Please report new issues athttps://github.com/DOCGroup
Currently, when one side of connection in TAO goes down, the other side will not notice until after it makes a call. Since it didn't know before making the call, it will fail with a status of COMPLETED=MAYBE. This prevents the ORB from falling back on the original profile (if it has been forwarded) and thus prevents key ImplRepo functionality from working. This problem also appears when an client makes one-way calls on a IR-using server. One-ways cannot be forwarded since the caller does not wait for a response. So the plan is to reenable LOCATE_REQUEST support, along with adding other strategies, such as using LOCATE_REQUEST or _non_existant calls with the first (or all calls) to a server.
Darrell, I think we can optimize some of this stuff for cases where we're a TAO client talking with a TAO server. The trick is to use a special TAO encoding of persistent object references so we can tell quickly that an IOR points to an ImplRepo vs. a real server. If we're pointing at the real server, then we could skip the LOCATE_REQUEST.
Darrell knows about this issue. He should be the point man.
I suppose...
This is not a bug. It is always possible that the kernel will not report any errors on a connection until after the request is sent. Trying to use a LocateRequest or a similar mechanism can help to minimize the problem, but it is always possible for the server to crash after the connection has been declared "live" by the client, but before the request is received by the server. Adding a complex protocol in an attempt to detect if the connection is live would only decrease performance as well as decrease the effective bandwidth on the network. We recommend that the reader checks a good book on distributed algorithms, in particular detection of crash failures on networks with unbounded delays and the two generals problem. Both theoretical problems are very much related to this, and both cannot be solved. In those circumstances it is better to leave the ORB as it is, and deal with the problem (if possible) at the application layer.
*** Bug 213 has been marked as a duplicate of this bug. ***