Bug 303 - Connection Verification/LOCATE_REQUEST Support
Summary: Connection Verification/LOCATE_REQUEST Support
Status: RESOLVED
Alias: None
Product: TAO
Classification: Unclassified
Component: ORB (show other bugs)
Version: 1.0.3
Hardware: All All
: P2 enhancement
Assignee: Nanbor Wang
URL:
Depends on:
Blocks: 212 213
  Show dependency tree
 
Reported: 1999-09-06 16:37 CDT by Nanbor Wang
Modified: 2000-02-08 15:26 CST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nanbor Wang 1999-09-06 16:37:02 CDT
Currently, when one side of connection in TAO goes down, the other side will
not notice until after it makes a call.  Since it didn't know before making the
call, it will fail with a status of COMPLETED=MAYBE.  This prevents the ORB
from falling back on the original profile (if it has been forwarded) and thus
prevents key ImplRepo functionality from working.

This problem also appears when an client makes one-way calls on a IR-using
server.  One-ways cannot be forwarded since the caller does not wait for a
response.

So the plan is to reenable LOCATE_REQUEST support, along with adding other
strategies, such as using LOCATE_REQUEST or _non_existant calls with the first
(or all calls) to a server.
Comment 1 Douglas C. Schmidt 1999-09-06 17:15:59 CDT
Darrell, I think we can optimize some of this stuff for cases where we're a
TAO client talking with a TAO server.  The trick is to use a special TAO
encoding of persistent object
references so we can tell quickly that an IOR points to an ImplRepo
vs. a real server.  If we're pointing at the real server, then we
could skip the LOCATE_REQUEST.
Comment 2 Ossama Othman 1999-09-07 11:11:59 CDT
Darrell knows about this issue.  He should be the point man.
Comment 3 Nanbor Wang 1999-09-15 10:50:59 CDT
I suppose...
Comment 4 Carlos O'Ryan 2000-02-08 15:20:59 CST
This is not a bug.  It is always possible that the kernel will not report any
errors on a connection until after the request is sent.
Trying to use a LocateRequest or a similar mechanism can help to minimize the
problem, but it is always possible for the server to crash after the connection
has been declared "live" by the client, but before the request is received by
the server.

Adding a complex protocol in an attempt to detect if the connection is live
would only decrease performance as well as decrease the effective bandwidth on
the network.

We recommend that the reader checks a good book on distributed algorithms, in
particular detection of crash failures on networks with unbounded delays and the
two generals problem.  Both theoretical problems are very much related to this,
and both cannot be solved.

In those circumstances it is better to leave the ORB as it is, and deal with the
problem (if possible) at the application layer.
Comment 5 Carlos O'Ryan 2000-02-08 15:26:59 CST
*** Bug 213 has been marked as a duplicate of this bug. ***