Bug 1185 - ORB blocks during connect() calls
Summary: ORB blocks during connect() calls
Status: RESOLVED FIXED
Alias: None
Product: TAO
Classification: Unclassified
Component: ORB (show other bugs)
Version: 1.2.2
Hardware: All All
: P1 critical
Assignee: Nanbor Wang
URL:
Depends on:
Blocks: 189 940 1131
  Show dependency tree
 
Reported: 2002-04-11 10:52 CDT by Carlos O'Ryan
Modified: 2002-05-09 14:36 CDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos O'Ryan 2002-04-11 10:52:39 CDT
The ORB performs blocking connect() calls.  This causes problems with the
timeout policy (as described in bug 1131), but it also makes middle-tier servers
prone to deadlocks, for example, if two ORBs are trying to connect simultaneously.
Using blocking connect() calls make it hard to add any configuration options to
work around the problems described in bug 189.

With the current architecture in the Leader/Followers loop it should be possible
to add yet another 'LF_Event' that waits for a connection, without blocking. 
But it will probably require some changes to the pluggable protocols framework.
 In detail: the connectors will need access to some strategy to setup
connections, the strategy will depend on wether the ORB is using
Leader/Followers, the Reactor or blocking connects [1].  If we are using the
Leader/Followers stuff then the connector will set the ACE_Asynch_Options
accordingly, allocate an LF_Event on the stack, and block until the LF_Event is
satisfied.  More or less the same thing should happen with a concurrency
strategy based on the reactor (without LF).  The blocking strategy is just what
we are doing now.

As it is the case with the Waiting strategies we have to be careful to decouple
the strategies and the pluggable protocol code, no sense in repeating this code
on each protocol.  I do not believe that all protocols will be able to support
asynchronous connects, those protocols will have to simply block.

[1] This can be done with yet another option or we may start fixing the option
madness and use a single option (-ORBConcurrency seems reasonable) to set the
Waiting Strategy and this new thing.  For backwards compatibility we can keep
the old option, as an override over the defaults provided by -ORBConcurrency.
Check bug 940 for more details on this.
Comment 1 Carlos O'Ryan 2002-04-11 10:53:28 CDT
Add dependencies, this bug is required to fix a number of other things.
Comment 2 Nanbor Wang 2002-04-11 11:41:13 CDT
Assigning the bug to me
Comment 3 Nanbor Wang 2002-04-11 11:41:58 CDT
Accepting this bug 
Comment 4 Nanbor Wang 2002-04-11 17:50:58 CDT
After going through the ACE code I have the following observations to make 

- Whenever we have a timeout set in the ORB ie. a roundtrip timeout through the 
RELATIVE_RT_TIMEOUT_POLICY or through the hook (I am not sure why we need this 
still. IMHO, this should be removed), the ORB does a non-blocking connect by 
default.

- When we dont have a timeout set, the ORB does a blocking connect (). 

We can address the later, ie. prevent the ORB from doing a blocked connect by 
using the strategies that are pointed out by Carlos. But we have a problem in 
the sense that the Strategy Connector does not return control to the developer 
when the system connect () returns a EINPROGRESS. The problem is exacerbated by 
the fact that ACE assumes (AFAICS) non-blocking connects only when a timeout 
argument is passed.  I am not very sure on how to address these issues, other 
than hacking up portions of ACE. Any help would be great, including pointing 
out any misunderstandings on my part. 

Further, I seem to miss the connection between this bug and bug 1131 

Comment 5 Nanbor Wang 2002-05-09 14:36:08 CDT
Added non-blocking connects per-se and that actually works ;). Here is the entry 

Sat Apr 27 16:54:22 2002  Balachandran Natarajan  <bala@cs.wustl.edu>

But the option madness hasnt been fixed. That will be done in the next round of
changes.  I am closing this bug since the requirements addressed in this one
should have been addressed.