Bug 1292

Summary: Fix race conditions in new Load Balancer
Product: TAO Reporter: Ossama Othman <ossama.othman>
Component: Load BalancerAssignee: DOC Center Support List (internal) <tao-support>
Status: NEW ---    
Severity: normal CC: jai
Priority: P3    
Version: 1.2.4   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 1277    

Description Ossama Othman 2002-08-23 18:40:59 CDT
The new load balancer has race conditions in all methods that implement those
found in the PortableGroup::{GenericFactory, ObjectGroupManager,
PropertyManager} interfaces.

TAO's PortableGroup library has different locks for each of these interfaces.
However, locking at level is too fine grained.  Higher level code, such as that
found in the new LB, ends up having race conditions since the state of the
underlying PortableGroup code may change during call to the higher level LB code.

Remove the locks from the PortableGroup library, and add locks to the LB code
that calls the PortableGroup implementation.  This makes synchronization coarser
grained, but addresses the race conditions.  Performance isn't an issue here
since the methods in question are not in the critical path of the load balancer.
Comment 1 Ossama Othman 2002-08-23 18:41:22 CDT
Blocker for the TAO 1.3 release.
Comment 2 Ossama Othman 2002-08-23 18:41:57 CDT
Jai should be kept apprised of this bug, too.
Comment 3 Ossama Othman 2002-09-13 09:04:46 CDT
Mine.
Comment 4 Ossama Othman 2002-10-22 17:31:58 CDT
It turns out there aren't as many race conditions as I thought.  The primary
race conditions exist in the load balancing strategies that callback on the
LoadManager.  Strictly speaking, the so-called race conditions are not race
conditions since all operations are performed atomically.  The real problem is
that the built-in load balancing strategies retrieve information from the
LoadManager through the standard public methods.  Unfortunately, the retrieved
information, such as group membership, may already be obsoleted by related calls
made by another thread.

So, we need to make decision.  Should the built-in load balancing strategies
have the ability to lock out other threads when calling back on the LoadManager,
or should a "protocol" be defined that specifies how the strategies should be
behave if the retrieved information is obsolete?
Comment 5 Johnny Willemsen 2008-03-31 04:22:09 CDT
lowering severity