Bug 1292 - Fix race conditions in new Load Balancer
Summary: Fix race conditions in new Load Balancer
Status: NEW
Alias: None
Product: TAO
Classification: Unclassified
Component: Load Balancer (show other bugs)
Version: 1.2.4
Hardware: All All
: P3 normal
Assignee: DOC Center Support List (internal)
URL:
Depends on:
Blocks: 1277
  Show dependency tree
 
Reported: 2002-08-23 18:40 CDT by Ossama Othman
Modified: 2008-03-31 04:22 CDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ossama Othman 2002-08-23 18:40:59 CDT
The new load balancer has race conditions in all methods that implement those
found in the PortableGroup::{GenericFactory, ObjectGroupManager,
PropertyManager} interfaces.

TAO's PortableGroup library has different locks for each of these interfaces.
However, locking at level is too fine grained.  Higher level code, such as that
found in the new LB, ends up having race conditions since the state of the
underlying PortableGroup code may change during call to the higher level LB code.

Remove the locks from the PortableGroup library, and add locks to the LB code
that calls the PortableGroup implementation.  This makes synchronization coarser
grained, but addresses the race conditions.  Performance isn't an issue here
since the methods in question are not in the critical path of the load balancer.
Comment 1 Ossama Othman 2002-08-23 18:41:22 CDT
Blocker for the TAO 1.3 release.
Comment 2 Ossama Othman 2002-08-23 18:41:57 CDT
Jai should be kept apprised of this bug, too.
Comment 3 Ossama Othman 2002-09-13 09:04:46 CDT
Mine.
Comment 4 Ossama Othman 2002-10-22 17:31:58 CDT
It turns out there aren't as many race conditions as I thought.  The primary
race conditions exist in the load balancing strategies that callback on the
LoadManager.  Strictly speaking, the so-called race conditions are not race
conditions since all operations are performed atomically.  The real problem is
that the built-in load balancing strategies retrieve information from the
LoadManager through the standard public methods.  Unfortunately, the retrieved
information, such as group membership, may already be obsoleted by related calls
made by another thread.

So, we need to make decision.  Should the built-in load balancing strategies
have the ability to lock out other threads when calling back on the LoadManager,
or should a "protocol" be defined that specifies how the strategies should be
behave if the retrieved information is obsolete?
Comment 5 Johnny Willemsen 2008-03-31 04:22:09 CDT
lowering severity