Bug 3307 - Shutdown race involving ORBInitializer_Registry
Summary: Shutdown race involving ORBInitializer_Registry
Status: ASSIGNED
Alias: None
Product: TAO
Classification: Unclassified
Component: Portable Interceptors (show other bugs)
Version: 1.6.4
Hardware: All Windows XP
: P3 normal
Assignee: ciju john
URL:
Depends on:
Blocks:
 
Reported: 2008-04-25 11:28 CDT by ciju john
Modified: 2008-04-26 02:42 CDT (History)
1 user (show)

See Also:


Attachments
TAO patch (15.91 KB, patch)
2008-04-25 11:28 CDT, ciju john
Details
ACE patch (3.66 KB, patch)
2008-04-25 11:29 CDT, ciju john
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ciju john 2008-04-25 11:28:50 CDT
Created attachment 957 [details]
TAO patch

We (OCI) encountered a race condition involving ORBInitializer_Registry and ORBInitializer's registered by services in shared libraries.

The PI library gets automatically loaded by whichever service attempting the first ORBInitializer registration. Registering service A causes shared libs to be loaded in order A -> PI. The PI as part of its destruction clears out all registered ORBInitializers. However since at PI's destruction, library A has already been unloaded the process cores trying to access A's ORBInitializer dstr. We have seen this problem in client tests involving CSD, however we haven't been able to reproduce this in clean (outside client code) environment.


The solution is to tie library A's unloading with its ORBInitializers lifetime. I will next upload the proposed fix for this issue. This fix has been pushed into OCI 1.5a and I have tested it locally on DOC head. If there are no objections I will push this patch in next week.

Ciju
Comment 1 ciju john 2008-04-25 11:29:19 CDT
Created attachment 958 [details]
ACE patch
Comment 2 Johnny Willemsen 2008-04-25 12:37:47 CDT
wouldn't it be a better solution to have an API to remove an already registered ORBInitializer? When working on the sg/sc issues with Iliyan this was one of the things that came to my mind, it was on my list to report an issue to the OMG about this. 

Also couldn't this be resolved by the sg/sc changes from Iliyan, we fixed there some ordering problems when unloading. we had similar problems there
Comment 3 Johnny Willemsen 2008-04-25 12:49:05 CDT
patch seems to have some debug statements that are just for testing
Comment 4 ciju john 2008-04-25 12:50:01 CDT
(In reply to comment #2)
> wouldn't it be a better solution to have an API to remove an already registered
> ORBInitializer? When working on the sg/sc issues with Iliyan this was one of
> the things that came to my mind, it was on my list to report an issue to the
> OMG about this. 
> 
> Also couldn't this be resolved by the sg/sc changes from Iliyan, we fixed there
> some ordering problems when unloading. we had similar problems there
> 

Comment 5 ciju john 2008-04-25 12:54:47 CDT
(In reply to comment #2)
> wouldn't it be a better solution to have an API to remove an already registered
> ORBInitializer?

But someone will still have to defer the dll unloading until ORBInitializer destruction. This is done by the PortableInterceptor::DLL_Resident_ORB_Initializer class.

> When working on the sg/sc issues with Iliyan this was one of
> the things that came to my mind, it was on my list to report an issue to the
> OMG about this. 
> 
> Also couldn't this be resolved by the sg/sc changes from Iliyan, we fixed there
> some ordering problems when unloading. we had similar problems there

I am not sure about that. Iliyan do your changes target the scenario described earlier?

thanks,
Ciju
Comment 6 Johnny Willemsen 2008-04-25 12:56:28 CDT
As part of Iliyan's changes we did definitively fix some dll unload order problems. I think we should hold this change until there is a reproducer, I am afraid that we are making changes for something that already got resolved on svn head. Having an api to register an orbinitializer is something I consider usefull, independent of this issue, what do you think?
Comment 7 Iliyan Jeliazkov 2008-04-25 13:03:03 CDT
(In reply to comment #6)
> ... Having an api to register an orbinitializer is something I consider
> usefull, independent of this issue, what do you think?

I think you mean de-register, right? (sorry, if I'm missing the point)
Yes, I think there must be an equivalent of init/fini mechanism so the initializers can be un-plugged explicitly when unloading their DLLs from memory. However, I haven't given this a lot of thought and can't say how it should be best approached.
Comment 8 Johnny Willemsen 2008-04-25 13:07:41 CDT
(In reply to comment #7)
> (In reply to comment #6)
> > ... Having an api to register an orbinitializer is something I consider
> > usefull, independent of this issue, what do you think?
> 
> I think you mean de-register, right? (sorry, if I'm missing the point)
> Yes, I think there must be an equivalent of init/fini mechanism so the
> initializers can be un-plugged explicitly when unloading their DLLs from
> memory. However, I haven't given this a lot of thought and can't say how it
> should be best approached.
> 

Yes, unregister, typed to fast. I think we should file an issue to the OMG and then maybe add already a TAO specific extension
Comment 9 ciju john 2008-04-25 14:18:38 CDT
(In reply to comment #6)
> As part of Iliyan's changes we did definitively fix some dll unload order
> problems. I think we should hold this change until there is a reproducer, I am
> afraid that we are making changes for something that already got resolved on
> svn head. Having an api to register an orbinitializer is something I consider
> useful, independent of this issue, what do you think?

I agree. Having a deregistration API would have been great and could be another way to solve this problem. I didn't go that route because it changes the initializer destruction order in which things are currently done and therefore more prone to bugs creeping in. It seemed an easier fix to keep the dll around till the time the registry finishes up with all registered initializers.

Currently I don't have a test for this. CSD or SSLIOP tests may show up this problem.

Ciju
Comment 10 Johnny Willemsen 2008-04-26 02:42:53 CDT
(In reply to comment #9)
> (In reply to comment #6)
> > As part of Iliyan's changes we did definitively fix some dll unload order
> > problems. I think we should hold this change until there is a reproducer, I am
> > afraid that we are making changes for something that already got resolved on
> > svn head. Having an api to register an orbinitializer is something I consider
> > useful, independent of this issue, what do you think?
> 
> I agree. Having a deregistration API would have been great and could be another
> way to solve this problem. I didn't go that route because it changes the
> initializer destruction order in which things are currently done and therefore
> more prone to bugs creeping in. It seemed an easier fix to keep the dll around
> till the time the registry finishes up with all registered initializers.

It would be a much better fix to deregister things, that can also be used for cases where things are not in a dll. when the dll does it cleanup it can cleanup the installed initializers. 

> Currently I don't have a test for this. CSD or SSLIOP tests may show up this
> problem.

We really need a test to see if this still is a problem. we had issues on doc head that dll's got unloaded to early, Iliyan addressed this in the service config framework.