Please report new issues athttps://github.com/DOCGroup
Hi, We have a system where each machine as its own naming service. For example, if we have 2 machines in our system, each machine will have a NamingContext (Machine-A) pointing to the root of Machine A's Naming service and a Naming Context (Machine-B) pointing to the root of machine B's Naming service. Using the NamingViewer the naming services should look like this: For Machine A: Root Machine-A Machine-B TestServant For Machine B: Root Machine-A Machine-B TestServant The deadlock occurred when both machines try to resolve the TestServant at the same time: On machine B: Machine-B-Root->resolvestr("Machine-A\TestServant"); On machine A: Machine-A-Root->resolvestr("Machine-B\TestServant"); Where Machine-A-Root and Machine-B-Root are the root of there NamingContext respectively. The deadlock occurred in the Root of NamingContext of each machine. Since the Root context of Machine-B is locked by machine B to resolve Machine-A context, the machine A cannot resolve the Machine-B context since it is already locked by Machine B and Machine B cannot resolve the TestServant object because the Root context of machine A has been locked by the machine A itself. The deadlock occurred in the implementation of the Naming Service (TAO_Hash_Naming_Context and TAO_Storable_Naming_Context classes) in the resolve method (also all methods with CosNaming::Name parameter). CORBA::Object_ptr TAO_Hash_Naming_Context::resolve (const CosNaming::Name& n ACE_ENV_ARG_DECL) { ACE_GUARD_THROW_EX (TAO_SYNCH_RECURSIVE_MUTEX, ace_mon, this->lock_, CORBA::INTERNAL ()); { ... // If there are any exceptions, they will propagate up. ACE_TRY { CORBA::Object_ptr resolved_ref; ->>>>> Here, It should release the Mutex acquired befaore calling the context of another machine. resolved_ref = context->resolve (rest_of_name ACE_ENV_ARG_PARAMETER); ACE_TRY_CHECK; return resolved_ref; } ACE_CATCH (CORBA::TIMEOUT, timeoutEx) { ACE_PRINT_EXCEPTION (timeoutEx, "Hash_Naming_Context::resolve (), Caught CORBA::TIMEOUT exception"); // throw a CannotProceed exception back to the client // ACE_TRY_THROW (CosNaming::NamingContext::CannotProceed (context.in (), rest_of_name)); } ACE_ENDTRY; } ... }
Created attachment 283 [details] Patch to resolve this bug.
I'll handle this bug report.
Mine.
Created attachment 287 [details] There was a problem with previous patch in resolve method. Here the latest patch.
Comment on attachment 283 [details] Patch to resolve this bug. incorrect