Bug 4128 - Temporary deadlocks in multi-threaded TAO application
Summary: Temporary deadlocks in multi-threaded TAO application
Status: NEW
Alias: None
Product: TAO
Classification: Unclassified
Component: ORB (show other bugs)
Version: 2.2.1
Hardware: All Linux
: P3 normal
Assignee: DOC Center Support List (internal)
URL: https://groups.google.com/d/topic/com...
Depends on:
Blocks:
 
Reported: 2013-09-13 16:23 CDT by milan.cvetkovic
Modified: 2013-09-13 16:47 CDT (History)
0 users

See Also:


Attachments
client.cpp (12.01 KB, text/x-c++src)
2013-09-13 16:23 CDT, milan.cvetkovic
Details
client.cpp (3.39 KB, text/x-c++src)
2013-09-13 16:34 CDT, milan.cvetkovic
Details
Bug_4128_Regression.mpc (511 bytes, text/plain)
2013-09-13 16:35 CDT, milan.cvetkovic
Details
MyInterfaceImpl.h (97 bytes, text/x-c++src)
2013-09-13 16:35 CDT, milan.cvetkovic
Details
MyInterfaceImpl.cpp (66 bytes, text/x-c++src)
2013-09-13 16:36 CDT, milan.cvetkovic
Details
Test.idl (46 bytes, application/octet-stream)
2013-09-13 16:36 CDT, milan.cvetkovic
Details
server.cpp (1.58 KB, text/x-c++src)
2013-09-13 16:36 CDT, milan.cvetkovic
Details
README (373 bytes, text/plain)
2013-09-13 16:39 CDT, milan.cvetkovic
Details

Note You need to log in before you can comment on or make changes to this bug.
Description milan.cvetkovic 2013-09-13 16:23:49 CDT
Created attachment 1472 [details]
client.cpp

TAO VERSION: 1.2.1
     ACE VERSION: 6.2.1

     HOST MACHINE and OPERATING SYSTEM:
        Linux debian 6.0 on amd64

     COMPILER NAME AND VERSION: g++-4.4.5

     $ACE_ROOT/ace/config.h FILE: config-linux.h

     THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE:
        platform-linux.GNU

     CONTENTS OF $ACE_ROOT/bin/MakeProjectCreator/config/default.features:
       unmidified

     AREA/CLASS/EXAMPLE AFFECTED: LF_Strategy

     DOES THE PROBLEM AFFECT:
         EXECUTION: Yes

     SYNOPSIS:
In 2 threaded application, we experience intermittent CORBA::TIMEOUT 
exceptions from a thread invoking RPC call, while other thread is busy.

     DESCRIPTION:
I have 2 threads, both of them running orb->run() in TP_Reactor. Main 
task of the application is executing every 5 minutes, and it run by a 
timer. Application runs client RPC every 1 minute to another CORBA 
server, which normally returns immediately. Before invoking RPC, this 
thread invokes set_upcall_thread(). What I am seeing is that this RPC 
throws CORBA::TIMEOUT. It does not appear that remote CORBA server is 
busy at this time.

Here is code layout:

int Obj1::handle_timeout (...)
{
      // very long op, perhaps 2minutes (1)
      ACE_OS::sleep (20);
}

int Obj2::handle_timeout (...)
{
    try {
      remote->ping();
    } catch (CORBA::TIMEOUT&) {
       // the RPC comes back with TIMEOUT
    }
}

It appears that the thread which executes RPC is not capable of handling the response.

If I put this line after comment (1)
orb->orb_core()->lf_strategy().set_upcall_thread (
	orb->orb_core()->leader_follower());

I get the expected behavior - if server is up, I get response, if not, I get immediate CORBA::TRANSIENT

Attached is the sample code exposing the problem
Comment 1 milan.cvetkovic 2013-09-13 16:34:10 CDT
Created attachment 1473 [details]
client.cpp
Comment 2 milan.cvetkovic 2013-09-13 16:35:11 CDT
Created attachment 1474 [details]
Bug_4128_Regression.mpc
Comment 3 milan.cvetkovic 2013-09-13 16:35:31 CDT
Created attachment 1475 [details]
MyInterfaceImpl.h
Comment 4 milan.cvetkovic 2013-09-13 16:36:07 CDT
Created attachment 1476 [details]
MyInterfaceImpl.cpp
Comment 5 milan.cvetkovic 2013-09-13 16:36:25 CDT
Created attachment 1477 [details]
Test.idl
Comment 6 milan.cvetkovic 2013-09-13 16:36:53 CDT
Created attachment 1478 [details]
server.cpp
Comment 7 milan.cvetkovic 2013-09-13 16:39:01 CDT
Created attachment 1479 [details]
README
Comment 8 milan.cvetkovic 2013-09-13 16:47:18 CDT
Couple of notes:

- If there is no relative timeout policy in client orb, the RPC does not complete until sleep in the other thread is complete.

- if sleep() is preceded by:
orb_->orb_core()->lf_strategy().set_upcall_thread (
   orb_->orb_core()->leader_follower());
RPC returns correctly with CORBA::TRANSIENT.

- If number of threads is decreased to 1, and only RPC timer started, without lf_strategy code above, RPC also returns correctly with CORBA::TRANSIENT