Please report new issues athttps://github.com/DOCGroup
Oneways in TAO are implemented as asynchronous messages. The code responsible for sending them is in TAO_Transport::send_asynchronous_message_i(). It happens that TAO tries to send the messages as far as it can and if it cannot it puts new messages in a queue inside TAO_Transport. Later depending on the buffering constraints and the flushing strategy it either sends the messages immediately (blocking strategy) or schedule output in the ORB's reactor (leader_follower and reactive strategies). The later way is problematic for Solaris builds. I observed the following problem on that system. If there are too many oneways are sent very fast then at some point write() to socket returns with errno=EAGAIN and the message is put on the queue as I mentioned above. However, the constraints are such that the flushing code is never called. At the same time TAO assumes sync scope = Messaging::SYNC_WITH_TRANSPORT by default for oneway calls which is not always consistent through the code. It looks like TAO needs a default hook set in TAO_ORB_Core::sync_scope_hook_ and that will set sync scope appropriately (i.e. to Messaging::SYNC_WITH_TRANSPORT). The reproducer for this issue is a Single_Read test.
check also the changes of Carlos of yesterday
Created attachment 1156 [details] A proposed fix It fixes Single_Read and AMH_Oneway tests on Solaris.
(In reply to comment #1) > check also the changes of Carlos of yesterday > I'll check my fix with the latest code from SVN.
make sure argument names in header/cpp are listed the same, else doxygen gets confused.
Created attachment 1157 [details] A new fix against recent code from SVN The first one was not complete. However, with this new fix Big_AMI starts failing (again on Solaris). This happens because client in that test assumes that until synchronous message is sent no replies can come from server which is not true if we use sync scope SYNC_WITH_TRANSPORT which means that we run reactor for flushing queued messages and as a side effect we receive replies from server. So, I think Big_AMI has to be changed.
Vladimir, can you also have a look at 3683? This seems related, sending a lot of data using rw.
(In reply to comment #6) > Vladimir, can you also have a look at 3683? This seems related, sending a lot > of data using rw. > Johnny, how should it fail? In my local OpenSolaris environment (it was the fastest to check) 3683 works. I checked it with my patch applied.
I have to commit an updated run-test.pl for you to run it. it failed on one host here that client just hangs forever. I am not sure if you are also testing with RW strategy, but it seems client stratey and flushing strategy have some dependency
In 85640. Mon Jun 15 10:19:16 UTC 2009 Vladimir Zykov <vz@prismtech.com> .... * tests/Big_AMI/client.cpp: * tests/Portable_Interceptors/AMI/client.cpp: * tests/Bug_1270_Regression/client.cpp: * tests/Bug_1270_Regression/Echo.cpp: * tests/Bug_1270_Regression/server.cpp: Fixed tests after the change for Bug#3682. In these tests it was assumed that nothing could be received from server until we run orb explicitly. The later is not true with synch scope policy SYNC_WITH_TRANSPORT. .... * tao/ORB_Core.cpp: * tao/ORB_Core.h: * tao/Messaging/Messaging_Policy_i.cpp: This fixes Bug#3682. SYNC_WITH_TRANSPORT is now really default synch scope policy in TAO. This must fix Single_Read and AMH_Oneway tests on Solaris.
Reverted the change in 85652 as there are problems. Tue Jun 16 07:06:14 UTC 2009 Vladimir Zykov <vz@prismtech.com>
In rev 86599. Thu Sep 3 09:01:53 UTC 2009 Vladimir Zykov <vz@prismtech.com> .... * tests/Big_AMI/client.cpp: * tests/Bug_1361_Regression/Echo_Caller.cpp: * tests/Bug_1361_Regression/server.cpp: * tests/Bug_1361_Regression/Server_Timer.cpp: * tests/Bug_1361_Regression/Server_Timer.h: * tests/Bug_1361_Regression/client.cpp: * tests/Bug_1361_Regression/Echo.cpp: * tests/Bug_1361_Regression/Server_Thread_Pool.cpp: * tests/Portable_Interceptors/AMI/client.cpp: * tests/Bug_1270_Regression/client.cpp: * tests/Bug_1270_Regression/Echo.cpp: * tests/Bug_1270_Regression/Echo_Caller.cpp: * tests/Bug_1270_Regression/server.cpp: * tests/Bug_1270_Regression/Server_Timer.cpp: * tests/Bug_1270_Regression/Server_Timer.h: * tests/Bug_1270_Regression/run_test.pl: Fixed tests after the change for Bug#3682 and Bug#3697. In some of these tests it was assumed that nothing could be received from server until we run orb explicitly. The later is not true with synch scope policy SYNC_WITH_TRANSPORT. Cleaned up the code of the tests. * tao/ORB_Core.cpp: * tao/Messaging/Messaging_Policy_i.cpp: * tao/ORB_Core.h: This fixes Bug#3682. SYNC_WITH_TRANSPORT is now really default synch scope policy in TAO. This must fix Single_Read and AMH_Oneway tests on Solaris. ....
In revision 86672. Wed Sep 9 12:38:15 UTC 2009 Vladimir Zykov <vz@prismtech.com> * tao/ORB_Core.cpp: * tao/Leader_Follower_Flushing_Strategy.cpp: * tao/Messaging/Messaging_Policy_i.cpp: * tao/ORB_Core.h: Reverted fixes for bug#3682 and bug#3697 as there are problems with them and the new x.7.3 release is very close.
In revision 88011. Wed Dec 9 09:40:10 UTC 2009 Vladimir Zykov <vladimir.zykov@prismtech.com> * tests/Bug_1361_Regression/Echo_Caller.cpp: * tests/Bug_1361_Regression/Echo_Caller.h: * tests/Bug_1361_Regression/server.cpp: * tests/Bug_1361_Regression/Server_Thread_Pool.cpp: * tests/Bug_1361_Regression/Server_Thread_Pool.h: * tests/Bug_1361_Regression/run_test.pl: Changed the test so that it doesn't shutdown the orb until all threads are done with the remote calls. Substantially extended the time for server shutdown since threads in server's pool don't handle shutdown message until they send all (50) remote messages. * tao/ORB_Core.cpp: * tao/Messaging/Messaging_Policy_i.cpp: * tao/ORB_Core.h: This fixes Bug#3682. SYNC_WITH_TRANSPORT is now really default synch scope policy in TAO. * tao/Leader_Follower_Flushing_Strategy.cpp: Changed the code to poll the reactor instead of running it indefinitely. This fixes bug#3697.
This bug is fixed. The last problems for which I had to reopen it are fixed now.