Please report new issues athttps://github.com/DOCGroup
The TAO_ROOT/tests/AMI_Buffering tests fails when timeouts are used (start client with -t). The strage thing is that the test fails without debugging, at level 1 and 2 but at level 3 and higher the test runs. I am thinking about a debug line that has a side effect due to which the test runs on level 3 or higher. When running the test without debugging or level 1 or 2 I get the following errors in the log: TAO (344|240) - Connection_Handler[264]::close_connection, purging entry from cache TAO (344|240) - Connection_Handler[264]::close_connection, removing from the reactor TAO (344|240) - Connection_Handler[264]::close_connection, cancel all timers (344|240) Got except (344|240) EXCEPTION, Reply handler receive_data: system exception, ID 'IDL:omg.org/CORBA/COMM_FAILURE:1.0' Unknown vendor minor code id (0), minor code = 0, completed = MAYBE At a certain moment in time the connection are closed and then we get a lot of failures. Also during testing I sometimes get the following assert: ACE_ASSERT: file .\Transport.cpp, line 150 assertion failed for 'this->head_ == 0'.Aborting... I have added an ACE_ASSERT to ACE_EH_Dispatch_Info::set() to check whether a valid handle and event_handler* is passed, sometimes I get the following assert: ACE_ASSERT: file ..\ace/TP_Reactor.i, line 33 assertion failed for 'event_handler != 0'.Aborting...
Found the following in tao/Transport.cpp, there is an ACE_ERROR_RETURN that should be an ACE_ERROR with a separate return. Diff to fix below: =================================================================== RCS file: /project/cvs-repository/ACE_wrappers- repository/TAO/tao/Transport.cpp,v retrieving revision 1.100 diff -u -u -r1.100 Transport.cpp --- Transport.cpp 11 May 2004 13:00:31 -0000 1.100 +++ Transport.cpp 12 May 2004 11:27:18 -0000 @@ -1157,10 +1157,12 @@ if (retval == -1) { if (TAO_debug_level > 2) - ACE_DEBUG ((LM_DEBUG, - "TAO (%P|%t) - Transport[%d]::handle_input, " - "error while parsing the head of the queue\n", - this->id())); + { + ACE_DEBUG ((LM_DEBUG, + "TAO (%P|%t) - Transport[%d]::handle_input, " + "error while parsing the head of the queue\n", + this->id())); + } } return retval; @@ -1333,10 +1335,12 @@ if (retval == -1) { if (TAO_debug_level > 2) - ACE_DEBUG ((LM_DEBUG, - "TAO (%P|%t) - Transport[% d]::parse_incoming_messages, " - "error in incoming message\n", - this->id ())); + { + ACE_DEBUG ((LM_DEBUG, + "TAO (%P|%t) - Transport[% d]::parse_incoming_messages, " + "error in incoming message\n", + this->id ())); + } return -1; } @@ -1836,12 +1840,12 @@ { if (TAO_debug_level) { - ACE_ERROR_RETURN ((LM_ERROR, - "TAO (%P|%t) - Transport[% d]::process_parsed_messages, " - "received MessageError, closing connection\n", - this->id ()), - -1); + ACE_ERROR ((LM_ERROR, + "TAO (%P|%t) - Transport[%d]::process_parsed_messages, " + "received MessageError, closing connection\n", + this->id ())); } + return -1; } // If not, just return back.. ***** CVS exited normally with code 1 *****
The problem I see looks very much like 1815. A backtrace from the problem In the method below we do a send and this returns zero. int TAO_Transport::drain_queue_helper (int &iovcnt, iovec iov[]) { size_t byte_count = 0; ssize_t retval = this->send (iov, iovcnt, byte_count); <== returns zero We return now -1 We are now coming back in int TAO_Transport::drain_queue_i (void) { Becase we return a value not equal to 0, we also return -1 We go back to int TAO_Transport::drain_queue (void) { ACE_GUARD_RETURN (ACE_Lock, ace_mon, *this->handler_lock_, -1); int retval = this->drain_queue_i (); We return also -1 Come to int TAO_Transport::handle_output (void) { Return also here -1 Come to int TAO_Connection_Handler::handle_output_eh ( ACE_HANDLE, ACE_Event_Handler * eh) We leave the handle suspended and return -1 And now we are back in int TAO_IIOP_Connection_Handler::handle_output (ACE_HANDLE handle) { And we call this->close_connection (); This gives the results that the reply handlers get an exception. So we have another send call returning zero. Have to debug further why it returns 0.
I added a debug line to printout errno and %m when send returns zero int TAO_Transport::drain_queue_helper (int &iovcnt, iovec iov[]) { The output: TAO (400|1580) - Transport[10035]::drain_queue_helper, send() returns 0, errno 1 776, Resource temporarily unavailable
Fixed by applying the patch below: =================================================================== RCS file: /project/cvs-repository/ACE_wrappers- repository/ace/OS_NS_sys_socket.inl,v retrieving revision 1.13 diff -u -u -r1.13 OS_NS_sys_socket.inl --- OS_NS_sys_socket.inl 12 May 2004 06:34:59 -0000 1.13 +++ OS_NS_sys_socket.inl 12 May 2004 13:15:20 -0000 @@ -667,6 +667,7 @@ if (result == SOCKET_ERROR) { ACE_OS::set_errno_to_wsa_last_error (); + return -1; } # else for (int i = 0; i < n; ++i)
reopening, not everything is working still.
I now see another problem, now in the method below the connection is closed before all replies from the server are received int TAO_IIOP_Connection_Handler::handle_input (ACE_HANDLE h) { int result = this->handle_input_eh (h, this); if (result == -1) { this->close_connection (); return 0; } return result; }
In the code below, in case of a failure the recv return 0. But I have to mention now that at the moment this happens in the client, the server shutdowns, so maybe the problem is now in the server. This also not happens always, sometimes the tests runs without problems. ssize_t TAO_IIOP_Transport::recv (char *buf, size_t len, const ACE_Time_Value *max_wait_time) { ssize_t n = this->connection_handler_->peer ().recv (buf, len, max_wait_time);
changed severity