Please report new issues athttps://github.com/DOCGroup
First of all, this bug is very similar to bug 1269, in fact, they *might* be duplicates, but I filed them separately because: 1) The stack traces are different, so they could be actually two problems. 2) This one crashes inside the Reactor. 3) I want to know when each one is fixed, and I'm mostly interested in this one. The problem has been dragging for a while, see bug 1202 and its brethen for more details, but basically the ORB is running the Leader/Follower loop waiting for a send to complete. During a handle_output() call the peer death is detected, so we return -1 from handle_output() to close the socket. The socket is removed from the reactor, but *ONLY* from the write mask, it is left there on the read mask. Problem is the ORB closes the socket too, and somehow the Connection_Handler is also deleted. The Reactor then has this invalid file descriptor on the read mask (remember it was only removed from the write mask), yet it tries to use the corresponding Event_Handler (which was deleted) and the whole thing crashes. Multiple funny things are going on at the same time (why does the reactor only remove the write mask but calls handle_close()? why does the ORB deletes the event handler without calling remove_handler() first? why does the reactor::remove_handle(Event_Handler*) tries to use the get_handle() method?), most of them way out of my expertise. I'll be attaching a regression test shortly.
Created attachment 135 [details] The regression test, tarred
This bug should block 1202.
Carlos seems to have a better handle on this. So assigning it to him.
Bug 1305 was getting triggered inside the ORB too. Adding a dependency.
Not mine anymore. I submitted the patches and everything. Returning to the tao-support tarpit.
Fixed! Details are available in Mon Oct 21 22:45:02 2002 Balachandran Natarajan <bala@isis- server.isis.vanderbilt.edu> I ran the tests for these for the past two days in various ways. The only problem that I have seen is that the test crashes because of stack overflow. With some aggressive testing over the past two days, we can give some assurance that this is fixed.
What is the expected result from $TAO_ROOT/tests/Bug_1270_Regression? I ran it for 5 minutes and it just repeated like: [harris_s@paris Bug_1270_Regression]$ perl run_test.pl (14143|1024) Echo::echo_payload, sleeping (14142|1024) Echo::echo_payload, sleeping (14144|1024) Echo::echo_payload, sleeping (14143|1024) Echo::echo_payload, aborting (14142|1024) Echo::echo_payload, aborting (14144|1024) Echo::echo_payload, aborting (14159|1024) Echo::echo_payload, sleeping (14158|1024) Echo::echo_payload, sleeping (14157|1024) Echo::echo_payload, sleeping ... I am using gcc 2.96 on RH 7.2.
The server sholudnt die even when the client dies..
Should it run for the full 70 minutes in the perl script?