Please report new issues athttps://github.com/DOCGroup
OK, we have been here before and fixed many of these bugs, but it is happening again, obviously the regression tests are missing the problem and the connection management is busted yet another time. I will be adding a regression test shortly, but having the server crash in the middle of a request like this: // IDL interface Echo { void echo_payload(in Payload x); }; is not good news. I think it also crashes if the operation above is a oneway, notice that the client has to be in the right state, i.e. blocked trying to write or using the Reactor/handle_output() loop to send the data. And it helps if the error is detected by write() instead of read. But at this point my analysis is probably premature (read that as "most likely wrong"), the core files don't lie though :-)
Created attachment 134 [details] Regression test for this bug (tarred)
Doh! The repo is frozen, so I attached the regression test to the bug, will commit once the beta is out and they thaw the repo.
Last heard that Carlos was looking to fix this. If not, we need to take care of this.
Adding dependency on 1305
Created attachment 150 [details] Patches to the ORB core.
Created attachment 151 [details] Patches to the protocols.
OK. I attached two patches that fix this bug. The first patch: http://deuce.doc.wustl.edu/bugzilla/showattachment.cgi?attach_id=150 modifies the ORB core and solves the problem (at least as far as I can solve it.) The second patch: http://deuce.doc.wustl.edu/bugzilla/showattachment.cgi?attach_id=151 simply modifies the pluggable protocols to match the changes in the ORB Core, so it needs no explanation. As to the first patch, here are the changes in detail: 1) It eliminates the pending_upcall_ vs. refcount_ fields in the Conneciton_Handler. Having two reference counts is hard to debug and extremely hard to get right. It also makes it hard to state when the object is deleted, hard to analyze the reference counting rules and it actually does not help with anything I can see, so it is zapped. 2) The transport_ field in the Connection_Handler is atomically modified. 3) Closing connections is also atomic. 4) When a connection is closed *all* the activations in the Reactor are removed. The last one is the really important change, but it does not help without (3). I also documented the reference counting with REFCNT comments in the places where it is incremented or decremented, that way we can analyze reference counting statically, and convince ourselves that it is done right. Please review the changes and let me know what do you think. Be adviced, I do not have much time to break the changes in smaller portions, so if there is something you do not like you better change it yourselves.
Not mine anymore. I submitted the patches and everything. Returning to the tao-support tarpit.
Fixed! Details are available in Mon Oct 21 22:45:02 2002 Balachandran Natarajan <bala@isis- server.isis.vanderbilt.edu> I ran the tests for these for almost the past two days in various ways. The only problem that I have seen the test crash is because of stack overflow. With some aggressive testing over the past two days, we can give some assurance that this is fixed.