Please report new issues athttps://github.com/DOCGroup
The dev_poll reactor currently doesn't suspend/resume event handlers, for TAO we do need this. We can add a TP_Dev_Poll reactor, but it is nicer to have a strategy to the dev_poll reactor
added block
Why do you think ACE_Dev_Poll_Reactor doesn't suspend/remove handlers?
Because there have been reports from users that it doesn't do so and also with TAO we do see problems that multiple threads try to read from the same socket at the same time
Well, it does do suspend/resume. It has a problem in the Reactor_Remove_Resume_Test, which may explain the TAO issues. If you think so, let's change this title. I figured out why the test fails but don't have a fix yet.
Feel free to update the summary to the problem you think is there
Reactor_Remove_Resume_Test fails the remove part when using ACE_Dev_Poll_Reactor. This failure occurs because when a callback returns -1 to close the handler, the reactor calls back the handler currently registered for the handle that triggered the callback, not the handler that was called back. This error occurs when a particular handle ends up being registered to a different handler (such as when a handle is closed then reused quickly), the called back handler no longer controls the handle, and the callback returns -1.
The problem for TAO is probably the fact that the dev_poll reactor doesn't check the event_handler->resume_handler() operation. TAO does resume handlers as application in some cases, the reactor should do that. The TP_Reactor is the only reactor that uses this method (as a note in the header of eh says), but the dev_poll should also check this method and not always resume the handle
I think my original post was unclear, the dev poll reactor does have suspend/resume methods, but the event handler is not automatically suspended when it is used by one thread, as done by the tp_reactor. This is what we need for TAO and I can't find this back in the code.
Ah, I see now. What is the factor that made addition of a manual-resume hack necessary for ACE_TP_Reactor? The behavior of ACE_Dev_Poll_Reactor need not be level-triggered, if that would help. I would hate to hack in a capability just because ACE_TP_Reactor does it that way. Since I don't know the innards of TAO and why this is an important feature, I need your help to understand the rationale and whether or not there's a better way to address it.
(In reply to comment #9) > Ah, I see now. What is the factor that made addition of a manual-resume hack > necessary for ACE_TP_Reactor? The behavior of ACE_Dev_Poll_Reactor need not be > level-triggered, if that would help. TAO is written in such a way that only one thread handles a certain connection at a moment. Also TAO does indicate to the reactor who resumes the handle, see bug 3188. > I would hate to hack in a capability just because ACE_TP_Reactor does it that > way. Since I don't know the innards of TAO and why this is an important > feature, I need your help to understand the rationale and whether or not > there's a better way to address it. I was more thinking in ways of a policy to control the suspend/resume, not a full new class. The same could maybe be done for the TP_Reactor, change the select reactor to have a policy that controls suspend/resume behaviour
As further background, during a discussion with Bala I asked why TAO needs the varying suspend/resume behavior. Here is his response: Resume() stuff is/was necessary for the following reasons: (1) Nested Upcall -- When a incoming call (from peer A) to servant (in peer B) makes an outbound call to peer A, which in turn (from Peer A) makes a call to peer B, the ORB can deadlock. How? Peer A can use the same connection to send a request back to peer B. If the connection is in suspended state, peer B cannot see the message. By resuming the connection we could get around this deadlock (2) There were select()'isms in TAO code. I am not sure how much of them have been removed. For example, there used to be paths where a thread can pick up parts of the message that it requires, leaving rest of the message in the socket. The thread that picks up part of the message can release the connection for another thread to be sent down to the handler. I know TAO fixed some of this, but I am not 100% sure whether all the select()'isms have been removed. TAO should have removed them to work well with SSL. But I am not sure how much SSLIOP has been tested with TAO to be confident enough.
Fixed: Fri Jun 12 00:45:09 UTC 2009 Steve Huston <shuston@riverace.com>