Please report new issues athttps://github.com/DOCGroup
TP_Reactor waits until the upcall completes before resuming the socket for the select() call. This is not suitable in some cases since the socket may become ready for the select() call before the upcall completes. One such example is in the case of oneway calls in TAO. Oneway calls from the same client can be processed simultaneously if multiple threads are available on the server. Currently, since the socket is not resumed until the upcall completes, the oneway calls are serialized. The right solution here would be to resume the socket once the oneway request has been read so that other threads can immediately work on other oneway requests that arrive from the client. There needs to be a protocol developed between the user and TP_Reactor so that TP_Reactor does not try to resume a handle that the user has already resumed. There are three issues that need to be addressed in this context: 1. Early resumption: The above mentioned protocol for early resumption will cause the user to write different code when dealing with TP_Reactor. 2. Concurrent replies: Since the replies can be concurrently sent back, it will be important to synchronize the socket. This will require additional locking and will be another source of difference in code when dealing with TP_Reactor. 3. Multithreaded upcalls: Another aspect of the current serialized scheme is that upcall code can be single threaded. The above changes would require that additional smart be added to the upcall code. This is specially relevant when a second upcall is indicating an EOF which would eventually delete the handler. Therefore, the upcall code will have to much smarter than it is now, and again this be another source of difference in code when dealing with TP_Reactor. I think it is important to carefully address these three issues before making changes to the TP_Reactor.
Carlos and others, please capture additional design concerns regarding these changes into this bugzilla entry so that we can track them. Thanks!
You seem to have captured the main issues. The problem is not limited to oneways, asynchronous messaging or multiplexed connections have the same problems in TAO. Do you notice certain similarities between this code and the POA? I do, it looks like you have the same forces: concurrent upcalls, destruction of the object in the middle of the upcall, thread-aware vs. single threaded servants, etc. Time to use the same patterns! i.e. reference count the entries in the dispatching table, delay destruction until the reference count is 0, etc.
Here is another way in which this bug is hurting us. Suppose the ORB receives an upcall, and the reply is a big message. The ORB gets flow controlled while sending the reply, now, at this point we could attempt to use the reactor to send the message as needed. But it does not work! The handle_output() call is never invoked because the handler is still suspended. The only workaround that I can see is to return control to the event loop, while the message is still queued, however that requires allocating the buffer from the heap, and it would ruin a number of ORB optimizations for the output data path.
Unless I can be convinced otherwise we shouldn't cut 1.2 with this bug. It leads to all sorts of deadlocks and the scalability of the ORB suffers considerably. I'm also updating the owner of the bug, I believe Bala is working on it. Bala: don't forget to record your guestimate for when the bug will be fixed.
I am accepting the bug.. I can provide a guestimate only after I am done with 884.
Adding myself to CC. We (at easybase) are connecting from JacOrb middleware (multithreaded client) to TAO Servers, and it seems that we hit this bug. TAO is putting all requests in a line and makes them execute in sequence. (As a workaround we bind a pool of ORBs from our client, and round robin with them. This way each JacOrb ORB is getting it's own socket).
Trying to capture the discussions in the doc_group mailing list here: ------------------------------------------------ Balachandran Natarajan bala@cs.wustl.edu wrote: ********************************************** Context: The thread pool reactor suspends the handler during an upcall and resumes the handler only after the upcall is completed (ie. when the application returns from the handle_xxxxx () call). Problem: (while using TP_Reactor in TAO) This creates problems with Nested Upcalls in BiDirGIOP and there have been reports in the users group of deadlocks. The deadlock was occuring because the client made a oneway request to the server and the server made a twoway request to the client as a part of the call. Solutions: (1) Strategise the TP_Reactor. The default strategy would have the present semantics. The new strategy would allow the user to control the resumption of the handler. This was tried by adding a flag in TP reactor. When the flag is set, the TP reactor would not resume handlers but wait for the application to resume the handler. This does not work. Why? - The TP reactor has got other handlers also registered with it. It doesn't resume any of them. Expecting the application to resume all the handlers that are registered with the reactor does not make sense (2) We can have a virtual method in the event handler ( ACE_Event_Handler) class. This method will return default value. The TP reactor before resuming can check for a value by invoking the method on the event_handler. The user/application that would be interested in controlling the resumption can implement this method as a part of their event handler implementation and return a non-default value. That way they can take control of the resumption. The #2 seems to work. But I would like to get the views of the doc_group before I can go ahead. Please pitch in with your views on why we SHOULDN'T be using #2 and if so please let me know other alternatives :) Carlos O'Ryan <coryan@uci.edu> wrote : ********************************************** Basically the problem is that during an upcall the socket that received the request is completely ignored.... so if the upcall needs to wait for anything coming through that socket the ORB would dead-lock. There are quite a few application scenarios that can fall in that trap, the oneway-twoway madness is one, but it can also happen with just twoways: 1) Application A is using connection muxing. 2) Thread 1 in app A makes a twoway request R1 3) Application B receives the request R1 and makes a distributed callback C1 to A 4) Now A receives C1 and makes another call R2 to B, using the same connection. 5) B never hears about R2 (because the socket is suspended), and the app dead-locks. There are infinite variations on this theme, involving bidir-GIOP requests through the same socket, or AMI callbacks that try to use the same connection, etc. etc. BTW, this is also a severe scalability problem: effectively the ORB can only handle one request at a time on each connection, in many cases that means that each client will perceive the server as a single threaded server. Not exactly the kind of behavior expected from a thread-pool model. Carlos O'Ryan <coryan@uci.edu>wrote: ********************************************** >Why was TP_reactor designed this way? Read the Leader/Followers paper. In short: 1) Simplifies synchronization inside the Event_Handler, you know for a fact that only one thread will be messing up in handle_input() at a time. 2) If you don't suspend the handle until "enough" data has been read then the next leader will call select and try to read the data again. With TCP sockets this can result in data corruption as two threads call read() and get inconsistent pieces of the message. > Right. So, there must be a way around it? What's the right thing? That's exactly what we are talking about here... The handle must be resumed as soon as enough data has been read, i.e. as soon as the Transport is in a state where another call to handle_input() will not produce data corruption. The question is what *mechanism* should be put in place to support this use-case. > Should the ORB do its own queueing of requests rather than rely on TCP > buffers? *Please* read the Leader/Followers paper: there are consequences to that, memory allocations, data movement, context switching and priority inversions. Irfan Pyarali <irfan@cs.wustl.edu> wrote: ********************************************** > Would it do harm, if the handler got resumed twice? Resumption is expensive - the leader has to be woken up, the handle set is changed, and finally a new leader is chosen to run the reactor. Two resumptions (where the second one was not necessary) in the critical path can be very expensive. I would stay away from this. I had ask this question to Bala last night but did not get a complete/convincing answer: once the handle is resumed, the event handler is fair game for other threads. Is there any state in the TAO event handlers (or their helper classes) that require synchronization? How about the optimization for single reads? Will it work with multiple upcall threads? If we end up synchronizing the event handlers, we might as well leave the resumption for the TP_Reactor. Carlos O'Ryan" <coryan@uci.edu> wrote: *************************************** > > handler is fair game for other threads. Is there any state in the TAO > > event handlers (or their helper classes) that require synchronization? > > At the risk of contributing an unconvincing answer, I don't think so. Well, right now there is none because handle_input() is never invoked from two separate threads :-) :-) But actually there is plenty of state, for example, the Transport must keep track of multiple GIOP fragments and reassemble then in a single Message. As to wether this state requires synchronization, the answer is probably "not if you do this *very carefully*". > > How about the optimization for single reads? > > Don't know. Not familiar with this. And the crowd goes wild as Irfan nails the second piece of state: assume you do read() and you get more than one request (it does happen). Chances are that the tail of your buffer will contain the header of the next request, the next read() operation has to use that to generate a complete message, parse and deliver it to higher level components in the ORB. So essentially there are two pieces of state: 1) The GIOP Fragment messages that must be reassembled to generate a complete Request or Reply. 2) The "current" GIOP message that is reassembled in multiple read() requests before delivering that to the upper layers of the ORB. The trick is that the handle_input() method does not have to resume the handle until it stops messing with this state, in other words, the method should look more or less like this: Foo::handle_input () { read(buffer...); update_state (buffer); if (!has_complete_GIOP_Message_to_dispatch) { resume_me(); return 0; } // We have a message (request or reply) to dispatch... // The internal state of this connection_handler/transport is stable // so: resume_me (); dispatch_message(...); return 0; } no extra synchronization, but still can have another thread follow immediately after resume_me(). > > Will it work with multiple upcall threads? > > What are multiple upcall threads? I think Irfan is talking about multiple threads going through handle_input() simultenously, I think the trick is to resume the handler once the critical section has been left. BTW, this is why the TP_Reactor works the way it does, the correct point to resume is extremely application specification.
Added myself to the CC: list - btrask@contactsystems.com
Here is the update. I have got the code in this branch bug_575_stage_1. The branched code is available for $ACE_ROOT/ace & $TAO_ROOT/tao. The ChangeLog entries for ACE & TAO are in branch bug_575 and the name of the file is ChangeLog_bug_575 in $ACE_ROOT and $TAO_ROOT. Tasks remaining - SHMIOP need to be dealt properly - Clean up code and documentation - support GIOP Lite and fragments - Run quantify again to see whether I have used up more resources than needed.
I guess I am coming to a stage where most of the bugs have been fixed that were opened up by this change. As a matter of practice, here are the relevant ChangeLog entries TAO ----- Thu Jul 5 23:30:07 2001 Balachandran Natarajan <bala@cs.wustl.edu> Mon Jul 23 11:44:30 2001 Balachandran Natarajan <bala@cs.wustl.edu> ACE ---- Thu Jul 5 23:22:21 2001 Balachandran Natarajan <bala@cs.wustl.edu> Many bugs were fixed along the process that were uncovered by the above changes. The ChangeLogs can be looked at for the details of those changes. So with this we close the bug that has been around for more than year and half. Thanks to the DOC group folks for allowing me to work on this... :-)
Its been fixed officially..