Please report new issues athttps://github.com/DOCGroup
We need to do the following enhancements to the TP_Reactor after the release goes out. - The TPR (Thread Pool Reactor) does not guarantee fairness to the handle sets. It gives unfair advantage to the notify handle. Further it gives more weightage to handles that appear first in the handle set. If we happen to keep getting lots of events on say handle 5, 6, 7 & 8 and there is only one thread in the pool. The thread will first dispatch events on handle 5. After dispatching 5 if the thread happens to call select again, and if there is a event on 5, the thread dispatches 5 again, thereby making the events on 6, 7 & 8 to wait. We need to remedy this. (The description is not accurate to the word but represents an idea of the problem that we have). This affects Oneway_Buffering tests in TAO. - The TPR's (or for that matter the Reactor's) handle set needs to be reference counted. We see race conditions like the one in bug 1020. The problem comes when one of threads is making an upcall to the handler and another thread tries to remove the handler from the Reactors handle set. The race is between one of the threads (the leader) in the reactor holding the reference to the handler, which is being dispatched, and another thread potentially trying to remove the same handler from the reactor. - We may want to have a Single threaded reactor in ACE derived from TPR (for TAO specifically) instead of using the select's ST Reactor. Others if you have ideas please put them here. I am not adding Don's idea. I will request him to add that.
The ACE_Select_Reactor family of classes, which included TPR, use ACE_Token to implement a FIFO thread scheduling mechanism. But this isn't always the most efficient. Therefore, adding the ability to select a thread scheduling strategy, e.g., LIFO, either by deriving from or strategizing ACE_Token, could increase performance.
Added myself to cc list
Accepting this one too..
This is after a long discussion with Irfan on this. Looks like the enhanced reactor should have a way to dispatch to the handles based on priority of the handle/connection. A naive dispatching mechanism would hurt RTCORBA a lot. For example the reactor could manage 3 connections opened up by threads of different priorities from the client. If the messages are handled naively we could easily be servicing the lowest priority connection. (This seems definitely possible if the server thread has a zero priority and uses client propagated policy model). A solution for this problem is to allow setting of priorities to the handles so that dispatching can be done after reodering the handlesets based on priority. The methods can be on the Reactor so that an application would be able to change the priority of the handle at run time. Irfan , if you would like to add more information, please go ahead.
*** Bug 1074 has been marked as a duplicate of this bug. ***
Look at bug 1074 for a test case
Created attachment 94 [details] One possible implementation and use of event handler ref counting to prevent the race condition of bugs 1074 and 1020.
If it's of any interest, the previous attachment shows changes that we are using, that add reference counting to ACE_Event_Handler, and that use these in place of the pending_upcalls counter in the TP_Reactor to control the lifetime of IIOP connection handlers. This fixes the race condition in bug 1074 (and presumably that in bug 1020 as well, although we have not tested the latter). Here's a summary of the changes: ACE_Event_Handler changes: - Add add_ref() and remove_ref() methods. add_ref() increments a reference count, and remove_ref() decrements the count, deleting the event handler object when the counter reaches zero. - Initialize the ref count to 1 in the constructor. ACE_TP_Reactor changes: - Change handle_socket_events() so that it increments the event handler's ref count before dispatching to its callback method, and decrements it afterwards. TAO_IIOP_Connection_Handler changes: - Change handle_close() to call handle_close_i() unconditionally, rather than checking pending_upcalls first. - Change handle_close_i() to call remove_ref(this) instead of directly deleting itself. The destruction of the event handler and associated Transport object will be done by the connection handler's destructor. - In handle_input(), remove the increment, decrement, and check of the pending_upcalls counter. Don't call handle_close_i(). The changes apply to the following files: ACE_wrappers/ace/Event_Handler.h v 4.41 ACE_wrappers/ace/Event_Handler.cpp v 4.19 ACE_wrappers/ace/TP_Reactor.cpp v 4.47 ACE_wrappers/TAO/tao/IIOP_Connection_Handler.cpp v 1.46
Created attachment 95 [details] Bug fix in event handler ref counting implementation
The previous attachment is diffs to ace/Select_Reactor_Base.cpp that prevent dispatch_notify() from closing the handle twice if the event handler callback detects an error. This fix applies to the event handler mods of attachment 94 [details].
Another feature which is needed and which has not been captured explicitly is this - When the Reactor holds a refcounted handler in its internal map, the handler would not be deleted till all the references to the handler goes off. In other words the external references to the handler should be zero. In the mean time if a connection is accepted for the same handle (this is very much possible), the handler shoudl not be added to the internal maps immediately. The addition should be delayed to make sure that all the references to the original handler go away.
The TP_Reactor does not have signal handling capabilities. Needs to be implemented.
Yet another use case that the enhancements should take care is the following - The reactor code releases the guard and then uses an event handler. Yes, the handler is suspended to avoid multiple threads from calling handle_XXX(), but there is no guarantee that the application will not call remove_handler(..., DONT_CALL) and then remove the handler. Nor can the application find out if the handle is suspended and therefore in the middle of an upcall. In other words, the TP_Reactor is at risk of using released memory, small risk, but a real one. The solution to this reference counting and a new memory management scheme.
Fixed. This ChangeLogTag: Mon Jul 07 18:00:38 2003 Irfan Pyarali <irfan@oomworks.com> describes the proper fixes to these problems.
Reopening the bug since one of the points is not addressed
(In reply to comment #15) > Reopening the bug since one of the points is not addressed In particular, the fairness issue is unresolved, correct?