Please report new issues athttps://github.com/DOCGroup
This is a bug report from Carlos at ATD --------------------------------------------------------------------- ---------------- Sun Dec 29 08:44:06 2002 Balachandran Natarajan <bala@isis-server.isis.van\ derbilt.edu> * tao/Muxed_TMS.cpp: * tao/Muxed_TMS.h: Use recursive mutex instead of a simple mutex. This prevents deadlock that might occur, when a thread tries to send a request during reply dispatching. ---------------- I do not believe the fix is correct, or at least, it is not immediately obvious that the fix is correct. Consider the following code: ---------------------------------------------------------------- 191 TAO_Muxed_TMS::connection_closed (void) 192 { 193 ACE_GUARD (TAO_SYNCH_RECURSIVE_MUTEX, ace_mon, this->lock_); 194 // @@ This should be done using a mutex, the table 195 196 REQUEST_DISPATCHER_TABLE::ITERATOR end = 197 this->dispatcher_table_.end (); 198 for (REQUEST_DISPATCHER_TABLE::ITERATOR i = 199 this->dispatcher_table_.begin (); 200 i != end; 201 ++i) 202 { 203 (*i).int_id_->connection_closed (); 204 } 205 this->dispatcher_table_.unbind_all (); 206 } 207 ---------------------------------------------------------------- In this code line 203 may invoke application code as part of AMI reply handlers (to handle the exception.) The application can do anything crazy at that point, including making new requests. At that point, the iterator declared in line 198 may become invalidated. If that happens the system may crash. I cannot convince myself that the problem is not real. Can anybody guide me there? Also, I can easily see that only the thread calling connection_closed() can manipulate the map. But I cannot convince myself that calling connection_closed() on the first reply dispatcher won't result in any other reply dispatcher becoming invalid. For example, what if the application runs an event loop while in the middle of the first connection_closed() upcall? What if all the other reply dispatchers are AMI or DII deferred calls and they time out during that nested event loop? The dispatchers are deleted as part of the nested loop and removed from the map. That either invalidates the iterator or makes it possible to access a deleted object. I thought about changing the code as follows: ---------------------------------------------------------------- 191 TAO_Muxed_TMS::connection_closed (void) 192 { 193 ACE_GUARD (TAO_SYNCH_RECURSIVE_MUTEX, ace_mon, this->lock_); 194 REQUEST_DISPATCHER_TABLE tmp = 195 this->dispatcher_table_; 196 this->dispatcher_table_.unbind_all(); 197 198 REQUEST_DISPATCHER_TABLE::ITERATOR end = tmp.end(); 199 for (REQUEST_DISPATCHER_TABLE::ITERATOR i = tmp.begin(); 200 i != end; 201 ++i) 202 { 203 (*i).int_id_->connection_closed (); 204 } 205 } ---------------------------------------------------------------- That deals with the iterators, but not with the deleted objects. There is the extra cost of line 194/195, if the ACE hash map supported a swap() operation ala STL (hint hint), we could implement the function using: ---------------------------------------------------------------- 194 REQUEST_DISPATCHER_TABLE tmp; 195 tmp.swap(this->dispatcher_table_); ----------------------------------------------------------------
Created attachment 193 [details] Test TAR for Bala, from CTS
Fixed. Please see Thu Apr 3 13:13:24 2003 Balachandran Natarajan <bala@dre.vanderbilt.edu> for details
Reopening the bug. The problem hasn't been completely fixed. I see deadlocks still in my branch.
Fixed and this time for real! Sat Oct 4 23:09:51 2003 Balachandran Natarajan <bala@dre.vanderbilt.edu>