Bug 1506 - SSLIOP leaks memory and connections on client
Summary: SSLIOP leaks memory and connections on client
Status: RESOLVED FIXED
Alias: None
Product: TAO
Classification: Unclassified
Component: SSLIOP Pluggable Protocol (show other bugs)
Version: 1.4
Hardware: x86 Windows 2000
: P2 normal
Assignee: DOC Center Support List (internal)
URL:
Depends on:
Blocks:
 
Reported: 2003-05-13 04:42 CDT by david.kinder
Modified: 2007-02-20 12:39 CST (History)
0 users

See Also:


Attachments
tarball containing simple client/server which expose problem (5.48 KB, application/octet-stream)
2007-02-20 12:18 CST, Ken Sedgwick
Details
Patch for TAO 1.5.6 which fixes problem. (807 bytes, patch)
2007-02-20 12:19 CST, Ken Sedgwick
Details

Note You need to log in before you can comment on or make changes to this bug.
Description david.kinder 2003-05-13 04:42:23 CDT
On my current project I have a test scenario which involves a client repeatedly 
connecting to CORBA objects incarnated by a server. Each connection leads to 
the server destroying and recreating the CORBA objects it incarnates, so that 
the client is forever creating new SSL connections to new CORBA objects.

When the transport is IIOP, memory use is constant. With SSLIOP, my server 
seems stable, but the client consumes ever more memory until it crashes.

Tracing the cause of this with a debugger, I can see that for each new 
connection ACE_SSL_SOCK_Connector::ssl_connect() calls ::SSL_connect() to 
create the SSL connection, but the corresponding ::SSL_free() call in 
ACE_SSL_SOCK_Stream's destructor is never called, because the instance of 
TAO_SSLIOP_Connection_Handler is never deleted.

When the reactor decides to remove the handler, it calls 
TAO_SSLIOP_Connection_Handler::close_connection(), which calls 
close_connection_eh(). This in turn calls decr_refcount(), which should lead to 
the handler being deleted (as under IIOP). However, after close_connection(), 
the reference count on the handler is 1.

Comparison of the code paths taken when using IIOP and SSLIOP shows that in the 
SSLIOP case the Connector's connect() method creates a connection handler with 
a reference count of 4, while for IIOP the reference count is only 3.

I believe that the problem lies in TAO_SSLIOP_Connector::ssliop_connect(). 
First of all, I looked at the corresponding IIOP connector method: 
TAO_IIOP_Connector::make_connection(). This creates the connection handler with 
a call to base_connector_.connect(), and then later in the method decreases the 
reference count. In comparison, in TAO_SSLIOP_Connector::ssliop_connect() there 
are three interesting steps:

1) Create the handler with a call to base_connector_.creation_strategy()
   .make_svc_handler()
2) Make a new connection with base_connector_.connect()
3) Decrement the reference count on the connection handler after connection

The problem is that step 2 makes a call to make_svc_handler(), passing in the 
handler created in step 1. This second call to make_svc_handler() increments 
the handler's reference count, so steps 1 and 2 both increment the reference 
count, but step 3 only decrements it once. As a result, the SSLIOP connection 
handler's reference count never gets to 0 and it is never deleted.

The solution seems to be to add the lines

    // base_connector_.connect () will increment the handler's reference count
    // once more. This is not needed as we already hold a reference to the
    // handler, therefore we discard this second reference.
    svc_handler->decr_refcount ();

immediately below the call to base_connector_.connect() in 
TAO_SSLIOP_Connector::ssliop_connect().

This was tested and found to work with TAO 1.3.0. Looking at the CVS, this 
problem is present in the latest TAO.
Comment 1 Ossama Othman 2003-05-16 14:41:15 CDT
Hi David,

Thanks for analyzing the problem.  I don't recall the reference count being
incremented when calling make_svc_handler().  It looks like this was added after
I wrote the code in SSLIOP_Connector.*.

Actually, I think it was a bad idea (not mine :-)) to increment the reference
count by one in such a factory method since it altered the behavior of code that
expected things to work in the canonical way.  In particular, a factory method
that creates a reference counted object would typically return an object with a
reference count equal to one.  Instead it is returning an object with a
reference count equal to two the first time into this method.

Unfortunately, this change to the make_svc_handler() method introduced the
problem you detected.  Thanks alot for your fix!  I'll commit it.  However, I'd
really like to see that reference count increment in make_svc_handler() removed.
Comment 2 Ossama Othman 2003-05-19 13:11:23 CDT
Fixed.

Mon May 19 11:05:26 2003  Ossama Othman  <ossama@dre.vanderbilt.edu>
...
	From David Kinder <david.kinder@sophos.com>
	* orbsvcs/orbsvcs/SSLIOP/SSLIOP_Connector.cpp (ssliop_connect):

	  The base_connector_.connect() call will increment the handler's
	  reference count once more.  This is not needed as we already
	  hold a reference to the handler.  Therefore we discard this
	  second reference.  Fixes a memory leak.  [Bug 1506]
	  
	  (retrieve_credentials):

	  For the default certificate and private key case, assign the
	  return value of the TAO_SSLIOP_Credentials_var::out() call to a
	  reference to a TAO_SSLIOP_Credentials_ptr.  Previously, the
	  return value was assigned to a local TAO_SSLIOP_Credentials_ptr
	  variable that was lost when leaving the scope of this method.
	  Fixes a memory leak.  [Bug 1508]
Comment 3 david.kinder 2004-01-27 04:14:25 CST
This bug was reintroduced in TAO 1.3.4, when a refactoring branch was merged 
in. To be exact, when revision 1.76 of this file was committed to CVS. The 
problem is in 1.4.0, and is the same as before: the reference count on 
svc_handler is incremented twice, but only decremented once, so that SSLIOP 
connection handlers are never deleted.

Inspection with a debugger on a breakpoint immediately after the call to

  this->base_connector_.connect()

in ssliop_connect() shows that the reference count is either three (for success 
or pending completion) or two (for failure), contrary to the comment in the 
code immediately below, which states the desired values for the reference count 
of either two or one.

The fix is the same as before: immediately after the base_connector_.connect() 
call, add the following:

  // base_connector_.connect() will increment the handler's #REFCOUNT#
  // once more. This is not required as we already hold a reference to
  // the handler, so we discard this second reference.
  svc_handler->remove_reference ();
Comment 4 Ken Sedgwick 2007-02-20 12:16:46 CST
This bug was observed in our production environment.  It caused clients to leak
file descriptors until they became inoperative.

I wrote a simple test setup which demonstrates this problem.

I verified that the previously described patch fixes the problem in the test setup.
Comment 5 Ken Sedgwick 2007-02-20 12:18:20 CST
Created attachment 662 [details]
tarball containing simple client/server which expose problem
Comment 6 Ken Sedgwick 2007-02-20 12:19:41 CST
Created attachment 663 [details]
Patch for TAO 1.5.6 which fixes problem.
Comment 7 Douglas C. Schmidt 2007-02-20 12:39:13 CST
Applied the fix from Ken Sedgwick.