Bug 1356

Summary: Excessive memory useage by TAO client processing fragmented reply
Product: TAO Reporter: Simon McQueen <sm>
Component: ORBAssignee: Simon McQueen <sm>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P3    
Version: 1.2.5   
Hardware: All   
OS: All   

Description Simon McQueen 2002-11-04 09:34:23 CST
A TAO client processing a fragmented IIOP reply (as produced by Orbix 2000, for 
example) consumes and holds an excessive quantity of memory. e.g. for a mesaage 
size ~1.5 MB, the process will hold a memory allocation ~700 MB.
Comment 1 Simon McQueen 2002-11-04 09:40:19 CST
Accepting
Comment 2 Simon McQueen 2002-11-08 07:17:03 CST
Some progress on this. It still fails over a lot but it uses less memory when 
it does so :-)

Memory consumption can be brought under some level of control with the 
following change:

Index: GIOP_Message_Base.cpp
===================================================================
RCS file: /cvs/ACE_wrappers-repository/TAO/tao/GIOP_Message_Base.cpp,v
retrieving revision 1.88
diff -r1.88 GIOP_Message_Base.cpp
527c527
<                  mb->size () + incoming_size);
---
>                  mb->length () + incoming_size);


This does not completely fix the interoperability issue however. What also 
seems to be happening is that sometimes whilst in the middle of successfully 
processing a single (but fragmented) response from the Orbix server, this 
happens:

TAO (328|424) - Transport[544]::process_queue_head
TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 2, retval = 0
TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 3
TAO (328|424) - Transport[544]::handle_input_i
TAO (328|424) - Transport[544]::process_queue_head
TAO (328|424) - Transport[544]::handle_input_i, read 4 bytes
TAO (328|424) - Transport[544]::consolidate_extra_messages
TAO (328|424) - Transport[544]::consolidate_extra_messages, extracting extra 
messages
TAO (328|424) - Transport[544]::process_queue_head
TAO (328|424) - GIOP_Message_Base::dump_msg, recv GIOP v1.2 msg, -8 data bytes, 
my endian, Type Fragment[0]
GIOP message - HEXDUMP 4 bytes
47 49 4f 50                                       GIOP            
(328|424) exception thrown but client is not waiting a response
(328|424) EXCEPTION, TAO_GIOP_Message_Base::process_request[2]
system exception, ID 'IDL:omg.org/CORBA/MARSHAL:1.0'
TAO exception, minor code = 0 (unknown location; unspecified errno), completed 
= NO

TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 2, retval = 0
TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 3
TAO (328|424) - Transport[544]::handle_input_i
TAO (328|424) - Transport[544]::process_queue_head
TAO (328|424) - Transport[544]::handle_input_i, read 1020 bytes
TAO (328|424) - Transport[544]::consolidate_extra_messages
TAO (328|424) - GIOP_Message_State::parse_message_header_i
TAO (328|424) - GIOP_Message_State::get_version_info
TAO (328|424) - GIOP_Message_State::get_byte_order_info
TAO (328|424) - Transport[544]::consolidate_extra_messages, extracting extra 
messages
TAO (328|424) - Transport[544]::process_queue_head
TAO (328|424) - GIOP_Message_Base::dump_msg, recv GIOP v0.4 msg, -12 data 
bytes, other endian, Type Request[0]
GIOP message - HEXDUMP 0 bytes
(328|424) exception thrown but client is not waiting a response
(328|424) EXCEPTION, TAO_GIOP_Message_Base::process_request[2]
system exception, ID 'IDL:omg.org/CORBA/MARSHAL:1.0'
TAO exception, minor code = 0 (unknown location; unspecified errno), completed 
= NO

TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 2, retval = 0
TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 3
TAO (328|424) - Transport[544]::handle_input_i
TAO (328|424) - Transport[544]::process_queue_head
TAO (328|424) - Transport[544]::handle_input_i, read 1020 bytes
TAO (328|424) - Transport[544]::consolidate_message_queue
(328|424) TAO_Transport[6351932]::consolidate_message_queue(328|424) 
TAO_Transport[6352068]::consolidate_message_queue(328|424) TAO_Transport
[6352376]::consolidate_message_queue(328|424) TAO_Transport
[6352504]::consolidate_message_queueTAO (328|424) - IIOP_Connection_Handler
[544]::handle_input, handle = 544/544, refcount = 2, retval = 0
TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 3
TAO (328|424) - Transport[544]::handle_input_i
TAO (328|424) - Transport[544]::process_queue_head
TAO (328|424) - Transport[544]::handle_input_i, read 1020 bytes
TAO (328|424) - GIOP_Message_State::parse_message_header_i
TAO (328|424) - GIOP_Message_State::get_version_info
TAO (328|424) - GIOP_Message_State::get_byte_order_info
TAO (328|424) - Transport[544]::consolidate_message
TAO (328|424) - Transport[544]::consolidate_message, read 15364 bytes on attempt
TAO (328|424) - Transport[544]::consolidate_message, queueing up the message
TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 2, retval = 0
TAO (328|424) - IIOP_Connection_Handler[544]::handle_input, handle = 544/544, 
refcount = 3
TAO (328|424) - Transport[544]::handle_input_i
TAO (328|424) - Transport[544]::process_queue_head
TAO (328|424) - Transport[544]::handle_input_i, read 1020 bytes
TAO (328|424) - GIOP_Message_State::parse_message_header_i
TAO (328|424) - GIOP_Message_State::get_version_info
TAO (328|424) - GIOP_Message_State::get_byte_order_info
TAO (328|424) - Transport[544]::consolidate_message
TAO (328|424) - Transport[544]::consolidate_message, read 15364 bytes on attempt
TAO (328|424) - Transport[544]::consolidate_fragments

... at this point the incoming message queue now holds *2* TAO_Queued_Data 
nodes. The first is the unfinshed first part of the fragmented message, TAO 
consolidates the remainder of the fragments into the new node on the tail of 
the queue.

When the final fragment is received and consolidated into the tail, the 
Transport::process_queue_head call doesn't work because although the tail node 
is now not 'is_tail_fragmented' the first node still has more_fragments_ true 
because it is only a part of the whole message and hence the 'is_head_complete' 
returns false.

A quick attempt to kludge it by detecting the break and consolidating the 
fragments back together met with a marshal exception, so it seems that we are 
losing something in the confusion. 

Comment 3 Simon McQueen 2006-02-23 04:59:22 CST
Frank's fragmentation fixes have finally laid this to rest:
ChangeLogTag: Wed Feb 22 20:37:00 UTC 2006 Frank Rehberger
<frehberger@prismtech.com>