Bug 3562 - Demarshal octet sequence has poor performance when replacing a buffer
Summary: Demarshal octet sequence has poor performance when replacing a buffer
Status: NEW
Alias: None
Product: TAO
Classification: Unclassified
Component: other (show other bugs)
Version: 1.6.7
Hardware: All All
: P3 enhancement
Assignee: DOC Center Support List (internal)
URL:
Depends on:
Blocks: 3574
  Show dependency tree
 
Reported: 2009-01-29 15:01 CST by Aaron Brown
Modified: 2009-02-17 07:16 CST (History)
0 users

See Also:


Attachments
test program (8.82 KB, text/plain)
2009-01-29 15:01 CST, Aaron Brown
Details
patch (801 bytes, patch)
2009-01-30 09:24 CST, Aaron Brown
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aaron Brown 2009-01-29 15:01:47 CST
Created attachment 1066 [details]
test program

Overview: 
I noticed poorer than expected performance when trying to demarshal an octet sequence from a TAO_InputCDR.  TAO was compiled with the TAO_NO_COPY_OCTET_SEQUENCES flag enabled and I had the ACE_Message_Block::DONT_DELETE flag disabled to take advantage of the sequence buffer replacement code in demarshal_sequence for octets.

After inspecting the demarshal_sequence for octets function in Unbounded_Sequence_CDR_T.h, I saw that the temporary sequence length is set before the check of the CDR's message block flags and replacement of the temporary sequence buffer with the message block from the CDR.

Setting the sequence length causes the buffer to be initialized, which can be computationally expensive.  I don't think that initializing the buffer is necessary if the buffer will just be replaced.

I have attached a modified version of $TAO_ROOT/tests/OctetSeq/OctetSeq.cpp to reproduce the problem.  It allocates a single buffer, uses a TAO_OuptutCDR to marshal a sequence to that buffer, then uses a TAO_InputCDR to demarshal that buffer into an octet sequence.  The result should be that the octet sequence is just referencing memory in the original buffer.

The program uses the same arguments as the standard OctetSeq test.
        -l <low>
                Sets the minimum size of the sequences tested.

        -h <high>
                The maximum size of the sequences tested.

        -s <step>
                Increase the size of the sequence from <low> to <high>
                in increments of <step>

        -n <iter>
                The number of iterations (marshaling/demarshaling)
                done for each loop.

        -q
                Be quiet, only print the summary data....

        Example:

$ ./OctetSeq -l 4096 -h 8192 -s 16 -n 32 -q

Below is a unified diff of a fix for Unbounded_Sequence_CDR_T.h
--- ACE_wrappers.ORIG/TAO/tao/Unbounded_Sequence_CDR_T.h        2009-01-18 14:57:23.000000000 -0500
+++ ACE_wrappers.NEW/TAO/tao/Unbounded_Sequence_CDR_T.h 2009-01-18 15:01:10.000000000 -0500
@@ -109,7 +109,6 @@
       return false;
     }
     sequence tmp(new_length);
-    tmp.length(new_length);
     if (ACE_BIT_DISABLED (strm.start ()->flags (), ACE_Message_Block::DONT_DELETE))
     {
       TAO_ORB_Core* orb_core = strm.orb_core ();
@@ -123,6 +122,7 @@
         return true;
       }
     }
+    tmp.length(new_length);
     typename sequence::value_type * buffer = tmp.get_buffer();
     if (!strm.read_octet_array (buffer, new_length)) {
       return false;



Build Date & Platform: 

    Linux 2.6.12.6 Sun Nov 16 22:43:23 EST 2008 ppc

Additional Builds and Platforms: 

    Linux 2.6.18-53.el5 Wed Oct 10 16:34:02 EDT 2007 i686

Additional Information: 
The results of the test program should be more pronounced as the sequence size and number of iterations increases.
Comment 1 Aaron Brown 2009-01-30 09:24:30 CST
Created attachment 1068 [details]
patch

Setting the sequence length  to occur after checking if the sequence buffer can be replaced.