Bug 4168

Summary: CORBA server receives NULL pointer for 'in string' argument
Product: TAO Reporter: milan.cvetkovic
Component: ORBAssignee: Phil Mesnier <mesnierp>
Status: ASSIGNED ---    
Severity: normal CC: mesnierp
Priority: P3    
Version: 2.2.6   
Hardware: All   
OS: Linux   
Attachments: Prevent invoking servant with NULL pointer for in string parameter

Description milan.cvetkovic 2014-05-29 14:08:18 CDT
TAO VERSION: 2.2.1
     ACE VERSION: 6.2.1

     HOST MACHINE and OPERATING SYSTEM: Linux Debian 7.0 on amd64

     COMPILER NAME AND VERSION (AND PATCHLEVEL): gcc (Debian 4.7.2-5)

     THE $ACE_ROOT/ace/config.h FILE: config-linux.h

     THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE:
       platform_linux.GNU

     CONTENTS OF 
$ACE_ROOT/bin/MakeProjectCreator/config/default.features: unmodified

     AREA/CLASS/EXAMPLE AFFECTED: TAO_IDL ?

     DOES THE PROBLEM AFFECT:
         EXECUTION? YES

     SYNOPSIS: When
- TAO server runs with ISO8859-1 as native codeset for chars
- client negotiates UTF-8 encoding
- client sends *invalid* UTF-8 sequence
TAO server receives NULL pointer as parameter for string

     DESCRIPTION:
==============
* IDL:

typedef sequence<octet> OctetSeq;
interface CorbaSimple
{
	OctetSeq hello_s (in string s);
};

==============
* Implementation
OctetSeq CorbaSimple_i::hello_s (const char* s)
{
	if (!s) std::cout << "GOT NULL" << std::endl;
	return new OctetSeq;
}

==============
* TAO server starts with no codeset options specified.

byte jim[]= { (byte)0xef, (byte)0xbf, (byte)0xbd };

==============
Jacorb client code snippet:
     byte badbytes[]= { (byte)0xbd };
     String inp = new String (badbytes, "UTF-8");
     byte[] rv = srv.hello_s (inp);


While the client is obviously misbehaving, I believe TAO server should 
never pass NULL pointer to application code.

Here is portion of debug output with -ORBDebugLevel 10:
GIOP message - HEXDUMP 134 bytes
47 49 4f 50 01 02 00 00  00 00 00 7a 00 00 00 00   GIOP.......z....
03 00 00 00 00 00 00 00  00 00 00 3e 14 01 0f 00   ...........>....
4e 55 50 00 00 00 20 01  00 00 00 01 00 00 00 52   NUP... ........R
6f 6f 74 50 4f 41 00 43  6f 72 62 61 53 69 6d 70   ootPOA.CorbaSimp
6c 65 5f 50 4f 41 00 00  00 00 00 01 00 00 00 43   le_POA.........C
6f 72 62 61 53 69 6d 70  6c 65 00 00 00 00 00 08   orbaSimple......
68 65 6c 6c 6f 5f 73 00  00 00 00 01 00 00 00 01   hello_s.........
00 00 00 0c 00 00 00 00  05 01 00 01 00 01 01 09   ................
00 00 00 02 fd 00                                  ......
TAO (5827|140450518775552) - Codeset_Manager_i::process_service_context, 
using tcsc <UTF-8> (05010001), tcsw <UTF-16> (00010109)
    (140450518775552) calling CorbaSimple_i::hello in file 
`CorbaSimple_i.cpp' on line 48
(5827|140450518775552|2014-05-28 15:09:26.839608) DIED...
    (140450518775552) leaving CorbaSimple_i::hello
TAO (5827|140450518775552) - GIOP_Message_Base::dump_msg, send GIOP 
message v1.2, 16 data bytes, my endian, Type Reply[0]
GIOP message - HEXDUMP 28 bytes
47 49 4f 50 01 02 01 01  10 00 00 00 00 00 00 00   GIOP............
00 00 00 00 00 00 00 00  00 00 00 00               ............
Comment 1 milan.cvetkovic 2014-05-29 14:12:16 CDT
Further analysis:

The stack trace:
#0  TAO_2_2_1::TAO_InputCDR::throw_skel_exception (error_num=0) at CDR.cpp:360
#1  0x00007ff33e4acfc3 in TAO_2_2_1::TAO::Upcall_Wrapper::pre_upcall (this=0x7ff33aebb30f, cdr=..., args=0x7ff33aebb310, nargs=2)
    at Upcall_Wrapper.cpp:236
#2  0x00007ff33e4acb28 in TAO_2_2_1::TAO::Upcall_Wrapper::upcall (this=0x7ff33aebb30f, server_request=..., args=0x7ff33aebb310,
    nargs=2, command=..., servant_upcall=0x7ff33aebb450, exceptions=0x0, nexceptions=0) at Upcall_Wrapper.cpp:46
#3  0x00007ff33b6dd24e in POA_mpathix::test::CorbaSimple::hello_s_skel (server_request=..., servant_upcall=0x7ff33aebb450,
    servant=0x1c2c300) at CorbaSimpleS.cpp:355

Faulty behaviour:

- TAO_UTF8_Latin1_Translator::read_string:
    . detects invalid UTF-8 sequence
    . de-allocates the string
    . does *not* set errno
    . returns 0
- TAO::Upcall_Wrapper::pre_upcall invokes
    TAO_InputCDR::throw_skel_exception(errno), errno happens to be 0
- TAO_InputCDR::throw_skel_exception(0)
     is a no-op
- as a result MyServant::op () is invoked with NULL pointer for string argument

It appears that:
- all execution paths in TAO_InputCDR::read* and TAO_OutputCDR::write* returning
  0 should simultaneously set errno, *or* Upcall_Wrapper::pre_upcall should set
  one if it wasn't set.
- TAO_InputCDR::throw_(skel/stub)_exception should 'assert(error_number!=0)'

Comment 2 Johnny Willemsen 2014-05-29 14:23:04 CDT
Asserts are bad, shouldn't do that. What is the exception and minor code the client should get?
Comment 3 milan.cvetkovic 2014-05-29 16:46:42 CDT
(In reply to comment #2)
> Asserts are bad, shouldn't do that.

I guess that is a point of view. My take on it is that this bug could have been detected a lot earlier if there were assert in those lines. It doesn't feel right to throw an exception if error_code==0. I remember seeing NULL ptrs once in a blue moon, never figured out where they come from...

> What is the exception and minor code the client should get?

Not too sure about this. For this partucular case (bad UTF-8 sequence) reading from 04-03-12.pdf, chapter 13.10.2.6 "Code Set Negotiation"

> A DATA_CONVERSION exception is raised when a client or server attempts
> to transmit a character that does not map into the negotiated
> transmission code set. For example, not all characters in Taiwan Chinese map
> into Unicode. When an attempt is made to transmit one of these characters via
> Unicode, an ORB is required to raise a DATA_CONVERSION exception, with standard
> minor code 1.

so, it looks like errno should be set to ERANGE, for this case.



Comment 4 milan.cvetkovic 2014-05-30 09:54:29 CDT
Created attachment 1491 [details]
Prevent invoking servant with NULL pointer for in string parameter

Proposed patch:

- Upcall_Wrapper.cpp,
  AMI_Arguments_Converter_Impl.cpp,
  DII_Arguments_Converter_Impl.cpp
    Rake sure TAO_{Input/Output}CDR::throw_skel_exception is always invoked with
    non zero argument. Use errno, or EPROTO if errno is not set. EPROTO will
    throw CORBA::MARSHAL exception.
- UTF8_Latin1_Translator::read_string
  set errno to ERANGE when UTF-8 sequence cannot be translated to ISO8859-1
  char. This will generate CORBA::DATA_CONVERSION with minor code 1
Comment 5 Johnny Willemsen 2014-05-30 10:52:11 CDT
It could be that the throw_skel_exception/throw_stub_exception are called from places that are not in our repo (some extensions), therefor it would be better to handle a errno of 0 in these methods itself instead of in the calling code. That would reduce the amount of changes also

Also it is pretty expensive to use errno on some platforms
Comment 6 milan.cvetkovic 2014-05-30 11:33:46 CDT
The problem is that the interface to TAO::Argument is not stipulating anything about using errno, and TAO::Upcall_Wrapper,TAO_UTF8_Latin1_Translator,and  TAO_{Input,Output}CDR are using it incosistently.

The only way not to break any of 3rd party libs relying on this assumption is to actually go through the all possible execution paths in TAO_UTF8_Latin1_Translator (and presumably TAO_UTF16_BOM_Translator), and make sure errno is set when 0 is returned.

My patch modified TAO::Upcall_Wrapper::pre_upcall and similar functions to replace this loop

  for (TAO::Argument * const * i = begin; i != end; ++i)
     if (!(*i)->demarshal (cdr))
        TAO_InputCDR::throw_skel_exception (errno ? errno : EPROTO);

with

  for (TAO::Argument * const * i = begin; i != end; ++i)
     if (!(*i)->demarshal (cdr))
        TAO_InputCDR::throw_skel_exception (errno);

This has a potential to break codeset translation libs which assumed that when errno==0, no exception should be thrown, even though demarsharl() returns 0.

(the errno usage is subotimal, since is used twice, but that would be easy to avoid by introducing local int var...)

Comment 7 Johnny Willemsen 2014-05-30 12:15:00 CDT
When looking at Svn history I still stay with my idea, but this has to be tested
Comment 8 Phil Mesnier 2014-09-02 08:48:22 CDT
I just became aware of this bug. Back in the day, when implementing the codeset translation feature, I needed a way to indicate different errors to raise various spec defined exceptions. When I created the helper functions, I made two mistakes. First, I assumed that all CDR functions set errno. They do not. The solution for this is to initialize errno prior to marshaling operations. The second mistake, for which I cannot recall a motivation, was to assume errno of 0 is not an error.

In all cases, the various throw_*_exceptions are called when marshaling fails. At that point message processing should stop and an exception raised, even when the errno is 0. 

I have a test driver provided to me by a customer. I need to recreate that in a fashion that will fit our test suite. That will happen this week.
Comment 9 Phil Mesnier 2014-09-02 08:50:02 CDT
taking
Comment 10 Johnny Willemsen 2014-11-10 04:15:43 CST
Phil, is this fixed?
Comment 11 Phil Mesnier 2014-11-10 09:44:14 CST
It is fixed but I never got around to building the test I mentioned. Here's the change log, I actually fixed it before I found this bug.

Fri Aug 29 20:26:57 UTC 2014  Phil Mesnier  <mesnier_p@ociweb.com>

        * tao/CDR.cpp:
        * tao/DynamicInterface/DII_Arguments_Converter_Impl.cpp:
        * tao/Messaging/AMI_Arguments_Converter_Impl.cpp:
        * tao/PortableServer/Upcall_Wrapper.cpp:
          Fixed a vulnerability. In cases where errno happened to be
          zero when demarshaling a malformed string (length > available
          buffer) does not result in an exception being raised, rather
          a null char * will be passed to the servant. Also ensured the
          errno is set to a known state as CDR does not set errno in all
          failure cases, but always returns the status of the goodbit,
          so a demashal fail with an errno of zero still needs to raise
          the appropriate MARSHAL exception.