Bug 1047 - stdin as pipe or close(0) can cause ORB_init to fail
Summary: stdin as pipe or close(0) can cause ORB_init to fail
Status: ASSIGNED
Alias: None
Product: TAO
Classification: Unclassified
Component: ORB (show other bugs)
Version: 1.1.16
Hardware: SPARC Solaris
: P3 normal
Assignee: DOC Center Support List (internal)
URL:
Depends on:
Blocks:
 
Reported: 2001-10-03 19:33 CDT by pphillip
Modified: 2001-11-13 08:07 CST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description pphillip 2001-10-03 19:33:49 CDT
This is an odd problem which came up when trying to fork() & exec() a TAO CORBA 
server process.
Sorry about the old revision (1.1.16).  I'm pretty sure this will happen on 
1.1.21 but have
not had time to try it yet.  I know the compiler is pretty lame but I'm pretty 
sure the bug
is straightforward.

It turns out that if stdin is a pipe or (alternatively) if you close stdin 
before calling
CORBA::ORB_init() the call will fail with a CORBA::Initialize exception.  Below, 
here is the
output and source for the test program.

The problem, roughly, comes from CORBA::ORB_init in ORB.cpp.  This code does the 
following
sequence of operations:

  int result = TAO_Internal::open_services (argc, argv);

  // Check for errors returned from <TAO_Internal::open_services>.
  if (result != 0 && errno != ENOENT)
    {
      ACE_ERROR ((LM_ERROR,
                  ACE_TEXT ("(%P|%t) %p\n"),
                  ACE_TEXT ("Unable to initialize the ")
                  ACE_TEXT ("Service Configurator")));
      ACE_THROW_RETURN (CORBA::INITIALIZE (
                          CORBA_SystemException::_tao_minor_code (
                            TAO_ORB_CORE_INIT_LOCATION_CODE,
                            0),
                          CORBA::COMPLETED_NO),
                        CORBA::ORB::_nil ());
    }
Note the check "errno != ENOENT".  That is, the code is not meant to fail if 
svc.conf exists.
The problem is in TAO_Internal::open_services_i in TAO_Internal.cpp.  The code 
there looks like:

           result = ACE_Service_Config::open (argc, argv,
                                               ACE_DEFAULT_LOGGER_KEY,
                                               0, // Don't ignore static 
services.
                                               ignore_default_svc_conf_file);
        }

      // Handle RTCORBA library special case.  Since RTCORBA needs
      // its init method call to register several hooks, call it here
      // if it hasn't already been called.
      ACE_Service_Object *rt_loader =
        ACE_Dynamic_Service<ACE_Service_Object>::instance ("RT_ORB_Loader");
      if (rt_loader != 0)
          rt_loader->init (0, 0);

      // @@ What the heck do these things do and do we need to avoid
      // calling them if we're not invoking the svc.conf file?
      if (TAO_Internal::resource_factory_args_ != 0)
        ACE_Service_Config::process_directive 
(TAO_Internal::resource_factory_args_);

So result & errno are set appropriately by the call to 
ACE_Service_Config::open() but
ACE_Service_Config::process_directive() can reset errno.  I really don't know 
the details
but I think the various "ace_yy" routines which process_directive() uses will 
ultimately
perform an "lseek()" operation on stdin and get the EPIPE error.  Note that if 
your
close(0) (what my fork/exec code did) file descriptor 0 is guaranteed to be a 
pipe because
the reactor calls pipe().

The fix: Obviously the code should preserve errno after the 
ACE_Service_Config::open().
Ideally the ace_yy* routines shouldn't play with file descriptors if they are 
only parsing
strings.  The bug is a not a big deal but could cause trouble for UNIX users who 
play with
fork()/exec().

---------------- output
% ./a.out
CORBA::ORB_init() returned OK
% cat /dev/null | ./a.out
(15688|1) Unable to initialize the Service Configurator: Illegal seek
CORBA::ORB_init threw exception IDL:omg.org/CORBA/INITIALIZE:1.0
---------------- source
#include <tao/corba.h>
#include <tao/TAO_Internal.h>

int main( int argc, char *argv[] )
{
        CORBA::ORB_ptr theOrb;

        TAO_Internal::default_svc_conf_entries( "", "", "" );

        try
        {
                theOrb = CORBA::ORB_init( argc, argv );
        }
        catch ( const CORBA::Exception& e )
        {
                const char* id = e._id();

                fprintf( stderr, "CORBA::ORB_init threw exception %s\n", id );

                return 0;
        }

        fprintf( stderr, "CORBA::ORB_init() returned OK\n" );

        return 0;
}
Comment 1 Ossama Othman 2001-10-17 12:30:59 CDT
I try your test code using a post TAO 1.2 CVS snapshot.  Unfortunately, I wasn't
able to reproduce the error that you're seeing.   Here's what I get (on Linux):

$ ./test 
CORBA::ORB_init() returned OK
$ cat /dev/null | ./test
CORBA::ORB_init() returned OK

Note that ACE's Service Configurator parser was replaced with a brand new
reentrant parser (generated with GNU Bison) in ACE 5.1.19.  Furthermore,
significant changes were made to the scanner (still GNU Flex based) in that ACE
beta as well.

Can you please try to reproduce this problem with TAO 1.2?
Comment 2 pphillip 2001-10-17 18:35:05 CDT
It still seems to happen on ACE/TAO 1.2 but is very Solaris specific.
Here is a stack trace.  I think the problem is the fflush(yyin) call which on
Solaris has the side effect of calling llseek() and setting errno if stdin
is a pipe:

(/opt/SUNWspro/bin/../WS5.0/bin/sparcv9/dbx) where
current thread: t@1
  [1] _lseek64(0x0, 0x0, 0x0, 0x1, 0xff3dc9d0, 0xa1), at 0xff29290c
  [2] _fflush_u(0xff2b63ec, 0x0, 0xff30c524, 0x5257, 0xff3dc9d0, 0xff2b65c0), at 
0xff289cf0
  [3] _fflush_u_iops(0x0, 0xff2b63ec, 0xff2b2118, 0xff2b6584, 0x13, 0x0), at 
0xff289c78
=>[4] ace_yywrap(), line 97 in "Svc_Conf.l"
  [5] ace_yylex(ace_yylval = 0xffbef0a4), line 170 in "Svc_Conf.l"
  [6] ace_yyparse(), line 432 in "bison.simple"
  [7] ACE_Service_Config::process_directives_i(), line 385 in 
"Service_Config.cpp"
  [8] ACE_Service_Config::process_directive(directive = 0x4b6a4 ""), line 418 in 
"Service_Config.cpp"
  [9] TAO_Internal::open_services_i(argc = 1, argv = 0x65f48, 
ignore_default_svc_conf_file = 0, skip_service_config_open = 0), line 256 in 
"TAO_Internal.cpp"
  [10] TAO_Internal::open_services(argc = 1, argv = 0xffbefbcc), line 175 in 
"TAO_Internal.cpp"
  [11] CORBA::ORB_init(argc = 1, argv = 0xffbefbcc, orbid = 0xfe9f7191 "", 
_ACE_CORBA_Environment_variable = CLASS), line 1426 in "ORB.cpp"
  [12] CORBA::ORB_init(argc = 1, argv = 0xffbefbcc, orb_name = (nil)), line 1304 
in "ORB.cpp"
  [13] main(0x1, 0xffbefbcc, 0xffbefbd4, 0x62400, 0x0, 0x0), at 0x2986c


Line 97 of Svc_conf.l is the fflush().
int
yywrap (void)
{
  ::fflush (yyin);  <-- right here.
  yytext[0] = '#';
  yyleng = 0;

  return 1;
}

This doesn't happen on HP-UX, perhaps because their fflush() is more careful 
(ditto for Linux).

The following program simulates the behaviour in minature.
#include <stdio.h>
#include <errno.h>

int main( int argc, char* argv[] )
{
        int res;
        int flush_errno;

        errno = 0;

        res = fflush( stdin );
        flush_errno = errno;
        if ( res != 0 )
        {
                perror( "fflush failed on stdin" );
        }
        else
        {
                fprintf( stderr, "fflush() on stdin succeeded\n" );
        }

        if ( flush_errno != 0 )
        {
                fprintf( stderr, "fflush set errno to %d\n", flush_errno );
                errno = flush_errno;
                perror( "that translates to:" );
        }

        return 0;
}

On HP-UX 11:
% ./a.out
fflush() on stdin succeeded
% cat /dev/null | ./a.out
fflush() on stdin succeeded

On Solaris 2.7:
% ./a.out 
fflush() on stdin succeeded
% cat /dev/null | ./a.out 
fflush() on stdin succeeded
fflush set errno to 29
that translates to:: Illegal seek

I think this is what is happening to the service parser--it is accidentally
causing errno to get set.

Trivia note: FreeBSD 3.2 and 4.2 completely refuse:
% ./a.out
fflush failed on stdin: Bad file descriptor
fflush set errno to 9
% cat /dev/null | ./a.out
fflush failed on stdin: Bad file descriptor
fflush set errno to 9
that translates to:: Bad file descriptor
Comment 3 pphillip 2001-10-17 18:45:08 CDT
I forgot to add: the bug won't happen if you have an svc.conf file.
On solaris:

% touch svc.conf
% ./a.out 
CORBA::ORB_init() returned OK
% cat /dev/null | ./a.out 
CORBA::ORB_init() returned OK
% rm svc.conf
% ./a.out 
CORBA::ORB_init() returned OK
% cat /dev/null | ./a.out 
(22421|1) Unable to initialize the Service Configurator: Illegal seek
CORBA::ORB_init threw exception IDL:omg.org/CORBA/INITIALIZE:1.0
Comment 4 pphillip 2001-10-24 13:29:36 CDT
I've tracked this down.  Summary:
1. fflush() under solaris can accidentically change errno even though
   it returns no error.
2. fflush() can be called with NULL (I did not know this).
3. fflush() is called with NULL when parsing strings in ace/Svc_conf_l.cpp
   around line 1822 or so:

int
ace_yywrap (void)
{
  ::fflush (ace_yyin);
  ace_yytext[0] = '#';
  ace_yyleng = 0;

  return 1;
}

4. In the case of parsing strings, ace_yyin is NULL.
   I have no idea if the parser meant to do this as calling fflush(NULL)
   is supposed to flush all streams in the process.

5. A patch: change the flush call as follows:
   if (ace_yyin) ::fflush (ace_yyin)

Maybe this bug is fixed in a later version of Solaris?  I'm testing on
"5.7 Generic_106541-04 sun4u sparc SUNW,Ultra-2"
Comment 5 Nanbor Wang 2001-11-13 08:07:36 CST
Aceepting the bug for tao-support