Please report new issues athttps://github.com/DOCGroup
Notification service core dumps with SIGBUS signal when using it with -ORBDaemon parameter, to daemonize it. Previous version (1.5.2) work correctly.
Steve, can someon of OCI have a look at this?
Accepted on behalf of OCI.
janka.ace@mailnull.com (Sam) wrote: > Notification service core dumps with SIGBUS signal when using it with > -ORBDaemon parameter, to daemonize it. Previous version (1.5.2) work > correctly. Can you provide more information about the failure? For example: - Stack trace from the core file - Complete list of command line parameters provided to the Notification Service server - Any Service Configurator (svc.conf) options provided to the Notification Service server Thanks! Steve Totten
COMMAND: LD_LIBRARY_PATH=/opt/TAO-1.5.6/lib /opt/TAO-1.5.6/bin/Notify_Service -IORoutput /tmp/nos.ior -DispatchingThreads 10 -ORBRunThreads 10 -AllocateTaskperProxy -AsynchUpdates -ORBEndpoint iiop://1.0@belladona-3 -ORBInitRef NameService=file:///tmp/ns.ior -ORBDebugLevel 10 -ORBLogFile=/tmp/logx -ORBDaemon It doesn't creates core dump, i see it make sigbus only when running via truss (solaris strace). Here is its output: 3/1: lwp_park(0x00000000, 0) = 0 1643/1: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 1643/1: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 1643/1: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 1643/1: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 1643/1: fstat64(3, 0xFFBFEBA0) = 0 1643/1: fstat64(3, 0xFFBFEA48) = 0 1643/1: ioctl(3, TCGETA, 0xFFBFEB2C) Err#22 EINVAL 1643/1: write(3, " I O R : 0 0 0 0 0 0 0 0".., 250) = 250 1643/1: close(3) = 0 1643/1: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 1643/1: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 1643/3: lwp_park(0x00000000, 0) = 0 1643/3: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 1643/3: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 1643/3: pollsys(0xFDE7B9E8, 3, 0x00000000, 0x00000000) = 1 1643/3: read(2, " I O R : 0 0 0 0", 8) = 8 1643/3: lwp_unpark(2) = 0 1643/3: Incurred fault #5, FLTACCESS %pc = 0xFE5B8E2C 1643/3: siginfo: SIGBUS BUS_ADRALN addr=0x494F5276 1643/3: Received signal #10, SIGBUS [default] 1643/3: siginfo: SIGBUS BUS_ADRALN addr=0x494F5276 Ok, and when running without truss it simply exits with retval 0, and i there is no information about any problem, it just doesn't runs. Btw. Naming_Service version 1.5.6 with -ORBDaemon param works without problems. Notification_Service from 1.5.2 with the same params, same environment work correctly.
Sam, thanks a lot for the additional information. This would help reproducing the problem. Can I also ask you to add ACE_DEBUG=1 to your commadline and post the resulting output (run it directly, without truss)? Thanks, --Iliyan
janka.ace@mailnull.com (Sam) wrote: > COMMAND: > LD_LIBRARY_PATH=/opt/TAO-1.5.6/lib /opt/TAO-1.5.6/bin/Notify_Service -IORoutput > /tmp/nos.ior -DispatchingThreads 10 -ORBRunThreads 10 -AllocateTaskperProxy > -AsynchUpdates -ORBEndpoint iiop://1.0@belladona-3 -ORBInitRef > NameService=file:///tmp/ns.ior -ORBDebugLevel 10 -ORBLogFile=/tmp/logx -ORBDaemon That's a lot of threads. I don't know if you realize, but with -AllocateTaskperProxy, the number of dispatching threads specified with -DispatchingThreads will be multiplied on a per supplier proxy basis. So, each new consumer that connects will get a new supplier proxy (unless it connects to an existing supplier proxy, but that's not the usual usage scenario), which will mean another 10 dispatching threads will be created. If you had 20 consumers, that's 200 dispatching threads, plus the 10 ORBRunThreads. Since you are capturing ORB debug output in a file, can you attach the contents of that file? It may provide some clues. > It doesn't creates core dump, Perhaps ulimit is not set to allow core dumps on your system for your userid. What does "ulimit -c" show? > i see it make sigbus only when running via truss > (solaris strace). Here is its output: > > 3/1: lwp_park(0x00000000, 0) = 0 > 1643/1: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF > [0x0000FFFF] > 1643/1: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF > [0x0000FFFF] > 1643/1: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF > [0x0000FFFF] > 1643/1: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF > [0x0000FFFF] > 1643/1: fstat64(3, 0xFFBFEBA0) = 0 > 1643/1: fstat64(3, 0xFFBFEA48) = 0 > 1643/1: ioctl(3, TCGETA, 0xFFBFEB2C) Err#22 EINVAL > 1643/1: write(3, " I O R : 0 0 0 0 0 0 0 0".., 250) = 250 > 1643/1: close(3) = 0 > 1643/1: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF > [0x0000FFFF] > 1643/1: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF > [0x0000FFFF] > 1643/3: lwp_park(0x00000000, 0) = 0 > 1643/3: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF > [0x0000FFFF] > 1643/3: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF > [0x0000FFFF] > 1643/3: pollsys(0xFDE7B9E8, 3, 0x00000000, 0x00000000) = 1 > 1643/3: read(2, " I O R : 0 0 0 0", 8) = 8 > 1643/3: lwp_unpark(2) = 0 > 1643/3: Incurred fault #5, FLTACCESS %pc = 0xFE5B8E2C > 1643/3: siginfo: SIGBUS BUS_ADRALN addr=0x494F5276 > 1643/3: Received signal #10, SIGBUS [default] > 1643/3: siginfo: SIGBUS BUS_ADRALN addr=0x494F5276 > > > Ok, and when running without truss it simply exits with retval 0, and i there is > no information about any problem, it just doesn't runs. Btw. Naming_Service > version 1.5.6 with -ORBDaemon param works without problems. Notification_Service > from 1.5.2 with the same params, same environment work correctly. I see you are using g++ 4.1.1 on Solaris (you didn't specify which version of Solaris). Is it possible the Notification Service is linked with libraries built with either a different version of g++, or with a different compiler (e.g., a Sun compiler)? What does "cd $TAO_ROOT/orbsvcs/Notify_Service; ldd Notify_Service" show? Steve Totten
Reassigning to Iliyan.
Sorry i've not pasted whole uname: HW is Solaris 10, latest patches. All used software is self compiled, including gcc (bootstraped) and TAO, linking using Sun's solaris linker bundled with sol10 (because gnu binutils works horibly on solaris). All C++ libraries used by Notification are build with the same version of g++ (gcc and SUN compiler have different mangling schemes, so it's quite impossible to mix them, and SUN compiler had ago some problems compiling TAO when i've tried it). But here is ldd: ldd of Notifcation_Server: LD_LIBRARY_PATH=/opt/TAO-1.5.6/lib ldd /opt/TAO-1.5.6/bin/Notify_Service libTAO_IORTable.so.1 => /opt/TAO-1.5.6/lib/libTAO_IORTable.so.1 libTAO_CosNaming.so.1 => /opt/TAO-1.5.6/lib/libTAO_CosNaming.so.1 libTAO_CosNotification_Serv.so.1 => /opt/TAO-1.5.6/lib/libTAO_CosNotification_Serv.so.1 libTAO_ETCL.so.1 => /opt/TAO-1.5.6/lib/libTAO_ETCL.so.1 libTAO_DynamicAny.so.1 => /opt/TAO-1.5.6/lib/libTAO_DynamicAny.so.1 libTAO_CosNotification_Skel.so.1 => /opt/TAO-1.5.6/lib/libTAO_CosNotification_Skel.so.1 libTAO_Svc_Utils.so.1 => /opt/TAO-1.5.6/lib/libTAO_Svc_Utils.so.1 libTAO_Valuetype.so.1 => /opt/TAO-1.5.6/lib/libTAO_Valuetype.so.1 libTAO_CosEvent_Skel.so.1 => /opt/TAO-1.5.6/lib/libTAO_CosEvent_Skel.so.1 libTAO_CosNotification.so.1 => /opt/TAO-1.5.6/lib/libTAO_CosNotification.so.1 libTAO_CosEvent.so.1 => /opt/TAO-1.5.6/lib/libTAO_CosEvent.so.1 libTAO_PortableServer.so.1 => /opt/TAO-1.5.6/lib/libTAO_PortableServer.so.1 libTAO_AnyTypeCode.so.1 => /opt/TAO-1.5.6/lib/libTAO_AnyTypeCode.so.1 libTAO.so.1 => /opt/TAO-1.5.6/lib/libTAO.so.1 libACE-5.5.6.so => /opt/TAO-1.5.6/lib/libACE-5.5.6.so librt.so.1 => /lib/librt.so.1 libgen.so.1 => /lib/libgen.so.1 libnsl.so.1 => /lib/libnsl.so.1 libsocket.so.1 => /lib/libsocket.so.1 libstdc++.so.6 => /opt/TAO-1.5.6/lib/libstdc++.so.6 libm.so.2 => /lib/libm.so.2 libc.so.1 => /lib/libc.so.1 libgcc_s.so.1 => /opt/devtools/lib/libgcc_s.so.1 libaio.so.1 => /lib/libaio.so.1 libmd5.so.1 => /lib/libmd5.so.1 libmp.so.2 => /lib/libmp.so.2 libscf.so.1 => /lib/libscf.so.1 libdoor.so.1 => /lib/libdoor.so.1 libuutil.so.1 => /lib/libuutil.so.1 /platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1 /platform/SUNW,Sun-Fire-V240/lib/libmd5_psr.so.1 I could finaly produce core-dump (don't know why it didn't worked before), backtrace: #0 0xfe0b0bb8 in strlen () from /lib/libc.so.1 (gdb) bt #0 0xfe0b0bb8 in strlen () from /lib/libc.so.1 #1 0xfe116138 in _ndoprnt () from /lib/libc.so.1 #2 0xfe118858 in vsnprintf () from /lib/libc.so.1 #3 0xfe5d8524 in ACE_OS::snprintf (buf=0x2e0f9 "", maxlen=4056, format=0xffbfef11 "%s: %s") at ../../ace/OS_NS_stdio.inl:927 #4 0xfe5c6d50 in ACE_Log_Msg::log (this=0x2e0f9, format_str=0xfd8 <Address 0xfd8 out of bounds>, log_priority=4290768657, argp=0xd) at ../../ace/Log_Msg.cpp:1274 #5 0xfe5c7334 in ACE_Log_Msg::log (this=0x2dc78, log_priority=94511, format_str=0x80 <Address 0x80 out of bounds>) at ../../ace/Log_Msg.cpp:951 #6 0x00013fc8 in TAO_Notify_Service_Driver::parse_args () #7 0x000147ec in TAO_Notify_Service_Driver::init () #8 0x00013b7c in main () ORBLogFile has not any contents, it is not even created. ACE_DEBUG=1 output: // (14372|0) SG::insert repo=291b8 (opened=0) - enqueue TAO_CosNotify_Service, active=0. // (14372|0) SG::insert repo=291b8 (opened=0) - enqueue Notify_Default_Event_Manager_Objects_Factory, active=0. ACE (14372|1) SG::process_directive_i, repo=291b8 - Concrete_Valuetype_Adapter_Factory, dll=<null>, force=0 ACE (14372|1) SR::insert, repo=291b8 [34] (1024), name=Concrete_Valuetype_Adapter_Factory, type=2fa10, impl=2f9b8, object=0, active=-32046948 ACE (14372|1) SG::add_processed_statisc_svc, repo=291b8 - Concrete_Valuetype_Adapter_Factory ACE (14372|1) DSB::instance, repo=291b8, name=TAO_ORB_Core_Static_Resources type=2e588 => 2f568 // (14372|0) SG::insert repo=291b8 (opened=0) - enqueue TAO_CosNotify_Service, active=0. // (14372|0) SG::insert repo=291b8 (opened=0) - enqueue Notify_Default_Event_Manager_Objects_Factory, active=0. // (14372|0) SG::insert repo=291b8 (opened=0) - enqueue TAO_CosNotify_Service, active=0. // (14372|0) SG::insert repo=291b8 (opened=0) - enqueue Notify_Default_Event_Manager_Objects_Factory, active=0. Segmentation Fault (core dumped)
looks a problem using '%p' in ACE LogMsg. Seems a nill string is used which can cause a crash on solaris. Maybe use the same logic as with '%s' in log_msg, if the string is 0, then print (null). This then seems more a problem in ACE
Wed Apr 4 13:04:25 UTC 2007 Chad Elliott <elliott_c@ociweb.com> * orbsvcs/Notify_Service/Notify_Service.cpp: Added the missing parameter for %p in an ACE_ERROR_RETURN().
Hi, got same problem with latest TAO 1.5.10, same scenario, same environment, only TAO upgraded. Again without -ORBDaemon Notify service works just fine. Command line: /opt/TAO/bin/Notify_Service -IORoutput /tmp/nos.ior -ORBRunThreads 10 -AllocateTaskperProxy -AsynchUpdates -ORBDebugLevel 10 -ORBLogFile /tmp/nos.log -ORBEndpoint iiop://1.0@belladona-3 -ORBInitRef NameService=file:///tmp/ns.ior -ORBDaemon ORBDebugLog: TAO (24812|1) Default_Resource_Factory - codeset manager=3bbb8 TAO (24812|1) - Codeset_Manager_i::init_ccs, Loaded Codeset translator <UTF8_Latin1_Factory>, ncs = 00010001 tcs = 05010001 TAO (24812|1) - UTF16_BOM_Translator: forceBE 0 TAO (24812|1) Loaded default protocol <IIOP_Factory> TAO (24812|1) created new ORB <> TAO (24812|1) - IIOP_Acceptor::open, address==belladona-3, options=(null) TAO (24812|1) - IIOP_Acceptor::open, specified host=belladona-3:0 TAO (24812|1) - IIOP_Acceptor::open_i, listening on: <belladona-3:41779> Loading the Cos Notification Service... in TAO_Properties ctos 3e374 Running 10 ORB threads Starting up the Notification Service... object:499e0 created object:497e8 created Created POA : 1 (24812|1) Topology persistence disabled. Activating object with id = 1 in POA : 1 object:499e0 incr refcount = 1 (24812|1) TAO_Notify_EventChannelFactory::activate_selfTAO (24812|1) - Transport_Cache_Manager::find_i, no idle transport is available TAO (24812|1) - Transport_Cache_Manager::fill_set_i, current_size = 0, cache_maximum = 128 TAO (24812|1) - IIOP_Connector::begin_connection, to <belladona-3:40940> which should block TAO (24812|1) - IIOP_Connection_Handler::open, The local addr is <192.168.0.103:41780> TAO (24812|1) - IIOP_Connection_Handler::open, IIOP connection to peer <192.168.0.103:40940> on 7 TAO (24812|1) - IIOP_Connector::make_connection, new connected connection to <belladona-3:40940> on Transport[7] TAO (24812|1) - Transport_Cache_Manager::bind_i, 0xffbfedc4 -> 0x6ca58 Transport[7] TAO (24812|1) - Transport_Cache_Manager::bind_i, cache size is [1] TAO (24812|1) - Transport[7]::register_handler TAO (24812|1) - Transport_Connector::connect, opening Transport[7] in TAO_CLIENT_ROLE TAO (24812|1) - Codeset_Manager_i::set_tcs, No codeset component in profile TAO (24812|1) - Muxed_TMS[7]::request_id, <1> TAO (24812|1) - Codeset_Manager_i::generate_service_context, using tcs_c = 00010001, tcs_w = 00010109 TAO (24812|1) - GIOP_Message_Base::dump_msg, send GIOP v1.0 msg, 142 data bytes, my endian, Type Request[1] GIOP message - HEXDUMP 154 bytes 47 49 4f 50 01 00 00 00 00 00 00 8e 00 00 00 01 GIOP............ 00 00 00 01 00 00 00 0c 00 00 00 00 00 01 00 01 ................ 00 01 01 09 00 00 00 01 01 00 00 00 00 00 00 3a ...............: 14 01 0f 00 4e 55 50 00 00 00 1c 00 00 00 00 00 ....NUP......... 00 00 01 52 6f 6f 74 50 4f 41 00 4e 61 6d 65 53 ...RootPOA.NameS 65 72 76 69 63 65 00 00 00 00 00 00 00 00 01 4e ervice.........N 61 6d 65 53 65 72 76 69 63 65 00 00 00 00 00 08 ameService...... 74 6f 5f 6e 61 6d 65 00 00 00 00 00 00 00 00 1a to_name......... 4e 6f 74 69 66 79 45 76 65 6e 74 43 68 61 6e 6e NotifyEventChann 65 6c 46 61 63 74 6f 72 79 00 elFactory. TAO (24812|1) - Transport[7]::cleanup_queue, byte_count = 154 TAO (24812|1) - Transport[7]::cleanup_queue, after transfer, bc = 0, all_sent = 1, ml = 0 TAO (24812|1) - Transport[7]::drain_queue_helper, byte_count = 154, head_is_empty = 1 TAO (24812|1) - Transport[7]::drain_queue_i, helper retval = 1 TAO (24812|1) - Transport[7]::make_idle TAO (24812|1) - Leader_Follower[7]::wait_for_event, (leader) enter reactor event loop TAO (24812|1) - Connection_Handler[7]::handle_input, handle = 7/7 TAO (24812|1) - Transport[7]::handle_input TAO (24812|1) - Transport[7]::process_queue_head, 0 enqueued TAO (24812|1) - Transport[7]::handle_input_parse_data, enter TAO (24812|1) - Transport[7]::handle_input_parse_data, read 65 bytes TAO (24812|1) - GIOP_Message_State::parse_message_header_i TAO (24812|1) - GIOP_Message_State::get_version_info TAO (24812|1) - GIOP_Message_State::get_byte_order_info TAO (24812|1) - Transport[7]::process_parsed_messages, entering (missing data == 0) TAO (24812|1) - GIOP_Message_Base::dump_msg, recv GIOP v1.0 msg, 53 data bytes, my endian, Type Reply[1] GIOP message - HEXDUMP 65 bytes 47 49 4f 50 01 00 00 01 00 00 00 35 00 00 00 00 GIOP.......5.... 00 00 00 01 00 00 00 00 00 00 00 01 00 00 00 1a ................ 4e 6f 74 69 66 79 45 76 65 6e 74 43 68 61 6e 6e NotifyEventChann 65 6c 46 61 63 74 6f 72 79 00 2f 48 00 00 00 01 elFactory./H.... 00 . TAO (24812|1) - TAO_Muxed_TMS::dispatch_reply, id = 1 TAO (24812|1) - Connection_Handler[7]::handle_input, handle = 7/7, retval = 0 TAO (24812|1) - Leader_Follower[7]::wait_for_event, (leader) exit reactor event loop TAO (24812|1) - Transport_Cache_Manager::is_entry_idle, state is [0] TAO (24812|1) - Transport_Cache_Manager::find_i, at index 0 (Transport[7]) - idle TAO (24812|1) - Transport_Connector::connect, got an existing connected Transport[7] in role TAO_CLIENT_ROLE TAO (24812|1) - Muxed_TMS[7]::request_id, <2> TAO (24812|1) - GIOP_Message_Base::dump_msg, send GIOP v1.0 msg, 255 data bytes, my endian, Type Request[2] GIOP message - HEXDUMP 267 bytes 47 49 4f 50 01 00 00 00 00 00 00 ff 00 00 00 00 GIOP............ 00 00 00 02 01 00 00 0c 00 00 00 3a 14 01 0f 00 ...........:.... 4e 55 50 00 00 00 1c 00 00 00 00 00 00 00 01 52 NUP............R 6f 6f 74 50 4f 41 00 4e 61 6d 65 53 65 72 76 69 ootPOA.NameServi 63 65 00 00 00 00 00 00 00 00 01 4e 61 6d 65 53 ce.........NameS 65 72 76 69 63 65 00 00 00 00 00 07 72 65 62 69 ervice......rebi 6e 64 00 53 00 00 00 00 00 00 00 01 00 00 00 1a nd.S............ 4e 6f 74 69 66 79 45 76 65 6e 74 43 68 61 6e 6e NotifyEventChann 65 6c 46 61 63 74 6f 72 79 00 74 43 00 00 00 01 elFactory.tC.... 00 6c 46 61 00 00 00 26 49 44 4c 3a 4e 6f 74 69 .lFa...&IDL:Noti 66 79 45 78 74 2f 45 76 65 6e 74 43 68 61 6e 6e fyExt/EventChann 65 6c 46 61 63 74 6f 72 79 3a 31 2e 30 00 00 00 elFactory:1.0... 00 00 00 01 00 00 00 00 00 00 00 3f 00 01 00 00 ...........?.... 00 00 00 0c 62 65 6c 6c 61 64 6f 6e 61 2d 33 00 ....belladona-3. a3 33 00 00 00 00 00 23 14 01 0f 00 4e 55 54 46 .3.....#....NUTF b1 cb d2 00 0b ba 4f 00 00 00 01 00 00 00 01 00 ......O......... 00 00 00 00 00 00 01 00 00 00 01 ........... TAO (24812|1) - Transport[7]::cleanup_queue, byte_count = 267 TAO (24812|1) - Transport[7]::cleanup_queue, after transfer, bc = 0, all_sent = 1, ml = 0 TAO (24812|1) - Transport[7]::drain_queue_helper, byte_count = 267, head_is_empty = 1 TAO (24812|1) - Transport[7]::drain_queue_i, helper retval = 1 TAO (24812|1) - Transport[7]::make_idle TAO (24812|1) - Leader_Follower[7]::wait_for_event, (leader) enter reactor event loop TAO (24812|1) - Connection_Handler[7]::handle_input, handle = 7/7 TAO (24812|1) - Transport[7]::handle_input TAO (24812|1) - Transport[7]::process_queue_head, 0 enqueued TAO (24812|1) - Transport[7]::handle_input_parse_data, enter TAO (24812|1) - Transport[7]::handle_input_parse_data, read 24 bytes TAO (24812|1) - GIOP_Message_State::parse_message_header_i TAO (24812|1) - GIOP_Message_State::get_version_info TAO (24812|1) - GIOP_Message_State::get_byte_order_info TAO (24812|1) - Transport[7]::process_parsed_messages, entering (missing data == 0) TAO (24812|1) - GIOP_Message_Base::dump_msg, recv GIOP v1.0 msg, 12 data bytes, my endian, Type Reply[2] GIOP message - HEXDUMP 24 bytes 47 49 4f 50 01 00 00 01 00 00 00 0c 00 00 00 00 GIOP............ 00 00 00 02 00 00 00 00 ........ TAO (24812|1) - TAO_Muxed_TMS::dispatch_reply, id = 2 TAO (24812|1) - Connection_Handler[7]::handle_input, handle = 7/7, retval = 0 TAO (24812|1) - Leader_Follower[7]::wait_for_event, (leader) exit reactor event loop Registered with the naming service as: NotifyEventChannelFactory ../../../../TAO/orbsvcs/Notify_Service/Notify_Service.cpp: Running the Notification Service TAO (24812|2) - ORB_Core::run, start [run] TAO (24812|2) - ORB_Core::run, calling handle_events() TAO (24812|3) - ORB_Core::run, start [run] TAO (24812|3) - ORB_Core::run, calling handle_events() TAO (24812|4) - ORB_Core::run, start [run] TAO (24812|4) - ORB_Core::run, calling handle_events() TAO (24812|5) - ORB_Core::run, start [run] TAO (24812|5) - ORB_Core::run, calling handle_events() TAO (24812|6) - ORB_Core::run, start [run] TAO (24812|6) - ORB_Core::run, calling handle_events() TAO (24812|7) - ORB_Core::run, start [run] TAO (24812|7) - ORB_Core::run, calling handle_events() TAO (24812|8) - ORB_Core::run, start [run] TAO (24812|8) - ORB_Core::run, calling handle_events() TAO (24812|9) - ORB_Core::run, start [run] TAO (24812|9) - ORB_Core::run, calling handle_events() TAO (24812|10) - ORB_Core::run, start [run] TAO (24812|10) - ORB_Core::run, calling handle_events() TAO (24812|11) - ORB_Core::run, start [run] TAO (24812|11) - ORB_Core::run, calling handle_events() TAO (24812|2) - Transport_Cache_Manager::fill_set_i, current_size = 1, cache_maximum = 128 TAO (24812|2) - Concurrency_Strategy::activate_svc_handler, opened as TAO_SERVER_ROLE TAO (24812|2) - IIOP_Connection_Handler::open, The local addr is <192.168.0.103:41779> TAO (24812|2) - IIOP_Connection_Handler::open, IIOP connection to peer <192.168.0.103:41788> on 3 TAO (24812|2) - Transport_Cache_Manager::bind_i, 0xfdffb854 -> 0x92628 Transport[3] TAO (24812|2) - Transport_Cache_Manager::bind_i, cache size is [2] TAO (24812|2) - Transport[3]::register_handler sleep_hook failed: Broken pipe TRACE: 26157/2: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 26157/2: write(2, " T A O ( 2 6 1 5 7 | 2".., 84) = 84 26157/2: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 26157/2: door_info(5, 0xFDFF9018) = 0 26157/2: door_call(5, 0xFDFF9000) = 0 26157/2: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 26157/2: write(2, " T A O ( 2 6 1 5 7 | 2".., 67) = 67 26157/2: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 26157/2: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 26157/2: write(2, " T A O ( 2 6 1 5 7 | 2".., 47) = 47 26157/2: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 26157/2: write(4, "\0\0\0\0\0\0\004", 8) Err#32 EPIPE 26157/2: Received signal #13, SIGPIPE [ignored] 26157/2: siginfo: SIG#0 26157/2: open("/opt/nms/nls/C/SUNW_OST_OSLIB", O_RDONLY) Err#2 ENOENT 26157/2: open("/usr/lib/locale/C/LC_MESSAGES/SUNW_OST_OSLIB.mo", O_RDONLY) Err#2 ENOENT 26157/2: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 26157/2: write(2, " s l e e p _ h o o k f".., 31) = 31 26157/2: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 26157/11: lwp_park(0x00000000, 0) = 0 26157/11: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF] 26157/11: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF] 26157/11: pollsys(0xFD6FB9E0, 1, 0x00000000, 0x00000000) (sleeping...) 26157/2: lwp_park(0x00000000, 0) (sleeping...)
Forgot to TAO version change.
(In reply to comment #11) Hi Sam, > /opt/TAO/bin/Notify_Service -IORoutput /tmp/nos.ior -ORBRunThreads 10 > -AllocateTaskperProxy -AsynchUpdates -ORBDebugLevel 10 -ORBLogFile /tmp/nos.log > -ORBEndpoint iiop://1.0@belladona-3 -ORBInitRef NameService=file:///tmp/ns.ior > -ORBDaemon Something doesn't make sense to me. If this is (part of) the log file: > ORBDebugLog: > > ... > (24812|1) Topology persistence disabled. > Activating object with id = 1 in POA : 1 > object:499e0 incr refcount = 1 > (24812|1) TAO_Notify_EventChannelFactory::activate_selfTAO (24812|1) - ... > bytes, my endian, Type Request[1] > GIOP message - HEXDUMP 154 bytes > 47 49 4f 50 01 00 00 00 00 00 00 8e 00 00 00 01 GIOP............ > 00 00 00 01 00 00 00 0c 00 00 00 00 00 01 00 01 ................ > 00 01 01 09 00 00 00 01 01 00 00 00 00 00 00 3a ...............: > 14 01 0f 00 4e 55 50 00 00 00 1c 00 00 00 00 00 ....NUP......... > 00 00 01 52 6f 6f 74 50 4f 41 00 4e 61 6d 65 53 ...RootPOA.NameS > 65 72 76 69 63 65 00 00 00 00 00 00 00 00 01 4e ervice.........N > 61 6d 65 53 65 72 76 69 63 65 00 00 00 00 00 08 ameService...... > 74 6f 5f 6e 61 6d 65 00 00 00 00 00 00 00 00 1a to_name......... > 4e 6f 74 69 66 79 45 76 65 6e 74 43 68 61 6e 6e NotifyEventChann > 65 6c 46 61 63 74 6f 72 79 00 elFactory. > TAO (24812|1) - Transport[7]::cleanup_queue, byte_count = 154 > TAO (24812|1) - Transport[7]::cleanup_queue, after transfer, bc = 0, all_sent = > 1, ml = 0 > TAO (24812|1) - Transport[7]::drain_queue_helper, byte_count = 154, > head_is_empty = 1 ... > my endian, Type Reply[1] > GIOP message - HEXDUMP 65 bytes > 47 49 4f 50 01 00 00 01 00 00 00 35 00 00 00 00 GIOP.......5.... > 00 00 00 01 00 00 00 00 00 00 00 01 00 00 00 1a ................ > 4e 6f 74 69 66 79 45 76 65 6e 74 43 68 61 6e 6e NotifyEventChann > 65 6c 46 61 63 74 6f 72 79 00 2f 48 00 00 00 01 elFactory./H.... > 00 . > TAO (24812|1) - TAO_Muxed_TMS::dispatch_reply, id = 1 > TAO (24812|1) - Connection_Handler[7]::handle_input, handle = 7/7, retval = 0 > TAO (24812|1) - Leader_Follower[7]::wait_for_event, (leader) exit reactor event > loop > TAO (24812|1) - Transport_Cache_Manager::is_entry_idle, state is [0] > TAO (24812|1) - Transport_Cache_Manager::find_i, at index 0 (Transport[7]) - > idle > TAO (24812|1) - Transport_Connector::connect, got an existing connected > Transport[7] in role TAO_CLIENT_ROLE > TAO (24812|1) - Muxed_TMS[7]::request_id, <2> > TAO (24812|1) - GIOP_Message_Base::dump_msg, send GIOP v1.0 msg, 255 data > bytes, my endian, Type Request[2] > GIOP message - HEXDUMP 267 bytes > 47 49 4f 50 01 00 00 00 00 00 00 ff 00 00 00 00 GIOP............ > 00 00 00 02 01 00 00 0c 00 00 00 3a 14 01 0f 00 ...........:.... > 4e 55 50 00 00 00 1c 00 00 00 00 00 00 00 01 52 NUP............R > 6f 6f 74 50 4f 41 00 4e 61 6d 65 53 65 72 76 69 ootPOA.NameServi > 63 65 00 00 00 00 00 00 00 00 01 4e 61 6d 65 53 ce.........NameS > 65 72 76 69 63 65 00 00 00 00 00 07 72 65 62 69 ervice......rebi > 6e 64 00 53 00 00 00 00 00 00 00 01 00 00 00 1a nd.S............ > 4e 6f 74 69 66 79 45 76 65 6e 74 43 68 61 6e 6e NotifyEventChann > 65 6c 46 61 63 74 6f 72 79 00 74 43 00 00 00 01 elFactory.tC.... > 00 6c 46 61 00 00 00 26 49 44 4c 3a 4e 6f 74 69 .lFa...&IDL:Noti > 66 79 45 78 74 2f 45 76 65 6e 74 43 68 61 6e 6e fyExt/EventChann > 65 6c 46 61 63 74 6f 72 79 3a 31 2e 30 00 00 00 elFactory:1.0... > 00 00 00 01 00 00 00 00 00 00 00 3f 00 01 00 00 ...........?.... > 00 00 00 0c 62 65 6c 6c 61 64 6f 6e 61 2d 33 00 ....belladona-3. > a3 33 00 00 00 00 00 23 14 01 0f 00 4e 55 54 46 .3.....#....NUTF > b1 cb d2 00 0b ba 4f 00 00 00 01 00 00 00 01 00 ......O......... > 00 00 00 00 00 00 01 00 00 00 01 ........... > TAO (24812|1) - Transport[7]::cleanup_queue, byte_count = 267 > TAO (24812|1) - Transport[7]::cleanup_queue, after transfer, bc = 0, all_sent = > 1, ml = 0 > TAO (24812|1) - Transport[7]::drain_queue_helper, byte_count = 267, > head_is_empty = 1 ... > my endian, Type Reply[2] > GIOP message - HEXDUMP 24 bytes > 47 49 4f 50 01 00 00 01 00 00 00 0c 00 00 00 00 GIOP............ > 00 00 00 02 00 00 00 00 ........ > TAO (24812|1) - TAO_Muxed_TMS::dispatch_reply, id = 2 > TAO (24812|1) - Connection_Handler[7]::handle_input, handle = 7/7, retval = 0 > TAO (24812|1) - Leader_Follower[7]::wait_for_event, (leader) exit reactor event > loop > Registered with the naming service as: NotifyEventChannelFactory > ../../../../TAO/orbsvcs/Notify_Service/Notify_Service.cpp: Running the > Notification Service > ... > TAO (24812|2) - Transport_Cache_Manager::fill_set_i, current_size = 1, > cache_maximum = 128 > TAO (24812|2) - Concurrency_Strategy::activate_svc_handler, opened as > TAO_SERVER_ROLE > TAO (24812|2) - IIOP_Connection_Handler::open, The local addr is > <192.168.0.103:41779> > TAO (24812|2) - IIOP_Connection_Handler::open, IIOP connection to peer > <192.168.0.103:41788> on 3 > TAO (24812|2) - Transport_Cache_Manager::bind_i, 0xfdffb854 -> 0x92628 > Transport[3] > TAO (24812|2) - Transport_Cache_Manager::bind_i, cache size is [2] > TAO (24812|2) - Transport[3]::register_handler > sleep_hook failed: Broken pipe I see the process id was 24812, hoever below the truss output shows 26157?! > 26157/2: lwp_sigmask(SIG_SETMASK, 0xFFBFFEFF, 0x0000FFF7) = 0xFFBFFEFF > [0x0000FFFF] > 26157/2: write(2, " T A O ( 2 6 1 5 7 | 2".., 84) = 84 > 26157/2: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF > ... Looks like you are running a (TAO) client under truss, not the Notification Service. And what we see failing is that client process. Is this correct? If, so - can you post the full command line for that client? Also, please post the stack trace at the point where the client causes the problem. What happens to the NS process at that time? Does it go down, or remain running? Is there more to the NS server log _after_ the client dies? Thanks, --Iliyan Jeliazkov Object Computing, Inc.
Sorry, i've mixed log files when i was pasting them, but trace is from running Notify service. Logs are not from the same run, but Notify still fails in the same place. This is how i was tracing Notify: truss -f -o /tmp/nos.trace /opt/TAO/bin/Notify_Service -IORoutput /tmp/nos.ior -ORBRunThreads 10 -AllocateTaskperProxy -AsynchUpdates -ORBDebugLevel 10 -ORBLogFile /tmp/nos.log -ORBEndpoint iiop://1.0@belladona-3 -ORBInitRef NameService=file:///tmp/ns.ior -ORBDaemon I'm going to setup again tao-1.5.10 testing environment on new machine so i can test it continuosly. As soon i'll have backtrace from debuger, i'll post it here.
Finaly got a backtrace of the bug. Core was generated by `/opt/TAO/bin/Notify_Service -IORoutput /tmp/nos.ior -ORBRunThreads 10 -Allocate'. Program terminated with signal 10, Bus error. #0 0xfde7ce0c in ACE_Event_Handler::Reference_Counting_Policy::value (this=0x494f5272) at ../../ace/Event_Handler.cpp:232 232 ../../ace/Event_Handler.cpp: No such file or directory. in ../../ace/Event_Handler.cpp (gdb) bt #0 0xfde7ce0c in ACE_Event_Handler::Reference_Counting_Policy::value (this=0x494f5272) at ../../ace/Event_Handler.cpp:232 #1 0xfdf0a6c8 in ACE_Select_Reactor_Notify::dispatch_notify (this=0x3e5b0, buffer=@0xfd7fb9b8) at ../../ace/Select_Reactor_Base.cpp:817 #2 0xfdf29da0 in ACE_TP_Reactor::handle_notify_events (this=0x3d6e8, guard=@0xfd7fbad4) at ../../ace/TP_Reactor.cpp:396 #3 0xfdf2a780 in ACE_TP_Reactor::dispatch_i (this=0x3d6e8, max_wait_time=0x0, guard=@0xfd7fbad4) at ../../ace/TP_Reactor.cpp:238 #4 0xfdf2a8fc in ACE_TP_Reactor::handle_events (this=0x3d6e8, max_wait_time=0x0) at ../../ace/TP_Reactor.cpp:174 #5 0xfdeeeea8 in ACE_Reactor::handle_events (this=0x3dfc8, max_wait_time=0x0) at ../../ace/Reactor.cpp:420 #6 0xfe1b4b24 in TAO_ORB_Core::run (this=0x33ab0, tv=0x0, perform_work=0) at ../../../TAO/tao/ORB_Core.cpp:2177 #7 0xfe1ae19c in CORBA::ORB::run (this=0x3f610, tv=0x0) at ../../../TAO/tao/ORB.cpp:202 #8 0xfe1ae1e0 in CORBA::ORB::run (this=0x3f610) at ../../../TAO/tao/ORB.cpp:188 #9 0x00015368 in Worker::svc (this=0xffbff350) at ../../../../TAO/orbsvcs/Notify_Service/Notify_Service.cpp:451 #10 0xfdf2bc7c in ACE_Task_Base::svc_run (args=0xffbff350) at ../../ace/Task.cpp:271 #11 0xfdf2cce4 in ACE_Thread_Adapter::invoke_i (this=0x41558) at ../../ace/Thread_Adapter.cpp:146 #12 0xfdf2cdf8 in ACE_Thread_Adapter::invoke (this=0x41558) at ../../ace/Thread_Adapter.cpp:95 #13 0xfde5e174 in ace_thread_adapter (args=0x41558) at ../../ace/Base_Thread_Adapter.cpp:116 #14 0xfd9bfda4 in _lwp_start () from /lib/libc.so.1 #15 0xfd9bfda4 in _lwp_start () from /lib/libc.so.1 Backtrace stopped: previous frame identical to this frame (corrupt stack?) Logfile strip: ACE (13366|11) SG::ctor - this = 4d698, pss = 0 ACE (13366|11) Service_Config_Guard:<ctor> - repo=95498 superceded by repo=2e3c8 TAO (13366|11) - ORB_Core::run, start [run] TAO (13366|11) - ORB_Core::run, calling handle_events() TAO (13366|2) - Transport_Cache_Manager::fill_set_i, current_size = 1, cache_maximum = 128 TAO (13366|2) - Concurrency_Strategy::activate_svc_handler, opened as TAO_SERVER_ROLE TAO (13366|2) - IIOP_Connection_Handler::open, The local addr is <192.168.0.102:41826> TAO (13366|2) - IIOP_Connection_Handler::open, IIOP connection to peer <192.168.0.102:41836> on 3 TAO (13366|2) - Transport_Cache_Manager::bind_i, 0xfd7fb638 -> 0x96628 Transport[3] TAO (13366|2) - Transport_Cache_Manager::bind_i, cache size is [2] TAO (13366|2) - Transport[3]::register_handler sleep_hook failed: Broken pipe
lowering priority