Summary: | IFR database corruption leading to IFR_Service crash | ||
---|---|---|---|
Product: | TAO | Reporter: | Richard Spence <richard.spence.extern> |
Component: | Interface Repository | Assignee: | DOC Center Support List (internal) <tao-support> |
Status: | ASSIGNED --- | ||
Severity: | normal | ||
Priority: | P3 | ||
Version: | 1.4.8 | ||
Hardware: | x86 | ||
OS: | Linux |
Description
Richard Spence
2006-01-26 03:44:46 CST
Fixed Thu Jan 26 20:36:47 UTC 2006 Jeff Parsons <j.parsons@vanderbilt.edu> * orbsvcs/IFR_Service/be_produce.cpp(BE_cleanup): Removed code to destory the temporary holding scope entry in the repository after each IDL file is processed. Instead the lifetime of that entry is now tied to the repository itself. * orbsvcs/IFR_Service/ifr_adding_visitor.cpp (visit_typedef): Removed code that replaces a typedef with the same repo id with a new entry, which would invalidate any references to the typedef entry that other entries may hold. The IFR will now throw the BAD_PARAM minor code that corresponds to an attempt to create an entry for a repo id that already exists in the repository. Thanks to Richard Spence <richard dot spence dot extern at icn dot siemens dot de> for reporting the problem when the typdef is used as an operation parameter. This closes [BUGID:2381]. * orbsvcs/orbsvcs/IFRService/IFR_Service_Utils.cpp (name_exists): Changed the loop to be a FOR loop using the explicit section names, rather than a while loop calling enumerate_sections() to get each section name. Retested the given scenario with TAO 1.4.9. I now get a segfault during step 4. GDB backtrace follows: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1214523712 (LWP 7998)] 0xb7bec5c7 in TAO::Invocation_Adapter::~Invocation_Adapter () at /lvol1/ACE_wrappers/TAO/tao/Sequence_T.i:15 (gdb) info stack #0 0xb7bec5c7 in TAO::Invocation_Adapter::~Invocation_Adapter () at /lvol1/ACE_wrappers/TAO/tao/Sequence_T.i:15 #1 0xb7e154ba in CORBA::Container::create_alias (this=0x0, id=0x4 <Address 0x4 out of bounds>, name=0x4 <Address 0x4 out of bounds>, version=0x4 <Address 0x4 out of bounds>, original_type=0x4) at IFR_Client/IFR_BaseC.cpp:4209 #2 0xb7fe5021 in ifr_adding_visitor::visit_typedef (this=0xbffff580, node=0x80886d8) at ifr_adding_visitor.cpp:2393 #3 0xb7f87979 in AST_Typedef::ast_accept (this=0x80886d8, visitor=0x4) at ast/ast_typedef.cpp:185 #4 0xb7fe32bf in ifr_adding_visitor::visit_scope (this=0xbffff580, node=0x8088700) at ifr_adding_visitor.cpp:98 #5 0xb7fe5df3 in ifr_adding_visitor::visit_root (this=0xbffff580, node=0x807b2b8) at ifr_adding_visitor.cpp:2442 #6 0xb7f822d1 in AST_Root::ast_accept (this=0x807b240, visitor=0x4) at ast/ast_root.cpp:215 #7 0xb7fe3079 in BE_produce () at be_produce.cpp:229 #8 0x0804e8dc in DRV_drive (s=0x807b0a8 "t2.idl") at /lvol1/ACE_wrappers/TAO/TAO_IDL/tao_idl.cpp:261 #9 0x0804ed29 in main (argc=2, argv=0x0) at /lvol1/ACE_wrappers/TAO/TAO_IDL/tao_idl.cpp:345 I tried this on Windows and Linux workspaces, and each time I get the expected exception. From your stack trace, it looks like this line in ifr_adding_visitor::visit_typedef() if (be_global->ifr_scopes ().top (current_scope) == 0) is succeeding but putting 0 into current_scope. That may be because this line in ifr_adding_visitor::visit_root() if (be_global->ifr_scopes ().push (be_global->repository ()) != 0) is succeeding, but be_global->repository() is returning 0. This is turn may be because code in be_produce.cpp (BE_ifr_repo_init) is not working as expected. Please check these steps out in your debugger, and see if any of my guesses are on the right track - it would help a lot in tracking down the source of the problem. After further debugging as requested I can declare the following: Inside ifr_adding_visitor::visit_typedef variable current_scope is non-null: (gdb) p current_scope $7 = 0x808f4c8 (gdb) p *$ $8 = {<CORBA::IRObject> = {<CORBA::Object> = {_vptr.Object = 0xb7ef5988, servant_ = 0x0, proxy_broker_ = 0xb7c88e38, is_collocated_ = false, is_local_ = false, is_evaluated_ = true, ior_ = {<TAO_Var_Base_T<IOP::IOR>> = {ptr_ = 0x0}, <No data fields>}, orb_core_ = 0x806e870, protocol_proxy_ = 0x8089280, refcount_ = 1, refcount_lock_ = 0x808f618}, _vptr.IRObject = 0xb7ef5900, the_TAO_IRObject_Proxy_Broker_ = 0x0}, _vptr.Container = 0xb7ef5878, static _tc_Description = 0xb7f116d4, static _tc_DescriptionSeq = 0xb7f116c0, the_TAO_Container_Proxy_Broker_ = 0x0} To my non-expert eye this all looks OK. Execution reaches create_alias() as follows: CORBA::Container::create_alias (this=0x808f4c8, id=0x8088818 "IDL:US:1.0", name=0x8088818 "IDL:US:1.0", version=0x8088818 "IDL:US:1.0", original_type=0x8088818) at /lvol1/ACE_wrappers/TAO/tao/Object.i:81 I can trace execution as far as _tao_call.invoke(...) where everything falls apart. I will try to debug further but in the meantime my programming sixth sense is screaming that I should check build configuration and hygiene. I built 1.4.9 (as always) from clean tarball with no deviation from the 'standard' traditional process with the only deviation being 'TAO_ORBSVCS = IFRService'. I will get a colleague to build ACE+TAO (5.4.9/1.4.9) and test this behaviour in his environment as a sanity check. Jeff, if you have a non-vanilla Linux build process, can you please post it so that I can see if it 'fixes' this problem in my environment? Here are the specs on my Linux workspace: Linux version 2.4.21-27.0.2.ELsmp (bhcompile@tweety.build.redhat.com) (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-53)), GNU Make version 3.79.1. Configured with: ../configure --prefix=/usr --mandir=/usr/share/man -- infodir=/usr/share/info --enable-shared --enable-threads=posix --disable- checking --with-system-zlib --enable-__cxa_atexit --host=i386-redhat-linux Thread model: posix and all the makefiles are generated from MPC as usual. Hope this helps. The debug messages seem to indicate that the problem is at a lower level in the client-side or server-side ORB, i.e., the stub (invoke method) or the POA (invocation_adapter). Have you tried any examples with remote calls to something other than the IFR? Any further progress on this issue? I'd like to tie up this loose end before we cut the next beta. |