908 – BUS error when in TAO_Singleton_Manager dtor

Bug 908 - BUS error when in TAO_Singleton_Manager dtor

Summary: BUS error when in TAO_Singleton_Manager dtor

Status:	RESOLVED WORKSFORME

Alias:	None

Product:	TAO
Classification:	Unclassified
Component:	ORB (show other bugs)
Version:	1.1.14
Hardware:	SPARC Solaris

Importance:	P3 normal
Assignee:	Ossama Othman

URL:	http://users.tellurian.net/fooz/TAO_O...

Depends on:
Blocks:	1277
	Show dependency tree

Reported:	2001-05-10 14:47 CDT by Steve Hespelt
Modified:	2002-09-23 19:15 CDT (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Steve Hespelt 2001-05-10 14:47:20 CDT

BUS error when trying to delete object at line 382 in $ACE_ROOT/ace/OS.cpp in 
~ACE_Cleanup_Info_Node() [next_ member variable]. ~TAO_Singleton_Manager() is in 
the call chain when the crash occurs.

    DESCRIPTION:
My program is a mixture of ACE [outside of TAO's usage of ACE] & TAO code. I 
have an ACE_Task derived object (RFActiveORB) that has as a member variable orb_ 
[a CORBA::ORB_var]. The RFActiveORB::open() method sets the orb_ variable via 
orb_ = CORBA::ORB_init( argv_->argc(), argv_->argv(), orbName.c_str() );
Ok, so far so good. 
The problem is that long after the RFActiveORB object has been destructed 
[orb_->shutdown() has been invoked prior to the RFActiveORB dtor being invoked], 
the ACE_OS_Object_Manager_Manager dtor is trying to delete memory pointed to by 
a ACE_Cleanup_Info_Node::next_ member variable. The pointer value in the next_ 
field is the pointer obtained at line 91 in tao/TAO_Singleton.cpp [the 
TAO_Singleton object is obtained just prior to registering the TAO_Singleton for 
destruction with the TAO_Singleton_Manager].

I've attached the script session recorded while using dbx to trace out where the
ACE_Cleanup_Info_Node object are instantiated. I took the liberty of converting 
the script file with dos2unix to remove the CR characters. There are still lots 
of escape sequence in the session file (terminal sequences for bolding,etc.) - 
sorry. This is also an edited version to save space. I have the un-edited 
version if it is needed.
At the end of the dbx session file, you'll see that the address of the 
ACE_Cleanup_Info_Node with the bogus next_ value is 0x1553fd8. The object_ field 
value is 0x155ba68 which by examining the arguments passed to 
TAO_Singleton_Manager::at_exit() is the TAO_Singleton object instantiated at 
line 91 in tao/TAO_Singleton.cpp. The crash info is at the end of the dbx 
session.

An interesting observation is that the value of the next_ field at the time of 
the crash is  0x1553ee8. It is a ACE_Cleanup_Info_Node object (look for: this = 
0x1553ee8). I'm using several dbx ' when in ACE_Cleanup_Info_Node { print this; 
where; } ' commands to try to determine where this object pointed to by the 
next_ field is created. Since I am doing this for every ACE_Cleanup_Info_Node 
constructor, I am expecting to see this address show up as a value of this in a 
trace of a constructor call. I don't. But I do see it as a next_ field value 
(again, look for: next_ = 0x1553ee8). So insert is being invoked on an 
ACE_Cleanup_Info_Node object that I can't see where it is being constructed. 
Now, 1553ee8 is quad aligned so I am assuming it is allocated off the heap.

    REPEAT BY:
I let the app startup and then I did a 'kill -TERM pid' which our 
ACE_Task_Base-derived RFSigMgr class [signals are trapped on one thread only via 
a RFSigMgr private reactor] intercepts & dispatches, causing the app's 'main' 
object to be closed(), thus causing the various ACE_Task derived objects to go 
out of scope.

I'm putting the entire email from 5/9/2001 onto a web site 
this evening [http://users.tellurian.net/fooz/TAO_Object_Manager.txt] so you can 
grab it instead of me pasting the dbx session log here.

Comment 1 Ossama Othman 2001-05-22 13:01:30 CDT

I guess that this one is mine.

Comment 2 Ossama Othman 2001-05-23 12:32:28 CDT

Mio!

Comment 3 Carlos O'Ryan 2002-08-10 21:53:17 CDT

Looks like a crash, thus it is a blocker.

Comment 4 Ossama Othman 2002-09-21 12:09:45 CDT

Does this problem still occur in TAO 1.2.4?  A number of changes have been 
made to this code since TAO 1.1.14 was released.  Furthermore, can you please 
provide a simple test that reproduces this problem?

Comment 5 Steve Hespelt 2002-09-23 09:20:59 CDT

Our production code is still at tao 1.1.14 but I'm in the middle of porting our 
code to Forte 6 update#2 and to TAO 1.2.x (1.2.2 currently but will be going to 
1.2.4 ASAP). I'm not running into this problem anymore & frankly, I don't know 
why not...  I'd be tempted to change the status to WORKSFORME until I can 
reproduce it under either 1.1.14 or 1.2.x
-steve

Comment 6 Ossama Othman 2002-09-23 19:15:58 CDT

Hi Steve,

Thanks for the confirmation!  If you can a wait another week, TAO 1.2.5 should
be ready by then.  Otherwise, 1.2.4 should be good too.

Out of curiosity, is the port from TAO 1.1.x to TAO 1.2.x difficult? 
Presumably, the new emulated exception handling macros and library split make
things difficult.

I'll mark the bug "WORKSFORME" for now.  Please feel free to reopen it if you're
able to reproduce the problem.

Thanks!