Bug 2846 - TAO mistakenly identifies a remote object as collocated
Summary: TAO mistakenly identifies a remote object as collocated
Status: NEW
Alias: None
Product: TAO
Classification: Unclassified
Component: ORB (show other bugs)
Version: 1.5
Hardware: x86 Windows XP
: P5 minor
Assignee: DOC Center Support List (internal)
URL:
Depends on:
Blocks:
 
Reported: 2007-03-06 11:16 CST by Tim Bomgardner
Modified: 2007-03-13 04:09 CDT (History)
0 users

See Also:


Attachments
Zip containing client and two servers, VS6 project files. (8.46 KB, application/octet-stream)
2007-03-06 11:19 CST, Tim Bomgardner
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Bomgardner 2007-03-06 11:16:34 CST
I have two computers, X and Y, running identical servers which implement
interface Test with operation A.  At one point, the X server operation A
attempts to invoke operation A on server Y.  The end result is that server X ORB
decides that the object is collocated, which it isn't, and an OBJECT_NOT_EXIST
exception is thrown.

I had originally thought that the 0xdead pointer was the cause, but it turns out
that problem simply unmasks another problem related to collocation.  There are
two preconditions to recreate the collocation problem (which perhaps violate
some rule I'm unaware of):

1. The servers must declare more than one endpoint.  In my case, I have

   -ORBEndpoint iiop://localhost:5122
   -ORBEndpoint iiop://X:5122

on machine X and

   -ORBEndpoint iiop://localhost:5122
   -ORBEndpoint iiop://Y:5122

on machine Y.  If there is only a single endpoint per server, the problem does
not occur.

2. The two servers must be using the same port number.  If the port used on
machine Y is 5123 instead of 5122, the problem does not occur.

In "real life", there will be zillions of these servers (i.e., more than 64K, so
they couldn't have individual port numbers even if we wanted them to).  Each
machine hosting a server typically hosts a client as well, which connects to its
local server as "localhost".  Clients can and typically do connect to other
servers simultaneously using a specified hostname and the well-known port
number.  The servers can and typically do connect to each other in a similar
fashion.

As for the 0xdead problem, the "bad" pointer is created deliberately by the IDL
compiler in interface_ss.cpp/961.  The pointer is compared to 0 in
Invocation_Adapter.cpp/80, which results in the execution path being diverted
into TAO_ORB_Core::collocation_strategy.  A strategy of TAO_CS_THRU_POA_STRATEGY
will be returned, and ultimately an OBJECT_NOT_EXIST exception will be thrown.  

At this point, however, the damage has already been done--the object in question
has already been marked as collocated.  This appears to occur when the object is
first narrowed after a string_to_object call, at ServerSupport.cpp/81 in the
attached Server VS project.  narrow() makes a call to _is_a in  Object_T.cpp/27.
 A round-trip call to the target server is performed (you'd think there'd be
enough local knowledge to not have to do that).  In any case, at some point
during that process, the ORB erroneously decides that the object is collocated.

The Bugzilla "enter a new bug" page doesn't seem to provide a mechanism to
attach a file.  I've been told an opportunity will come at a later time.  If
that's true, there will be a zip file attached by the time you read this.  If
not, contact me directly and I'll forward the file to you.

I'm using static ACE/TAO libraries, built with the supplied static
workspace/project files.  My config.h contains

#define ACE_HAS_WINSOCK2 1
#define ACE_DISABLE_WIN32_ERROR_WINDOWS
#define ACE_HAS_STANDARD_CPP_LIBRARY 0
#define ACE_DISABLE_WIN32_INCREASE_PRIORITY
#include "ace/config-win32.h"

Note: As a workaround, specifying -Sp -Sd options to the IDL compiler seems to
fix it, but the documentation says this is for "pure clients" only, so I don't
know what additional trouble make occur down the line.
Comment 1 Tim Bomgardner 2007-03-06 11:19:02 CST
Created attachment 674 [details]
Zip containing client and two servers, VS6 project files.
Comment 2 Tim Bomgardner 2007-03-06 11:38:04 CST
To run the testcase, unzip to a convenient location, then build with VS6. 
"Server" must run on one machine, "Server2" on another (no need to copy files;
just share and "map network drive" the drive on the first computer to the
second).  Launch Server2, then Server.  Server will ask for the hostname for
Server2.  Launch Client.  Client will invoke operationA on Server, which in turn
will attempt to invoke operationA on Server2.  All three executables are Windows
"console apps", meaning you'll get a DOS-like command window for each.  If/when
Server fails in its attempt to connect to Server2, you should see the results in
the associated window.
Comment 3 Johnny Willemsen 2007-03-07 02:38:56 CST
I have added a mpc file and can build this, the code needs some more work to go
to the repository, we don't allow iostream usage in test code. When it is in the
repo we have to see when we can test this
Comment 4 Johnny Willemsen 2007-03-08 04:47:26 CST
I have tested this with a linux host running server 2 and windows host running
server and client. This works without problems with the latest svn head version.
The test itself is not really portable, would be some work to get it to a level
that it can be added to the repo but we can't run this automatically because we
need multiple hosts to test this.

I am going to accept this issue just for the fact that the test has to be
cleaned and added to the repo.
Comment 5 Johnny Willemsen 2007-03-13 04:09:35 CDT
lowering priority because this is working on head, we only want to integrate the
regression