Please report new issues athttps://github.com/DOCGroup
I have two computers, X and Y, running identical servers which implement interface Test with operation A. At one point, the X server operation A attempts to invoke operation A on server Y. The end result is that server X ORB decides that the object is collocated, which it isn't, and an OBJECT_NOT_EXIST exception is thrown. I had originally thought that the 0xdead pointer was the cause, but it turns out that problem simply unmasks another problem related to collocation. There are two preconditions to recreate the collocation problem (which perhaps violate some rule I'm unaware of): 1. The servers must declare more than one endpoint. In my case, I have -ORBEndpoint iiop://localhost:5122 -ORBEndpoint iiop://X:5122 on machine X and -ORBEndpoint iiop://localhost:5122 -ORBEndpoint iiop://Y:5122 on machine Y. If there is only a single endpoint per server, the problem does not occur. 2. The two servers must be using the same port number. If the port used on machine Y is 5123 instead of 5122, the problem does not occur. In "real life", there will be zillions of these servers (i.e., more than 64K, so they couldn't have individual port numbers even if we wanted them to). Each machine hosting a server typically hosts a client as well, which connects to its local server as "localhost". Clients can and typically do connect to other servers simultaneously using a specified hostname and the well-known port number. The servers can and typically do connect to each other in a similar fashion. As for the 0xdead problem, the "bad" pointer is created deliberately by the IDL compiler in interface_ss.cpp/961. The pointer is compared to 0 in Invocation_Adapter.cpp/80, which results in the execution path being diverted into TAO_ORB_Core::collocation_strategy. A strategy of TAO_CS_THRU_POA_STRATEGY will be returned, and ultimately an OBJECT_NOT_EXIST exception will be thrown. At this point, however, the damage has already been done--the object in question has already been marked as collocated. This appears to occur when the object is first narrowed after a string_to_object call, at ServerSupport.cpp/81 in the attached Server VS project. narrow() makes a call to _is_a in Object_T.cpp/27. A round-trip call to the target server is performed (you'd think there'd be enough local knowledge to not have to do that). In any case, at some point during that process, the ORB erroneously decides that the object is collocated. The Bugzilla "enter a new bug" page doesn't seem to provide a mechanism to attach a file. I've been told an opportunity will come at a later time. If that's true, there will be a zip file attached by the time you read this. If not, contact me directly and I'll forward the file to you. I'm using static ACE/TAO libraries, built with the supplied static workspace/project files. My config.h contains #define ACE_HAS_WINSOCK2 1 #define ACE_DISABLE_WIN32_ERROR_WINDOWS #define ACE_HAS_STANDARD_CPP_LIBRARY 0 #define ACE_DISABLE_WIN32_INCREASE_PRIORITY #include "ace/config-win32.h" Note: As a workaround, specifying -Sp -Sd options to the IDL compiler seems to fix it, but the documentation says this is for "pure clients" only, so I don't know what additional trouble make occur down the line.
Created attachment 674 [details] Zip containing client and two servers, VS6 project files.
To run the testcase, unzip to a convenient location, then build with VS6. "Server" must run on one machine, "Server2" on another (no need to copy files; just share and "map network drive" the drive on the first computer to the second). Launch Server2, then Server. Server will ask for the hostname for Server2. Launch Client. Client will invoke operationA on Server, which in turn will attempt to invoke operationA on Server2. All three executables are Windows "console apps", meaning you'll get a DOS-like command window for each. If/when Server fails in its attempt to connect to Server2, you should see the results in the associated window.
I have added a mpc file and can build this, the code needs some more work to go to the repository, we don't allow iostream usage in test code. When it is in the repo we have to see when we can test this
I have tested this with a linux host running server 2 and windows host running server and client. This works without problems with the latest svn head version. The test itself is not really portable, would be some work to get it to a level that it can be added to the repo but we can't run this automatically because we need multiple hosts to test this. I am going to accept this issue just for the fact that the test has to be cleaned and added to the repo.
lowering priority because this is working on head, we only want to integrate the regression