Bug 2289 - Collocated optimisation doesn't work after location forward
Summary: Collocated optimisation doesn't work after location forward
Status: RESOLVED FIXED
Alias: None
Product: TAO
Classification: Unclassified
Component: ORB (show other bugs)
Version: 1.4.7
Hardware: All All
: P3 normal
Assignee: Simon McQueen
URL:
Depends on: 1495
Blocks: 1919
  Show dependency tree
 
Reported: 2005-10-31 10:23 CST by Simon McQueen
Modified: 2006-02-23 04:40 CST (History)
0 users

See Also:


Attachments
Bug_2289_Regression.tar.gz - test code (6.16 KB, application/x-gzip)
2005-10-31 10:30 CST, Simon McQueen
Details
bugzilla_2289.patch - updated unidiff patch for this issue (25.85 KB, patch)
2006-01-11 09:31 CST, Simon McQueen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Simon McQueen 2005-10-31 10:23:55 CST
If a client receives a location forward response directing it to an object that
exists within the client process, the client does not detect this collocation
and continues to use remote invocation.

May well be the cause of the problems seen in bug #1919.

This seems to fix it:

Index: tao/Invocation_Adapter.cpp
===================================================================
RCS file:
/project/cvs-repository/ACE_wrappers-repository/TAO/tao/Invocation_Adapter.cpp,v
retrieving revision 1.21
diff -U15 -r1.21 Invocation_Adapter.cpp
--- tao/Invocation_Adapter.cpp  10 Jun 2005 21:26:18 -0000      1.21
+++ tao/Invocation_Adapter.cpp  31 Oct 2005 16:16:03 -0000
@@ -370,26 +370,35 @@
     // The object pointer has to be changed to a TAO_Stub pointer
     // in order to obtain the profiles.
     TAO_Stub *stubobj =
       effective_target->_stubobj ();
 
     if (stubobj == 0)
       ACE_THROW (CORBA::INTERNAL (
         CORBA::SystemException::_tao_minor_code (
           TAO_INVOCATION_LOCATION_FORWARD_MINOR_CODE,
           errno),
         CORBA::COMPLETED_NO));
 
 
     // Reset the profile in the stubs
     stub->add_forward_profiles (stubobj->base_profiles ());
+    
+    if (effective_target->_is_collocated ())
+      {
+        this->target_->set_collocated_servant (effective_target->_servant ());
+        if (stubobj->servant_orb_var ())
+          {
+            stub->servant_orb (stubobj->servant_orb_var ());
+          }
+      }
 
     if (stub->next_profile () == 0)
       ACE_THROW (CORBA::TRANSIENT (
         CORBA::SystemException::_tao_minor_code (
           TAO_INVOCATION_LOCATION_FORWARD_MINOR_CODE,
           errno),
         CORBA::COMPLETED_NO));
 
     return;
   }
 } // End namespace TAO

XRef: Scarab TAO#135
Comment 1 Simon McQueen 2005-10-31 10:30:33 CST
Created attachment 410 [details]
Bug_2289_Regression.tar.gz - test code
Comment 2 Johnny Willemsen 2005-10-31 13:19:43 CST
Hi Simon,

I fixed issues in this area April 2004. From what I remember the target is the
object the initial invocation is done on, the effective_target the one we got
forwarded to. 

Thu Apr 22 06:48:13 UTC 2004  Johnny Willemsen  <jwillemsen@remedy.nl>

        * tao/Collocated_Invocation.cpp:
        * tao/Collocated_Invocation.h:
          Swapped the first arguments to target and then effective_target
          so that this matches the signature of the base class.

        * tao/Invocation_Adapter.cpp:
        * tao/Invocation_Adapter.h:
          Updated for the signature change of Collocated_Invocation. Also
          when we have a collocated proxy broker set we could by theory
          use a collocated invocation, so then use
          TAO_ORB_Core::collocation_strategy to see how the collocation
          strategy is set. Dependent on this we do a collocated or remote
          invocation. If is now possible that a remote invocation that
          gets a forward location to a collocated servant now switches
          from the remote invocation path to the collocated invocation path.
          This fixes Bug 1495. Thanks to Bala for helping me with this.

Comment 3 Johnny Willemsen 2005-10-31 13:52:29 CST
I attached a debugger, it seems the _narrow results in a _is_a, which returns a
forward location, this is then handled correctly and is forwarded, but then the
next operation is again being done remotely. 

It looks this is a special case because the regression puts the collocated ior
in its ior table. When this isn't done (for example location forward in an
interceptor), then my tests show that things work fine. 

The patch looks strange on the first sight, why change the target and
servant_orb, why does it work for the normal case.
Comment 4 Johnny Willemsen 2005-10-31 14:47:46 CST
I think the problem is more in the area of Narrow_Utils<T>::unchecked_narrow.
The _is_a call is detected as forward location, but then this information is not
stored correctly. When a location forward on a normal operation is done, it
works fine. The Narrow_Utils<T> is something special in this case, which we else
don't get. 
Comment 5 Johnny Willemsen 2005-11-01 02:18:24 CST
Simon, can you add the regression now to cvs so that we an look at the bugfix
more easier?
Comment 6 Johnny Willemsen 2005-11-01 02:18:49 CST
added blocks
Comment 7 klyk 2005-11-01 03:24:00 CST
I agree that solving this problem will likely solve 1919. Although note that 
the solution that we submitted circumvents interaction with the ImR and hence 
the location_forward, for collocated servants, and is therefore more optimal 
than the solution you describe.  
 
It seems a shame to have this overhead when we are talking about collocation 
optimization. Considering that proxy objects are usually short-lived entities, 
used for only a few request invocations, the overhead of repeatedly 
communicating with the ImR (whenever a new object proxy is created and setup) 
may be seen by some as an unacceptable inefficiency, which negates the 
optimization for collocation. It doesn't seem fully optimal if we just take 
the location forward into account. 
Comment 8 Johnny Willemsen 2005-11-01 03:31:18 CST
This will cause all of the above changes to be overwritten, except for the
changes Simon, also what should be tested is forwarding from collocated to remote.

Btw, I tested 1495. It seems we get forwarded from remote to the collocated
servant correctly the first call, the calls after that we call the new servant,
but use the remote path. No idea why this happens, tested this in the past and
then it worked. 
Comment 9 Simon McQueen 2005-11-01 03:49:15 CST
> ------- Additional Comments From jwillemsen@remedy.nl  2005-11-01 02:18 -------
> Simon, can you add the regression now to cvs so that we an look at the bugfix
> more easier?

Tue Nov  1 09:35:36 2005  Simon McQueen  <sm@prismtech.com>

        * tests/Bug_2289_Regression/Bug_2289_Regression.mpc:
        * tests/Bug_2289_Regression/MyInterfaceImpl.cpp:
        * tests/Bug_2289_Regression/MyInterfaceImpl.h:
        * tests/Bug_2289_Regression/MyStruct.idl:
        * tests/Bug_2289_Regression/MyStructC.cpp:
        * tests/Bug_2289_Regression/MyStructC.h:
        * tests/Bug_2289_Regression/MyStructC.inl:
        * tests/Bug_2289_Regression/MyStructS.cpp:
        * tests/Bug_2289_Regression/MyStructS.h:
        * tests/Bug_2289_Regression/MyStructS.inl:
        * tests/Bug_2289_Regression/MyStructS_T.cpp:
        * tests/Bug_2289_Regression/MyStructS_T.h:
        * tests/Bug_2289_Regression/MyStructS_T.inl:
        * tests/Bug_2289_Regression/README:
        * tests/Bug_2289_Regression/Test.idl:
        * tests/Bug_2289_Regression/client.cpp:
        * tests/Bug_2289_Regression/run_test.pl:
        * tests/Bug_2289_Regression/server.cpp:

          Added regression test for this bug. Bug is not yet fixed so it
          *will* fail.
Comment 10 Johnny Willemsen 2005-11-01 03:54:05 CST
Updated bug 1495 with some extra tests, it is failing now also. Removed the
generated files from the repo for bug 2289
Comment 11 Johnny Willemsen 2005-11-01 03:56:08 CST
Simon, in case the forwarded one fails at some moment, with your patch, can we
then fallback on the original?
Comment 12 Simon McQueen 2005-11-01 04:15:33 CST
> ------- Additional Comments From jwillemsen@remedy.nl  2005-11-01 03:54 -------
> Updated bug 1495 with some extra tests, it is failing now also. Removed the
> generated files from the repo for bug 2289
> 

Hi Johnny,

I think you've missed the point of the test - please see the README file:

"2/ A definition of the struct (MyStruct.idl). The stub / skel code for this has
been pre-generated and altered such that it will fail to marshal to / from a CDR
stream (see MyStructC.cpp)."
...and...
"If the fix is present the client ORB will pass the MyStruct object by reference
- if the fix is not present it will attempt to marshal it and the hand-broken
stub code will fail, causing the test to fail."

The handcrafted committed stub code is the whole reason that this test is able
to work. The build file does not generate code from MyStruct.idl - if it did you
would have a test that always passes.
Comment 13 Johnny Willemsen 2005-11-01 04:19:00 CST
Ok, sorry that I removed the files, but keeping generated files in the repo is
really a maintenance nightmare. Especially if we change generation (and I will
do that soon again) then we have to regenerate all these files. I have now been
able to regenerate the problem with bug 1495. Because this regression will
deliver us a maintenance problem I propose or to change this test so that we can
build it normally or back it out again. 
Comment 14 klyk 2005-11-10 06:10:35 CST
I tried out the patch quoted at the beginning (to Invocation_Adapter.cpp) but 
it doesn't appear to work. Here is my stack as proof: 
 
#0  0xffffe410 in ?? () 
#1  0xbfff6b08 in ?? () 
#2  0x00000400 in ?? () 
#3  0xbfff6bf0 in ?? () 
#4  0x412f3f63 in __read_nocancel () from /lib/tls/libpthread.so.0 
#5  0x40cd72f0 in recv (handle=18, buf=0xbfff6bf0, n=1024, timeout=0x0) at 
OS_NS_unistd.inl:869 
#6  0x40af666b in TAO_IIOP_Transport::recv (this=0x80a1bf8, buf=0xbfff6bf0 
"0eL@", len=1024, max_wait_time=0x0) 
    at IPC_SAP.inl:18 
#7  0x40b7b181 in TAO_Transport::handle_input (this=0x80a1bf8, rh=@0xbfff7040, 
max_wait_time=0x0) at Message_Block.inl:365 
#8  0x40b9b095 in TAO_Wait_On_Read::wait (this=0x80a0b90, max_wait_time=0x0, 
rd=@0xbfff7130) at Wait_On_Read.cpp:43 
#9  0x40b5fb52 in TAO::Synch_Twoway_Invocation::wait_for_reply 
(this=0xbfff7400, max_wait_time=0x0, rd=@0xbfff7130, 
    bd=@0xbfff7110) at Transport.inl:42 
#10 0x40b60022 in TAO::Synch_Twoway_Invocation::remote_twoway 
(this=0xbfff7400, max_wait_time=0x0) 
    at Synch_Invocation.cpp:162 
#11 0x40afc3b3 in TAO::Invocation_Adapter::invoke_twoway (this=0xbfff7640, 
details=@0xbfff75a0, 
    effective_target=@0xbfff7540, r=@0xbfff74d0, max_wait_time=@0xfffffe00) at 
Invocation_Adapter.cpp:310 
#12 0x40afc823 in TAO::Invocation_Adapter::invoke_remote_i (this=0xbfff7640, 
stub=0x80a18e0, details=@0xbfff75a0, 
    effective_target=@0xbfff7540, max_wait_time=@0xbfff753c) at 
Invocation_Adapter.cpp:277 
#13 0x40afc91e in TAO::Invocation_Adapter::invoke_i (this=0xbfff7640, 
stub=0x80a18e0, details=@0xbfff75a0) 
    at Invocation_Adapter.cpp:85 
#14 0x40afca99 in TAO::Invocation_Adapter::invoke (this=0xbfff7640, 
ex_data=0xfffffe00, ex_count=4294966784) 
    at Invocation_Adapter.cpp:44 
#15 0x40b4f60c in TAO::Remote_Object_Proxy_Broker::_is_a (this=0x40c26f90, 
target=Internal: global symbol `Object' found inObject.cpp psymtab but not in 
symtab. 
Object may be an inlined function, or may be a template function 
(if a template, try specifying an instantiation: Object<type>). 
) at Remote_Object_Proxy_Broker.cpp:38 
#16 0x40b1ad0a in CORBA::Object::_is_a (this=0x80a1ab0, type_id=0x4020bef9 
"IDL:MO_Agent:1.0") at Object.cpp:216 
#17 0x401f1dd9 in TAO::Narrow_Utils<MO_Agent>::narrow (obj=Variable "obj" is 
not available. 
) at Object_T.cpp:27 
#18 0x401e328b in MO_Agent::_narrow (_tao_objref=0x80a1ab0) at 
MObjectC.cc:5207 
#19 0x402d0a51 in Agent (this=0xbfffb970, argc=18, argv=0xbfffbd14) at 
Pseudo_VarOut_T.inl:66 
#20 0x0804c226 in main (argc=18, argv=0xbfffbd14) at Srv_Main.cc:33 
 
The object in question (MO_Agent) has its servant collated. As can be seen the 
request still goes remote. 
 
My code is as follows:  
 
// the agentIORstr contains the endpoints of the IMR 
CORBA::Object_var tmpVar(orb_->string_to_object(agentIORstr)); 
if (CORBA::is_nil(tmpVar.in())) { 
   throw AgentInitFailed("Agent.cc: MO_Agent string_to_object failed"); 
} 
MO_Agent_var moagent = MO_Agent::_narrow(tmpVar.in()); 
 
Comment 15 Simon McQueen 2006-01-04 12:13:34 CST
accepting
Comment 16 Simon McQueen 2006-01-11 09:31:50 CST
Created attachment 440 [details]
bugzilla_2289.patch - updated unidiff patch for this issue
Comment 17 Simon McQueen 2006-01-11 10:11:19 CST
The fix (attached in the previous post) has been updated to deal with the use
case where the stub falls back from a forwarded location to another location of
a differing collocated status. The fix will also now cope with location forwards
(and fallbacks) when more than one ORB is collocated.

The approach can be summarised as:
1/ Moved the collocation indicator, the collocated servant pointer, and the
strategized object proxy reference from CORBA::Object to TAO::Stub. This enables
these values to be changed during the lifetime of an object in response to the
location forward handling (which is dealt with in the stub).
2/ Removed the Object ref parameter of ::initialize_collocated_object on
TAO_Adapter and its implementations so that this method can be invoked from
circumstances when the CORBA::Object(s) related to s tub are not available.

Hi Kostas -

Your stack trace shows the initial is_a request being made remotely. This is the
correct behaviour in the absence of the enhanced ImR reference handling you
proposed on bug 1919. Without the patch for this (bug 2289) bug, both the is_a
during reference narrowing *and* the application method invocation woiuld have
been dispatched remotely, which is incorrect. 
Comment 18 Simon McQueen 2006-02-23 04:40:22 CST
Fixed: Tue Feb 21 17:48:24 UTC 2006  Simon McQueen  <sm@prismtech.com>
Comment 19 Simon McQueen 2006-02-23 04:40:45 CST
Fixed: Tue Feb 21 17:48:24 UTC 2006  Simon McQueen  <sm@prismtech.com>