Please report new issues athttps://github.com/DOCGroup
This is a bug reported by an OCI customer. This problem was found using VxWorks, but is not limited to VxWorks. It can occur on any RTE platform or can occur on standard platforms when using the -ORBEndPoint option. The problem is in the Notification Service. It is an edge case that can be reproduced in the following scenario. Start a normal notification service on a host (not VxWorks). Start a consumer on VxWorks and a producer on a host (not the same as the consumer, but can be the same as the Notification Service). This producer sends a continuous stream of messages to the consumer, doesn't have to be real fast, just faster than a VxWorks reboot. Two possible paths follow. The VxWorks machine is rebooted, that is, the Notification Proxy is sending data to the consumer when it goes away. In this case, the Notification Service transport will detect that the connection to the consumer, as upcall for sending the data, has failed and the transport cache and Proxy will be cleaned up. The other scenario is where the consumer first disconnects from the Notification Service and then reboots. In this case, the Notification Service transport does not detect the failure and the upcall connection remains in the transport cache. After the new consumer connects back to the Notification Service a new Proxy is created, when this new proxy tries to send its first message to the consumer it finds the old connection left in the transport cache and tries to use. This old connection is no good and fails, which causes the Proxy to be deleted and no messages flow to the new consumer. This problem can be fixed by validating any possible existing connections to the consumer during Proxy initialization. If the connection is bad, it will be found and cleaned up at this time. Then, when the new Proxy tries to send real data it will create a new good connection and the new consumer will receive data. The Notification Service is started with the command line option -UseSeparateDispatchingORB 1 and a svc.conf file containing static TAO_CosNotify_Service "-DispatchingThreads 1". Since this problem also can occur on non-VxWorks platforms when the -ORBEndPoint option is used on the Consumer, it should be possible to create a regression test. I will attempt to create such a test and add it after the current beta is complete.