Please report new issues athttps://github.com/DOCGroup
Created attachment 1405 [details] Patch and unit test Overview: Using codepoint above U+007F with the UTF8_Latin1_Translator yields incorrect results. Multibyte sequences are not handled properly when translating from UTF-8 to Latin1. Codepoints in the range [U+0080 - U+00BF] are incorrectly written as single bytes when translating from Latin1 to UTF-8. Steps to Reproduce: 1) Run the attached unit test. It will start a client with UTF-8 and server with Latin1 as native codeset. The server will use the translator. The client will send two strings: one containing all ASCII codepoints, the other containing all extra Latin1 codepoints. Both strings are encoded in UTF-8. The server will translate them to Latin1, translate them back to UTF-8 and send them back to the client. The client will compare the received strings with the one it sent to see if they're identical. Actual Results: The string with ASCII characters is handled correctly, the other string not. Expected Results: Both strings should be handled correctly. Build Date & Platform: TAO version 2.1.2, released Sat May 19 14:28:57 CEST 2012, tested on Windows XP Additional Builds and Platforms: TAO version 2.0.2, released Wed Apr 20 09:52:52 CEST 2011, tested on AIX 5.3 Additional Information: I have no idea how to create a patch. I've attached a zip archive containing the unit test, unified diff for the translator cpp and a fixed version of the translator cpp.