|
|||
| Moderated by: Renate.Reinartz, Markus.Kreisel, Jaakko.Salmenius, Ilkka.Salmenius |
|
|||||||||||||
| Text File localisation and encodings - Usage - Three simple steps to localize - Technical Support (You need to be registered at the forum to write) - Localization Tool for VB, Delphi, .NET, C#, VB.NET, XML, Online Help, HTML ... | ||||||||||||||
| Author | Post | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
||||||||||||||
|
bikemike Member
|
I have just encountered a problem with the customised NSIS installer strings used to deploy my application. In development all went well and installing in non-original language, such as Russian, displayed the translated strings in Russian text. Now, testing and installing on other PCs I have a problem where the strings are shown as a series of repeated characters something like this: πïSπïSπïS The original text, for example, is this: is not installed I think, on development PC without specific Russian support, if I open the file created locally I see this: ? ????????????? If Russian support is available when the file is created, then I see this: íå èíñòàëëèðîâàí If Russian is available at the time of creation and viewing I see this: не инсталлирован I just looked at the the Sisulizer project and see that the encoding for the Russian file is Windows Cyrillic (1251). I also noticed that the C# program I use to post-process the file (regionalise the header line for each localised file - ;!insertmacro LANGFILE "Neutral" "Neutral") is not specifying any encoding.
What should I be doing to ensure I always see the Russian text when installing in Russian? Do I change the encoding on the Sisulizer project, and or specify an encoding for the file handling in my C# app?
|
|||||||||||||
| ||||||||||||||
| ||||||||||||||
|
bikemike Member
|
OK, some views but no replies.... perhaps I didn't phrase a good question. What about looking at it this way; what is required for the Russian install to work? In particular, it installed with readable Russian on my PC in the past, but now it does not so I have changed something...
Should I keep the ANSI encoding in Sisulizer and have my C# set the ANSI encoding for each of the installer custom strings files it edits so as to keep the ANSI encoding? - and will this work? Is there something else that needs to be done in Sisulizer, NSIS, Build PC or Install PC in order for either of these to work. btw - I also have to consider Chinese, Korean, and other languages - mostly European. many thanks I've read this but I'm still not sure where the problem lies in my case http://www.joelonsoftware.com/articles/Unicode.html
|
|||||||||||||
| ||||||||||||||
|
||||||||||||||
|
Markus.Kreisel Administrator
|
Hi Mike, sorry for the late answer. There are many things that can go wrong with code page conversions. The problem simply is, that a 8-Bit Cyrillic file uses the same 256 possible values an English or German files uses. The same char value of e.g. 192 can represent two different chars. If you load a made for Cyrillic code page you have to have Cyrillic code pages on the target system. Sisulizer always uses UNICODE internally. If it writes out a 8-Bit file it has to decide what code page to use. You can see which is default if you use Project - Edit Source -> Encodings. If .NET reads the 8-Bit file it does not know what code page the file was stored. 8-Bit text files do not have headers where this information is stored. It does not know if it should default to a German umlaut or a Russian char. It simply assumes on German systems it must be a umlaut and on Russian system that it is Cyrillic. This is the reason we see often Mojibake (http://en.wikipedia.org/wiki/Mojibake) :-( If you use Windows conversion routines you might often see this ?????. The reason for this is easy. If you have UNICODE containing German umlauts and you use a conversion into 8-bit Cyrillic the converter simply does not know what to do. A German ö is not a part of Cyrillic. In this case it writes ? for every char it can not convert correctly. The downside. Once it writes a ? the original information is lost and can not be converted back anymore. I'm glad to hear that you can use UTF-8 for your files. Yes, this is the solution for your problem. Convert your source file into UTF-8 or even UTF-16. You can use Windows Notepad to do that. Simply change the file type in the Save As dialog of Notepad. Notepad will write a Byte Order Mark. If you read the article you linked to, you now know what it is. Now Sisulizer will default also to UTF-8 or UTF-16 for your file. If you use Scan for Changes on an existing file you might have to change it manually in the settings I mentioned above. If you use UTF-16 .NET does not need to convert at all. UTF-8 on the other hand produces smaller files. Both can handle UNICODE and should be the end of your mojibake problems. Markus
____________________ http://www.sisulizer.com - Three simple steps to localize |
|||||||||||||
| ||||||||||||||
| ||||||||||||||
|
bikemike Member
|
.... some time later ... Would that the installer did handle UTF8. It did not. I since discovered that it is all ANSI and there is a side project for Unicode. This was the missing link. So, I reverted to ANSI and simply changed my intermediate process to both read and write in specified code pages. Thus far it appears to work now. Thanks for your time.
|
|||||||||||||
| ||||||||||||||
| Current time is 12:10 pm | |
| Localization Tool for VB, Delphi, .NET, C#, VB.NET, XML, Online Help, HTML ... > Technical Support (You need to be registered at the forum to write) > Usage - Three simple steps to localize > Text File localisation and encodings | |
Sisulizer software localization tool - Three simple steps to localize