German characters in ZIP extracted wrong

Discussion of bugs and problems found in Altap Salamander. In your reports, please be as descriptive as possible, and report one incident per report. Do not post crash reports here, send us the generated bug report by email instead, please.
TÜV
Posts: 31
Joined: 21 Sep 2009, 14:37

German characters in ZIP extracted wrong

Post by TÜV »

Problem Report:
I have got a ZIP file from a client which contains the German characters Ä ü Ü ö ä in the names of the directories inside the ZIP file.
Attached is this ZIP file with most of its content deleted by Altap Salamander 2.53 beta 1.
changed by Salamander.zip
Characters Ä ü Ü ö ä in this ZIP file are wrong with Salamander only
(536 Bytes) Downloaded 642 times
The directories of this ZIP file can be extracted with Salamander, but the German characters Ä ü Ü ö ä are wrong after extraction.

With WinZip the German signs are shown and extracted correctly.
Testing the attached ZIP file with WinZip 12.0 created the attached report.
Test ZIP changed by Salamander.txt
Test report of the ZIP file created by WinZip 12.0
(7.25 KiB) Downloaded 570 times

By this report I suppose that a character translation of WinZip 12.0 is missing in Altap Salamander 2.53 beta 1.
User avatar
SelfMan
Posts: 1144
Joined: 05 Apr 2006, 20:51
Contact:

Re: German characters in ZIP extracted wrong

Post by SelfMan »

All this is because of the lack of unicode support in Salamander.
The partial help is to set the Regional and language options so it supports the characters within the system.
Jan Patera
Plugin Developer
Plugin Developer
Posts: 707
Joined: 08 Dec 2005, 14:33
Location: Prague, Czech Republic
Contact:

Re: German characters in ZIP extracted wrong

Post by Jan Patera »

TÜV wrote:I have got a ZIP file from a client which contains the German characters Ä ü Ü ö ä in the names of the directories inside the ZIP file.
Attached is this ZIP file with most of its content deleted by Altap Salamander 2.53 beta 1.
I think it is a bug of all the other tools around or Salamander is too smart. A file created with AS would have Zip2.0/DOS as the creator while the archive you posted here has Zip1.0/Win32 as the creator. That's the problem. This means that the ZIP header claims that the file names are stored in ANSI codepage (1252), but apparently they are stored in the OEM codepage (850), this can be seen from the fact that 0x81 is used for the umlaut 'u' character. While Salamander is smart enough to do the right code page conversion (OEM->ANSI) only in the DOS case, all the other tools assume the file name is always in OEM code page.

It has nothing to do with Unicode support.

The same problem has been encountered here.

It is also possible that Salamander screwed up something (or rather removed some additional info) when you modified the archive with it, therefore it would be usefull to see the original zip file (you can email it directly to us). For example, some ZIP files contain file names stored in several alternative encodings (OEM or ANSI and UTF8). Or the filenames can be stored directly in UTF8. While it is not this case, it is IMHO noteworthy that AS 2.53 beta 1 and later detects UTF8 if it is the only encoding used to store the file names (i.e. alternative filenames are (still) always thrown away).
TÜV
Posts: 31
Joined: 21 Sep 2009, 14:37

Re: German characters in ZIP extracted wrong

Post by TÜV »

Maybe WinZip recognizes by the length of a name and by a special character in a name that this cannot be DOS a name and switches to codepage 1252 automatically.

Could you add a selectable option "use codepage 1252 for names in ZIP" in the options of Salamander?

Here is an unchanged testfile created by the client:
Testverzeichnis_ä_ö_ü.zip
ZIP-Testfile with German characters in directory name and file name
(180 Bytes) Downloaded 594 times
TÜV
Posts: 31
Joined: 21 Sep 2009, 14:37

Re: German characters in ZIP extracted wrong

Post by TÜV »

Here is the testfile with a not empty file inside:
Attachments
Testverzeichnis_ä_ö_ü.zip
ZIP-Testfile with a not empty file inside
(196 Bytes) Downloaded 575 times
TÜV
Posts: 31
Joined: 21 Sep 2009, 14:37

Re: German characters in ZIP extracted wrong

Post by TÜV »

Now the client wrote me that he used only Windows XP (no external program) to generate the ZIP-file.
Jan Patera
Plugin Developer
Plugin Developer
Posts: 707
Joined: 08 Dec 2005, 14:33
Location: Prague, Czech Republic
Contact:

Re: German characters in ZIP extracted wrong

Post by Jan Patera »

Jan Patera wrote:I think it is a bug of all the other tools around or Salamander is too smart.
I slightly changed the condition when AS decides to convert filenames from OEM to ANSI codepage. Can I ask you (and all the other enthusiastic AS users around) to test support for accented characters in ZIP archives created by other (non-AS) tools in the upcoming AS 2.53 beta 2 (so that it gets well tested before the official release 2.53)?
TÜV
Posts: 31
Joined: 21 Sep 2009, 14:37

Re: German characters in ZIP extracted wrong

Post by TÜV »

Jan Patera wrote:Can I ask you (and all the other enthusiastic AS users around) to test support for accented characters in ZIP archives created by other (non-AS) tools in the upcoming AS 2.53 beta 2 (so that it gets well tested before the official release 2.53)?
Of course
Post Reply