My findings were the following:
- in the search window, no conversion is applied automatically and there is no provision to select one. This can be looked at as if the text/hex given is converted to a binary pattern, in the case of text according to salamander's system ANSI codepage, and then a binary search is performed. Regular expression serach is handled accordingly. Upper/lowercase conversion appears to be done for ASCII characters only (i.e. case is not ignored for accented or foreign characters even in case-independend search)
- search in the viewer appears to work accordingly. However, not the bit pattern of the original file is being searched, but the one to which the current conversion has been applied.
- Hex view does not show the values stored in the file, but the ones resulting from the selected conversion in both, the hex values and the textual representation.
- You might ave to ignore the BOM if present.
- You need detection code for UTF8/UTF16LE/UTF16BE.
- You need a converter to convert text from those to the (ANSI) codepage used by the textbox control, converting unknown/invalid codepoints to some invalid character marker (same as you do use '?' already for characters that have no representation in the destination codepage).
- You might need some changes to map character positions to byte positions in the file, as this is no longer a 1:1-relationship.
- You might have to clear selections when switching between conversions that do not share the same character/byte mapping (i.e. to/from UTF8 and between a 1-byte code and a 2-byte code).
For files that cannot be converted (i.e. non-unicode files, binary or different text encoding) you might choose to force the user to switch to a different conversion or just display some error indication.
If possible you might even consider to use a different codepage in the textbox, i.e. to show Cyrillic or Greek text on a western system, but then you might have to disable text search or switch that input control accordingly.
What do you think about it?