While adding support for editing and viewing text encoded in UTF-8 to HxD’s hex editor control itself, it turns out I have to query Unicode property tables, that go beyond the basic ones included with Delphi (and most other languages / default libraries).
Parsing the structured text files, provided by the Unicode consortium, at each startup is too inefficient, and merely storing the parsed text into a simple integer array wastes too much memory.
A more efficient storage uses a dictionary-like approach, to compress the needed data using a few layers of indirections, while still giving array-like performance with constant (and negligible) overhead.
In the following, I’ll briefly present the solution I found.
Continue reading