Compressing arrays of integers while keeping fast indexing

While adding support for editing and viewing text encoded in UTF-8 to HxD’s hex editor control itself, it turns out I have to query Unicode property tables, that go beyond the basic ones included with Delphi (and most other languages / default libraries).

Parsing the structured text files, provided by the Unicode consortium, at each startup is too inefficient, and merely storing the parsed text into a simple integer array wastes too much memory.

A more efficient storage uses a dictionary-like approach, to compress the needed data using a few layers of indirections, while still giving array-like performance with constant (and negligible) overhead.

In the following, I’ll briefly present the solution I found.

Continue reading

Solving logic game Uluru with Prolog

Before starting to learn Prolog I used various logic based systems, such as the SPIN model checker, or reasoners that work on ontologies encoded in OWL. The latter of them to have it reason about (visual) objects in RoboCup.

Prolog however seems to encode many problems in a more natural and fluent way, so I set out to make a few toy examples to test how well I could make it work and get a feel for its advantages and limitations.

Many concepts in AI are implicitly based on specific formulations / terminology as used for Prolog or its derivatives. Vague sounding words / expressions, often taken from general contexts, really mean something rather specific, and learning about Prolog sharpens the understanding of these wordings.

Continue reading

Understanding neural nets

There has been interesting research in helping to make machine learning models more understandable, such as Unmasking Clever Hans predictors and assessing what machines really learn. Also see practical implementations of this approach:

A fully functioning DIY bread-board computer

Ben Eater has created an excellent 8-bit computer that is true to the essential architecture of modern computers, yet is simple enough to fit on a few breadboards. It uses DIP-switches and push buttons as inputs, and LEDs and 7-segment displays as (debug) outputs. Even step-wise execution by stepping the computer clock is possible, such that every part of the computer can be observed as it functions and the internal state and memory can be modified by switches.

Continue reading

How pull-up resistors really work

There are many explanations of pull-up or pull-down resistors that gloss a bit too much over the details, keeping you in doubt about how they really work, especially in conjunction with microcontrollers.

To improve our understanding, we will use a simplified schematic that models a microcontroller input-pin connected to a switch and a pull-up resistor. Then we calculate the voltages on the input-pin resulting from a closed or open switch.

Continue reading

Character encoding confusion

HxD will extend character encoding support, and I am looking for the best way to name character encodings. So far, you can only pick between the following four to affect the text display in the editor window:

  • Windows (ANSI)
  • DOS/IBM-PC (OEM)
  • Macintosh
  • EBCDIC

Additionally, in the Search window, Unicode (UCS-2LE) can be selected using a checkbox to override the current editor window encoding. I’d like the character encoding selection to be more uniform, flexible, and clear in future. Continue reading

How to understand raw data in a hex editor?

All data in a computer, including files, is a sequence of numbers. But almost no program shows data in such a raw format, except for hex editors, which can make this concept pretty confusing and abstract. (This actually was one of the motivations for me to write such a program: to understand data and representations better.)

Data encoding, decoding and representation is a big topic, but for many applications of hex editors a few concepts are enough. We’ll start with a brief answer to this question: how do I make sense of (hexadecimal) numbers in a hex editor?

These or similar formulations seem to be popular variations of the above question:

How do you translate hex to English?

Can I change the text to English (or another language)?

What are those ‘random’ numbers on the left in the hex editor?

How do you know if the text representation on the right in a hex editor is valid?

Continue reading

Reading process memory / RAM

RAM

One of the most common questions I get regarding HxD is about obtaining source code to understand and reproduce some of its basic functionality. The most frequent points of interest are how the RAM editor reads / writes memory of other running programs (i.e., processes) and how the disk editor reads from / writes to storage media.

Yesterday, when Mohamed A. Mansour, who will teach an Operating Systems class, suggested this information might be useful for his students, I thought it was a good opportunity to start my blog and write it up in a proper way.

This post is about reading memory that belongs to another running program (aka. process), and will be the first in a series of posts about the implementation and design of HxD.

Continue reading