Jump to content

Recommended Posts

Geschrieben

The OLED 128x64 Bricklet 2.0 has an embedded font. Code points 32 through 126 overlap with ASCII.

Is there a table (somewhere) mapping UTF to the other codepoints, where there is a match? E.g. "½" => 171 ?

Geschrieben

Hi,

The embedded font is Code page 437. For the openHAB bindings I've used the following mapping to convert from UTF-16:

static final List<Integer> CP437 = Arrays.asList(0x0000, 0x263A, 0x263B, 0x2665, 0x2666, 0x2663, 0x2660, 0x2022,
        0x25D8, 0x25CB, 0x25D9, 0x2642, 0x2640, 0x266A, 0x266B, 0x263C, 0x25BA, 0x25C4, 0x2195, 0x203C, 0x00B6,
        0x00A7, 0x25AC, 0x21A8, 0x2191, 0x2193, 0x2192, 0x2190, 0x221F, 0x2194, 0x25B2, 0x25BC, 0x0020, 0x0021,
        0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002A, 0x002B, 0x002C, 0x002D, 0x002E,
        0x002F, 0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003A, 0x003B,
        0x003C, 0x003D, 0x003E, 0x003F, 0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048,
        0x0049, 0x004A, 0x004B, 0x004C, 0x004D, 0x004E, 0x004F, 0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055,
        0x0056, 0x0057, 0x0058, 0x0059, 0x005A, 0x005B, 0x005C, 0x005D, 0x005E, 0x005F, 0x0060, 0x0061, 0x0062,
        0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006A, 0x006B, 0x006C, 0x006D, 0x006E, 0x006F,
        0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007A, 0x007B, 0x007C,
        0x007D, 0x007E, 0x2302, 0x00C7, 0x00FC, 0x00E9, 0x00E2, 0x00E4, 0x00E0, 0x00E5, 0x00E7, 0x00EA, 0x00EB,
        0x00E8, 0x00EF, 0x00EE, 0x00EC, 0x00C4, 0x00C5, 0x00C9, 0x00E6, 0x00C6, 0x00F4, 0x00F6, 0x00F2, 0x00FB,
        0x00F9, 0x00FF, 0x00D6, 0x00DC, 0x00A2, 0x00A3, 0x00A5, 0x20A7, 0x0192, 0x00E1, 0x00ED, 0x00F3, 0x00FA,
        0x00F1, 0x00D1, 0x00AA, 0x00BA, 0x00BF, 0x2310, 0x00AC, 0x00BD, 0x00BC, 0x00A1, 0x00AB, 0x00BB, 0x2591,
        0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556, 0x2555, 0x2563, 0x2551, 0x2557, 0x255D, 0x255C,
        0x255B, 0x2510, 0x2514, 0x2534, 0x252C, 0x251C, 0x2500, 0x253C, 0x255E, 0x255F, 0x255A, 0x2554, 0x2569,
        0x2566, 0x2560, 0x2550, 0x256C, 0x2567, 0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256B,
        0x256A, 0x2518, 0x250C, 0x2588, 0x2584, 0x258C, 0x2590, 0x2580, 0x03B1, 0x00DF, 0x0393, 0x03C0, 0x03A3,
        0x03C3, 0x00B5, 0x03C4, 0x03A6, 0x0398, 0x03A9, 0x03B4, 0x221E, 0x03C6, 0x03B5, 0x2229, 0x2261, 0x00B1,
        0x2265, 0x2264, 0x2320, 0x2321, 0x00F7, 0x2248, 0x00B0, 0x2219, 0x00B7, 0x221A, 0x207F, 0x00B2, 0x25A0,
        0x00A0);

public static String utf16ToCP437(String utf16) {
    StringBuilder result = new StringBuilder();
    utf16.codePoints().map(c -> CP437.indexOf(c)).map(i -> i == -1 ? 0xDB : i)
            .forEach(c -> result.append((char) c));
    return result.toString();
}

The lookup table should work in any language. Many programming language standard libraries can do this conversion. For example in Python:

'test ½'.encode('cp437', 'replace')

will return

b'test \xab'

Using 'replace' will insert encoded '?' chars if a non-encodeable unicode character is encountered.

Geschrieben (bearbeitet)

Genius. This saves me a lot of work. I completely missed that it is the character set of the IBM PC.

In Ruby:

'test ½'.encode('cp437')
=> "test \xAB"

'test ½'.encode('cp437', :replace => '?')
=> "test \xAB"

'test ‹'.encode('cp437', :replace => '?')
=> "test ?"

Thanks!

bearbeitet von Superp
clarify invalid chars in example
  • 2 weeks later...
Geschrieben

...but things are not that simple.

The IBM437 encoding in Ruby (and some other languages) does not include code points 0..31 and 127, which traditionally were control characters like bell and tab.

This means you can either:

  1. Build your own complete lookup table with 256 code points, duplicating the encoding already available on your system, but adding 0..31 and 127. Not dry.
  2. Use the encoding, and miss •, ○, ♫, →, ♥, ⌂ and other useful characters.
  3. Use the encoding, with a fallback table for 0..31 and 127.

I opted to do 3. Commit here.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Gast
Reply to this topic...

×   Du hast formatierten Text eingefügt.   Formatierung jetzt entfernen

  Only 75 emoji are allowed.

×   Dein Link wurde automatisch eingebettet.   Einbetten rückgängig machen und als Link darstellen

×   Dein vorheriger Inhalt wurde wiederhergestellt.   Clear editor

×   Du kannst Bilder nicht direkt einfügen. Lade Bilder hoch oder lade sie von einer URL.

×
×
  • Neu erstellen...