Superp Geschrieben March 7, 2022 at 10:18 Geschrieben March 7, 2022 at 10:18 The OLED 128x64 Bricklet 2.0 has an embedded font. Code points 32 through 126 overlap with ASCII. Is there a table (somewhere) mapping UTF to the other codepoints, where there is a match? E.g. "½" => 171 ? Zitieren
rtrbt Geschrieben March 7, 2022 at 11:07 Geschrieben March 7, 2022 at 11:07 Hi, The embedded font is Code page 437. For the openHAB bindings I've used the following mapping to convert from UTF-16: static final List<Integer> CP437 = Arrays.asList(0x0000, 0x263A, 0x263B, 0x2665, 0x2666, 0x2663, 0x2660, 0x2022, 0x25D8, 0x25CB, 0x25D9, 0x2642, 0x2640, 0x266A, 0x266B, 0x263C, 0x25BA, 0x25C4, 0x2195, 0x203C, 0x00B6, 0x00A7, 0x25AC, 0x21A8, 0x2191, 0x2193, 0x2192, 0x2190, 0x221F, 0x2194, 0x25B2, 0x25BC, 0x0020, 0x0021, 0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002A, 0x002B, 0x002C, 0x002D, 0x002E, 0x002F, 0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003A, 0x003B, 0x003C, 0x003D, 0x003E, 0x003F, 0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048, 0x0049, 0x004A, 0x004B, 0x004C, 0x004D, 0x004E, 0x004F, 0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057, 0x0058, 0x0059, 0x005A, 0x005B, 0x005C, 0x005D, 0x005E, 0x005F, 0x0060, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006A, 0x006B, 0x006C, 0x006D, 0x006E, 0x006F, 0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007A, 0x007B, 0x007C, 0x007D, 0x007E, 0x2302, 0x00C7, 0x00FC, 0x00E9, 0x00E2, 0x00E4, 0x00E0, 0x00E5, 0x00E7, 0x00EA, 0x00EB, 0x00E8, 0x00EF, 0x00EE, 0x00EC, 0x00C4, 0x00C5, 0x00C9, 0x00E6, 0x00C6, 0x00F4, 0x00F6, 0x00F2, 0x00FB, 0x00F9, 0x00FF, 0x00D6, 0x00DC, 0x00A2, 0x00A3, 0x00A5, 0x20A7, 0x0192, 0x00E1, 0x00ED, 0x00F3, 0x00FA, 0x00F1, 0x00D1, 0x00AA, 0x00BA, 0x00BF, 0x2310, 0x00AC, 0x00BD, 0x00BC, 0x00A1, 0x00AB, 0x00BB, 0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556, 0x2555, 0x2563, 0x2551, 0x2557, 0x255D, 0x255C, 0x255B, 0x2510, 0x2514, 0x2534, 0x252C, 0x251C, 0x2500, 0x253C, 0x255E, 0x255F, 0x255A, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256C, 0x2567, 0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256B, 0x256A, 0x2518, 0x250C, 0x2588, 0x2584, 0x258C, 0x2590, 0x2580, 0x03B1, 0x00DF, 0x0393, 0x03C0, 0x03A3, 0x03C3, 0x00B5, 0x03C4, 0x03A6, 0x0398, 0x03A9, 0x03B4, 0x221E, 0x03C6, 0x03B5, 0x2229, 0x2261, 0x00B1, 0x2265, 0x2264, 0x2320, 0x2321, 0x00F7, 0x2248, 0x00B0, 0x2219, 0x00B7, 0x221A, 0x207F, 0x00B2, 0x25A0, 0x00A0); public static String utf16ToCP437(String utf16) { StringBuilder result = new StringBuilder(); utf16.codePoints().map(c -> CP437.indexOf(c)).map(i -> i == -1 ? 0xDB : i) .forEach(c -> result.append((char) c)); return result.toString(); } The lookup table should work in any language. Many programming language standard libraries can do this conversion. For example in Python: 'test ½'.encode('cp437', 'replace') will return b'test \xab' Using 'replace' will insert encoded '?' chars if a non-encodeable unicode character is encountered. 1 Zitieren
Superp Geschrieben March 7, 2022 at 12:10 Autor Geschrieben March 7, 2022 at 12:10 (bearbeitet) Genius. This saves me a lot of work. I completely missed that it is the character set of the IBM PC. In Ruby: 'test ½'.encode('cp437') => "test \xAB" 'test ½'.encode('cp437', :replace => '?') => "test \xAB" 'test ‹'.encode('cp437', :replace => '?') => "test ?" Thanks! bearbeitet March 7, 2022 at 12:22 von Superp clarify invalid chars in example Zitieren
Superp Geschrieben March 17, 2022 at 07:32 Autor Geschrieben March 17, 2022 at 07:32 ...but things are not that simple. The IBM437 encoding in Ruby (and some other languages) does not include code points 0..31 and 127, which traditionally were control characters like bell and tab. This means you can either: Build your own complete lookup table with 256 code points, duplicating the encoding already available on your system, but adding 0..31 and 127. Not dry. Use the encoding, and miss •, ○, ♫, →, ♥, ⌂ and other useful characters. Use the encoding, with a fallback table for 0..31 and 127. I opted to do 3. Commit here. Zitieren
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.