Awesome Open Source
Awesome Open Source

unibits | Reveal the Unicode [version] [ci]

Ruby library and CLI command that visualizes various Unicode and ASCII/single byte encodings in the terminal:

  • Makes analyzing encodings easier
  • Helps you with debugging strings
  • Highlights invalid/special/blank bytes/characters/codepoints
  • Supports UTF-8, UTF-16LE/UTF-16BE, UTF-32LE/UTF-32BE, ISO-8859-X, Windows-125X, IBMX, CP85X, macX, TIS-620/Windows-874, KOI8-R/KOI8-U, 7-Bit ASCII/GB1988, and arbitrary BINARY data

Color Coding

Each byte of the given string is highlighted using the following mechanism (characters -> codepoints):

  • Red for invalid bytes
  • Light blue for blanks
  • Blue for control characters
  • Non-control formatting characters in pink
  • Green for marks (Unicode only)
  • Orange for unassigned codepoints
  • Lighter orange for unassigned codepoints which are also ignorable
  • Random color for all other codepoints

The same colors are used in the higher-level companion tool uniscribe.


Make sure you have Ruby installed and installing gems works properly. Then do:

$ gem install unibits


Pass the string to debug to unibits:

From CLI

$ unibits " Idiosyncrtic "

From Ruby

require 'unibits/kernel_method'
unibits " Idiosyncrtic "

Advanced Options

unibits takes some optional options:

  • encoding (e): The encoding of the given string (uses the string's default encoding if none given)
  • convert (c): An encoding the string should be converted to before visualizing it
  • stats: Whether to show a short stats header (default: true), you can deactivate on the CLI with --no-stats
  • wide-ambiguous: Treat characters of ambiguous width as 2 spaces instead of 1 (more info)
  • width (w): Set a custom column width, if not set, unibits will retrieve it from the terminal or just use 80

Examples of Valid Encodings


CLI: $ unibits -e utf-8 -c utf-8 " Idiosyncrtic "

Ruby: unibits " Idiosyncrtic ", encoding: 'utf-8', convert: 'utf-8'

Screenshot UTF-8


CLI: $ unibits -e utf-8 -c utf-16le " Idiosyncrtic "

Ruby: unibits " Idiosyncrtic ", encoding: 'utf-8', convert: 'utf-16le'

Screenshot UTF-16LE


CLI: $ unibits -e utf-8 -c utf-32be " Idiosyncrtic "

Ruby: unibits " Idiosyncrtic ", encoding: 'utf-8', convert: 'utf-32be'

Screenshot UTF-32BE


CLI: $ unibits -e binary " Idiosyncrtic "

Ruby: unibits " Idiosyncrtic ", encoding: 'binary'

Screenshot BINARY


CLI: $ unibits -e utf-8 -c ascii "ascii"

Ruby: unibits "ascii", encoding: 'utf-8', convert: 'ascii'

Screenshot ASCII

Examples of Invalid Encodings


Example in Ruby: unibits "unexpected \x80 | not enough \xF0\x9F\x8C | overlong \xE0\x81\x81 | surrogate \xED\xA0\x80 | too large \xF5\x8F\xBF\xBF"

Screenshot invalid UTF-8


Example in Ruby: unibits " Idiosyncrtic ", encoding: 'ascii'

Screenshot invalid ASCII


More info

Related gems

Lots of thanks to @damienklinnert for the motivation and inspiration required to build this!

Copyright (C) 2017-2021 Jan Lelis Released under the MIT license.

Related Awesome Lists
Top Programming Languages
Top Projects

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Ruby (222,126
Command Line (131,487
Terminal (17,898
Character (14,786
Encoding (6,545
Unicode (4,458
Ascii (4,154
Debugging Tool (582
Utf 8 (272
Codepoints (173
Ruby Cli (86
Utf 16 (53
Utf 32 (27