Software I like: The Unicode® Standard

The Unicode standard is approximately a superset of the Universal Character Set (ISO/IEC 10646) (ref, ref).

To find out what fonts include a particular character, use the FileFormat.Info Unicode Character Search to find the character and click on Fonts that support …. This not only lists fonts that support that particular character, but can also attempt to display that character with every font installed on the local computer. However, many fonts may appear to contain a character that they don't contain, because of font substitution by the browser.

shapecatcher.com offers Unicode character lookup based on drawing the desired character. As of 2018 Apr 7, ‘Japanese, Korean and Chinese characters are currently not supported’, although at least some Japanese hiragana and katakana are found.

Chinese (Han) characters can be looked up using the Unihan Database Lookup Tool. It is possible to look up characters by code; by copying and pasting a character; by searching on the definition or pronunciation; and by the numbers of strokes in the radical and in addition to the radical.

Unicode Font Guide For Free/Libre Open Source Operating Systems

The characters of the Control Pictures block (2400–243f), e.g., ^H_T ^C_R ^S_P (which may or may not be displayed properly here: ␉ ␍ ␠), are in the Debian ttf-ancient-fonts package (along with many other things).

The Unicode standard includes the concepts of compatibility characters, decomposition, etc. This is quite confusing. For example, U+00B5 micro sign has no canonical decomposition mapping but it has a compatibility decomposition mapping to U+03BC greek small letter mu. See Section 3.7 of the standard. (Note that in the STIX font U+00B5 and U+03BC have slightly different shapes.)

Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the United States and other countries.

R. Funnell

Last modified: 2018-04-07 13:45:34