Unicode Character Lookup

Explore the world of Unicode characters. Type any character or enter a hex code point (U+XXXX) to see detailed encoding information - Unicode name, UTF-8 and UTF-16 byte sequences, HTML entity, general category, and Unicode block. Includes a searchable reference table of common symbols.

FAQ

UTF-8 uses 1-4 bytes per character and is the dominant encoding on the web (backwards compatible with ASCII). UTF-16 uses 2 or 4 bytes and is used internally by JavaScript and Windows. This tool shows both encodings for any character.

A code point is a unique number assigned to each character in the Unicode standard. For example, the letter "A" is U+0041 (decimal 65). Code points range from U+0000 to U+10FFFF and can be represented in HTML as numeric entities like A.

Unicode is the universal character encoding standard that assigns a unique number (code point) to every character across all writing systems. It covers 150,000+ characters across 160+ scripts. Unicode enables consistent text representation across different platforms, languages, and programs.

UTF-8 uses 1-4 bytes per character (ASCII chars use 1 byte, making it backward compatible and space-efficient for English). UTF-16 uses 2 or 4 bytes (most common chars use 2 bytes). UTF-32 uses exactly 4 bytes per character (simplest but wastes space). UTF-8 is the dominant web encoding.

Yes. Paste any emoji (or any other character) into the search box to see its Unicode details. To search by code point, type U+1F600 (for ??). For characters beyond the Basic Multilingual Plane (BMP), the tool handles supplementary planes correctly.