Unicode Character Lookup

Explore the world of Unicode characters. Type any character or enter a hex code point (U+XXXX) to see detailed encoding information - Unicode name, UTF-8 and UTF-16 byte sequences, HTML entity, general category, and Unicode block. Includes a searchable reference table of common symbols.

FAQ

What's the difference between UTF-8 and UTF-16?

UTF-8 uses 1-4 bytes per character and is the dominant encoding on the web (backwards compatible with ASCII). UTF-16 uses 2 or 4 bytes and is used internally by JavaScript and Windows. This tool shows both encodings for any character.

What is a code point?

A code point is a unique number assigned to each character in the Unicode standard. For example, the letter "A" is U+0041 (decimal 65). Code points range from U+0000 to U+10FFFF and can be represented in HTML as numeric entities like A.

What is Unicode?

Unicode is the universal character encoding standard that assigns a unique number (code point) to every character across all writing systems. It covers 150,000+ characters across 160+ scripts. Unicode enables consistent text representation across different platforms, languages, and programs.

What's the difference between UTF-8, UTF-16, and UTF-32?

UTF-8 uses 1-4 bytes per character (ASCII chars use 1 byte, making it backward compatible and space-efficient for English). UTF-16 uses 2 or 4 bytes (most common chars use 2 bytes). UTF-32 uses exactly 4 bytes per character (simplest but wastes space). UTF-8 is the dominant web encoding.

Can I search for emoji?

Yes. Paste any emoji (or any other character) into the search box to see its Unicode details. To search by code point, type U+1F600 (for ??). For characters beyond the Basic Multilingual Plane (BMP), the tool handles supplementary planes correctly.

Unicode Character Lookup

FAQ

More tools

HTML Color Names

HTML Entity Reference

MIME Types Reference

HTTP Status Codes

ASCII Table

Git Cheatsheet

RegEx Cheatsheet

JSON Formatter