Character encoding
Jump to navigation
Jump to search
Definition
This article focuses on character encoding.
See also: Codec (Encoding/decoding of compression formats that for simple files, archive files or files with multiple contents (e.g. in multimedia container formats).
Links
Specifications
(some)
- UniCode Home Page (includes for example code-charts and the Unicode and the Web FAQ)
- Character Model for the World Wide Web 1.0: Fundamentals (W3C Recommendation 2005).
- Unicode in XML and other Markup Languages (W3C Technical Report)
- W3C I18N GEO Working Group
Charts
- IANA Character Sets table for the Internet
- Unicode 4.1.0 Chart
- Unicode Character Code Charts (PDF files)
- HTML specific
- HTM entities table
- Character Converter(Iain Tucker)
- ISO8859-1 (Latin-1) (HEX/Dec/Entities)
- iso8859-1 Table
- Table of character entity references in HTML 4
- ASCII - ISO 8859-1 (Latin-1)Table with HTML Entity Names
- The ISO 8859 Alphabet Soup
See HTML links for other HTML-related links.
Online converters
- Text to UTF-8 or HTML Entities Tool
- Unicode (UTF-8) to HTML entity online converter
- UTF Converter
- Converter for funny characters into the proper HTML
Tutorials
- Characters and encodings by Jukka "Yucca" Korpela, very good reading ! See e.g. A tutorial on character code issues and On the use of some MS Windows characters in HTML
- HTML and Browsers Character encoding, entity references and UTF-8. Good short tutorial.
- Some Wikipedia entries regarding Wikipedia contents
Wikipedia is a good example that shows how modern websites can deal with most character sets.
- Help:Multilingual support
- Indic scripts (as an example)
- More general Wikipedia entries
URL encoding
- URL Encoding (or what are those " " codes in URLs?') by Brian Wilson