summaryrefslogtreecommitdiffstats
path: root/python/python-webencodings/README
blob: c48004c0f24f56aed620c9fa349c67b1fab22947 (plain)
webencodings is a Python implementation of the WHATWG Encoding standard.

In order to be compatible with legacy web content when interpreting something
like Content-Type: text/html; charset=latin1, tools need to use a particular
set of aliases for encoding labels as well as some overriding rules. For
example, US-ASCII and iso-8859-1 on the web are actually aliases for
windows-1252, and an UTF-8 or UTF-16 BOM takes precedence over any other
encoding declaration. The Encoding standard defines all such details so that
implementations do not have to reverse-engineer each other.

This module has encoding labels and BOM detection, but the actual
implementation for encoders and decoders is Python's.