Search notes:

Character encoding

A character encoding specifies how a sequence of Unicode characters is converted to bytes (while decoding transforms such bytes back to respective Unicode characters again).
In Windows, a character encoding is identified by a Windows code pages.
Unicode and ISO 10646 define a few encodings for the UCS (Universal Character Set):

Determining the character encoding of a file or a byte stream

uchardet (by freedesktop.org) takes a sequence of bytes in an unknown character encoding and attempts to determine the encoding of the text. The returned encoding names are iconv-compatible.
In a Unix shell, the character encoding of a file might be determined with file or file -i.
With Python, the encoding of a bytestream can be determined with bs4.UnicodeDammit.

See also

iconv can be used to convert text from one character encoding to another. t
Specifying a characterset within an HTML document
Some known character encodings
The Accept-Charset HTTP request header (which should not be used anymore).
The encoding parameter of dbConnect
In .NET, a character encoding is represented by the System.Text.Encoding class.
An example that creates files in different encodings is here.
Using the .NET methods System.IO.File::ReadAllText and WriteAllText to change a file's encoding.
PEP 263 - Defining Python Source Code Encodings

Index