HTML Charsets - Detailed Overview

Introduction to HTML Charsets

Character sets (or charsets) define the set of characters that can be displayed and interpreted in a web page. In HTML, specifying the correct charset ensures that text content is displayed correctly across different browsers and devices.

Common Character Sets

Here are some commonly used character sets in HTML:

Charset Description Example
UTF-8 The most widely used character encoding that supports all characters in the Unicode standard. <meta charset="UTF-8">
ISO-8859-1 A character encoding for Western European languages, also known as Latin-1. <meta charset="ISO-8859-1">
Windows-1252 A character encoding used by Windows for Western European languages, similar to ISO-8859-1 but with additional characters. <meta charset="Windows-1252">
Shift_JIS A character encoding for the Japanese language, used primarily in Japan. <meta charset="Shift_JIS">
GB2312 A character encoding for Simplified Chinese characters, used in mainland China. <meta charset="GB2312">

Examples

Here are some examples of how to specify a character set in HTML:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Charset Example</title>
</head>
<body>
    <p>This page uses UTF-8 charset.</p>
</body>
</html>

This page uses UTF-8 charset.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="ISO-8859-1">
    <title>Charset Example</title>
</head>
<body>
    <p>This page uses ISO-8859-1 charset.</p>
</body>
</html>

This page uses ISO-8859-1 charset.

Choosing the Right Charset

When creating web pages, it's important to choose the right charset based on the language and characters used. UTF-8 is recommended for most modern web pages as it supports a wide range of characters from different languages and symbols.