ASCII Character Set and HTML Codes

ASCII stands for the American Standard Code for Information Interchange. ASCII was the first character set (encoding standard) used between computers on the Internet.

images/webmaster/ascii-character-set-and-html-codes.jpg

Both ISO-8859-1 (default in HTML 4.01) and UTF-8 (default in HTML5), are built on ASCII. The character encoding for the early web was ASCII.

Binary Information

Computer information (numbers, texts and pictures) is stored as binary ones and zeros (01000101). To standardize the storing of alphanumeric characters, the American Standard Code for Information Interchange (ASCII) was created. It defined a unique binary 7-bits number for each character.

Number of Characters in ASCII

27 = 128 (0 to 127)

ASCII is a 7-bit character set containing 128 characters. It contains control characters, numbers from 0 to 9, upper and lower case English letters from A to Z, and some special characters, punctuation and symbols.

Since ASCII used one byte (7 bits for the character and one bit for transmission parity control), it could only represent 128 different characters. In addition, 32 of these characters were reserved for other control purposes. The biggest weakness with ASCII was that it excluded non-English letters.

  • 0 to 31 and 127: Control Characters
  • 32 to 47: Special Characters
  • 48 to 57: Numbers
  • 58 to 64: Special Characters
  • 65 to 90: Uppercase Alphabets
  • 91 to 96: Special Characters
  • 97 to 125: Lowercase Alphabets
  • 126: Special Character

ASCII Printable Characters (32 - 126)

ASCII Characters Description HTML Entity Codes
  space  
! exclamation mark !
" quotation mark "
# number sign #
$ dollar sign $
% percent sign %
& ampersand &
' apostrophe '
( left parenthesis (
) right parenthesis )
* asterisk *
+ plus sign +
, comma ,
- hyphen -
. period .
/ slash /
0 digit 0 0
1 digit 1 1
2 digit 2 2
3 digit 3 3
4 digit 4 4
5 digit 5 5
6 digit 6 6
7 digit 7 7
8 digit 8 8
9 digit 9 9
: colon :
; semicolon &#59;
< less-than &#60;
= equals-to &#61;
> greater-than &#62;
? question mark &#63;
@ at sign &#64;
A uppercase A &#65;
B uppercase B &#66;
C uppercase C &#67;
D uppercase D &#68;
E uppercase E &#69;
F uppercase F &#70;
G uppercase G &#71;
H uppercase H &#72;
I uppercase I &#73;
J uppercase J &#74;
K uppercase K &#75;
L uppercase L &#76;
M uppercase M &#77;
N uppercase N &#78;
O uppercase O &#79;
P uppercase P &#80;
Q uppercase Q &#81;
R uppercase R &#82;
S uppercase S &#83;
T uppercase T &#84;
U uppercase U &#85;
V uppercase V &#86;
W uppercase W &#87;
X uppercase X &#88;
Y uppercase Y &#89;
Z uppercase Z &#90;
[ left square bracket &#91;
\ backslash &#92;
] right square bracket &#93;
^ caret &#94;
_ underscore &#95;
` grave accent &#96;
a lowercase a &#97;
b lowercase b &#98;
c lowercase c &#99;
d lowercase d &#100;
e lowercase e &#101;
f lowercase f &#102;
g lowercase g &#103;
h lowercase h &#104;
i lowercase i &#105;
j lowercase j &#106;
k lowercase k &#107;
l lowercase l &#108;
m lowercase m &#109;
n lowercase n &#110;
o lowercase o &#111;
p lowercase p &#112;
q lowercase q &#113;
r lowercase r &#114;
s lowercase s &#115;
t lowercase t &#116;
u lowercase u &#117;
v lowercase v &#118;
w lowercase w &#119;
x lowercase x &#120;
y lowercase y &#121;
z lowercase z &#122;
{ left curly brace &#123;
| vertical bar &#124;
} right curly brace &#125;
~ tilde &#126;

Control Characters (0 - 31)

ASCII reserves the first 32 codes for control characters. These are codes intended to control peripheral devices (such as printers), or to provide meta-information about data streams. These code points do not represent printable characters.

ASCII Characters Description HTML Entity Codes
NUL null character &#00;
SOH start of header &#01;
STX start of text &#02;
ETX end of text &#03;
EOT end of transmission &#04;
ENQ enquiry &#05;
ACK acknowledge &#06;
BEL bell (ring) &#07;
BS backspace &#08;
HT horizontal tab &#09;
LF line feed &#10;
VT vertical tab &#11;
FF form feed &#12;
CR carriage return &#13;
SO shift out &#14;
SI shift in &#15;
DLE data link escape &#16;
DC1 device control 1 &#17;
DC2 device control 2 &#18;
DC3 device control 3 &#19;
DC4 device control 4 &#20;
NAK negative acknowledge &#21;
SYN synchronize &#22;
ETB end transmission block &#23;
CAN cancel &#24;
EM end of medium &#25;
SUB substitute &#26;
ESC escape &#27;
FS file separator &#28;
GS group separator &#29;
RS record separator &#30;
US unit separator &#31;
DEL delete (rubout) &#127;

Extended ASCII (128 - 255)

Extended ASCII character encoding are 8-bit encoding that include the standard seven-bit ASCII characters, plus additional characters.