00000000 | nul | 00000001 | soh | 00000010 | stx | 00000011 | etx | 00000100 | eot | 00000101 | enq | 00000110 | ack | 00000111 | bel |
00001000 | bs | 00001001 | tab | 00001010 | nl | 00001011 | vt | 00001100 | ff | 00001101 | cr | 00001110 | so | 00001111 | si |
00010000 | dle | 00010001 | dc1 | 00010010 | dc2 | 00010011 | dc3 | 00010100 | dc4 | 00010101 | nak | 00010110 | syn | 00010111 | etb |
00011000 | can | 00011001 | em | 00011010 | sub | 00011011 | esc | 00011100 | fs | 00011101 | gs | 00011110 | rs | 00011111 | us |
00100000 | sp | 00100001 | ! | 00100010 | " | 00100011 | # | 00100100 | $ | 00100101 | % | 00100110 | & | 00100111 | ' |
00101000 | ( | 00101001 | ) | 00101010 | * | 00101011 | + | 00101100 | , | 00101101 | - | 00101110 | . | 00101111 | / |
00110000 | 0 | 00110001 | 1 | 00110010 | 2 | 00110011 | 3 | 00110100 | 4 | 00110101 | 5 | 00110110 | 6 | 00110111 | 7 |
00111000 | 8 | 00111001 | 9 | 00111010 | : | 00111011 | ; | 00111100 | < | 00111101 | = | 00111110 | > | 00111111 | ? |
01000000 | @ | 01000001 | A | 01000010 | B | 01000011 | C | 01000100 | D | 01000101 | E | 01000110 | F | 01000111 | G |
01001000 | H | 01001001 | I | 01001010 | J | 01001011 | K | 01001100 | L | 01001101 | M | 01001110 | N | 01001111 | O |
01010000 | P | 01010001 | Q | 01010010 | R | 01010011 | S | 01010100 | T | 01010101 | U | 01010110 | V | 01010111 | W |
01011000 | X | 01011001 | Y | 01011010 | Z | 01011011 | [ | 01011100 | \ | 01011101 | ] | 01011110 | ^ | 01011111 | _ |
01100000 | ` | 01100001 | a | 01100010 | b | 01100011 | c | 01100100 | d | 01100101 | e | 01100110 | f | 01100111 | g |
01101000 | h | 01101001 | i | 01101010 | j | 01101011 | k | 01101100 | l | 01101101 | m | 01101110 | n | 01101111 | o |
01110000 | p | 01110001 | q | 01110010 | r | 01110011 | s | 01110100 | t | 01110101 | u | 01110110 | v | 01110111 | w |
01111000 | x | 01111001 | y | 01111010 | z | 01111011 | { | 01111100 | | | 01111101 | } | 01111110 | ~ | 01111111 | del |
000 | nul | 001 | soh | 002 | stx | 003 | etx | 004 | eot | 005 | enq | 006 | ack | 007 | bel |
010 | bs | 011 | tab | 012 | nl | 013 | vt | 014 | ff | 015 | cr | 016 | so | 017 | si |
020 | dle | 021 | dc1 | 022 | dc2 | 023 | dc3 | 024 | dc4 | 025 | nak | 026 | syn | 027 | etb |
030 | can | 031 | em | 032 | sub | 033 | esc | 034 | fs | 035 | gs | 036 | rs | 037 | us |
040 | sp | 041 | ! | 042 | " | 043 | # | 044 | $ | 045 | % | 046 | & | 047 | ' |
050 | ( | 051 | ) | 052 | * | 053 | + | 054 | , | 055 | - | 056 | . | 057 | / |
060 | 0 | 061 | 1 | 062 | 2 | 063 | 3 | 064 | 4 | 065 | 5 | 066 | 6 | 067 | 7 |
070 | 8 | 071 | 9 | 072 | : | 073 | ; | 074 | < | 075 | = | 076 | > | 077 | ? |
100 | @ | 101 | A | 102 | B | 103 | C | 104 | D | 105 | E | 106 | F | 107 | G |
110 | H | 111 | I | 112 | J | 113 | K | 114 | L | 115 | M | 116 | N | 117 | O |
120 | P | 121 | Q | 122 | R | 123 | S | 124 | T | 125 | U | 126 | V | 127 | W |
130 | X | 131 | Y | 132 | Z | 133 | [ | 134 | \ | 135 | ] | 136 | ^ | 137 | _ |
140 | ` | 141 | a | 142 | b | 143 | c | 144 | d | 145 | e | 146 | f | 147 | g |
150 | h | 151 | i | 152 | j | 153 | k | 154 | l | 155 | m | 156 | n | 157 | o |
160 | p | 161 | q | 162 | r | 163 | s | 164 | t | 165 | u | 166 | v | 167 | w |
170 | x | 171 | y | 172 | z | 173 | { | 174 | | | 175 | } | 176 | ~ | 177 | del |
00 | nul | 01 | soh | 02 | stx | 03 | etx | 04 | eot | 05 | enq | 06 | ack | 07 | bel |
08 | bs | 09 | tab | 10 | nl | 11 | vt | 12 | ff | 13 | cr | 14 | so | 15 | si |
16 | dle | 17 | dc1 | 18 | dc2 | 19 | dc3 | 20 | dc4 | 21 | nak | 22 | syn | 23 | etb |
24 | can | 25 | em | 26 | sub | 27 | esc | 28 | fs | 29 | gs | 30 | rs | 31 | us |
32 | sp | 33 | ! | 34 | " | 35 | # | 36 | $ | 37 | % | 38 | & | 39 | ' |
40 | ( | 41 | ) | 42 | * | 43 | + | 44 | , | 45 | - | 46 | . | 47 | / |
48 | 0 | 49 | 1 | 50 | 2 | 51 | 3 | 52 | 4 | 53 | 5 | 54 | 6 | 55 | 7 |
56 | 8 | 57 | 9 | 58 | : | 59 | ; | 60 | < | 61 | = | 62 | > | 63 | ? |
64 | @ | 65 | A | 66 | B | 67 | C | 68 | D | 69 | E | 70 | F | 71 | G |
72 | H | 73 | I | 74 | J | 75 | K | 76 | L | 77 | M | 78 | N | 79 | O |
80 | P | 81 | Q | 82 | R | 83 | S | 84 | T | 85 | U | 86 | V | 87 | W |
88 | X | 89 | Y | 90 | Z | 91 | [ | 92 | \ | 93 | ] | 94 | ^ | 95 | _ |
96 | ` | 97 | a | 98 | b | 99 | c | 100 | d | 101 | e | 102 | f | 103 | g |
104 | h | 105 | i | 106 | j | 107 | k | 108 | l | 109 | m | 110 | n | 111 | o |
112 | p | 113 | q | 114 | r | 115 | s | 116 | t | 117 | u | 118 | v | 119 | w |
120 | x | 121 | y | 122 | z | 123 | { | 124 | | | 125 | } | 126 | ~ | 127 | del |
00 | nul | 01 | soh | 02 | stx | 03 | etx | 04 | eot | 05 | enq | 06 | ack | 07 | bel |
08 | bs | 09 | tab | 0a | nl | 0b | vt | 0c | ff | 0d | cr | 0e | so | 0f | si |
10 | dle | 11 | dc1 | 12 | dc2 | 13 | dc3 | 14 | dc4 | 15 | nak | 16 | syn | 17 | etb |
18 | can | 19 | em | 1a | sub | 1b | esc | 1c | fs | 1d | gs | 1e | rs | 1f | us |
20 | sp | 21 | ! | 22 | " | 23 | # | 24 | $ | 25 | % | 26 | & | 27 | ' |
28 | ( | 29 | ) | 2a | * | 2b | + | 2c | , | 2d | - | 2e | . | 2f | / |
30 | 0 | 31 | 1 | 32 | 2 | 33 | 3 | 34 | 4 | 35 | 5 | 36 | 6 | 37 | 7 |
38 | 8 | 39 | 9 | 3a | : | 3b | ; | 3c | < | 3d | = | 3e | > | 3f | ? |
40 | @ | 41 | A | 42 | B | 43 | C | 44 | D | 45 | E | 46 | F | 47 | G |
48 | H | 49 | I | 4a | J | 4b | K | 4c | L | 4d | M | 4e | N | 4f | O |
50 | P | 51 | Q | 52 | R | 53 | S | 54 | T | 55 | U | 56 | V | 57 | W |
58 | X | 59 | Y | 5a | Z | 5b | [ | 5c | \ | 5d | ] | 5e | ^ | 5f | _ |
60 | ` | 61 | a | 62 | b | 63 | c | 64 | d | 65 | e | 66 | f | 67 | g |
68 | h | 69 | i | 6a | j | 6b | k | 6c | l | 6d | m | 6e | n | 6f | o |
70 | p | 71 | q | 72 | r | 73 | s | 74 | t | 75 | u | 76 | v | 77 | w |
78 | x | 79 | y | 7a | z | 7b | { | 7c | | | 7d | } | 7e | ~ | 7f | del |
The basic POSIX character classes are shown by color-coding as follows:
Control Characters | [:cntrl:] |
Space | |
Punctuation | [:punct:] |
Digits | [:digit:] |
Upper Case Letters | [:upper:] |
Lower Case Letters | [:lower:] |
Notice that the space character stands on its own and is not included in any basic class.
Most of the control characters should not appear in normal text. The ones that are likely to are:
0x09 | TAB | horizontal tab |
0x0A | NL | newline/linefeed |
0x0D | CR | carriage return |
The usual derived classes are as follows.
Class | Definition |
---|---|
[:alpha:] | [:upper:] ∪ [:lower:] |
[:alnum:] | [:alpha:] ∪ [:digit:] |
[:xdigit:] | [:digit:] ∪ [AaBbCcDdEeFf] |
[:graph:] | [:alnum:] ∪ [:punct:] |
[:print:] | [:graph:] ∪ Space |
[:blank:] | Space ∪ Tab |
[:space:] | [:blank:] ∪ [NL VT FF CR] |
[:word:] | [:alnum:] ∪ Underscore |
All but [:word:] are defined in the POSIX standard. [:word:] is not a POSIX class (pace the bash manual) but reflects the fact that in quite a few programming languages the characters in this class are those permitted in identifiers.
The principle governing the classification of characters outside the ASCII range is that the structure of the system as applied to ASCII must be maintained, except that additional classes may be created. The rules for the derived classes must continue to hold, and the basic classes must remain disjoint.
The first 31 characters (0-30 decimal, 000-037 octal, 0x00-0x1E hex) together with the last character DEL (decimal 127, octal 177, hex 0x7F) are the "control characters". These were originally used to control teletype machines. Only a few of them are generally meaningful with most devices used today. However, they are often used for other purposes, for example, as commands to programs. When you press the control key on a keyboard at the same time as one of the letters, the code sent to the computer is the corresponding control code. That is, CTRL-A sends 001, CTRL-B sends 002, CTRl-C sends 003, etc.
Control characters are sometimes referred to by names like "Control-A", also written "Ctrl-A" or "^A". The correspondance is as follows: The null character, 0x00, is designated "Control-@". 0x01 is "Control-A", 0x02 is "Control-B", and so on through 0x1A, which is "Control-Z". 0x1B is "Control-[", 0x1C "Control-\", 0x1D "Control-]", 0x1E "Control-^", 0x1F "Control-_", and 0x20 "Control-`". In other words, the control characters are regarded as "Control" versions of the range 0x40-0x60.
The original meanings of the control characters are as follows:
ACK | acknowledge |
BEL | bell - rings the bell |
BS | backspace - moves the cursor or print head back one space |
CAN | cancel |
CR | carriage return - moves the cursor or print head back to the beginning of the line |
DC1 | device control 1 |
DC2 | device control 2 |
DC3 | device control 3 |
DC4 | device control 4 |
DLE | data link escape |
EM | end of medium |
ENQ | enquiry |
EOT | end of transmission |
ESC | escape |
ETB | end of transmission block |
ETX | end of text |
FF | form feed - advances the paper to the top of the next page |
FS | file separator |
GS | group separator |
NAK | negative acknowledge |
NL | newline. Also known as LF "line feed". Originally, moved the print head or cursor to the next line. |
NUL | null |
RS | record separator |
SI | shift in - switches output device back to default character set |
SO | shift out - switches output device to alternate character set |
SOH | start of heading |
STX | start of text |
SUB | substitute |
SYN | synchronous idle |
TAB | horizontal tab |
US | unit separator |
VT | vertical tab |