AI For DIY

Saturday, August 20, 2022

Code 2

1. About the code

Only by arranging whether electricity flows (1) or does not flow (0) on the computer, people feel uncomfortable with the expression of the number system or language used in real life. Therefore, a promise is made to have a certain meaning for a specific binary sequence, which is collectively called a code.
 
Although there is no separate standard, it can be divided into two main categories: codes for reducing or checking errors in computer devices, and codes for comfortable writing of numbers and characters. There are more types of codes than the ones below, but let's look at only a few codes for the convenience of people.

 

2. 8421 code

Humans use decimal numbers by default, but computers use binary numbers by default, so readability is poor. In particular, it is inconvenient to convert whenever the number of digits of decimal notation increases or whenever a decimal representation other than the decimal system such as date/time is used.
  For this reason, 4 binary digits are allocated to each decimal number, and 10 fixed arrays corresponding to 010 to 910 are used, other expressible arrays are not used. This code is called BCD (Binary Coded Decimal). BCD includes 8421 code, excess-3 code, 2421 code, 5421 code, 51111 code (5 binary digits), etc. Among them, 8421 codes are used as representative.
  The 8421 code gives weights of 8, 4, 2, and 1 to each of the 4 binary digits, and the correspondence with the decimal number is the same as that of the basic binary number. If we change the decimal number 16910 to the 8421 code, it is as follows.

 

Decimal

1

2

7

Weight

8

4

2

1

8

4

2

1

8

4

2

1

8421 code

0

0

0

1

0

0

1

0

0

1

1

0

 

Assign a decimal number to 4 binary digits for each digit, and place the weights in order so that the sum of the decimal and binary numbers is equal. For decimal numbers 010 to 910, only binary numbers 00002 to 10012 are supported, so it is not used for the rest of 10102 to 11112.
  There is a difference in the number of digits between the 8421 code and the general binary number, so when converted to binary like 12710, even if the binary number ends with 7 digits, the 8421 code allocates 12 digits and is often used unnecessarily. In addition, 6 digits are empty between 910 and 1010, so be careful when performing arithmetic operations.

 

3. ASCII

Human numbers can be used in correspondence with binary numbers, but since computers do not have the concept of characters, it is possible to express words or sentences only by matching the entire character system. In the United States, in 1963, a code was created to correspond to 7 binary digits by combining punctuation marks, numbers, uppercase and lowercase of Roman characters (Latin characters), this is called ASCII (American Standard Code for Information Interchange). It is used all over the world to the extent that it is used as a standard for information transmission by adding 1 digit of parity for error checking to 7 digits of ASCII.
  Besides ASCII, there is an EBCDIC(Extended Binary Coded Decimal Interchange Code) that is extended from BCD to express characters, but it is an 8-digit code, with more inconspicuous symbols and due to the inconvenience of using ASCII, so it is not used as well as ASCII.
  The table below is the known ASCII correspondence table. (The corresponding binary number is long, so it is replaced with a hexadecimal number.)

 

Dec

Hex

Char

Dec

Hex

Char

Dec

Hex

Char

Dec

Hex

Char

010

0016

NUL

3210

2016

Space

6410

4016

@

9610

6016

`

110

0116

SOH

3310

2116

!

6510

4116

A

9710

6116

a

210

0216

STX

3410

2216

6610

4216

B

9810

6216

b

310

0316

ETX

3510

2316

#

6710

4316

C

9910

6316

c

410

0416

EOT

3610

2416

$

6810

4416

D

10010

6416

d

510

0516

ENQ

3710

2516

%

6910

4516

E

10110

6516

e

610

0616

ACK

3810

2616

&

7010

4616

F

10210

6616

f

710

0716

BEL

3910

2716

7110

4716

G

10310

6716

g

810

0816

BS

4010

2816

(

7210

4816

H

10410

6816

h

910

0916

TAB

4110

2916

)

7310

4916

I

10510

6916

i

1010

0A16

LF

4210

2A16

*

7410

4A16

J

10610

6A16

j

1110

0B16

VT

4310

2B16

+

7510

4B16

K

10710

6B16

k

1210

0C16

FF

4410

2C16

,

7610

4C16

L

10810

6C16

l

1310

0D16

CR

4510

2D16

-

7710

4D16

M

10910

6D16

m

1410

0E16

SO

4610

2E16

.

7810

4E16

N

11010

6E16

n

1510

0F16

SI

4710

2F16

/

7910

4F16

O

11110

6F16

o

1610

1016

DLE

4810

3016

0

8010

5016

P

11210

7016

p

1710

1116

DC1

4910

3116

1

8110

5116

Q

11310

7116

q

1810

1216

DC2

5010

3216

2

8210

5216

R

11410

7216

r

1910

1316

DC3

5110

3316

3

8310

5316

S

11510

7316

s

2010

1416

DC4

5210

3416

4

8410

5416

T

11610

7416

t

2110

1516

NAK

5310

3516

5

8510

5516

U

11710

7516

u

2210

1616

SYN

5410

3616

6

8610

5616

V

11810

7616

v

2310

1716

ETB

5510

3716

7

8710

5716

W

11910

7716

w

2410

1816

CAN

5610

3816

8

8810

5816

X

12010

7816

x

2510

1916

EM

5710

3916

9

8910

5916

Y

12110

7916

y

2610

1A16

SUB

5810

3A16

:

9010

5A16

Z

12210

7A16

z

2710

1B16

ESC

5910

3B16

;

9110

5B16

[

12310

7B16

{

2810

1C16

FS

6010

3C16

9210

5C16

12410

7C16

|

2910

1D16

GS

6110

3D16

=

9310

5D16

]

12510

7D16

}

3010

1E16

RS

6210

3E16

9410

5E16

^

12610

7E16

~

3110

1F16

US

6310

3F16

?

9510

5F16

_

12710

7F16

DEL

 

In this table, 010(0016) to 3110(1F16) and 12710(7F16) are called control characters and are not symbols representing actual characters. Excluding this control character, it consists of 10 numbers from 4810(3016), 26 uppercase letters from 6510(4116), 26 lowercase letters from 9710(6116), and the rest of the punctuation marks. Using these letters, numbers, and symbols, it is possible to express words, sentences, and numbers.

 

4. Unicode

As ASCII is used worldwide, characters of each country are also assigned to codes by adding digits(12810(8016)~) based on ASCII as needed. However, if an independently created character sequence is exchanged globally, there is no way to be compatible. Accordingly, in 1991, a coding system that handles all characters in the world was created and announced, which is Unicode (Unique, Universal, and Uniform character encoding).
  The purpose of the code can be seen from the name, and characters from the past, including modern ones, are continuously being added with the goal of fully expressing all writing systems. If ASCII is 128 characters with 7 binary digits, Unicode has 21 binary digits, and the number of possible codes exceeds 1 million characters. Even after allocating characters used up to the present time, there is room for symbols, so pictures, emoticons, and game symbols are sometimes added.
  In the Unicode system, symbols are expressed using 4 to 6 hexadecimal digits as a prefix with U+. Here are a few examples that are actually used:

 

From U+0020 to U+007F (127 characters): Basic Latin (same content as ASCII)

From U+0250 to U+02AF (96 characters): IPA Extensions

From U+0370 to U+03FF (135 characters): Greek

From U+3131 to U+318E (94 characters): Hangul Compatibility Jamo

From U+AC00 to U+D7A3 (11252 characters): Hangul Syllables

From U+1F300 to U+1F5FF (768 characters): Miscellaneous Symbols and Pictographs

From U+1F600 to U+1F64F (80 characters): Emoticons

 

Characters written in 4 hexadecimal digits belong to the Basic Multilingual Plane(BMP, U+0000~U+FFFF), and most of the characters used today fall into this category. In addition, there are Supplementary Multilingual Plane(SMP, 10000~1FFFF), Supplementary Ideographic Plane(SIP, U+20000~2FFFF), and Tertiary Ideographic Plane(TIP, U+30000~3FFFF). Supplementary multilingual is rarely used, and most of the ideograms are assigned Chinese characters as symbols.

If you visit https://www.unicode.org/roadmaps/bmp/index.html, you can check all currently used characters. This makes it possible to read and write all characters equally around the world just by-passing information arranged in binary.

 

5. Conclusion

Know only the type of code, and if necessary, find a table on the Internet and use it.

No comments:

Post a Comment