您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

[Python skill tree co construction] character encoding and decoding

編輯：Python

Python What is character encoding and decoding

stay Python3 The default encoding of Chinese characters is Unicode character , Encoding refers to converting characters into byte streams , Decoding is the opposite operation .
Before the official start , We also need to sort out some basic concepts
Python String in
In the computer 8 The bit （bit） Equal to one byte （byte）,8 It's bits 8 position , That is, the largest integer a byte can represent is 255（1111 1111）.
If you want to expand the integer range , Need more bytes , for example 2 One byte can represent 65535,4 It can be represented by one byte 4294967295.
Based on the above principles, various coding formats have emerged , for example ASCII Can be said 256 Characters , But only English letters are supported , Numbers and a few symbols , The scope of Chinese is much larger , So it's here GB2312 code （ Later upgraded to GBK code ）, It can hold 6763 The Chinese characters , But looking at the world is not enough , More characters are needed .
here Unicode The character set appears , It holds all languages together , In order to save space when storing and transmitting data , There is UTF8 code .

How to use it?

Python Basic use of coding

adopt ord() Function to get the integer representation of a character , adopt chr() Convert integers to characters , For example, the following code

print(ord(' climb ')) # 29228
print(chr(29228))

Now that you know that numbers can be converted into numbers , Both decimal and hexadecimal numbers are OK .
for example 29228 = 722c, therefore \u722c

print(chr(int('722c', 16)))

You can also use Unicode Transcoding tool for conversion .

Python Encoding and decoding functions

encode() and decode() Corresponding to encoding and decoding functions respectively ,en Is the code ,de It's decoding .

my_b = ' The skill tree '.encode('utf-8')
print(' After the coding ',my_b) #  After the coding  b'\xe6\x8a\x80\xe8\x83\xbd\xe6\xa0\x91'

The decoding operation is as follows ：

my_b = ' The skill tree '.encode('utf-8')
print(' After the coding ', my_b) #  After the coding  b'\xe6\x8a\x80\xe8\x83\xbd\xe6\xa0\x91'

my_str = my_b.decode('utf-8')
print(&quot; After decoding &quot;, my_str)

Note that the output after encoding is similar to the string , It is preceded by a prefix b.

The statement

If the encoding and decoding methods are inconsistent , There will be a mess , For example, the following code

my_b = ' The skill tree '.encode('gbk')
print(' After the coding ', my_b) #  After the coding  b'\xbc\xbc\xc4\xdc\xca\xf7'

my_str = my_b.decode('utf-8')
print(&quot; After decoding &quot;, my_str)

The error message is as follows ：

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 0: invalid start byte

When the above types of errors occur , What needs to be done is to find the correct original code , And then deal with it .

上一篇文章：【Python技能樹共建】字符編碼與解碼
下一篇文章： Acwing game time c++ Python

Python

21天學Python --- 打卡3: Python ＆＆ Json

21天學Python --- 打卡3: Python &am

What are the advantages of Python? Why do so many people use Python?

In programming languages ,Pyth

Python中應用Winsorize縮尾處理的操作經驗

最近搞數據時發現，縮尾時本來是空值或者無效值的地方被填補了數

詳解 Python 列表推導式|迭代器|生成器|匿名函數

作者 | pythonic生物人來源 | pyth

【畢設教程】python區塊鏈實現 - proof of work工作量證明共識算法

文章目錄0 前言1 區塊鏈基礎1.1 比特幣內部結構1.2

python怎麼獲取xml元素內容?

如圖想用xml儲存數據，需要用到dom操作，但py和js還不

[Python] [skill tree evaluation] skill example - Description improvement and practice [03] abstract class

[Python] [skill tree evaluation] skill example - description, improvement and practice [02] - class initialization and initialization parameters

[Python skill tree] evaluation

[Python daily skill] print text with fixed width

Force deduction binary tree middle order traversal (non recursive) Python

Python | using Python to implement the tree command of Linux system

Merkle tree python implementation

Python uses binary tree to store expressions

[Python] [skill tree evaluation] skill example - Description improvement and practice [04] access restrictions

When I drew a Christmas tree for my female classmate in Python

熱門圖文

C#顯式地完成接口成員的辦法 c說話完成體系時光校訂對象代碼分享 C++實現一個線程安全的單例工廠實現代碼基於MIDP1.0實現屏幕轉動 B-樹 C++模板類封裝（有圖有真相），b-模板 OGNL中$、%、#、{}、^、$ 標簽的使用簡介 android-設置按鈕和相關文本向右對齊 C#中HashTable簡介和使用用法

欄目導航