Python in , To know how many characters a string has ( Get string length ), Or how many bytes a string takes , have access to len function .
len The basic syntax format of the function is :
len(string)
among string Used to specify the string to be length counted .
for example , Define a string , The content is “http://c.biancheng.net”, And then use len() Function to evaluate the length of the string , The execution code is as follows :
>>> a='http://c.biancheng.net'
>>> len(a)
22
In actual development , In addition to often getting the length of the string , Sometimes you have to get the number of bytes of the string .
stay Python in , Different characters occupy different bytes , Numbers 、 English letter 、 decimal point 、 Underline and space , One byte each , And a Chinese character may account for 2~4 Bytes , How many , Depending on the encoding method used . for example , Chinese characters are in GBK/GB2312 Use... In code 2 Bytes , And in the UTF-8 The code usually uses 3 Bytes .
With UTF-8 Coding, for example , character string “ Life is too short , I use Python” The number of bytes occupied is shown in the following figure .
We can do that by using encode() Method , Encode the string and then get its bytes . for example , use UTF-8 Encoding mode , Calculation “ Life is too short , I use Python” Bytes of , You can execute the following code :
>>> str1 = " Life is too short , I use Python"
>>> len(str1.encode())
27
Because Chinese characters and Chinese punctuation marks are 7 individual , Occupy 21 Bytes , English letters and punctuation marks account for 6 Bytes , All together 27 Bytes .
Empathy , If you want to get adoption GBK The length of the encoded string , You can execute the following code :
>>> str1 = " Life is too short , I use Python"
>>> len(str1.encode('gbk'))
20