Python The loveliest thing about language is that its standard library and tripartite library are so rich , Many tasks in daily development work can be directly solved through these standard libraries or third-party libraries . Let's first introduce Python Some common modules in the standard library , Later, we will continue to introduce to you Python Uses and usages of common third-party libraries .
A-Z
、a-z
、0-9
, Here is 62 Characters , The other two printable symbols are usually +
and /
,=
Used in Base64 At the end of the coding, carry out filling . About Base64 Coding details , You can refer to 《Base64 note 》 One article ,Python In the standard library base64
The module provides b64encode
and b64decode
Two functions , Dedicated to the realization of Base64 Code and decode , The following is a demonstration of Python The interactive environment The effect of executing these two functions in .
>>> import base64
>>>
>>> content = 'Man is distinguished, not only by his reason, but by this singular passion from other animals, which is a lust of the mind, that by a perseverance of delight in the continued and indefatigable generation of knowledge, exceeds the short vehemence of any carnal pleasure.'
>>> base64.b64encode(content.encode())
b'TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlzIHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2YgdGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRoZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4='
>>> content = b'TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlzIHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2YgdGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRoZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4='
>>> base64.b64decode(content).decode()
'Man is distinguished, not only by his reason, but by this singular passion from other animals, which is a lust of the mind, that by a perseverance of delight in the continued and indefatigable generation of knowledge, exceeds the short vehemence of any carnal pleasure.'
collections
The module provides many very useful data structures , It mainly includes :
namedtuple
: Command tuples , It is a kind of factory , Create a class by accepting the name of the type and the list of properties .deque
: deque , Is an alternative implementation of the list .Python The bottom layer of the list in is based on the array , and deque
The bottom layer is a two-way linked list , So when you need to add and remove elements at the beginning and end, it is ,deque
Will show better performance , The asymptotic time complexity is O ( 1 ) O(1) O(1).Counter
:dict
Subclasses of , Keys are elements , Value is the count of elements , its most_common()
Method can help us get the most frequent elements .Counter
and dict
I think it is worth discussing , according to CARP principle ,Counter
Follow dict
The relationship should be designed to be more reasonable .OrderedDict
:dict
Subclasses of , It records the order in which key value pairs are inserted , It seems that the behavior of the existing dictionary , There is also linked list behavior .defaultdict
: Similar to dictionary type , But you can get the default value corresponding to the key through the default factory function , Compared to the dictionary setdefault()
Method , This approach is more efficient .
The following is the Python In an interactive environment Use namedtuple
An example of creating a poker class .
>>> from collections import namedtuple
>>>
>>> Card = namedtuple('Card', ('suite', 'face'))
>>> card1 = Card(' Heart ', 5)
>>> card2 = Card(' Grass flower ', 9)
>>> card1
Card(suite=' Heart ', face=5)
>>> card2
Card(suite=' Grass flower ', face=9)
>>> print(f'{
card1.suite}{
card1.face}')
Heart 5
>>> print(f'{
card2.suite}{
card2.face}')
Grass flower 9
Here's how to use Counter
Class counts the three elements that appear most frequently in the list .
from collections import Counter
words = [
'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around',
'the', 'eyes', "don't", 'look', 'around', 'the', 'eyes',
'look', 'into', 'my', 'eyes', "you're", 'under'
]
counter = Counter(words)
# Print words The most frequent... In the list 3 Elements and their occurrences
for elem, count in counter.most_common(3):
print(elem, count)
Hash function is also called hash algorithm or hash function , Is a method of creating for existing data “ Digital fingerprinting ”( Hash Digest ) Methods . The hash function compresses the data into a digest , For the same input , The hash function can generate the same digest ( Digital fingerprinting ), It should be noted that this process is not reversible ( The input cannot be calculated from the summary ). A good hash function can generate different summaries for different inputs , Hash conflict occurred ( Different inputs produce the same summary ) The probability is very low ,MD5、SHA Families are such good hash functions .
explain : stay 2011 In the year ,RFC 6151 It has been forbidden to use MD5 Used as key hash message authentication code , This problem is beyond the scope of our discussion .
Python Standard library hashlib
Module provides the encapsulation of hash function , By using md5
、sha1
、sha256
Such as , We can easily generate “ Digital fingerprinting ”. Let's take a simple example , When a user registers, we want to save the user's password in the database , Obviously, we can't store user passwords directly in the database , This may lead to the disclosure of user privacy , So when saving the user password in the database , Usually the password will be “ The fingerprint ” Save up , When the user logs in, the hash function is used to calculate the password “ The fingerprint ” Then match to determine whether the user login is successful .
import hashlib
# Compute string "123456" Of MD5 Abstract
print(hashlib.md5('123456'.encode()).hexdigest())
# Calculation file "Python-3.7.1.tar.xz" Of MD5 Abstract
hasher = hashlib.md5()
with open('Python-3.7.1.tar.xz', 'rb') as file:
data = file.read(512)
while data:
hasher.update(data)
data = file.read(512)
print(hasher.hexdigest())
explain : Many websites provide hash summaries next to download links , After downloading the file , We can calculate the hash summary of the file and check whether it is consistent with the hash summary provided on the website ( Fingerprint comparison ). If the calculated hash summary is not consistent with that provided by the website , It is likely that the download error or the file has been tampered with during the transmission process , This file should not be used directly .
heapq
Module implements the heap sorting algorithm , If you want to use heap sorting , Especially to solve TopK problem ( Find... From the sequence K A maximum or minimum element ), Use this module directly , The code is as follows .
import heapq
list1 = [34, 25, 12, 99, 87, 63, 58, 78, 88, 92]
# Find the three largest elements in the list
print(heapq.nlargest(3, list1))
# Find the smallest three elements in the list
print(heapq.nsmallest(3, list1))
list2 = [
{
'name': 'IBM', 'shares': 100, 'price': 91.1},
{
'name': 'AAPL', 'shares': 50, 'price': 543.22},
{
'name': 'FB', 'shares': 200, 'price': 21.09},
{
'name': 'HPQ', 'shares': 35, 'price': 31.75},
{
'name': 'YHOO', 'shares': 45, 'price': 16.35},
{
'name': 'ACME', 'shares': 75, 'price': 115.65}
]
# Find the top three stocks
print(heapq.nlargest(3, list2, key=lambda x: x['price']))
# Find the three stocks with the highest number of shares
print(heapq.nlargest(3, list2, key=lambda x: x['shares']))
itertools
It can help us generate various iterators , You can take a look at the following examples .
import itertools
# produce ABCD The whole arrangement
for value in itertools.permutations('ABCD'):
print(value)
# produce ABCDE Three out of five combinations
for value in itertools.combinations('ABCDE', 3):
print(value)
# produce ABCD and 123 Cartesian product of
for value in itertools.product('ABCD', '123'):
print(value)
# produce ABC Infinite cyclic sequence of
it = itertools.cycle(('A', 'B', 'C'))
print(next(it))
print(next(it))
print(next(it))
print(next(it))
We have used this module many times before , Generate random number 、 Realize random random disorder and random sampling , Here is a list of commonly used functions .
getrandbits(k)
: Returns a k
An integer of random bits .randrange(start, stop[, step])
: from range(start, stop, step)
Returns a randomly selected element , But it doesn't actually build a range
object .randint(a, b)
: Returns a random integer N
Satisfy a <= N <= b
, amount to randrange(a, b+1)
.choice(seq)
: From non empty sequence seq
Returns a random element . If seq
It's empty , The cause IndexError
.choices(population, weight=None, *, cum_weights=None, k=1)
: from population
Choose to replace , The return size is k
The list of elements . If population
It's empty , The cause IndexError
.shuffle(x[, random])
: The sequence of x
Randomly disrupt the position .sample(population, k)
: Returns the selection from the overall sequence or set k
A list of non repeating element constructions , For random sampling without repetition .random()
: return [0.0, 1.0)
The next random floating-point number in the range .expovariate(lambd)
: An index distribution .gammavariate(alpha, beta)
: Gamma distribution .gauss(mu, sigma)
/ normalvariate(mu, sigma)
: Normal distribution .paretovariate(alpha)
: Pareto distribution .weibullvariate(alpha, beta)
: Weibull distribution .os.path
The module encapsulates the tool functions of the operation path , If the file path needs to be spliced in the program 、 Split 、 Get and get the existence and other properties of the file , This module will be very helpful , Here are some common functions for you .
dirname(path)
: Return path path
Directory name .exists(path)
: If path
Point to an existing path or an open file descriptor , return True
.getatime(path)
/ getmtime(path)
/ getctime(path)
: return path
Last access time for / Last modified / Creation time .getsize(path)
: return path
Size , In bytes . If the file does not exist or is inaccessible , Throw out OSError
abnormal .isfile(path)
: If path
It's an ordinary document , Then return to True
.isdir(path)
: If path
Is a directory ( Folder ), Then return to True
.join(path, *paths)
: Reasonably splice one or more path parts . The return value is path
and paths
Connection of all values , Each non empty part is followed by a directory separator (os.sep
), Except for the last part . This means that if the last part is empty , The result will end with a delimiter . If a part of the parameter is an absolute path , Then all paths before the absolute path will be discarded , And connect from the absolute path part .splitext(path)
: Route path
Split into a pair , namely (root, ext)
, bring root + ext == path
, among ext
Empty or start with English period , And contain at most one period .uuid
Modules can help us generate globally unique identifiers (Universal Unique IDentity). This module provides four for generating UUID Function of , Namely :
uuid1()
: from MAC Address 、 Current timestamp 、 Random number generation , It can guarantee the uniqueness in the world .uuid3(namespace, name)
: By calculating the name of the namespace and MD5 Hash Digest (“ The fingerprint ”) Worthy of , It ensures the uniqueness of different names in the same namespace , And the uniqueness of different namespace , But the same name in the same namespace will generate the same UUID.uuid4()
: Generated from pseudo-random numbers UUID, There is a certain probability of repetition , The probability can be calculated .uuid5()
: Algorithm and uuid3
identical , Only the hash function uses SHA-1 To replace the MD5. because uuid4
There is a probabilistic repetition , It is better not to use a globally unique identifier where it is really needed . In a distributed environment ,uuid1
It's a good choice , Because it can guarantee the generation of ID Global uniqueness of . The following is the Python In an interactive environment Use uuid1
Function to generate a globally unique identifier .
>>> import uuid
>>> uuid.uuid1().hex
'622a8334baab11eaaa9c60f81da8d840'
>>> uuid.uuid1().hex
'62b066debaab11eaaa9c60f81da8d840'
>>> uuid.uuid1().hex
'642c0db0baab11eaaa9c60f81da8d840'
Python There are a large number of modules in the standard library , There are many common tasks in daily development Python There are encapsulated functions or classes available in the standard library , This is also Python The loveliest part of this language .