This section is for some Python Compare confusing operations .
1.1 There are put back random sampling and no put back random sampling
import random
random.choices(seq, k=1) # The length is k Of list, There is a return sample
random.sample(seq, k) # The length is k Of list, No return sampling
1.2 lambda The parameters of the function
func = lambda y: x + y # x The value of is bound when the function runs
func = lambda y, x=x: x + y # x The value of is bound when the function is defined
1.3 copy and deepcopy
import copy
y = copy.copy(x) # Copy only the top layer
y = copy.deepcopy(x) # Copy all nested parts
When replication and variable aliases are combined , Easy to confuse :
a = [1, 2, [3, 4]]
# Alias.
b_alias = a
assert b_alias == a and b_alias is a
# Shallow copy.
b_shallow_copy = a[:]
assert b_shallow_copy == a and b_shallow_copy is not a and b_shallow_copy[2] is a[2]
# Deep copy.
import copy
b_deep_copy = copy.deepcopy(a)
assert b_deep_copy == a and b_deep_copy is not a and b_deep_copy[2] is not a[2]
Changes to the alias will affect the original variable ,( shallow ) The element in the copy is the alias of the element in the original list , Deep replication is recursive , Modifications to deep replication do not affect the original variable .
1.4 == and is
x == y # Whether the two reference objects have the same value
x is y # Whether two references point to the same object
1.5 Judgment type
type(a) == int # Ignore polymorphism in object-oriented design
isinstance(a, int) # The polymorphism in object-oriented design is considered
1.6 String search
str.find(sub, start=None, end=None); str.rfind(...) # If we can't find a way back -1
str.index(sub, start=None, end=None); str.rindex(...) # If not found, throw ValueError abnormal
1.7 List Backward index
It's just a matter of habit , When indexing forward, the subscript starts from 0 Start , If the reverse index also wants to start from 0 You can use ~.
print(a[-1], a[-2], a[-3])
print(a[~0], a[~1], a[~2])
not a few Python Our users are from the past C/C++ Moved over , The two languages are in grammar 、 There are some differences in code style and so on , This section briefly introduces .
2.1 Large numbers and small numbers
C/C++ The habit is to define a large number ,Python There is inf and -inf:
a = float('inf')
b = float('-inf')
2.2 Boolean value
C/C++ My habit is to use 0 He Fei 0 Value representation True and False, Python Recommended direct use True and False Represents a Boolean value .
a = True
b = False
2.3 The judgment is empty
C/C++ The habit of judging null pointers is if (a) and if (!a).Python about None Our judgment is :
if x is None:
pass
If you use if not x, Then other objects will be ( For example, the length is 0 String 、 list 、 Tuples 、 Dictionary, etc ) Will be treated as False.
2.4 Exchange value
C/C++ The habit is to define a temporary variable , In exchange for values . utilize Python Of Tuple operation , One step at a time .
a, b = b, a
2.5 Compare
C/C++ My habit is to use two conditions . utilize Python One step at a time .
if 0 < a < 5:
pass
2.6 Class members Set and Get
C/C++ The custom is to set class members to private, Through a series of Set and Get Function to access the value . stay Python Although you can also pass @property、@setter、@deleter Set the corresponding Set and Get function , We should avoid unnecessary abstraction , This will be slower than direct access 4 - 5 times .
2.7 The input and output parameters of the function
C/C++ My habit is to list the input and output parameters as the parameters of the function , Change the value of the output parameter through the pointer , The return value of the function is the execution state , The function caller checks the return value , Determine whether the execution was successful . stay Python in , There is no need for the function caller to check the return value , Special case encountered in function , Throw an exception directly .
2.8 Reading documents
comparison C/C++,Python Reading files is much easier , The opened file is an iteratable object , Return one line at a time .
with open(file_path, 'rt', encoding='utf-8') as f:
for line in f:
print(line) # Last \n Will retain
2.9 File path splicing
C/C++ The habit of using... Directly + Splice paths , It's easy to go wrong ,Python Medium os.path.join The connection between paths will be automatically supplemented according to different operating systems / or \ Separator :
import os
os.path.join('usr', 'lib', 'local')
2.10 Parsing command line options
although Python It can also be like C/C++ The use of sys.argv Directly parse the command line selection , But use argparse Under the ArgumentParser Tools are more convenient , More powerful .
2.11 Call external command
although Python It can also be like C/C++ The use of os.system Call external commands directly , But use subprocess.check_output You are free to choose whether to execute Shell, You can also get the execution results of external commands .
import subprocess
# If the external command returns a value other than 0, Throw out subprocess.CalledProcessError abnormal
result = subprocess.check_output(['cmd', 'arg1', 'arg2']).decode('utf-8')
# Collect both standard output and standard errors
result = subprocess.check_output(['cmd', 'arg1', 'arg2'], stderr=subprocess.STDOUT).decode('utf-8')
# perform shell command ( The Conduit 、 Redirect etc. ), have access to shlex.quote() Enclose the parameter in double quotation marks
result = subprocess.check_output('grep python | wc > out', shell=True).decode('utf-8')
2.12 Don't make wheels again
Don't make wheels over and over again ,Python be called batteries included Which means Python Provides solutions to many common problems .
3.1 Reading and writing CSV file
import csv
# nothing header Read and write
with open(name, 'rt', encoding='utf-8', newline='') as f: # newline='' Give Way Python Do not treat line breaks uniformly
for row in csv.reader(f):
print(row[0], row[1]) # CSV All the data I read is str type
with open(name, mode='wt') as f:
f_csv = csv.writer(f)
f_csv.writerow(['symbol', 'change'])
# Yes header Read and write
with open(name, mode='rt', newline='') as f:
for row in csv.DictReader(f):
print(row['symbol'], row['change'])
with open(name, mode='wt') as f:
header = ['symbol', 'change']
f_csv = csv.DictWriter(f, header)
f_csv.writeheader()
f_csv.writerow({'symbol': xx, 'change': xx})
Be careful , When CSV An error will be reported when the file is too large :_csv.Error: field larger than field limit (131072), Solve by modifying the upper limit
import sys
csv.field_size_limit(sys.maxsize)
csv You can also read \t Segmented data
f = csv.reader(f, delimiter='\t')
3.2 Iterator tool
itertools There are many tools in the iterator definition , For example, subsequence tools :
import itertools
itertools.islice(iterable, start=None, stop, step=None)
# islice('ABCDEF', 2, None) -> C, D, E, F
itertools.filterfalse(predicate, iterable) # To filter out predicate by False The elements of
# filterfalse(lambda x: x < 5, [1, 4, 6, 4, 1]) -> 6
itertools.takewhile(predicate, iterable) # When predicate by False Stop iterating
# takewhile(lambda x: x < 5, [1, 4, 6, 4, 1]) -> 1, 4
itertools.dropwhile(predicate, iterable) # When predicate by False Start iteration at
# dropwhile(lambda x: x < 5, [1, 4, 6, 4, 1]) -> 6, 4, 1
itertools.compress(iterable, selectors) # according to selectors Each element is True or False Make a selection
# compress('ABCDEF', [1, 0, 1, 0, 1, 1]) -> A, C, E, F
Sequence order :
sorted(iterable, key=None, reverse=False)
itertools.groupby(iterable, key=None) # Group by value ,iterable Need to be sorted first
# groupby(sorted([1, 4, 6, 4, 1])) -> (1, iter1), (4, iter4), (6, iter6)
itertools.permutations(iterable, r=None) # array , The return value is Tuple
# permutations('ABCD', 2) -> AB, AC, AD, BA, BC, BD, CA, CB, CD, DA, DB, DC
itertools.combinations(iterable, r=None) # Combine , The return value is Tuple
itertools.combinations_with_replacement(...)
# combinations('ABCD', 2) -> AB, AC, AD, BC, BD, CD
Merge multiple sequences :
itertools.chain(*iterables) # Direct splicing of multiple sequences
# chain('ABC', 'DEF') -> A, B, C, D, E, F
import heapq
heapq.merge(*iterables, key=None, reverse=False) # Multiple sequences are spliced in sequence
# merge('ABF', 'CDE') -> A, B, C, D, E, F
zip(*iterables) # Stop when the shortest sequence is exhausted , The result can only be consumed once
itertools.zip_longest(*iterables, fillvalue=None) # Stop when the longest sequence is exhausted , The result can only be consumed once
3.3 Counter
The counter can count the number of occurrences of each element in an iteratable object .
import collections
# establish
collections.Counter(iterable)
# The frequency of
collections.Counter[key] # key Frequency of occurrence
# return n The element with the highest frequency and its corresponding frequency , If n by None, Return all elements
collections.Counter.most_common(n=None)
# Insert / to update
collections.Counter.update(iterable)
counter1 + counter2; counter1 - counter2 # counter Addition and subtraction
# Check whether the constituent elements of two strings are the same
collections.Counter(list1) == collections.Counter(list2)
3.4 With default values Dict
When access does not exist Key when ,defaultdict It is set to a default value .
import collections
collections.defaultdict(type) # When the first visit dict[key] when , Will call... Without parameters type, to dict[key] Provide an initial value
3.5 Orderly Dict
import collections
collections.OrderedDict(items=None) # Keep the original insertion order during iteration
4.1 Output error and warning messages
Output information to standard error
import sys
sys.stderr.write('')
Output warning messages
import warnings
warnings.warn(message, category=UserWarning)
# category The value is DeprecationWarning, SyntaxWarning, RuntimeWarning, ResourceWarning, FutureWarning
Controls the output of warning messages
$ python -W all # Output all warnings , Equivalent to setting warnings.simplefilter('always')
$ python -W ignore # Ignore all warnings , Equivalent to setting warnings.simplefilter('ignore')
$ python -W error # Convert all warnings to exceptions , Equivalent to setting warnings.simplefilter('error')
4.2 Test in code
Sometimes in order to debug , We want to add some code to the code , Usually a few print sentence , Can be written as :
# In the code debug part
if __debug__:
pass
Once debugging is complete , By executing... On the command line -O Options , Will ignore this part of the code :
$ python -0 main.py
4.3 Code style check
Use pylint You can do a lot of code style and syntax checking , Can find some errors before running
pylint main.py
4.4 Code takes time
Time consuming testing
$ python -m cProfile main.py
Testing a block of code takes time
# Code block time-consuming definition
from contextlib import contextmanager
from time import perf_counter
@contextmanager
def timeblock(label):
tic = perf_counter()
try:
yield
finally:
toc = perf_counter()
print('%s : %s' % (label, toc - tic))
# Code blocks take time to test
with timeblock('counting'):
pass
Some principles of code time-consuming optimization
Focus on optimizing where performance bottlenecks occur , Not all the code .
Avoid using global variables . The lookup of local variables is faster than that of global variables , It is usually faster to define the code of global variables in functions 15%-30%.
Avoid using . Access properties . Use from module import name Will be faster , Member variables of classes that will be accessed frequently self.member Put it into a local variable .
Try to use built-in data structures .str, list, set, dict Etc C Realization , It runs very fast .
Avoid creating unnecessary intermediate variables , and copy.deepcopy().
String splicing , for example a + ':' + b + ':' + c Will create a lot of useless intermediate variables ,':',join([a, b, c]) It will be much more efficient . In addition, we need to consider whether string splicing is necessary , for example print(':'.join([a, b, c])) Efficient than print(a, b, c, sep=':') low .
5.1 argmin and argmax
items = [2, 1, 3, 4]
argmin = min(range(len(items)), key=items.__getitem__)
argmax Empathy .
5.2 Transpose 2D list
A = [['a11', 'a12'], ['a21', 'a22'], ['a31', 'a32']]
A_transpose = list(zip(*A)) # list of tuple
A_transpose = list(list(col) for col in zip(*A)) # list of list
5.3 Expand a one-dimensional list into a two-dimensional list
A = [1, 2, 3, 4, 5, 6]
# Preferred.
list(zip(*[iter(A)] * 2))
author : Zhang Hao https://zhuanlan.zhihu.com/p/48293468