stay python In the encoding for When cycling through tasks , Will load all the parameters to be traversed into memory . It's not necessary , Because these parameters are likely to be disposable , Even in many scenarios, these parameters do not need to be stored in memory at the same time , In this case, we will use the iterative generator introduced in this article yield.
Let's start with an example to illustrate the iterative generator yield The basic use method , The purpose of this example is to construct a function to generate a square array In ordinary scenes, we usually construct an empty list directly , Then fill in the list with the results of each calculation , Last return Just list , It corresponds to the function here square_number. And another function square_number_yield To demonstrate yield And the constructed function , It uses the same syntax as return It's the same , The difference is that only one value is returned at a time :
def square_number(length):
s = []
for i in range(length):
s.append(i ** 2)
return s
def square_number_yield(length):
for i in range(length):
yield i ** 2
if __name__ == '__main__':
length = 10
sn1 = square_number(length)
sn2 = square_number_yield(length)
for i in range(length):
print (sn1[i], '\t', end='')
print (next(sn2))
stay main Function, we compare the results of the two methods , Print on the same line , use end='' Instructions can replace the line feed at the end of a line , The results are as follows :
[[email protected]-manjaro yield]$ python3 test_yield.py
0 0
1 1
4 4
9 9
16 16
25 25
36 36
49 49
64 64
81 81
You can see that the results of the two methods are the same . Maybe in some scenarios, it is the result returned from the storage function that needs to be persisted , This is useful yield It can also be realized , You can refer to the following example :
def square_number(length):
s = []
for i in range(length):
s.append(i ** 2)
return s
def square_number_yield(length):
for i in range(length):
yield i ** 2
if __name__ == '__main__':
length = 10
sn1 = square_number(length)
sn2 = square_number_yield(length)
sn3 = list(square_number_yield(length))
for i in range(length):
print (sn1[i], '\t', end='')
print (next(sn2), '\t', end='')
print (sn3[i])
The method used here is to directly yield The generated object is transformed into list Format , Or use sn3 = [i for i in square_number_yield(length)]
This way of writing is also possible , There should be little difference in performance . The execution result of the above code is as follows :
[[email protected]-manjaro yield]$ python3 test_yield.py
0 0 0
1 1 1
4 4 4
9 9 9
16 16 16
25 25 25
36 36 36
49 49 49
64 64 64
81 81 81
In the previous chapter we mentioned , Use yield It can save the memory of the program , Here we test a 100000 The sum of squares of a random array of sizes . If you use normal logic , So the program is as follows ( About python Memory footprint tracking method , You can refer to this blog )
import tracemalloc
import time
import numpy as np
tracemalloc.start()
start_time = time.time()
ss_list = np.random.randn(100000)
s = 0
for ss in ss_list:
s += ss ** 2
end_time = time.time()
print ('Time cost is: {}s'.format(end_time - start_time))
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
for stat in top_stats[:5]:
print (stat)
This program, on the one hand, through time To test the execution time , On the other hand, use tracemalloc Track the memory changes of the program . Here is the first one np.random.randn() Directly produced 100000 An array of random numbers used to calculate , Naturally, these generated random numbers need to be stored in the process of calculation , It's going to take up so much memory . If you use yield Methods , Only one random number for calculation is generated at a time , And according to the usage in the previous chapter , The random number generated by this iteration can also be transformed into a complete list Of :
import tracemalloc
import time
import numpy as np
tracemalloc.start()
start_time = time.time()
def ss_list(length):
for i in range(length):
yield np.random.random()
s = 0
ss = ss_list(100000)
for i in range(100000):
s += next(ss) ** 2
end_time = time.time()
print ('Time cost is: {}s'.format(end_time - start_time))
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
for stat in top_stats[:5]:
print (stat)
The results of these two examples are as follows , Can be put together for comparison :
[[email protected]-manjaro yield]$ python3 square_sum.py
Time cost is: 0.24723434448242188s
square_sum.py:9: size=781 KiB, count=2, average=391 KiB
square_sum.py:12: size=24 B, count=1, average=24 B
square_sum.py:11: size=24 B, count=1, average=24 B
[[email protected]-manjaro yield]$ python3 yield_square_sum.py
Time cost is: 0.23023390769958496s
yield_square_sum.py:9: size=136 B, count=1, average=136 B
yield_square_sum.py:14: size=112 B, count=1, average=112 B
yield_square_sum.py:11: size=79 B, count=2, average=40 B
yield_square_sum.py:10: size=76 B, count=2, average=38 B
yield_square_sum.py:15: size=28 B, count=1, average=28 B
After comparison, we find that , The calculation time of the two methods is almost the same , But in terms of memory footprint yield There are clear advantages . Of course , Maybe this example is not very appropriate , But this article mainly introduces yield How to use it and its application scenarios .
In the reference link 1 One use mentioned in is infinite iterators , For example, return all prime numbers in order , So if we use return To return all the elements and store them in a list , It's a very uneconomic way , So you can use yield To iteratively generate , Reference link 1 The source code in is as follows :
''' No one answers the problems encountered in learning ? Xiaobian created a Python Exchange of learning QQ Group :153708845 Looking for small partners who share the same aspiration , Help each other , There are also good video tutorials and PDF e-book ! '''
def get_primes(number):
while True:
if is_prime(number):
yield number
number += 1
So similar , Here we use while True Can show a simple case —— Returns all even numbers :
def yield_range2(i):
while True:
yield i
i += 2
iter = yield_range2(0)
for i in range(10):
print (next(iter))
Because here we limit the length to be 10, So in the end it will return to 10 An even number :
[[email protected]-manjaro yield]$ python3 yield_iter.py
0
2
4
6
8
10
12
14
16
18
This paper introduces python The iterator yield, In fact, about yield, We can simply understand it as a single element of return. This is not only a preliminary understanding yield The usage grammar of , I can also get a general idea of yield The advantages of , That is to say, in the process of calculation, only one element of memory is occupied at a time , Instead of storing a large number of elements in memory all the time .