In previous articles , We introduced the iterator pattern
Iterator pattern is a very common behavior design pattern , Most object-oriented programming languages provide the implementation of iterator pattern and specific tool classes , Iterators are mainly used to get the data items in the container in the required order . We used simple and clear examples in the previous article to illustrate Python Iterators and keywords in yield Usage of .
python yield And generators
They are the goals of this article .
Iterators are mainly used to support the following functions :
When you encounter an iterative scenario ,Python The interpreter will automatically call the built-in... With this object as a parameter iter Method . iter The execution logic of the method is as follows :
The second step above is designed to be compatible with the old version , It is likely to be cancelled in the future . meanwhile , As long as the object obj Realized __iter__ Method , perform isinstance(obj, abc.Iterable) It will return True, This is because abc.Iterable Class implements the magic method we introduced earlier __subclasshook__
Why do we need iterators ? For an iteratable object , We need to take out the elements in sequence , But for sequences 、 Dictionaries 、 Tuples and even trees 、 Iterations of structures such as graphs , We don't care about the internal implementation of its data structure , We just need the elements taken out of it , meanwhile , For a particular structure , There may be multiple iterations , For example, for binary tree structure , There is a priori 、 In the following order 、 Middle order and other traversal methods . that , How to avoid these complexities that we don't care about in the sequential iteration process ? Use uniform object encapsulation , Provide a set of simple 、 The abstract iterative approach is a very elegant solution , This is what the iterator pattern does . Iterators operate on iterated objects , At the same time, it provides a general iterative use mode for the upper client , To hide the details of specific iterations .
If you want your object to be iterated , At the same time, you cannot guarantee what you have achieved __getitem__ Methodical key It can be downloaded from 0 Start extracting elements in sequence , Then it must be realized __iter__ Method and returns a abc.Iterator Class object . The iterator object you return can not inherit explicitly abc.Iterator, As long as the implementation __iter__ and __next__ Two methods ,abc.Iterator Class __subclasshook__ Method allows all objects that implement these two methods to pass through isinstance(obj, abc.Iterator) return True.
Methods for creating and returning iterators . Usually , It is used to build and return iterator class objects in an iteratable object , In iterator class objects , To return a reference to itself .
Used to return the next iteration element , If the iteration has been completed , You need to throw StopIteration abnormal , This is also Python The only way in the iterator design idea that iteration completion can be perceived , loop 、 generator 、 Derivation and other scenarios , The interpreter automatically catches and handles this exception .
import re
import reprlib
RE_WORD = re.compile('\w+')
class Sentence:
def __init__(self, text):
self.text = text
self.words = RE_WORD.findall(text)
def __repr__(self):
return 'Sentence(%s)' % reprlib.repr(self.text)
def __iter__(self):
return SentenceIterator(self.words)
class SentenceIterator:
def __init__(self, words):
self.words = words
self.index = 0
def __next__(self):
try:
word = self.words[self.index]
except IndexError:
raise StopIteration()
self.index += 1
return word
def __iter__(self):
return self
Realization __iter__ And __next__ Two methods to implement an iterator class are not Python It's customary , This seems too cumbersome . Replaced by , Using generator functions is simpler and easier to understand .
import re
import reprlib
RE_WORD = re.compile('\w+')
class Sentence:
def __init__(self, text):
self.text = text
def __repr__(self):
return 'Sentence(%s)' % reprlib.repr(self.text)
def __iter__(self):
for match in RE_WORD.finditer(self.text):
yield match.group()
As long as the function definition has yield keyword , This function is a generator function , Its call returns a generator object , in other words , The generator function is a generator factory .
>>> def gen123():
... print('start')
... yield 1
... print('continue')
... yield 2
... print('end')
... yield 3
...
>>> g = gen123()
>>> next(g)
start
1
>>> next(g)
continue
2
>>> next(g)
end
3
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
>>> res = [x*3 for x in gen123()]
start
continue
end
>>> res
[3, 6, 9]
We see , The generator function returns a generator object , The behavior of this generator object is exactly the same as that of the iterator .
Since the generator function is a function , So this function can return A value ? stay python3.3 You can't have , But in python3.3 Start ,python The concept of coprocess is introduced , When a generator function is used as a coroutine , Its return The results will be meaningful , But even so ,return Statement will still result in throwing StopIteration abnormal , But will carry return Value . About the process , We will have an article dedicated to , Coming soon .
For the top Sentence Examples of classes , There is another way to implement the iteration of this class — Generator Expressions . Sometimes it is more convenient to use generator expressions .
import re
import reprlib
RE_WORD = re.compile('\w+')
class Sentence:
def __init__(self, text):
self.text = text
def __repr__(self):
return 'Sentence(%s)' % reprlib.repr(self.text)
def __iter__(self):
return (match.group() for match in RE_WORD.finditer(self.text))
The generator expression is a python Grammar sugar in , It is essentially the same as the generator function , It is very similar to list derivation in form . But generator expressions are fundamentally different from list derivation , List derivation will create all the elements at once , If there are too many elements in the list , Will result in an increase in memory usage , And the generator function 、 The generator object generated by the generator expression will record the program execution context , Every time next The call generates only one element , So as to save memory . In the context of large amounts of data , iterator 、 Generator Expressions 、 The generator function is a very good solution .
Sometimes we need to generate the value of another generator or iterator in our generator function .
>>> def geniter(iterable):
... for i in iterable:
... yield i
...
>>> g = geniter(range(3))
>>> next(g)
0
>>> next(g)
1
>>> next(g)
2
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
In the example above , We have realized the data generation of our generator function through another generator in a circular way . yield from Expressions let us eliminate loops in our code .
>>> def geniter(iterable):
... yield from iterable
...
>>> g = geniter(range(3))
>>> next(g)
0
>>> next(g)
1
>>> next(g)
2
>>> next(g)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
yield from It's not just a grammar sugar , And python Are closely related to each other , Further content , Please pay attention to the following questions about python The article of Xie Cheng .