您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Learning notes for Python introductory development master the usage of process pool and thread pool

編輯：Python

The point of this section

Master the usage of process pool and thread pool
Palm callback function
The duration of this section needs to be controlled 30 Within minutes

One Process pool and thread pool

At the beginning of learning multiprocess or multithreading , We can't wait to implement concurrent socket communication based on multi process or multi thread , However, the fatal flaw of this implementation is ： The number of open processes or threads of the service will increase with the number of concurrent clients , This will put great pressure on the server host , Even overwhelmed and paralyzed , So we must control the number of processes or threads opened on the server , Let the machine run in a range that it can bear , This is the purpose of a process pool or thread pool , For example, process pool , Is the pool used to store processes , The essence is still based on multiple processes , It just limits the number of open processes

Introduce

 Official website ：https://docs.python.org/dev/library/concurrent.futures.html
concurrent.futures The module provides a highly encapsulated asynchronous calling interface
ThreadPoolExecutor： Thread pool , Provide asynchronous calls
ProcessPoolExecutor: The process of pool , Provide asynchronous calls
Both implement the same interface, which is defined by the abstract Executor class.

The basic method

1、submit(fn, *args, **kwargs)
Submit tasks asynchronously
2、map(func, *iterables, timeout=None, chunksize=1)
replace for loop submit The operation of
3、shutdown(wait=True)
Equivalent to the process pool pool.close()+pool.join() operation
wait=True, Wait for all tasks in the pool to finish executing and recycle resources before continuing
wait=False, Return immediately , It doesn't wait for tasks in the pool to finish
But no matter wait What is the value of the parameter , The whole program will wait until all tasks are completed
submit and map Must be in shutdown Before
4、result(timeout=None)
Get results
5、add_done_callback(fn)
Callback function

Two The process of pool

Introduce

The ProcessPoolExecutor class is an Executor subclass that uses a pool of processes to execute calls asynchronously. ProcessPoolExecutor uses the multiprocessing module, which allows it to side-step the Global Interpreter Lock but also means that only picklable objects can be executed and returned.
class concurrent.futures.ProcessPoolExecutor(max_workers=None, mp_context=None)
An Executor subclass that executes calls asynchronously using a pool of at most max_workers processes. If max_workers is None or not given, it will default to the number of processors on the machine. If max_workers is lower or equal to 0, then a ValueError will be raised.

usage

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
import os,time,random
def task(n):
print('%s is runing' %os.getpid())
time.sleep(random.randint(1,3))
return n**2
if __name__ == '__main__':
executor=ProcessPoolExecutor(max_workers=3)
futures=[]
for i in range(11):
future=executor.submit(task,i)
futures.append(future)
executor.shutdown(True)
print('+++>')
for future in futures:
print(future.result())

3、 ... and Thread pool

Introduce

ThreadPoolExecutor is an Executor subclass that uses a pool of threads to execute calls asynchronously.
class concurrent.futures.ThreadPoolExecutor(max_workers=None, thread_name_prefix='')
An Executor subclass that uses a pool of at most max_workers threads to execute calls asynchronously.
Changed in version 3.5: If max_workers is None or not given, it will default to the number of processors on the machine, multiplied by 5, assuming that ThreadPoolExecutor is often used to overlap I/O instead of CPU work and the number of workers should be higher than the number of workers for ProcessPoolExecutor.
New in version 3.6: The thread_name_prefix argument was added to allow users to control the threading.Thread names for worker threads created by the pool for easier debugging.

usage

 hold ProcessPoolExecutor Switch to ThreadPoolExecutor, All other usage are the same

Four map Method

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
import os,time,random
def task(n):
print('%s is runing' %os.getpid())
time.sleep(random.randint(1,3))
return n**2
if __name__ == '__main__':
executor=ThreadPoolExecutor(max_workers=3)
# for i in range(11):
# future=executor.submit(task,i)
executor.map(task,range(1,12)) #map To replace the for+submit

5、 ... and Callback function

You can bind a function to each process or thread in the process pool or thread pool , This function automatically triggers after the task of the process or thread is completed , And receive the return value of the task as a parameter , This function is called a callback function

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
from multiprocessing import Pool
import requests
import json
import os
def get_page(url):
print('< process %s> get %s' %(os.getpid(),url))
respone=requests.get(url)
if respone.status_code == 200:
return {'url':url,'text':respone.text}
def parse_page(res):
res=res.result()
print('< process %s> parse %s' %(os.getpid(),res['url']))
parse_res='url:<%s> size:[%s]\n' %(res['url'],len(res['text']))
with open('db.txt','a') as f:
f.write(parse_res)
if __name__ == '__main__':
urls=[
'https://www.baidu.com',
'https://www.python.org',
'https://www.openstack.org',
'https://help.github.com/',
'http://www.sina.com.cn/'
]
p=ProcessPoolExecutor(3)
for url in urls:
p.submit(get_page,url).add_done_callback(parse_page) #parse_page I got a future object obj, Need to use