I am writing an automatic script recently , Obtain mobile device information parameters in batches from an e-commerce website , be based on python + requests Complete script development , But the actual operation efficiency is not particularly satisfactory . I saw it by accident HTTPX, In terms of functionality and efficiency , It gave me a bright feeling .
This article will reveal the secret HTTPX Basic usage and advanced feature usage of .
HTTPX yes Python3 The full function of HTTP client , It provides both synchronous and asynchronous API, And support HTTP/1.1 and HTTP/2.
According to the description on the official website , The summary has the following characteristics :
github Introduce :https://github.com/encode/httpx
Document introduction :https://www.python-httpx.org/
httpx It's easy to install , direct pip Just a matter of .
The installation command is as follows :
pip install httpx
httpx Command line mode is also supported , Need to install httpx[cli]
pip install 'httpx[cli]'
Examples of use :
httpx http://httpbin.org/json
get Request and post request , Direct the package and get Methods or post The method will do . Usage and requests The library is very similar .
Required request parameters and requests Library get The request parameters are similar , It also supports sending requests in proxy mode 、 Redirect 、 Certificate certification, etc .
The code is as follows :
r = httpx.get('http://www.baidu.com')
print(r.status_code)
post Request for json、formdata、files Type support is also comprehensive .
The code is as follows :
r = httpx.post(url='http://api.bakend.com/saveResult',json={'name':'mike'})
print(r.json())
r = httpx.delete('http://www.baidu.com')
r = httpx.put('http://www.baidu.com')
r = httpx.head(''http://www.baidu.com')
r = httpx.options('http://www.baidu.com')
The above is the basic usage , There is nothing special here , Let's put it down .
If you use from requests,httpx.Client() You can use it instead of requests.Session(). The main advantage is to make more effective use of network resources , When sending out API When requesting a request ,HTTPX A new connection will be established for each request ( The connection is not redone ). As the number of requests to hosts increases , This will soon become inefficient .
On the other hand ,Client Case use HTTP Connection pool . This means that when multiple requests are made to the same host ,Client The underlying... Will be reused TCP Connect , Instead of re creating one for each request .
This can lead to significant performance gains :
The code is as follows :
Client As a context manager .with This will ensure that connections are properly cleaned up when leaving the block .
with httpx.Client() as client:
headers = {'os': 'Android'}
r = client.get('https://www.baidu.com', headers=headers)
perhaps , You can use the following command to explicitly close the connection pool without blocking .close():
client = httpx.Client()
try:
client.get('https://www.baidu,com')
finally:
client.close()
Have used requests Ku's classmates should know , It is processing batch requests 、 Reptiles and other scenes , You need to wait for each request to send a completion script , The performance in terms of efficiency is average .
HTTPX You can send network requests asynchronously , Asynchrony is a more efficient concurrency model than multithreading , And it can provide significant performance advantages and support the use of long-lived network connections , for example WebSockets.
To make an asynchronous request , Need one AsyncClient, Use await Keyword modification get Method .
async def get_result():
async with httpx.AsyncClient() as client:
resp = await client.get('http://httpbin.org/get')
assert resp.status_code == 200
html = resp.text
asyncio.run(get_result())
It's mainly used here HTTPX Asynchronous requests and request Request , Compare the time taken to make a request .
request Of Session The way
def request_post_json():
print('----------- Synchronization request -----------')
url = host + "/responseTime/insert"
headers = {
'Content-Type': 'application/json',
}
payload = {}
payload['taskname'] = ' Response test '
payload['appnname'] = 999
payload['platform'] = platform.Android.value
payload['scenes'] = scenes.home.value
payload['videocount'] = 5 # The number of video
payload['grouptime'] = ','.join(["11.5", "11.6", "11.3"]) # Time consuming array
payload['avgtime'] = 5.55 # The average time taken
payload['author'] = 'xinxi'
print(json.dumps(payload, indent=4))
r = requests.Session()
response = r.post(url, headers=headers, json=payload)
print(response.text)
for i in range(100):
request_post_json()
endTime = time.time()
print(endTime - startTime)
Time consuming :
15.002s
HTTPX Asynchronous requests
async def httpx_post_json():
print('----------- Asynchronous requests -----------')
url = host + "/responseTime/insert"
print(url)
headers = {
'Content-Type': 'application/json',
}
payload = {}
payload['taskname'] = ' Response test '
payload['appnname'] = 99
payload['platform'] = platform.Android.value
payload['scenes'] = scenes.home.value
payload['videocount'] = 5 # The number of video
payload['grouptime'] = ','.join(["11.5", "11.6", "11.3"]) # Time consuming array
payload['avgtime'] = 5.55 # The average time taken
payload['author'] = 'xinxi'
async with httpx.AsyncClient() as client:
resp = await client.post(url, headers=headers, json=payload)
assert resp.status_code == 200
html = resp.text
print(html)
startTime = time.time()
loop = asyncio.get_event_loop()
task = [httpx_post_json() for i in range(100)] # Put the task into the array , Ready to call... To the event circulator
loop.run_until_complete(asyncio.wait(task))
loop.close()
endTime = time.time()
print(endTime - startTime)
Time consuming :
3.070s
It can be seen that HTTPX Asynchronous request mode is obviously better than request The request time is much reduced .
That's all HTTPX Some usage and sharing of , Can replace in actual work requests Finish the work . in addition , Blessing advanced usage , It can greatly improve work efficiency .
List of articles Problem desc