requests Bag is python The most used third party URL Get package of resources , Can be easily implemented get/post visit 、 Interface testing, etc .
requests Installation will not be repeated here , direct pip Just install it .
pip install requests
Introduce before use requests package import requests
, call get()
Method execution get request , The specific code is as follows :
import requests
# Get the Douban movie homepage label
url = 'https://movie.douban.com/j/search_tags?type=movie&source=index'
r = requests.get(url)
r.encoding = 'utf-8'
data = r.json()
print(data)
When executing this code, you will find Report errors , The information is as follows . This is because access to Douban requires adding browser information to the request header User-Agent
, Delegates are accessed through a browser .
...
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
How to get a browser User-Agent
Information ?
Open the browser , Press F12
Or click settings , Open developer tools . And then choose Network
, Find a connection to check Headers
Information , reproduce User-Agent
Of value value .
The revised code is as follows :
import requests
# Add browser information to the request header
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.134 Safari/537.36 Edg/103.0.1264.71'
}
# Get the Douban movie homepage label
url = 'https://movie.douban.com/j/search_tags?type=movie&source=index'
r = requests.get(url, headers=headers)
r.encoding = 'utf-8'
data = r.json()
print(data)
After execution, the output information is as follows :
{
'tags': [' hot ', ' newest ', ' Douban high score ', ' Popular film ', ' Chinese ', ' Europe and the United States ', ' South Korea ', ' Japan ']}
Come here requests The basic use of the package is over , We can see the above in the browser developer tool url
Of Headers
Information , Usually in the acquisition of url Resources will first analyze the corresponding request header to write code .
General
( essential information ),Request Method: GET
That is the get
request , So call requests.get()
Method .
Responsese Headers
( Response header information ),Content-Type: application/json; charset=utf-8
The content returned by the representative is json Format , So we use r.encoding = 'utf-8'
code ,data = r.json()
obtain json Information .
Request Headers
( Request header information ), The main thing here is Cookie
/User-Agent
/token
etc. .Cookie
Generally store browser authentication information , Such as user identification , Generally the same cookie Represents the same user accessing , But some use authentication information token
Transitive .
This section describes requests Package call get request The basic method of use , At the same time, the method of obtaining browser header information and the meaning of corresponding basic parameters are introduced , I hope it helps you .