您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python crawler, record the crawling process, list data, page turning, post mode, save dictionary

編輯：Python

Record the process of your own crawler , I'm working on a project recently .

The website to be crawled is relatively simple .

The problem is ：

post The way , Some of the website's data needs to be used post Way to get .

such as ,

This part should see 《 Projects initiated 》, Mouse click required , At first I thought it was ajax, In fact, it's not , yes js The way to get .

therefore , A careful study reveals , In fact, the website is like this .

https://s*****view.php?id=GKUdgjKayCQvY

Specific parts are omitted , Look at this website , It's nothing , But check through the browser , You can find , Mouse click 《 Projects initiated 》, There will be one. js action .

If there is only one page ,

like this

Then you won't find js action . But if there are many , Need to click , You will find , need js 了 .

This action , Is included post Of .

The specific parameters are as follows

therefore , In fact, the requested URL , It can be formed in this way .

https://sd.zhiyuanyun.com/app/api/view.php?m=get_opps&type=2&id=89608371&p=3

therefore , Here is id,p It's the page . Others are default parameters .

And then use post The way , Construct the request .

def get_proj_number(id):
print("((((((((( >>>>>>>> now obtain organization A total of How many projects ")
params = (('m', 'get_opps'), ('type', '2'), ('id', id), ('p', "1"), )
response = requests.get(
'https://sd.zhiyuanyun.com/app/api/view.php', headers=headers, params=params)
selector = Selector(response)

such , hold p The parameter is made into a for Circulation is all right .