Everybody, good duck ! It's my little panda again
Let's go directly to the code this time Explain before you start :
requests >>> pip install requests
parsel >>> pip install parsel
re
Interpreter : python 3.8
Editor : pycharm
Send a request
get data
Parsing data
Save the data
It's better for me to delete something in the code than to audit , If you need it, you can read the comments or chat privately. I'll get it ~
import requests # Send a request
import re
# camouflage
headers = {
'cookie': '',
'referer': '',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.84 Safari/537.36',
}
url = ''
html_data = requests.get(url=url, headers=headers).text
info_list = re.findall('<h2 class="book_name"><a href="(.*?)" target="_blank" data-eid=".*?" data-cid=".*?" alt=".*?" title=".*?">(.*?)</a></h2>', html_data)
for link, title in info_list:
link = 'https:' + link
# print(link, title)
# 1. Send a request
response = requests.get(url=link, headers=headers)
# 2. get data
link_data = response.text
# print(html_data)
# 3. Parsing data
# Web page tags <p></p> <a></a> <div></div> <img />
# <div class="read-content j_readContent" id=".*?">(.*?)</div>
text = re.findall('<div class="read-content j_readContent" id=".*?">(.*?)</div>', link_data, re.S)[0]
text = text.replace('<p>', '\n')
text = title + '\n\n' + text
print(text)
# 4. Save the data
with open(' The girlfriend of online love is the goddess of heaven .txt', mode='a', encoding='utf-8') as f:
f.write(text)
Okay , My article ends here !
There are more suggestions or questions to comment on or send me a private letter ! Come on together and work hard (ง •_•)ง
If you like, just pay attention to the blogger , Or like the collection and comment on my article !!!