程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Exciting challenge: Python crawler crawls the cover picture of station B

編輯:Python

Source power

For the article , The title is the essence of its concentration. ; So for video , Its cover is probably the most eye-catching frame .B standing , As a hot short video platform recently , There are all kinds of dances in its dance area , Especially house dance , suffer greatly “ Indoorsman ” The love of .( Don't tell me anything, black silk 、jk, I really don't like

Insert picture description here

So , I'll try to get it with a crawler B Station cover .

Web access

B The station has anti climbing measures , I started by analyzing web pages , To no avail .

Think about it. , It's so hot B standing , I'm definitely not the only one who wants to climb , So , I started searching for relevant articles and videos .

A slap , Soon! , I found one , according to B standing AV No. crawling to get the cover picture of the article , I tried , Why , It really works 🤩( Ecstasy in heart )

# according to aid, Get the cover
https://api.bilibili.com/x/web-interface/view?aid=(aid)

Just think about it , Since last year ,B The station began to use BV The no. , Which come of AV Give me the number , In the article AV Where did the number come from ? harm , I read the date of the article again ,2019 year , Oh , That's OK , People write that it will ,B The station hasn't been changed yet

There are more ways than difficulties , Now at least I know how to use AV Number , Then I use BV The number is found AV Don't you just number ? I'm so smart .

Look for it , A big man shared BV The no. api, Click send to boss page

I have a look at , Oh , still B The boss of the station , You don't talk about martial virtue , Teach others to do B standing ( But I like 🤪

# according to BV Number acquisition cid
https://api.bilibili.com/x/player/pagelist?bvid=(bvid, Take the beginning BV!)
# according to BV Number and cid Get video playlist
https://api.bilibili.com/x/player/playurl?cid=(cid)&qn=(qn)&bvid=(bvid, Take the beginning BV!)
# according to BV Number and cid obtain aid
https://api.bilibili.com/x/web-interface/view?cid=(cid)&bvid=(bvid, Take the beginning BV!)

Summarize the above api, So the idea is , Just have a hand , Follow the boss , That's the line. !

First, according to BV The number is found cid, According to BV Number and cid obtain aid, According to aid Get the cover .

And the data in the crawling process is basically json data . among :

cid Data in json Of ['data'][0]['cid'] in

aid Data in json Of ['data']['aid'] in

Cover picture Data in json Of ['data']['pic'] in

More detailed process , I wrote it in the comments of the code

Complete code

# -*- coding: UTF-8 -*-
# @Time: 2021/8/17 20:12
# @Author: Stars in the distance
# @CSDN: https://blog.csdn.net/qq_44921056
import os
import json
import requests
import chardet
from fake_useragent import UserAgent
# Randomly generate request header
ua = UserAgent(verify_ssl=False, path='D:/Pycharm/fake_useragent.json')
# Random handover request header
def random_ua():
headers = {
"accept-encoding": "gzip", # gzip Compression coding It can increase the file transfer rate
"user-agent": ua.random
}
return headers
# Create folder
def path_creat():
_path = "D:/B Station cover /"
if not os.path.exists(_path):
os.mkdir(_path)
return _path
# The crawled page content is json Format processing
def get_text(url):
res = requests.get(url=url, headers=random_ua())
res.encoding = chardet.detect(res.content)['encoding'] # Uniform character encoding
res = res.text
data = json.loads(res) # json format
return data
# according to bv Number acquisition av Number
def get_aid(bv):
url_1 = 'https://api.bilibili.com/x/player/pagelist?bvid={}'.format(bv)
response = get_text(url_1)
cid = response['data'][0]['cid'] # obtain cid
url_2 = 'https://api.bilibili.com/x/web-interface/view?cid={}&bvid={}'.format(cid, bv)
response_2 = get_text(url_2)
aid = response_2['data']['aid'] # obtain aid
return aid
# according to av No. get the cover picture
def get_image(aid):
url_3 = 'https://api.bilibili.com/x/web-interface/view?aid={}'.format(aid)
response_3 = get_text(url_3)
image_url = response_3['data']['pic'] # Get the picture download connection
image = requests.get(url=image_url, headers=random_ua()).content # Get photo
return image
# Download the cover
def download(image, file_name):
with open(file_name, 'wb') as f:
f.write(image)
f.close()
def main():
k = 'Y'
while k == 'Y': # Cycle all the time according to the user's needs
path = path_creat() # Create save B Station cover folder
bv = input(" Please enter the name of the video bv Number :")
image_name = input(" Please give the cover you want to download a favorite name :")
aid = get_aid(bv)
image = get_image(aid)
file_name = path + '{}.jpg'.format(image_name)
download(image, file_name)
print(" Cover extraction completed ^_^")
k = input(" Press Y Key to continue extraction , Press Q sign out :")
if __name__ == '__main__':
main()

== Code can be directly copied to run ==, If it helps you , Remember == give the thumbs-up == Oh , It's also the greatest encouragement to the author , The shortcomings can be corrected in the comments section 、 communication .

Running results : Beautiful sister , Here, take you

  • With BV Number is BV1C5411P7qM Video of :
Insert picture description here
Insert picture description here

Image lossless magnification

Online website :https://bigjpg.com/zh

This can be used online , You can enlarge your picture online and do noise reduction . If you are interested, you can try it yourself , I think the effect is OK .

Reference article

Reference article 1:python Crawling B Station cover

Reference article 2:bilibili new BV Number api

author : Stars in the distance CSDN:https://blog.csdn.net/qq_44921056

This article is only for communication learning , Without the permission of the author , Prohibited reproduced , Let alone for other purposes , Offenders will investigate .


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved