程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

How to realize real-time progress bar display when Python crawler crawls video

編輯:Python

Catalog

One 、 Full code display

Two 、 explain

1.with closing

with usage ( Realize context management )

closing usage ( Solve these problems perfectly )

2. File stream stream

3.response.headers['content-length']

4.response.iter_content()

5.\r and %

3、 ... and 、 Result display

Four 、 summary

Preface :

When crawling and downloading videos on the web , We need a real-time progress bar , This can help us see the video more intuitively Download progress .

One 、 Full code display from contextlib import closingfrom requests import geturl = 'https://v26-web.douyinvod.com/57cdd29ee3a718825bf7b1b14d63955b/615d475f/video/tos/cn/tos-cn-ve-15/72c47fb481464cfda3d415b9759aade7/?a=6383&br=2192&bt=2192&cd=0%7C0%7C0&ch=26&cr=0&cs=0&cv=1&dr=0&ds=4&er=&ft=jal9wj--bz7ThWG4S1ct&l=021633499366600fdbddc0200fff0030a92169a000000490f5507&lr=all&mime_type=video_mp4&net=0&pl=0&qs=0&rc=ank7OzU6ZnRkNjMzNGkzM0ApNmY4aGU8MzwzNzo3ZjNpZWdiYXBtcjQwLXNgLS1kLTBzczYtNS0tMmE1Xi82Yy9gLTE6Yw%3D%3D&vl=&vr='with closing(get(url, stream=True)) as response: chunk_size = 1024 # Single request maximum # response.headers['content-length'] The data type obtained is str instead of int content_size = int(response.headers['content-length']) # Total file size data_count = 0 # Current transmitted size with open(' file name .mp4', "wb") as file: for data in response.iter_content(chunk_size=chunk_size): file.write(data) done_block = int((data_count / content_size) * 50) # The size of the file that has been downloaded data_count = data_count + len(data) # Real time progress bar now_jd = (data_count / content_size) * 100 # %% Express % print("\r [%s%s] %d%% " % (done_block * '█', ' ' * (50 - 1 - done_block), now_jd), end=" ")

notes : above url Has expired , You need to find the video on the website by yourself url

Two 、 explain 1.with closing

When we read file resources everyday , Often used with open() as f: Sentences .

But use with The sentence is Conditions required Of , Any object , As long as context management is implemented correctly , You can use with sentence , Context management is achieved through __enter__ and __exit__ These two methods achieve .

with usage ( Context management is not implemented )

class Door(): def open(self): print('Door is opened') def close(self): print('Door is closed')with Door() as d: d.open() d.close()

  It turned out to be wrong :

with usage ( Realize context management )

use __enter__ and __exit__ Context management is realized

class Door(): def open(self): print('Door is opened') def close(self): print('Door is closed')with Door() as d: d.open() d.close()

  The result is correct :

closing usage ( Solve these problems perfectly )

An object has no implementation context , We can't use it for with sentence . This is the time , It can be used contextlib Medium

closing() Come and take the Object becomes a context object .

class Door(): def __enter__(self): print('Begin') return self def __exit__(self, exc_type, exc_value, traceback): if exc_type: print('Error') else: print('End') def open(self): print('Door is opened') def close(self): print('Door is closed')with Door() as d: d.open() d.close()

for example : use with Statements use requests Medium get(url)

That is the case in this paper , Use with closing() Download Video ( In the web page )

2. File stream stream

Imagine , If reading a file is compared to pumping water into a pool , Synchronization blocks programs , Asynchronously waits for results , What if the pool is very big ?

So there's a file stream , It's like you smoke and take , Don't wait until the pool is full ,

So for some large files ( How many? G In the video ) This parameter is generally used .( For small files, you can also use )

3.response.headers['content-length']

This represents the total size of the fetch file , But the data type of the result it gets is str instead of int, Therefore, data type conversion is required .

4.response.iter_content()

This method is generally used to download files and web pages from the Internet ( Need to use requests.get(url))

among chunk_size Indicates the maximum value of a single request .

5.\r and %

\r Said the enter ( Back to the beginning of the line )

% It's a placeholder

And for %%, first % Played the role of escape , Make the result output as a percent sign %

3、 ... and 、 Result display

Four 、 summary

I've seen a lot of progress bars before , These progress bars can move , But it can't be loaded according to the contents of the file ( The parameters inside are either dead , Or it has nothing to do with the file size ), Can't achieve real interaction function , This progress bar is a good display , You can try !!

The download video shows that the progress bar is against a url, You can add it to your reptile's cycle , So you can show the real-time progress bar when climbing each video !!

This is about how to implement python This is the end of the article about real-time progress bar display when crawler crawls video , More about python Crawl to display the progress bar content, please search the previous articles of the software development network or continue to browse the relevant articles below. I hope you can support the software development network in the future !



  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved