程序師世界是廣大編程愛好者互助、分享、學習的平台，程序師世界有你更精彩！


設為首頁	加入收藏

首頁
編程語言: C語言|JAVA編程
 Python編程
網頁編程: ASP編程|PHP編程
 JSP編程
數據庫知識: MYSQL數據庫|SqlServer數據庫
 Oracle數據庫|DB2數據庫

您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

讀取文件時，能不能跳過不能編碼的字符繼續讀取？(語言-python)

編輯：Python

最近在學爬蟲爬小說，遇到個網頁裡面有一個亂碼。它網頁是gb2312編碼，我用gb2312、gbk、utf-8都試了一遍識別不了。因為我是在整頁整頁的爬文字，一報錯就是一章內容沒下，就很難受。
想問問大家，有沒有辦法直接不管那個無法編碼的字符，直接將提取的內容寫入？
下載代碼如下

#下載async def download(url, name): async with semaphore: async with aiohttp.ClientSession() as session: async with session.get(url) as reques: reques.encoding = 'gbk' page = bs4.BeautifulSoup(await reques.text(), 'html.parser') div = page.find('div', class_="read_chapterDetail") p = div.find_all('p') # 打開文件，打開方式，數據為二進制 with open(f'{name}.txt', mode='wb') as f: for i in p: text = i.text + '\n' f.write(text.encode('utf-8')) print(f'{name}下載完成！')

上一篇文章：求簡單實用的python實戰
下一篇文章： python爬蟲每次運行都會出現一個坐標是怎麼回事

Python

Python & c++ facial classic

C/C++ Object oriented knowledg

Example of the max() function in Python

p{margin:10px 0}.markdown-body

django動態配置環境dynaconf + crontab定時任務腳本

1、動態配置：上一篇分享過了，貼個鏈接：CSDNhttps:

Fundamentals of python (4)

List of articles function Fu

Python Pandas條件篩選功能

來源：https://www.jb51.net/articl

Solve Python hmmlearn error

Downloaded hmmlearn after ,fro

相關文章

没有相关文章

閱讀排行榜

Python video audio merge function related, FL_ time（），AudioFileClip（）. duration On the birthday of giant pandas in Germany, 12 German tenors sang birthday songs 超級簡單Python學生信息管理系統設計與實現.zip(論文+項目源碼+使用說明書) python核心編程第四章和第五章 Want to change careers? Can zero basics learn Python? After reading this article, Bao Hui ǃ python tcp發送16進制字符【Python】 - Python的內置函數isinstance()中的參數classinfo一共有多少種，可以判斷多少種類型？ The difference between call stack and thread in Python Python string formatting python程序沒有思路 Python sensor acquisition data file analysis and processing experimental source code

熱門圖文

模板中的名稱，模板名稱 regret we didnt meet sooner! Pythons unknown pits TCP/IP學習筆記(8) C語言學習018:strdup復制字符串數組，018strdup 從ASP轉到ASP.NET過程中常見問題收集扔出一個功用比比較高的分頁類(for PHP5.x) c語言的main的所有寫法 C#實現的多線程異步Socket數據包接收器框架(2)

欄目導航

編程綜合問答

更多關於編程

編程問題解答

Copyright © 程式師世界 All Rights Reserved