程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Why cant the results of Python crawlers be stored in MySQL?

編輯:Python

Why? python Crawler's results could not be stored MySQL in ?

# coding:utf-8import requestsfrom bs4 import BeautifulSoupimport timeimport pymysql# Crawl data def get_information(page=0): url = 'https://tieba.baidu.com/f?ie=utf-8&kw=%E5%A4%8D%E6%97%A6%E5%A4%A7%E5%AD%A6' + str(page+1) headers={ "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36", "Referer": "https://tieba.baidu.com/f?ie=utf-8&kw=%E5%A4%8D%E6%97%A6%E5%A4%A7%E5%AD%A6" } r = requests.get(url,headers=headers) soup = BeautifulSoup(r.content.decode("utf-8"),"html.parser") out = soup.find("ul",attrs={
"class":"for-list"}) datas = out.find_all('li') datas_list = [] try: for data in datas: title = data.find('a', attrs={
"class":"truetit"}).text.split()[0] artical_link = "https://bbs.hupu.com" + data.find('a', attrs={
"class": "truetit"}).attrs['href'] author = data.find('a', class_="aulink").text author_link = data.find('a', class_="aulink").attrs['href'] create_time = data.find('a', ).text lastest_reply = data.find('span', class_='endauthor').text datas_list.append({
"title":title,"artical_link":artical_link,"author":author,"author_link":author_link,"create_time":create_time,"lastest_reply":lastest_reply}) except: None return datas_listif __name__ == "__main__":  config = { 'host':'localhost', 'port':3306, 'user':'root', 'password':'root', 'charset':'utf8', 'database':'xinxiz', }  connection = pymysql.connect(**config) # Create connection try: cur = connection.cursor() # Create cursors for page in range(2): datas = get_information(page) for data in datas: cur.execute("INSERT INTO hupu_datas (title, artical_link, author, author_link,create_time, lastest_reply) VALUES(%s,%s,%s,%s,%s,%s)",(data['title'], data['artical_link'], data['author'], data['author_link'], data['create_time'], data['lastest_reply'])) print(" Climbing to the top %s page "%(page+1)) time.sleep(1) except: connection.rollback() # If something goes wrong , Then roll back finally: cur.close() # Close cursor connection.commit() # Commit transaction connection.close() # Close the connection 

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved