Synchronous scrolling :
hello , Hello everyone ! Today's Xiaobian brings you a very practical tip How do we save the crawled information to excel. First, let's talk about excel Frequently used libraries ,xlrd,xlwt,xlwings,openpyxl,xlsxwriter Wait, there's a lot of , But I use openpyxl This library saves the information to excel.
openpyxl Is a program for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm Document library .
download openpyxl windows +R open cmd Enter the command pip install openpyxl
pip install openpyxl
The next step is how to create excel
import openpyxl # Open file wb = openpyxl.Workbook() # Create a table using the working object sheet1 = wb.active # stay sheet1 Write content in table Insert content sheet1.append[' full name ',' Gender '] # wait This is excel First row insert , It can be equivalent to the header of a file # Create a new form by default In the last sheet2 = wb.create_sheet('title') # Modify the name of the table sheet2.title ='new sheet' # Color sheet1.sheet_properties.tabColor= '000000' # Close save workbook wb.save(' file name .xlsx')
For example, a marriage website I crawled - My Lord, good luck
First, we need to parse its web address
Because the information and content we need is in list below , and list And in the data below So we can use a for Loop it out , The code is as follows :
for item in json['data']['list']: username=item['username'] gender = item['gender'] userid=item['userid'] province=item['province'] height = item['height'] city = item['city'] astro = item['astro'] birthdayyear = item['birthdayyear'] salary = item['salary'] avatar=item['avatar'] monolog = item['monolog'] print("ID:"+userid," full name :"+username," Gender :"+gender," Province ::"+province," City :"+city," Date of birth :"+birthdayyear," height :"+height," Wages :"+salary," Photo :"+avatar," The constellation :"+astro," Inner monologue :"+monolog)
Since we need to save the information to excel in Then you need to put the above code in creating excel In the code of the table .
The complete code is as follows :
# Open file wb = openpyxl.Workbook() # Create a table using the working object sheet1 = wb.active # stay sheet1 Write content in table Insert content sheet1.append(['ID',' full name ',' Gender ',' Province ',' City ',' Date of birth ',' height (cm)',' Wages ',' Photo ',' The constellation ',' Inner monologue ']) for page in range(1,10): # obtain 1 To 10 The content of the page # Based on the data entered by the user , Get the data returned by the service json = get_data(page,startage,endage,gender,startheight,endheight,salary) #print(json['data']['list']) for item in json['data']['list']: username=item['username'] gender = item['gender'] userid=item['userid'] province=item['province'] height = item['height'] city = item['city'] astro = item['astro'] birthdayyear = item['birthdayyear'] salary = item['salary'] avatar=item['avatar'] monolog = item['monolog'] print("ID:"+userid," full name :"+username," Gender :"+gender," Province ::"+province," City :"+city," Date of birth :"+birthdayyear," height :"+height," Wages :"+salary," Photo :"+avatar," The constellation :"+astro," Inner monologue :"+monolog) print(' Start writing ecxel, One moment please ...',end='') xx_info = [userid,username,gender,province,city,birthdayyear,height,salary,avatar,astro,monolog] sheet1.append(xx_info) print(' Write successfully \n') # Close save workbook wb.save(' Dating website data capture .xlsx')
Okay That's all for today's Xiaobian