I like to use it myself pandas, It's strong enough .
Pandas Is an open source third party Python library , be based on Numpy and Matplotlib Built on , The mainstream of data analysis in the industry 【 Three piece set 】(NumPy、Matplotlib、Pandas).Pandas Has become a Python Necessary advanced tools for data analysis , It's as powerful 、 flexible 、 Data analysis tools that can support any programming language , Very popular .
The first way ( recommend ):
Open the terminal , Input :pip install pandas
Command to install .
The second way :
find Pandas Download library for :https://www.lfd.uci.edu/~gohlke/pythonlibs/
Then download the version you want .
Excel It's a table , It contains 【 That's ok 】 and 【 Column 】, These are two areas that we need to focus on , Whether it's reading data , Or write data , It's actually operating on rows and columns .
Very simple , Let's implement a simple 【 User table 】, The general idea is :
The specific code is as follows :
# First step , Import pandas library , Take an alias
import pandas as pd
# The second step , Prepare the data , And create a worksheet
# DataFrame Data frame , Key concepts
# It is equivalent to us 【 workbook 】 One of them 【 Worksheet 】
df = pd.DataFrame({
'id':[1,2,3],
'name':[' Zhang San ', ' Li Si ', ' Wang Wu '],
'age':[22, 33, 44]
})
# The third step ,Pandas The default index will be used
# But let's customize the index , Otherwise, the generated worksheet will have its default index , Not so good
df = df.set_index('id')
print(df)
# Step four , Writes data to excel In file
df.to_excel('people.xlsx')
print('Done!')
There's a premise , You have to have one first excel file , Don't be too nervous to forget .
Just one key method :read_excel()
You can specify which file to read ? Which worksheet to read ? Which line to start with ?……
We , Let's look at the code , It's simple :
# Parameters 1: Specify read hello.xlsx file
# Parameters 2:header = 3 It means starting from the fourth line , Because the index is from 0 At the beginning
# Parameters 3:sheet_name Specify the worksheet to read
user= pd.read_excel('user.xlsx', header=3, sheet_name='sheet1')
# Output column name
print(user.columns)
# If your Excel There is no opening title in , It can be used header = None To set it up
user= pd.read_excel('user.xlsx', header=None)
user.columns = ['id', 'name', 'age']
print(user.columns)
# adopt index_id To assign id Column as index
user = pd.read_excel('user.xlsx', index_col='id')
# head The default is before output 5 That's ok , At this point, the default index will not be generated
print(user.head())
# skiprows Skip the first few lines
# usecols Use the data in those columns
# dtype Set the type of a column
peoples = pd.read_excel('user.xlsx', skiprows = 3, usecols='E:H', dytpe={
'ID':str, 'gender':str, 'birthday':str})