例:生成服從標准正態分布的24乘4隨機矩陣,行名為20191101到20191124,列名為A,B,C,D並保存為dataframe數據結構
import pandas as pd
import numpy as np
dates=pd.date_range(start="20191101",end="20191124",freq="D")#Generate a set of time series data
a=pd.DataFrame(np.random.randn(24,4),index=dates,columns=list('ABCD'));a
這裡有csv和xlsx兩種格式
a.to_excel("dataframe.xlsx")
a.to_csv("dataframe1.csv")
或者:
f=pd.ExcelWriter("data.xlsx")
a.to_excel(f,'sheet1')
b=a+1
b.to_excel(f,'sheet2')
f.save()
That way you can be in oneexcelsee two tables:
c=pd.read_csv("dataframe1.csv",usecols=range(1,5))#Read the file and display the line names
d=pd.read_excel("data.xlsx",'sheet2',usecols=range(1,5))#Read the second table of the file and display the row names
import pandas as pd
import numpy as np
dates=pd.date_range(start="20191101",end="20191124",freq="D")#Generate a set of time series data
d=pd.DataFrame(np.random.randn(24,4),index=dates,columns=list('ABCD'));a
d1=d[:4]#Get the first four rows of data
d2=d[4:]#Read the data after five lines of data
d3=pd.concat([d1],[d2])#合並行數據
s1=d.groupby("A").mean()#數據分組求均值
s2=d.groupby("A").apply(sum)#s數據分組求和
import pandas as pd
import numpy as np
data=pd.DataFrame(np.random.randint(1,3,(3,3)),index=["m","v","p"],columns=["one",'two','three'])
data.loc['m','one']=np.nan#Modify the data in the first row and first column to be null
data.iloc[1:3,0:2]#extract data2到3行,第1到2列
data["four"]="shit"#Add a fourth column of data
a2=data.reindex(["m",'v','p'])
a2.dropna()#Delete indeterminate values
a2
參考文章:https://book.douban.com/subject/35066598/