Hello everyone ~
In this issue, I recommend some pandas Efficient data processing functions ( Continuous updating ), I hope it helped you :
df_dict = {'name':['Alice_001','Bob_002','Cindy_003','Eric_004','Helen_005','Grace_006'],'sex':['female','male','female','male','female','male'],'math':[90,89,99,78,97,93],'english':[95,94,80,94,94,90]} #[1]. Write parameters directly test_dict df = pd.DataFrame(df_dict) #[2]. Dictionary assignment df = pd.DataFrame(data=df_dict)
Character splitting :
df1[['name', 'id']] = df1['name'].str.split('_', 2, expand = True)
Regular expression splitting :
df2 = df.copy() df2['name2'] = df2['name'].str.extract('([A-Z]+[a-z]+)') df2['id2'] = df2['name'].str.extract('(\d+)')
Custom connector :
df1["name_id"] = df1["name"].str.cat(df1["id"],sep='_'*3)
Merge output of a column :
df1["name"].str.cat(sep='*'*5)
padding-left :
df1["id"] = df1["id"].str.pad(10,fillchar="*") # amount to ljust() df1["id"] = df1["id"].str.rjust(10,fillchar="*")
Right fill :
df1["id"] = df1["id"].str.pad(10,side="right",fillchar="*")
Fill both sides :
df1["id"] = df1["id"].str.pad(10,side="both",fillchar="*")
Filter numeric Columns :
df1.select_dtypes(include=['float64', 'int64'])
Screening object Column :
df1.select_dtypes(include=['object'])
English achievement ranking :
df1['e_rank'] = df1['english'].rank(method='min',ascending=False)
94 There are three , So the three are tied for the first place 2.
The above is all the contents sorted out for you in this issue , Practice quickly