Sometimes when performing summary calculation , We need to convert the data before processing into the data after processing .
Before processing :
After processing :
Prepare the data in advance :
import pandas as pd
# Prepare the data
df = pd.DataFrame({
' full name ': ['A', 'A', 'B', 'B', 'C', 'C', 'C'],
' department ':[' The sales department ', ' The sales department ', ' The sales department ', ' The sales department ', ' The personnel department ', ' The personnel department ', ' The personnel department '],
' Management area ':[' south China ', ' The north China ', ' Central China ', ' East China ', ' south China ', ' The north China ', ' Central China ']})
df
result=df.groupby(df[' full name ']).agg( Department =(' department ',lambda x:','.join(x.unique())),
Managed area =(' Management area ',lambda x:','.join(x.unique()))).reset_index()
according to 2.1 In this way, we can achieve our expectations , But when there are many aggregated fields , We will face the trouble of writing multiple anonymous functions , Not very convenient , So how to solve this problem ?
def string_concat(column_name,sep=','):
return sep.join(column_name.unique())
result=df.groupby(df[' full name ']).agg( Department =(' department ',string_concat),
Managed area =(' Management area ',string_concat)).reset_index()
result
be based on 2.2 We solved the problem of writing anonymous functions repeatedly , But in reality , If we use different delimiters for multiple columns , It's not very humanized , Then how to solve this problem ?
def custome_str_cat(sep='|'):
def str_cat(column_name):
return sep.join(column_name.unique())
if sep:
return str_cat
result=df.groupby(df[' full name ']).agg( Department =(' department ', custome_str_cat('*')), Managed area =(' Management area ',
custome_str_cat()), Managed area 2=(' Management area ', custome_str_cat('—'))).reset_index()
The effect diagram after implementation is as follows :
The above is about the aggregate function of text class , amount to postgresql In the database string_agg function ,Oracle Medium wm_concat function ,MySQL Medium GROUP_CONCAT function . So how to be in pandas Turn one row into multiple rows ? That is, the processed data in this article will be transformed into the data before processing , You can refer to this blog , Portal :https://blog.csdn.net/qq_41780234/article/details/121623812?spm=1001.2014.3001.5502