問題:在最近的需求開發中,有這麼個分組比例計算求和問題,根據字段'CPN'進行分組,計算每一筆PO Line Actual CT2R * line 數量比重,取名為'Weighted(QTY)CT2R',再根據相同的'CPN'對每行'Weighted(QTY)CT2R'值進行匯總求和得到總的'Weighted(QTY)CT2R'值,如下圖填充色為黃色的單元格即是我們所需要的目標值
具體計算邏輯如下:
用Pandas代碼實現上述需求如下所示:
import pandas as pd
df = pd.DataFrame([['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',200,50],
['02-0437',20,80],
['02-0437',20,80],
['02-0437',20,80]
],columns = ['cpn','po_line_qty','actual_ct2r'])
# 根據字段'cpn'進行分組,對字段'po_line_qty'中的值進行求和,取名為total
total = df.groupby('cpn').agg({'po_line_qty':sum}).reset_index()
# 將字段'po_line_qty'更名為'total_po_line_qty'
total = total.rename(columns = {'po_line_qty':'total_po_line_qty'})
# df表與total表根據字段'cpn'進行左連接,取名為new_res
new_res = pd.merge(df,total,how='left',on='cpn')
def weighted_qty_ct2r(row):
scale = row['po_line_qty'] / row['total_po_line_qty']
weighted_qty_ct2r = scale * row['actual_ct2r']
return weighted_qty_ct2r
# 生成字段'weighted_qty_ct2r'
new_res['weighted_qty_ct2r'] = new_res.apply(lambda row:weighted_qty_ct2r(row), axis=1)
# 根據字段'cpn'進行分組,對字段'weighted_qty_ct2r'中的值進行求和,取名為df_result
df_result = new_res.groupby('cpn').agg({'weighted_qty_ct2r':sum})
df
total
new_res
df_result