程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

3 high frequencies using Pandas functions

編輯:Python

公眾號:尤而小屋
作者:Peter
編輯:Peter

大家好,我是Peter~

本文主要是給大家介紹3個PandasDaily high frequency using the function:apply + agg + transform

模擬數據

Simulations of a simple data

In [1]:

import pandas as pd
import numpy as np
復制代碼

In [2]:

df = pd.DataFrame(
{"name":["xiaoming","sunjun","jimmy","tom"],
"sex":["male","female","female","male"],
"chinese":[100,80,90,92],
"math":[90,100,88,90]
})
df
復制代碼

Out[2]:

namesexchinesemath0xiaomingmale100901sunjunfemale801002jimmyfemale90883tommale9290

函數apply

A very flexible function,To the wholeDataFrame或者SeriesTo perform a given function operation.

Function can be custom,也可以是python或者pandas內置的函數,Can also be anonymous functions.

使用1:自帶函數

改變字段類型:從int64變成float64

In [3]:

df.dtypes # 改變前
復制代碼

Out[3]:

name object
sex object
chinese int64
math int64
dtype: object
復制代碼

In [4]:

df["chinese"] = df["chinese"].apply(float)
復制代碼

In [5]:

df.dtypes # 改變後
復制代碼

Out[5]:

name object
sex object
chinese float64
math int64
dtype: object
復制代碼

使用2:自定義函數

In [6]:

def change_sex(x): # male-0 female-1
return 0 if x == "male" else 1
復制代碼

In [7]:

df["sex"] = df["sex"].apply(change_sex)
df # 改變後
復制代碼

使用3:匿名函數lambda

In [8]:

# float--->int
df["chinese"] = df["chinese"].apply(lambda x: int(x))
df.dtypes
復制代碼

Out[8]:

name object
sex int64
chinese int64
math int64
dtype: object
復制代碼

In [9]:

# 將name變成首字母大寫
df["name"] = df["name"].apply(lambda x: x.title())
df
復制代碼

# At the same time operating two columns,記得axis=1
df["score"] = df.apply(lambda x: x["chinese"] + x["math"], axis=1)
df
復制代碼

函數agg

操作Series數據

In [11]:

# 1
df["chinese"].agg(["mean", "sum"])
復制代碼

Out[11]:

mean 90.5
sum 362.0
Name: chinese, dtype: float64
復制代碼

操作DataFrame數據

In [12]:

# 2
df[["chinese","math"]].agg({"chinese":["sum"], "math":["mean"]})
復制代碼

Out[12]:

chinesemathsum362.0NaNmeanNaN92.0

In [13]:

# 3
df[["chinese","math"]].agg({"chinese":["sum","mean"], "math":["mean"]})
復制代碼

Out[13]:

chinesemathsum362.0NaNmean90.592.0

groupby + agg的聯合使用:

In [14]:

# 4
df.groupby("sex").agg(["mean","sum"])
復制代碼

# 5
df.groupby("sex").agg({"chinese":["mean"], "math":["sum","min","max"]})
復制代碼

Can also custom newly generated field name:

df.groupby("sex").agg(chinese_mean=("chinese","mean"), math_min=("chinese","min"))
復制代碼

函數transform

現在的df是這樣子:

假設有一個需求:Statistical gender men and women sex 的chinese 的平均分(A new field on the back),如何實現?

方法1:使用groupby + merge

In [18]:

# 1、先groupby
df1 = df.groupby("sex")["chinese"].mean().reset_index()
df1.columns = ["sex", "average"]
df1
復制代碼

# 2、merge
# 結果
df = pd.merge(df, df1, on="sex")
df
復制代碼

方法2:groupby + map

In [20]:

dic = df.groupby("sex")["chinese"].mean().to_dict()
dic
復制代碼

Out[20]:

{0: 96.0, 1: 85.0}
復制代碼

In [21]:

df["average_map"] = df["sex"].map(dic)
df
復制代碼

方法3:使用transform

使用transform可以一步到位

df["average_tran"] = df.groupby("sex")["chinese"].transform("mean")
df
復制代碼

你學會了嗎?


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved