This article introduces about Pandas in apply() Several common uses of functions ,apply() The degree of freedom of the function is high , It can be directly to Series perhaps DataFrame Element by element traversal operation , Convenient and efficient , Similar to Numpy Characteristics of .
apply() When using , Usually put one lambda Function expression 、 Or a function as an operation , Officially apply() usage :
DataFrame.apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds
apply() Finally, it is processed by function , The data to Series or DataFrame Format return .
Here are a few examples apply() Specific use of ;
1, Calculate the square root of each element
This is just for convenience , Directly use numpy Of sqrt function ;
>>> df =pd.DataFrame([[4,9]]*3,columns = ['A','B'])
>>> df
A B
0 4 9
1 4 9
2 4 9
>>> df.apply(np.sqrt)
A B
0 2.0 3.0
1 2.0 3.0
2 2.0 3.0
2, Calculate the average value of each line of elements
Here, the incoming data exists in the form of columns , therefore axis = 0, You can omit ;
>>> df.apply(np.mean)
A 4.0
B 9.0
3, Calculate the average value of each column of elements
And 2 The difference is that it is passed in the form of rows , To add a parameter axis =1;
>>> df.apply(np.mean,axis = 1)
0 6.5
1 6.5
2 6.5
dtype: float64
4, Add new column C, The values are columns A、B The sum of the
Implement this function , The simplest line of code can be achieved :
df['C'] = df.A +df.B
But here we need apply() To achieve , Realize the usage of inter column operation , The operation steps are divided into the following two steps :
1, First define a function implementation Column A + Column B ;
2, utilize apply() Add this function , And the data needs Join line by line , So set axis = 1
>>> def Add_a(x):
... return x.A+x.B
>>> df['C'] = df.apply(Add_a,axis=1)
>>> df
A B C
0 4 9 13
1 4 9 13
2 4 9 13
Series Use apply() Function and DataFrame be similar , The biggest difference in usage is the addition of a column name DataFram. Class name
1, Column A Add 1
no need apply() Methods
df.A =df.A +1
utilize apply() Function to operate , Here I introduce a lambda function :
>>> df.A = df.A.apply(lambda x:x+1)
>>> df
A B C
0 5 9 13
1 5 9 13
2 5 9 13
2, Judgment column A Whether the element in can be 2 to be divisible by , use Yes or No Mark beside
>>> df.A = df.A.apply(lambda x:str(x)+"\tYes" if x%2==0 else str(x)+"\tNo")
>>> df
A B
0 5\tNo 9
1 5\tNo 9
2 5\tNo 9
apply() Most usages of are the above points , The examples listed here are simpler , But it is enough for basic usage understanding .
That's all of this , Finally, thank you for reading !