您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

In depth understanding of pandas sorting mechanism

編輯：Python

author ：Peter edit ：Peter

Hello everyone , I am a Peter~

In a previous article , Describes in detail how to use pandas Built in functions for sort_values To sort the data . This article explains how to use custom methods to achieve sorting ：

Mapping relationship implementation
CategoricalDtype Type implementation

Analog data

First simulate a simple data ：

import pandas as pd
import numpy as np
df = pd.DataFrame({
"nick":["aaa","bbb","aba","abc","cac","ccc"], # nickname
"math":[100,120,130,111,100,128], # mathematics
"english":[140,80,120,90,125,116], # English
"size":["S","M","L","XS","XL","L"] # Clothing size
})
df

sort_values

DataFrame.sort_values(by,
axis=0,
ascending=True,
inplace=False,
kind='quicksort',
na_position='last', # last,first; The default is last
ignore_index=False,
key=None)

The specific interpretation of the parameter is ：

by： Indicates what field or index to sort by , It can be one or more
axis： Is the sorting on the horizontal or vertical axis , The default is the vertical axis axis=0
ascending： Whether the sorting result is in ascending or descending order , The default is ascending
inplace： Indicates whether the sorting result is directly modified in place on the original data or generated a new DatFrame
kind： Indicates the algorithm using sorting , Quick line up quicksort,, Merger mergesort, Heap sort heapsort, Stable sequencing stable , The default is ： Quick line up quicksort
na_position： Location handling of missing values , The default is last , Another option is to first
ignore_index： Whether the index of the newly generated data frame is rearranged , Default False（ Use the index of the original data ）
key： Functions used before sorting

Here are a few simple examples to review sort_values Use ：

Single field sorting

adopt nick Field sorting , Strings are based on letters ASCII code ; The default is ascending from small to large . Same first letter , Compare the second , Reason by analogy ：

Arrange in ascending order according to the size of the values ：

You can change the sorting method to descending ：

Sorting multiple fields

Sorting multiple fields at the same time , Default is also ascending . When the value of the first field is the same , Then arrange them in ascending order according to the second field

Assign different sorting methods to different fields ：

Then compare the two different ways completely ：

Above is sort_values Methods .

Custom sort

Use sort_values Methods are used to sort by the size of the built-in alphabetic or numeric data , When you encounter the following situations , How to operate ？

When we according to the size of the clothes size Sort by , And what you get is ：

Obviously, this sort of sorting is not what we expected , In our cognition ：

XS： Very small
S： Small
M： secondary
L： Big
XL： Super large

How to solve this problem ？ There are two ways ：

Method 1： By mapping

1、 Find each size The value size corresponding to the order of

2、 Generate new fields order

3、 We are right. order Sort

Method 2： Use CategoricalDtype

CategoricalDtype Is a type of categorical data with a category and order , Can create our custom sort data type . Official website address ：

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.CategoricalDtype.html

1、 Specify a classified data type CategoricalDtype

category_size = pd.CategoricalDtype(
['XS', 'S', 'M', 'L', 'XL'],
ordered=True)
category_size

2、 take size The field is set to the above CategoricalDtype type

3、 We're right size Use sort_values We can achieve our goal , And the above map The effect of mapping is the same

And by looking at df Data type of , We also see size The type is category：

上一篇文章： Classic case of pandas data analysis
下一篇文章： Simple use of pandas (text processing)

Python

不懂業務場景，你的Python就白學了。

都說得數據者，得天下。現在，不管是阿裡、騰訊這種大企業，還是

[original launch] learn Python recursive algorithm from Xiao Hui

p{margin:10px 0}.markdown-body

徹底搞懂Python中的zip()

Python 的 zip() 函數創建了一個迭代器，它將聚合

Exchange change for Python - Backpack

This problem needs a backpack

Python（七）Socket編程、IO多路復用、SocketServer

本章內容：SocketIO多路復用（select）Socke

/usr/bin/python: No module named pip

In the installation pip The to

[dynamic programming] divide an array containing m integers into n arrays. The sum of each array should be as close as possible and its deformation (Python Implementation)

A preliminary understanding of print in Python

Introduction to Python development learning notes understanding the use of selectors module

Introduction to Python crawler-34-scratch framework, understanding of the function of the scratch architecture module

Leetcode sword finger offer 55 - I. depth of binary tree_ Python

In depth understanding of Pythons type hints

Factoring n in Python

Python judging prime number (prime number): understanding and example application of for else loop

Artifact! This Python artifact can enlarge pictures and videos n times without damage and clarity!

熱門圖文

淺析C語言字中的符串格式化顯示實現壓縮access(*.mdb)數據庫的方法 C#異常處理中try和catch語句及finally語句的用法示例 c語言-C語言宏定義的語法問題求解 C#設計模式之享元設計模式（Flyweight）編寫安全 PHP應用程序的七個習慣深入分析 poj 1201-差分約束+spfa Lazarus實戰開發之串口通信(WINCE/WIN32)

欄目導航