程序師世界是廣大編程愛好者互助、分享、學習的平台，程序師世界有你更精彩！


設為首頁	加入收藏

首頁
編程語言: C語言|JAVA編程
 Python編程
網頁編程: ASP編程|PHP編程
 JSP編程
數據庫知識: MYSQL數據庫|SqlServer數據庫
 Oracle數據庫|DB2數據庫

您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python: using the laida guidelines (3 σ Criteria) eliminate abnormal data in Excel

編輯：Python

1. brief introduction
The laida rule (Pau’ta Criteron) First, assume that a set of data contains only random errors , First, calculate the standard deviation according to certain criteria , Determine a certain interval according to a certain probability , Those not in this interval are considered as outliers . It can be used when the data is in a positive or approximate positive distribution

2. Sample dataset

3. Complete processing code

import numpy as np
import pandas as pd
# Set the path of the file to be read
datapath = "traning Before processing .xlsx"
data = pd.read_excel(datapath)
# Record variance greater than 3 Times value
#shape[0] Record the number of lines ,shape[1] Number of record Columns
sigmayb = [0]*data.shape[0]
for i in range(1,data.shape[1]):
print(" To deal with the first "+str(i)+" That's ok ")
# loop Each column
lie = data.iloc[:, i].to_numpy()
#print(lie)
mea = np.mean(lie)
s = np.std(lie, ddof=1)
# Calculate each column mean value mea Standard deviation s
print(" The mean and standard deviation are respectively ："+str(mea)+" "+str(s))
# Count the rows with more than three times variance
for t in range(1,data.shape[0]):
if (abs(lie[t]-mea) > 3*s):
print(">3sigma"+" "+str(t)+" "+str(i))
# Set the outlier to null
data.iloc[t,i]=' '
# Store the processed data in the original file
data.to_excel(datapath)

4. Running results

上一篇文章： Python: generate Heatmap heat map based on data in Excel
下一篇文章： [Python basics] the 11th data type built-in method 02

Python

【Python案例】OCR提取圖片中的文字

很多軟件內置了OCR功能，即圖片提取文字功能。有些是免費提供

Save pandas plan (20) -- count the monthly order volume of retail stores

save pandas plan （20）—— Count

Adding and deleting dictionaries in Python

For a dictionary ：book_dict =

Explain how Python works

python There are two modes of

Python Tkinter Library

brief introduction Tkinter The

《Python3 網絡爬蟲開發實戰（第二版）》第二波贈書獲獎名單公布~

“ 閱讀本文大概需要 3 分鐘。 ” 上周我搞了一個贈書活動

相關文章

Installing the Python interpreter - detailed process

How to add the same character to each element of Python list

Pandas custom change the order of columns in dataframe

Pandas uses the split function to split the specific string data column of dataframe into two new data columns and generate a new dataframe

pandas自定義改變dataframe數據列的前後次序 (change the order of columns in dataframe)

Leetcode solution (1672): total assets of the richest customers (Python)

Leetcode solution (0330): complement the array as required (Python)

Use optimize curve_ Residuals are not final in the initial point when fit function is used to fit the curve. What should I do?

Python script: change all files in the current folder in a certain order, and save the original file name and the new file name to TXT (separated by spaces)

Python reports an error using PIP modulenotfounderror: no module named pip_ internal. cli. Solution to main:

閱讀排行榜

Getting started with Python flask Web Framework Django4.0 framework Python running error 根據txt批量找出文件夾裡面的圖片python Python code security guide 「Python循環結構」使用while循環實現求和和階乘 Writing network programming in python (3) Python數據分析與機器學習34-DBSCAN實例 Algorithm notes (14) PCA principal component analysis and Python code implementation Python實現的《芳華》WordCloud詞雲+LDA主題模型 Learn the higher-order map() function in Python 【Python】列表、元組、字典的使用詳解（增刪改查）

熱門圖文

算法5-7：區間搜索 C#的條件編譯指令 Yii實現顯示靜態頁的方法 php class類的用法詳細總結 PHP 類中的魔術方法，php魔術方法 Poker End Games Delphi控制Excel websocket++簡單使用例子，websocket使用例子

欄目導航

編程綜合問答

更多關於編程

編程問題解答

Copyright © 程式師世界 All Rights Reserved