程序師世界是廣大編程愛好者互助、分享、學習的平台，程序師世界有你更精彩！


設為首頁	加入收藏

首頁
編程語言: C語言|JAVA編程
 Python編程
網頁編程: ASP編程|PHP編程
 JSP編程
數據庫知識: MYSQL數據庫|SqlServer數據庫
 Oracle數據庫|DB2數據庫

您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python：使用拉依達准則（3σ准則）剔除excel表中異常數據

編輯：Python

1.簡介
拉依達准則(Pau’ta Criteron)是先假設一組數據中只含有隨機誤差，首先按照一定准則計算標准偏差，按照一定概率確定一定區間，認為不在這個區間的為異常值。當數據呈正太分布或者近似正太分布時可以使用

2.數據集示例

3.完整處理代碼

import numpy as np
import pandas as pd
#設置需讀取文件的路徑
datapath = "traning處理前.xlsx"
data = pd.read_excel(datapath)
# 記錄方差大於3倍的值
#shape[0]記錄行數，shape[1]記錄列數
sigmayb = [0]*data.shape[0]
for i in range(1,data.shape[1]):
print("處理第"+str(i)+"行")
# 循環 每一列
lie = data.iloc[:, i].to_numpy()
#print(lie)
mea = np.mean(lie)
s = np.std(lie, ddof=1)
# 計算每一列 均值 mea 標准差 s
print("均值和標准差分別為："+str(mea)+" "+str(s))
#統計大於三倍方差的行
for t in range(1,data.shape[0]):
if (abs(lie[t]-mea) > 3*s):
print(">3sigma"+" "+str(t)+" "+str(i))
#將異常值置空
data.iloc[t,i]=' '
#將處理後的數據存儲到原文件中
data.to_excel(datapath)

4.運行結果

上一篇文章： Python:根據Excel中的數據生成heatmap熱力圖
下一篇文章：從python圖像動漫化的設計和應用快速入門vue+python+深度學習+接口+部署

Python

給表弟寫的Python制作GUI學生管理系統畢設，老師直接給出滿分，畢業穩了

Python制作學生管理系統序言代碼解析一、登錄頁面1、定

Palindrome linked list Python

leetCode The first 234 topic P

Python time模塊之時間戳與結構化時間的使用

目錄1. 時間戳1.1 time.time()1.2 時間戳

We live in the Python age

1989 year ,Guido van Rossum Pa

Case 1: Pandas time series 01

Now we have 2015 To 2017 year

Meta 支持將 Hack/Python/C++/Rust 作為開發人員首選語言

Meta 宣布批准了 Hack、Python、C++ 和 R

相關文章

没有相关文章

閱讀排行榜

Unit 1 learn calculus with python (I) derivative (I) - derivative of 1/x CentOS 7從Python 2.7升級至Python3.6.1 [Advanced Python scripting] 2.5. Write your own 0day proof-of-concept code: stack buffer overflow attack, add key elements of attack, and send exploit code Use burpsuite to capture the post request of Python Python (Blue Bridge Cup) basic-6 Yanghui triangle Awesome, 40 seconds! Use Python to realize automatic minesweeping and challenge the world record! python異常處理 Python reader Python common skills: a prerequisite for getting started with crawlers - advantages and usage of IP proxy Python 在問答頻道中刷題積累到的小技巧（七） Ubuntu上安裝python連接oracle數據庫的包

熱門圖文

poj 1548 Robots 最小路徑匹配解法 python - 異常 php 操作 mysql 預處理(未完待續) file_put_contents 錯誤:failed to open stream: Invalid argument 一種原因，invalidromcontents BZOJ 2946 Poi2000 公共串後綴自動機平衡二叉樹的調整模版使用Python配置虛擬環境編程-求數值分析程序問題見圖片（c/c++，數值分析）

欄目導航

編程綜合問答

更多關於編程

編程問題解答

Copyright © 程式師世界 All Rights Reserved