程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

[teacher zhaoyuqiang] use Python to complete the analysis of data distribution characteristics

編輯:Python
IT Reading can .jpg

After analyzing the quality of the data , Next, we can analyze and calculate the characteristics of the data , You can also draw charts to show the characteristics of the data . There are several ways to analyze the characteristics of data : Distribution analysis 、 comparative analysis 、 Statistical analysis 、 Periodic analysis 、 Contribution analysis ( Pareto analysis )、 correlation analysis 、 Normality test .

Distribution analysis can reveal the distribution characteristics and types of data .

  • For quantitative data , Want to know whether its distribution is symmetric or asymmetric , Find some extra large or extra small values , You can draw a histogram of frequency distribution 、 Visual analysis of stem and leaf map ;
  • For qualitative data , Pie chart and bar chart can be used to visually display the distribution .

Let's use a specific example to demonstrate how to Conduct quantitative and qualitative distribution analysis . Here are the test data to be used . This is a digital camera in 1998 Sales order data for the whole year . Here's the front 10 Data :

picture .png
  • For quantitative data analysis

Quantitative analysis of data , The most common way to show its distribution is histogram (Histogram). This kind of diagram is also called mass distribution diagram , It's a statistical report chart , Data distribution is represented by a series of vertical stripes or line segments with different heights . Generally, the horizontal axis is used to represent the data type , The vertical axis shows the distribution .

The histogram can be drawn according to the following steps :

  1. Seeking the range . For the same indicator , The greater the range , The more unstable the data is import pandas as pd import matplotlib.pyplot as plt

data = pd.read_csv("/root/data/ Digital camera order data .csv")

Find the range of order amount . For the same indicator , The greater the range , The more unstable the data is

dr = data' Order amount '.max() - data' Order amount '.min()

print(" The order amount range is :",dr)

Draw histogram , Displays the total orders for each month of the year

Convert data to DatFrame. Here we only need the order time and order amount

df = pd.DataFrame({"datetime":data" The order time ","amount":data" Order amount "})

Take out the month in the order time

df'datetime' = pd.to_datetime(df'datetime')

df'month' = df'datetime'.dt.month.fillna(0).astype("int")

Calculate the total monthly orders by month , And display it with histogram

result = df.groupby('month').sum('amount')

# Output data distribution square table

print(result)

Draw data distribution histogram

result.plot(kind='bar')

plt.xlabel('Month')

plt.ylabel('Total Sales')

plt.show() The histogram drawn is as follows .

picture .png
  1. Grouped data , And decided to divide the time
  2. Draw a square table of frequency distribution
  3. Plot frequency distribution histogram
  4. For qualitative data analysis

Qualitative analysis of data is often grouped according to the classification type of variables , The most common way to show its distribution is pie chart or bar chart to describe the distribution of qualitative variables . for example , The pie chart shows the ratio of the size of the items in a data series to the sum of the items . The data points in the pie chart are displayed as a percentage of the whole pie chart .

IT Reading can .jpg

The following is an example of pie chart , Just add the last generated in the histogram code above DataFrame(result) Draw... Directly using pie charts , that will do . As shown below .

# Draw a pie chart of data
result.plot.pie(subplots=True,figsize=(11, 11))
plt.show()
# Be careful : The pie chart here is based on 1 month ~12 Classified by month .

The pie chart drawn is as follows .

picture .png

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved