程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

[monthly summary] Database & Python & Excel_ two hundred and two thousand two hundred and six

編輯:Python

Excel

1、 Turn the table into a super table

Implementation method :Ctrl+T function : When there is new data , Data graphics are updated in time

2、 Create a pivot table

Alt + D +P

3、 Dynamic charts

https://jingyan.baidu.com/article/5225f26bb4e00ee6fb090811.html

4、index function

grammar :index(reference,row_now,column_num,area_num) Parameters : reference: It's necessary , A reference to one or more cells . row_num: The line number of a line in the reference , Function returns a reference from the line . column_num: Optional , The column label of a column in a reference ,, Function returns a reference from the column . area_num: Optional , Select a reference area , Return from this area row_num and column_num Intersection 5、 “ return int ” Type in :=DATEDIF(C5,C6,"D"), Be careful : among ,C5 For the start date ,C6 For the end date ,D Then calculate the daily difference .

database

1、 Can be added to the database table , Add data to the specified column

insert into sku_data (sku,cw_type) values (%s,%s)

Python

1、 utilize python Sort groups :

According to Wangwang No , The order number , Amounts are grouped , And sort by payment time
data1['ranks'] = data.groupby([' Wangwang ',' The order number ',' The amount of goods '][' Time of payment '].rank().astype(int)

2、 utilize Python Build a two-tier index , And turn the double layer into a layer .

data2 = pd.pivot_table(data1,values = [' The amount of goods '],index = [' Wangwang ',' Time of payment '],columns =['ranks'])data2.columns.tolist()
# Rename column names
data2.columns =[str(s1) +" The first "+str(s2)+" Time " for (s1,s2) in data2.columns.tolist()]data2.reset_index(inplace =True)  # Reset index , Then flatten the index

3、 Distribution density diagram , Visual mapping method :sns.distplot()

Auxiliary line drawing method :plt.axvline()

4、 Change the proportion into the form of percentage

bk_std_fsl[' Proportion of orders '] = bk_std_fsl[' Number of orders '].apply(lambda x: '%.2f%%' % (x / bk_std_fsl[' Number of orders '].sum() * 100))

5、 Remove missing values

data.dropna(inplace = True)   # Remove missing values

6、python pandas Remove duplicates

Remove all duplicates :df.drop_duplicates(inplace = True)Remember:  (inplace = True) Will ensure that the method does not return a new DataFrame, But it will change from the original DataFrame Delete all duplicates .

7、 Replace various types of data

For example DataFrame Null value in , Some strings, etc. are all replaced with 0 replace([na.nan,' No data ',' countless '],0,inplace = True)

8、 Transposition

data.T

9、numpy.concatenate() Method

      numpy Provides numpy.concatenate((a1,a2,a3,……),axis =0), It can splice multiple arrays at one time , among a1,a2,a3 Is an array type parameter .
a = np.array([2,3,4])b = np.array([11,22,22])c = np.array([55,88,99])np.concatenate((a,b,c),axis =0)  # By default ,axis=0 Don't write     array([  2,  3, 4,11, 22, 22, 55, 88,99]) # For one-dimensional array splicing ,axis The value of does not affect the final result .
for example :
angle= np.linspace(0,2*np.pi,4,endpoint=False)  # Set the display position of each data point angle = np.concatenate((angle,[angle[0]]))

10、pd.cut() Parameters

pd.cut(x,bins,right=True,labels=None,retbins=False,precision=3,include_lowest=False,duplicates='raise')
  • x : One dimensional array
  • bins : Integers , Scalar sequence or interval index , It is the basis for grouping ,
         If you fill in an integer n, Will be x The values in are divided into equal width n Share ( That is, the difference between the maximum value and the minimum value in each group is approximately equal );         If it's a scalar sequence , The values in the sequence represent the dividing values used for grading          If it is an interval index ,“ bins” The interval indexes of must not overlap
  • right : Boolean value , The default is True Indicates that it contains the rightmost value
         When “ right = True”( The default value is ) when , be “ bins”=[1、2、3、4] Express (1,2],(2,3],(3,4]         When bins Is an interval index , This parameter is ignored .
  • labels : Array or Boolean , Optional . Specify the label of the sub box
         If it's an array , The length should be consistent with the number of boxes , such as “ bins”=[1、2、3、4] Express (1,2],(2,3],(3,4] altogether 3 Intervals , be labels The length of the tag is the number of tags 3         If False, Then only the integer indicator of the bin is returned , namely x The data in is in the first few boxes          When bins Is the interval index , This parameter will be ignored
  • retbins: Whether to display the boundary value of sub box . The default is False, When bins When taking an integer, you can set retbins=True To display the boundary value , Get the divided interval
  • precision: Integers , Default 3, Accuracy of storing and displaying sub box labels .
  • include_lowest: Boolean value , Indicates whether the left side of the interval is open or closed , The default is false, That is, the left side of the interval is not included .
  • duplicates: If the critical value of container separation is not unique , The cause ValueError Or discard non unique

11、matplotlib.pyplot.axvline()

Parameters :x: In data coordinates x Position to place the vertical line ymin:y The starting position of the vertical line on the axis , It will take 0 To 1 Between the value of the ,0 It's the bottom of the shaft ,1 It's the top of the shaft ymax:y The end position of the vertical line on the axis , It will take 0 To 1 Between the value of the ,0 It's the bottom of the shaft ,1 It's the top of the shaft **kwargs: Other optional parameters can change the properties of the line , for example Color change , Line width, etc

12、 Set gridlines

The following example adds a simple gridline , And set the style of gridlines , The format is as follows :grid(color ='color', linestyle ='linestyle', linewidth = number) Parameter description :    color:'b' Blue ,'m' Magenta ,'g' green ,'y' yellow ,'r' Red ,'k' black ,'w' white ,'c' Turquoise ,'#008000' RGB Color string .    linestyle:'' Solid line ,'' Broken line ,'.' Point line ,':' Dotted line .    linewidth: Set the width of the line , You can set a number .

13、matplotlib Area stacking diagram

data.plot.area(colormap='',figsize =(x,y)) colormap Represents the color block to be used

14、matplotlib Radar map

  •   Drawing radar maps , You need to establish polar coordinates first ;
  • After building the polar coordinates , Histogram can be drawn in polar coordinate system , Line chart, etc , In most cases, line charts are used , Form an irregular closed polygon .
Draw multiple points , And the first point is the same as the last point , Make it a closed figure . plt.polar( radian , radius ,"ro",lw)    # The radian system is used , Use radians to express degrees ;ro in r It means red ,o For shape ;lw Indicates the size of the point for example : 360 degree , Express 2Π(2*np.pi);180 Degree means Π(np.pi) plt.polar(0.25*np.pi,20,)
  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved