您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python Chapter 13 introduction to Python modeling library

編輯：Python

In this book , I've already introduced Python Programming basis of data analysis . Because data analysts and scientists always spend a lot of time on data collation and preparation , The focus of this book is to master these functions .

What library to choose for the development model depends on the application itself . Many statistical problems can be solved by simple methods , Such as ordinary least square regression , Other problems may require complex machine learning methods . Fortunately, ,Python It has become one of the languages that use these analytical methods , So after reading this book , You can explore many tools .

In this chapter , I will review some pandas Characteristics , When you cling to pandas Data normalization and model fitting and scoring , They may come in handy . Then I will briefly introduce two popular modeling tools ,statsmodels and scikit-learn. Each of these two is worth writing another book , I will not make a comprehensive introduction , Instead, it is recommended that you study the online documentation of the two projects and other information based on Python Data science 、 Statistics and machine learning books .

13.1 pandas Interface with model code

The usual workflow for model development is to use pandas Data loading and cleaning , Then switch to the modeling library for modeling . An important part of developing models is in machine learning “ Feature Engineering ”. It can describe any data transformation or analysis that extracts information from the original data set , These datasets may be useful in modeling . Data aggregation and GroupBy Tools are often used in feature engineering .

Excellent feature engineering is beyond the scope of this book , I will try my best to introduce some methods for data operation and modeling switching .

pandas And other analysis libraries usually rely on NumPy Array of . take DataFrame Convert to NumPy Array , have access to .values attribute ：