程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python數據分析與機器學習44-Python生成時間序列

編輯:Python

文章目錄

  • 一. Python 生成時間序列
  • 二.生成不同間隔的時間序列
  • 三. 截斷時間段
  • 四. 時間戳及時間計算
  • 五. 數據重采樣
  • 六. 移動窗口函數
  • 參考:

一. Python 生成時間序列

時間序列

  • 時間戳(timestamp)
  • 固定周期(period)
  • 時間間隔(interval)

date_range

  • 可以指定開始時間與周期
  • H:小時
  • D:天
  • M:月

二.生成不同間隔的時間序列

代碼:

import pandas as pd
import numpy as np
import datetime as dt
# 從2022-07-01開始,間隔3天,生成10條 時間數據
rng = pd.date_range('2022-07-01', periods = 10, freq = '3D')
print(rng)
print("#####################")
# 指定開始時間,結束時間 以及頻率
data=pd.date_range('2022-01-01','2023-01-01',freq='M')
print(data)
print("#####################")
# 從2022-01-01開始,間隔1天,生成20條 時間數據
time=pd.Series(np.random.randn(20),
index=pd.date_range(dt.datetime(2022,1,1),periods=20))
print(time)
print("#####################")
# 不規則的時間間隔
p1 = pd.period_range('2022-01-01 10:10', freq = '25H', periods = 10)
print(p1)
print("######################################")
# 指定索引
rng = pd.date_range('2022 Jul 1', periods = 10, freq = 'D')
print(pd.Series(range(len(rng)), index = rng))
print("######################################")

測試記錄:

DatetimeIndex(['2022-07-01', '2022-07-04', '2022-07-07', '2022-07-10',
'2022-07-13', '2022-07-16', '2022-07-19', '2022-07-22',
'2022-07-25', '2022-07-28'],
dtype='datetime64[ns]', freq='3D')
#####################
DatetimeIndex(['2022-01-31', '2022-02-28', '2022-03-31', '2022-04-30',
'2022-05-31', '2022-06-30', '2022-07-31', '2022-08-31',
'2022-09-30', '2022-10-31', '2022-11-30', '2022-12-31'],
dtype='datetime64[ns]', freq='M')
#####################
2022-01-01 -0.957412
2022-01-02 -0.333720
2022-01-03 1.079960
2022-01-04 0.050675
2022-01-05 0.270313
2022-01-06 -0.222715
2022-01-07 -0.560258
2022-01-08 1.009430
2022-01-09 -0.678157
2022-01-10 0.213557
2022-01-11 -0.720791
2022-01-12 0.332096
2022-01-13 -0.986449
2022-01-14 -0.357303
2022-01-15 -0.559618
2022-01-16 0.480281
2022-01-17 -0.443998
2022-01-18 1.541631
2022-01-19 -0.094559
2022-01-20 1.875012
Freq: D, dtype: float64
#####################
PeriodIndex(['2022-01-01 10:00', '2022-01-02 11:00', '2022-01-03 12:00',
'2022-01-04 13:00', '2022-01-05 14:00', '2022-01-06 15:00',
'2022-01-07 16:00', '2022-01-08 17:00', '2022-01-09 18:00',
'2022-01-10 19:00'],
dtype='period[25H]', freq='25H')
######################################
2022-07-01 0
2022-07-02 1
2022-07-03 2
2022-07-04 3
2022-07-05 4
2022-07-06 5
2022-07-07 6
2022-07-08 7
2022-07-09 8
2022-07-10 9
Freq: D, dtype: int64
######################################

三. 截斷時間段

代碼:

import pandas as pd
import numpy as np
import datetime as dt
# 從2022-01-01開始,間隔1天,生成20條 時間數據
time=pd.Series(np.random.randn(20),
index=pd.date_range(dt.datetime(2022,1,1),periods=20))
print(time)
print("#####################")
# 只輸出2022-01-10 之後的數據
print(time.truncate(before='2022-1-10'))
print("#####################")
# 只輸出2022-01-10 之後的數據
print(time.truncate(after='2022-1-10'))
print("#####################")
# 輸出區間段
print(time['2022-01-15':'2022-01-20'])
print("#####################")

測試記錄:

2022-01-01 -0.203552
2022-01-02 -1.035483
2022-01-03 0.252587
2022-01-04 -1.046993
2022-01-05 0.152435
2022-01-06 -0.534518
2022-01-07 0.770170
2022-01-08 -0.038129
2022-01-09 0.531485
2022-01-10 0.499937
2022-01-11 0.815295
2022-01-12 2.315740
2022-01-13 -0.443379
2022-01-14 -0.689247
2022-01-15 0.667250
2022-01-16 -2.067246
2022-01-17 -0.105151
2022-01-18 -0.420562
2022-01-19 1.012943
2022-01-20 0.509710
Freq: D, dtype: float64
#####################
2022-01-10 0.499937
2022-01-11 0.815295
2022-01-12 2.315740
2022-01-13 -0.443379
2022-01-14 -0.689247
2022-01-15 0.667250
2022-01-16 -2.067246
2022-01-17 -0.105151
2022-01-18 -0.420562
2022-01-19 1.012943
2022-01-20 0.509710
Freq: D, dtype: float64
#####################
2022-01-01 -0.203552
2022-01-02 -1.035483
2022-01-03 0.252587
2022-01-04 -1.046993
2022-01-05 0.152435
2022-01-06 -0.534518
2022-01-07 0.770170
2022-01-08 -0.038129
2022-01-09 0.531485
2022-01-10 0.499937
Freq: D, dtype: float64
#####################
2022-01-15 0.667250
2022-01-16 -2.067246
2022-01-17 -0.105151
2022-01-18 -0.420562
2022-01-19 1.012943
2022-01-20 0.509710
Freq: D, dtype: float64
#####################

四. 時間戳及時間計算

代碼:

import pandas as pd
import numpy as np
import datetime as dt
#時間戳
print(pd.Timestamp('2022-07-25'))
print(pd.Timestamp('2022-07-25 10'))
print(pd.Timestamp('2022-07-25 10:15'))
print("######################################")
#時間區間
print(pd.Period('2022-01'))
print(pd.Period('2022-01-01'))
print("######################################")
#時間計算
#help(pd.Timedelta)
print(pd.Period('2022-01-01 10:10') + pd.Timedelta('1 day'))
print(pd.Period('2022-01-01 10:10:10') + pd.Timedelta('1 s'))
print("######################################")

測試記錄:

2022-07-25 00:00:00
2022-07-25 10:00:00
2022-07-25 10:15:00
######################################
2022-01
2022-01-01
######################################
2022-01-02 10:10
2022-01-01 10:10:11
######################################

五. 數據重采樣

數據重采樣

  • 時間數據由一個頻率轉換到另一個頻率
  • 降采樣
  • 升采樣

代碼:

import pandas as pd
import numpy as np
import datetime as dt
# 生成時間序列
rng = pd.date_range('1/1/2022', periods=90, freq='D')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
#print(ts.head())
# 按月進行匯總
print(ts.resample('M').sum())
print("######################################")
# 按3天進行匯總
print(ts.resample('3D').sum())
print("######################################")
# 求3天的平均值
day3Ts = ts.resample('3D').mean()
print(day3Ts)
print("######################################")
# 將3天的時間序列轉為1天的,結果發現很多空值
# 插值方法:
# 1. ffill 空值取前面的值
# 2. bfill 空值取後面的值
# 3. interpolate 線性取值
print(day3Ts.resample('D').asfreq())
print("######################################")
print(day3Ts.resample('D').ffill(1))
print("######################################")
print(day3Ts.resample('D').bfill(1))
print("######################################")
print(day3Ts.resample('D').interpolate('linear'))
print("######################################")

測試記錄:

2022-01-31 0.904974
2022-02-28 -1.930083
2022-03-31 7.617911
Freq: M, dtype: float64
######################################
2022-01-01 0.104413
2022-01-04 2.255400
2022-01-07 -0.993552
2022-01-10 1.234344
2022-01-13 -0.621381
2022-01-16 -0.072830
2022-01-19 -0.215890
2022-01-22 0.050444
2022-01-25 -1.794619
2022-01-28 0.030952
2022-01-31 -1.022843
2022-02-03 -1.035522
2022-02-06 -1.124857
2022-02-09 1.915781
2022-02-12 0.263875
2022-02-15 0.927552
2022-02-18 0.760483
2022-02-21 -2.771669
2022-02-24 2.157336
2022-02-27 0.107964
2022-03-02 -0.852413
2022-03-05 1.252628
2022-03-08 -0.529793
2022-03-11 2.110139
2022-03-14 1.624062
2022-03-17 -0.241604
2022-03-20 -2.165326
2022-03-23 2.975993
2022-03-26 1.389412
2022-03-29 0.874324
dtype: float64
######################################
2022-01-01 0.034804
2022-01-04 0.751800
2022-01-07 -0.331184
2022-01-10 0.411448
2022-01-13 -0.207127
2022-01-16 -0.024277
2022-01-19 -0.071963
2022-01-22 0.016815
2022-01-25 -0.598206
2022-01-28 0.010317
2022-01-31 -0.340948
2022-02-03 -0.345174
2022-02-06 -0.374952
2022-02-09 0.638594
2022-02-12 0.087958
2022-02-15 0.309184
2022-02-18 0.253494
2022-02-21 -0.923890
2022-02-24 0.719112
2022-02-27 0.035988
2022-03-02 -0.284138
2022-03-05 0.417543
2022-03-08 -0.176598
2022-03-11 0.703380
2022-03-14 0.541354
2022-03-17 -0.080535
2022-03-20 -0.721775
2022-03-23 0.991998
2022-03-26 0.463137
2022-03-29 0.291441
dtype: float64
######################################
2022-01-01 0.034804
2022-01-02 NaN
2022-01-03 NaN
2022-01-04 0.751800
2022-01-05 NaN
2022-01-06 NaN
2022-01-07 -0.331184
2022-01-08 NaN
2022-01-09 NaN
2022-01-10 0.411448
2022-01-11 NaN
2022-01-12 NaN
2022-01-13 -0.207127
2022-01-14 NaN
2022-01-15 NaN
2022-01-16 -0.024277
2022-01-17 NaN
2022-01-18 NaN
2022-01-19 -0.071963
2022-01-20 NaN
2022-01-21 NaN
2022-01-22 0.016815
2022-01-23 NaN
2022-01-24 NaN
2022-01-25 -0.598206
2022-01-26 NaN
2022-01-27 NaN
2022-01-28 0.010317
2022-01-29 NaN
2022-01-30 NaN
...
2022-02-28 NaN
2022-03-01 NaN
2022-03-02 -0.284138
2022-03-03 NaN
2022-03-04 NaN
2022-03-05 0.417543
2022-03-06 NaN
2022-03-07 NaN
2022-03-08 -0.176598
2022-03-09 NaN
2022-03-10 NaN
2022-03-11 0.703380
2022-03-12 NaN
2022-03-13 NaN
2022-03-14 0.541354
2022-03-15 NaN
2022-03-16 NaN
2022-03-17 -0.080535
2022-03-18 NaN
2022-03-19 NaN
2022-03-20 -0.721775
2022-03-21 NaN
2022-03-22 NaN
2022-03-23 0.991998
2022-03-24 NaN
2022-03-25 NaN
2022-03-26 0.463137
2022-03-27 NaN
2022-03-28 NaN
2022-03-29 0.291441
Freq: D, Length: 88, dtype: float64
######################################
2022-01-01 0.034804
2022-01-02 0.034804
2022-01-03 NaN
2022-01-04 0.751800
2022-01-05 0.751800
2022-01-06 NaN
2022-01-07 -0.331184
2022-01-08 -0.331184
2022-01-09 NaN
2022-01-10 0.411448
2022-01-11 0.411448
2022-01-12 NaN
2022-01-13 -0.207127
2022-01-14 -0.207127
2022-01-15 NaN
2022-01-16 -0.024277
2022-01-17 -0.024277
2022-01-18 NaN
2022-01-19 -0.071963
2022-01-20 -0.071963
2022-01-21 NaN
2022-01-22 0.016815
2022-01-23 0.016815
2022-01-24 NaN
2022-01-25 -0.598206
2022-01-26 -0.598206
2022-01-27 NaN
2022-01-28 0.010317
2022-01-29 0.010317
2022-01-30 NaN
...
2022-02-28 0.035988
2022-03-01 NaN
2022-03-02 -0.284138
2022-03-03 -0.284138
2022-03-04 NaN
2022-03-05 0.417543
2022-03-06 0.417543
2022-03-07 NaN
2022-03-08 -0.176598
2022-03-09 -0.176598
2022-03-10 NaN
2022-03-11 0.703380
2022-03-12 0.703380
2022-03-13 NaN
2022-03-14 0.541354
2022-03-15 0.541354
2022-03-16 NaN
2022-03-17 -0.080535
2022-03-18 -0.080535
2022-03-19 NaN
2022-03-20 -0.721775
2022-03-21 -0.721775
2022-03-22 NaN
2022-03-23 0.991998
2022-03-24 0.991998
2022-03-25 NaN
2022-03-26 0.463137
2022-03-27 0.463137
2022-03-28 NaN
2022-03-29 0.291441
Freq: D, Length: 88, dtype: float64
######################################
2022-01-01 0.034804
2022-01-02 NaN
2022-01-03 0.751800
2022-01-04 0.751800
2022-01-05 NaN
2022-01-06 -0.331184
2022-01-07 -0.331184
2022-01-08 NaN
2022-01-09 0.411448
2022-01-10 0.411448
2022-01-11 NaN
2022-01-12 -0.207127
2022-01-13 -0.207127
2022-01-14 NaN
2022-01-15 -0.024277
2022-01-16 -0.024277
2022-01-17 NaN
2022-01-18 -0.071963
2022-01-19 -0.071963
2022-01-20 NaN
2022-01-21 0.016815
2022-01-22 0.016815
2022-01-23 NaN
2022-01-24 -0.598206
2022-01-25 -0.598206
2022-01-26 NaN
2022-01-27 0.010317
2022-01-28 0.010317
2022-01-29 NaN
2022-01-30 -0.340948
...
2022-02-28 NaN
2022-03-01 -0.284138
2022-03-02 -0.284138
2022-03-03 NaN
2022-03-04 0.417543
2022-03-05 0.417543
2022-03-06 NaN
2022-03-07 -0.176598
2022-03-08 -0.176598
2022-03-09 NaN
2022-03-10 0.703380
2022-03-11 0.703380
2022-03-12 NaN
2022-03-13 0.541354
2022-03-14 0.541354
2022-03-15 NaN
2022-03-16 -0.080535
2022-03-17 -0.080535
2022-03-18 NaN
2022-03-19 -0.721775
2022-03-20 -0.721775
2022-03-21 NaN
2022-03-22 0.991998
2022-03-23 0.991998
2022-03-24 NaN
2022-03-25 0.463137
2022-03-26 0.463137
2022-03-27 NaN
2022-03-28 0.291441
2022-03-29 0.291441
Freq: D, Length: 88, dtype: float64
######################################
2022-01-01 0.034804
2022-01-02 0.273803
2022-01-03 0.512801
2022-01-04 0.751800
2022-01-05 0.390805
2022-01-06 0.029811
2022-01-07 -0.331184
2022-01-08 -0.083640
2022-01-09 0.163904
2022-01-10 0.411448
2022-01-11 0.205256
2022-01-12 -0.000935
2022-01-13 -0.207127
2022-01-14 -0.146177
2022-01-15 -0.085227
2022-01-16 -0.024277
2022-01-17 -0.040172
2022-01-18 -0.056068
2022-01-19 -0.071963
2022-01-20 -0.042371
2022-01-21 -0.012778
2022-01-22 0.016815
2022-01-23 -0.188192
2022-01-24 -0.393199
2022-01-25 -0.598206
2022-01-26 -0.395365
2022-01-27 -0.192524
2022-01-28 0.010317
2022-01-29 -0.106771
2022-01-30 -0.223859
...
2022-02-28 -0.070721
2022-03-01 -0.177429
2022-03-02 -0.284138
2022-03-03 -0.050244
2022-03-04 0.183649
2022-03-05 0.417543
2022-03-06 0.219496
2022-03-07 0.021449
2022-03-08 -0.176598
2022-03-09 0.116728
2022-03-10 0.410054
2022-03-11 0.703380
2022-03-12 0.649371
2022-03-13 0.595363
2022-03-14 0.541354
2022-03-15 0.334058
2022-03-16 0.126762
2022-03-17 -0.080535
2022-03-18 -0.294281
2022-03-19 -0.508028
2022-03-20 -0.721775
2022-03-21 -0.150518
2022-03-22 0.420740
2022-03-23 0.991998
2022-03-24 0.815711
2022-03-25 0.639424
2022-03-26 0.463137
2022-03-27 0.405905
2022-03-28 0.348673
2022-03-29 0.291441
Freq: D, Length: 88, dtype: float64
######################################

六. 移動窗口函數

代碼:

import matplotlib.pylab as plt
import numpy as np
import pandas as pd
# 生成時間序列
df = pd.Series(np.random.randn(600), index = pd.date_range('7/1/2022', freq = 'D', periods = 600))
# 使用window函數
r = df.rolling(window = 10)
# 輸出最近10個值的平均值
print(print(r.mean()))
# 畫圖
plt.figure(figsize=(15, 5))
df.plot(style='r')
df.rolling(window=10).mean().plot(style='b')
plt.show()

測試記錄:

參考:

  1. https://study.163.com/course/introduction.htm?courseId=1003590004#/courseDetail?tab=1

  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved