Method of Automatically Decomposing Time Series Data with Python

Intro

Time series decomposition is a method that involves splitting time series data into its individual components, each containing a category of pattern. Those categories are the following: noise, trend, and seasonality. In the following tutorial, I intend to demonstrate to you how to use Python to automatically decompose time series data.

Time Series Decomposition

To start, let’s discuss the parts that make up a time series:

  • Noise: It’s the variability present in the data which can’t be explained through the model. This component is what is left after separating out the parts of trend and seasonality of a time series.
  • Trend: It’s whether a time series is increases, decreases, or is constant over time.
  • Seasonality: It’s the periodic signal that you find present in time series data.

For the example, I intend to rely on the Air Passengers Data found on Kaggle.

Python code:

import pandas as pd
import numpy as np
from statsmodels.tsa.seasonal import seasonal_decompose

#https://www.kaggle.com/rakannimer/air-passengers
df=pd.read_csv(‘AirPassengers.csv’)

df.head()

The first step is to set the index as the Month column and convert this column into Datetime Object.

Python code

df.set_index('Month',inplace=True)
df.index=pd.to_datetime(df.index)#drop null values
df.dropna(inplace=True)
df.plot()

The Decomposition

We shall rely on the function seasonal_decompose from Python which is a statsmodel function. This function as should be clear is useful for seasonal decompositions.

result=seasonal_decompose(df['#Passengers'], model='multiplicative', period=12)

In seasonal_decompose, we need to tell it which model to use within the function. This can either be Additive or Multiplicative. As a rule of thumb for choosing the correct model, check with the plot whether the seasonal variation and trend are fairly constant over time, which means see whether they are linear. If the answer is yes, then we should choose an Additive model. Otherwise, if these components increase or decrease over time then we need to use a Multiplicative model.

By month, is how the data is aggregated. The period for which we intend to analyze the data is by year, so you need to set the period in the code to 12.

We obtain the components with the following:

Python code

result.seasonal.plot()

Python code

result.trend.plot()

Alternatively, we could plot each component simultaneously.

Python code

result.plot()

Conclusion

Often, when examining data in a time series it can be tough to use manual methods to pull out the component of trend or determine the seasonality. With Python, you can quickly and effortlessly use this automatic method to decompose the data of a time series. Further, this method provides a clearer view of the components. It’s a lot easier to determine the trend after removing the component of seasonality from your data and vise versa.

Leave a comment

Your email address will not be published. Required fields are marked *