Time Series Analysis and Demand Forecasting
import numpy as np
import pandas as pd
import matplotlib.pylab as plt
from sklearn import linear_model
from sklearn.linear_model import LinearRegression
from sklearn import metrics
import pylab as pl
import math
For the exercises in this lab, we will use the oil price dataset from Kaggle. We will consider that you are using Google Colab so upload your data files to the directory sample_data.
Upload and explore the data
Depicting the trend of the Oil Price
A trend exists when there is a long-term increase or decrease in the data. It does not have to be linear. Researchers also refer to a trend as “changing direction”, when it might go from an increasing trend to a decreasing trend and vice versa.
Seasonal Component
A seasonal pattern occurs when a time series is affected by seasonal factors such as the time of the year or the day of the week. Seasonality is always of a fixed and known frequency.
Using the oil data and groupby, find the seasonal index per month. You can start by computing the average oil price per month over the different years and then compute the average of averages to get the seasonal index per month. Your output should look like:
Note: the values of the Seasonal Index may change slightly due to the used technique for handling the missing values.
Time series decomposition
There are three types of time series patterns: trend, seasonality and cycles. When decomposing a time series into components, we usually combine the trend and cycle into a single trend-cycle component (sometimes called the trend for simplicity). Thus, we think of a time series as comprising three components: a trend-cycle (or trend) component, a seasonal component, and a remainder component (containing anything else in the time series).
During the lecture, you learned about different techniques (‘Additive’, ‘Multiplicative’ and ‘STL’) for decomposing time series. Now, you need to apply them on the oil dataset and comment on your findings.
You may look at this Website for more information
Demand Forecasting
We will also use the oil dataset. We will remove the values of the last two weeks from the data and use it to check the different forecasting models. We will test forecasting using moving average, linear regression, ARIMA and Exponential Smoothing from the STL package.
We will start by reading the data and splitting it into train and test (the test will contain the readings of the last two week in the time series).