<- timetk::taylor_30_min data
Lab 6 - Time Series Methods
SOLUTIONS
Introduction
In today’s lab, you’ll practice building workflows
with recipes
, parsnip
models, rsample
cross validations, and model comparison in the context of timeseries data.
Packages
The Data
Today we will be using electricity demand data, based on a paper by James W Taylor:
Taylor, J.W. (2003) Short-term electricity demand forecasting using double seasonal exponential smoothing. Journal of the Operational Research Society, 54, 799-805.
The data can be found in the timetk
package as timetk::taylor_30_min
, a tibble with demensions: 4,032 x 2
date
: A date-time variable in 30-minute incrementsvalue
: Electricity demand in Megawatts
Exercise 1: EDA
Plot the data using the functions timetk::plot_time_series
, timetk::plot_acf_diagnostics
(using 100 lags), and timetk::plot_seasonal_diagnostics
.
Exercise 2: Time scaling
The raw data has 30 minutes intervals between data points. Downscale the data to 60 minute intervals, using timetk::summarise_by_time
, revising the electricity demand (value) variable by adding the two 30-minute intervals in each 60-minute interval. Assign the downscaled data to the variable taylor_60_min
.
Exercise 3: Training and test datasets
- Split the new (60 min) time series into training and test sets using
timetk::time_series_split
- set the training period (‘initial’) to ‘2 months’ and the assessment period to ‘1 weeks’
- Prepare the data resample specification with
timetk::tk_time_series_cv_plan()
and plot it withtimetk::plot_time_series_cv_plan
- Separate the training and test data sets using
rsample
.
Exercise 4: recipes
Create a base recipe (base_rec) using the formula
value ~ date
and the training data. This will be used for non-regression modelsCreate a recipe (lm_rec) using the formula
value ~ .
and thetraining
data. This will be used for regression models. For this recipe:- add time series signature features using
timetk::step_timeseries_signature
with the appropriate argument, - add a step to select the columns
value
,date_index.num
,date_month.lbl
,date_wday.lbl
,date_hour
, - add a normalization step targeting
date_index.num
, - add a step to mutate
date_hour
, changing it to a factor, - add a step to one-hot encode nominal predictors.
- add time series signature features using
Exercise 5 models
Now we will create a several models to estimate electricity demand, as follows
- Create a model specification for an exponential smoothing model using engine ‘ets’
- Create a model specification for an arima model using engine ‘auto_arima’
- Create a model specification for a linear model using engine ‘glmnet’ and penalty = 0.02, mixture = 0.5
Exercise 6 model fitting
Create a workflow for each model using workflows::workflow
.
- Add a recipe to the workflow
- the linear model uses the
lm_rec
recipe created above - the
ets
andarima
models use thebase_rec
recipe created above
- the linear model uses the
- Add a model to each workflow
- Fit with the training data
Exercise 7: calibrate
In this exercise we’ll use the testing data with our fitted models.
- Create a table with the fitted workflows using
modeltime::modeltime_table
- Using the table you just created, run a calibration on the test data with the function
modeltime::modeltime_calibrate
. - Compare the accuracy of the models using the
modeltime::modeltime_accuracy()
on the results of the calibration
Which is the best model by the rmse metric?
Exercise 8: forecast - training data
Use the calibration table with modeltime::modeltime_forecast
to graphically compare the fits to the testing data with the observed values.
Exercise 9: forecast - future
Now refit the models using the full data set (using the calibration table and modeltime::modeltime_refit
). Save the result in the variable refit_tbl.
- Use the refit data in the variable refit_tbl, along with
modeltime::modeltime_forecast
and argumenth
= ‘2 weeks’ (remember to also set theactual_data
argument). This will use the models to forecast electricity demand two weeks into the future. - Plot the forecast with
modeltime::plot_modeltime_forecast
.
Grading
Total points available: 30 points.
Component | Points |
---|---|
Ex 1 - 9 | 30 |