If you are interested in forecasting time series data, you may have heard of ARIMA and SARIMA models. These are two common methods of statistical modeling that can capture the patterns and trends in your data. But what are the differences between them, and when should you use one over the other? In this article, we will explain the basics of ARIMA and SARIMA models, and discuss their advantages and disadvantages for different scenarios.

ARIMA stands for AutoRegressive Integrated Moving Average. It is a model that describes the relationship between the current value of a time series and its past values, as well as a random error term. It has three parameters: p, d, and q. The p parameter represents the number of lagged values or autoregressive terms. The d parameter represents the degree of differencing or integration. The q parameter represents the number of moving average terms or error terms. For example, an ARIMA(1,1,1) model means that the current value depends on the previous value, the difference between the current and previous values, and the error term.

SARIMA stands for Seasonal AutoRegressive Integrated Moving Average. It is an extension of ARIMA that can account for seasonal patterns in the data. It has four additional parameters: P, D, Q, and s. The P, D, and Q parameters are similar to the p, d, and q parameters, but they apply to the seasonal component of the data. The s parameter represents the length of the seasonal cycle. For example, a SARIMA(1,1,1)(1,1,1)12 model means that the current value depends on the previous value, the difference between the current and previous values, the error term, the previous seasonal value, the difference between the current and previous seasonal values, the seasonal error term, and the seasonality is 12 periods long.

SARIMA models have some advantages over ARIMA models when the data exhibits strong seasonal patterns. For example, if you are forecasting monthly sales data, you may expect higher sales in certain months due to holidays or seasonal demand. SARIMA models can capture this effect and adjust the forecasts accordingly. SARIMA models can also handle multiple seasonal cycles, such as weekly, monthly, and yearly patterns. This can be useful for data that has complex seasonality, such as electricity consumption or air travel.

SARIMA models also have some disadvantages compared to ARIMA models. One of them is that they require more parameters to estimate, which can increase the complexity and computational cost of the model. Another disadvantage is that they may not perform well when the data has non-seasonal trends or structural changes, such as shifts in consumer behavior or market conditions. SARIMA models assume that the seasonal patterns are stable and consistent over time, which may not be realistic for some data. In these cases, ARIMA models may be more flexible and robust.

There is no definitive answer to whether you should use ARIMA or SARIMA models for your data. It depends on the characteristics of your data, the purpose of your analysis, and the trade-off between simplicity and accuracy. A good way to start is to plot your data and look for any seasonal patterns or trends. You can also use statistical tests or criteria to compare different models and select the best one. Some examples are the Augmented Dickey-Fuller test, the Akaike Information Criterion, and the Bayesian Information Criterion. You can also evaluate your models based on their forecasting performance, such as the mean absolute error or the mean squared error.

ARIMA-based models (which include the SARIMA model) can be compared using typical information criteria (such as the Akaike Information Criteria) as long as both models use the same differencing in order to yield a stationary series.

3 reactions

## Hereâ€™s what else to consider

This is a space to share examples, stories, or insights that donâ€™t fit into any of the previous sections. What else would you like to add?