Variable Selection when forecasting

Q: What is variable selection, and why does it matter for forecasting accuracy?

Variable selection is the process of identifying the most relevant variables or leading indicators that impact forecast accuracy.

Q: How does your feature determine which variables and transformations to include or exclude?

The feature uses machine learning to automatically rank variables based on their impact on forecast accuracy.

Q: Does variable selection work for both univariate and multivariate time series?

Variable selection works for both univariate and multivariate time series forecasting.

Q: What methods do you use (e.g., regularization, feature importance, SHAP) to rank variable relevance?

The platform uses machine learning methods including feature importance analysis to rank variable relevance.

Q: How does the feature prevent overfitting, especially with many candidate variables?

The feature uses cross-validation to test different sets of variables and ensure models perform well on out-of-sample data.

Key features in Indicio

Automated Feature Importance Analysis
‍Harness the power of machine learning to automatically rank variables based on their impact on forecast accuracy. Save time by allowing our system to highlight the most critical variables, ensuring you focus on the ones that truly drive results.
Manual override
While automation is powerful, you know your data best. Easily override automated selections by manually selecting or deselecting variables. This gives you full control over the forecasting process and enables customization for specific business needs.
Cross-Validation
Test different sets of variables against one another using built-in cross-validation. This ensures that the selected variables performs well on data that the model hasn't seen yet (out-of-sample), improving the model’s predictive performance.
Time Series Feature Engineering
Automatically generate lag features, moving averages, and seasonal factors relevant to time series forecasting. Easily adjust these features to match your data’s granularity and forecasting horizon, improving long-term accuracy.

How it works

Create a Forecast
Begin by setting up your forecast. Give it a unique name and define the forecast horizon, specifying how far into the future you want to predict—whether it’s days, months, or even years.
Import Data
Seamlessly bring your data into the platform by uploading files, fetching data from our integrations with 3rd-party data vendors, or connecting directly to your database. Indicio supports a wide range of formats to ensure flexibility.
Analyze Variable Importance
Let Indicio automatically analyze and rank the importance of each variable. Our system evaluates which features are most likely to impact forecast accuracy, helping you focus on the key drivers of your model.
Build Models
Choose from an extensive library of both statistical and machine learning models to build your forecast. Whether you prefer time-tested approaches like ARIMA or cutting-edge neural networks, Indicio has you covered.
Evaluate Models
Use cross-validation to evaluate your models and get out-of-sample accuracy. Compare models based on the metrics most relevant to your business needs—RMSE, MAPE, MASE, or Hit-ratio—and choose the best performer for deployment.
Export Forecast
Once your forecast is finalized, easily export it to your IT environment. Share the results with your team to help your organization make informed, data-driven decisions with confidence.

Frequently asked questions

What is variable selection, and why does it matter for forecasting accuracy?

‍Variable selection is the process of choosing which variables (features) your model should actually use. Things like price, promotions, weather, holidays, macro indicators, or custom business signals. Instead of feeding the model every possible variable, we keep the signals that add predictive value and drop those that add noise.

How does your feature determine which variables and transformations to include or exclude?

Our feature offers several strategies to choose variables and transformations. It can use search algorithms (backward, forward, stepwise) to test many variable combinations, Lasso to shrink small coefficients to zero, and Bayesian methods that keep variables with high posterior inclusion probability.

Can I combine automatic selection with my own expert-picked variables?

Yes, you can override the variable selection results if you need to have specific variables in your forecasting models.

How do you handle multicollinearity and redundant predictors?

Multicollinearity mainly affects classical statistical models, while Lasso and Bayesian approaches already penalize it. For classical models, you can drop variables flagged in multicollinearity warnings or let variable selection remove them using a model that is sensitive to multicollinearity.

Does variable selection work for both univariate and multivariate time series?

In Indicio, variable selection is applied only to multivariate models. Univariate models can only include other variables through exogenous modeling, which needs forecasts and would introduce look-ahead bias during evaluation since actual values are used for the exogenous variables.

‍

What methods do you use (e.g., regularization, feature importance, SHAP) to rank variable relevance?

Indicio offers several methods for ranking variables by relevance. It can either be done in the variable selection, where we use search algorithms (backward, forward, stepwise) that test variable combinations, Lasso to shrink small coefficients to zero, and Bayesian methods that keep variables with high posterior inclusion probability.

Ranking the variables' relevance can also be done in the last step in the forecasting process to translate complex forecast models into drivers and barriers using SHAP.

How does the feature prevent overfitting, especially with many candidate variables?

Indicio limits overfitting in several ways; train/validation splits and cross-validation, regularization (Lasso and Bayesian shrinkage), and automated variable selection that removes weak or redundant predictors.

Tip: comparing in-sample and out-of-sample results helps spot overfitting.

Can I see transparency/explainability on why a variable was selected or dropped?

Yes. You can inspect diagnostics like coefficients, and impact on accuracy. Together these show which variables were kept or dropped, how strongly they influence the model, and whether they help or hurt forecast performance.

How does variable selection impact training speed and inference latency at scale?

Variable selection adds some overhead, since it needs to test and compare different subsets of predictors. At scale, that cost is offset by smaller final models: fewer predictors speed up training of the chosen model and reduce inference latency in production.

What data prep is required (missing values, seasonality/holiday flags, categorical encodings) for best results?

Indicio automatically detects and treats missing values and seasonality. You can also flag and handle outliers and calendar effects such as holidays to further improve model performance.

Variable Selection

Correlation vs. advanced methods

Key features in Indicio

Webinar - Identifying leading indicators

How it works

Frequently asked questions

Explore more features

Build models

Scenario Analysis

Explainable Forecasting

Virtual demo

View our click-through demo