Forecast Quarterly House Sales with MindsDB
In this tutorial, we’ll use a time-series model to forecast quarterly house sales.
This tutorial uses the Lightwood integration that requires the mindsdb/mindsdb:lightwood
Docker image. Learn more here.
Connect a data source
We start by connecting a demo database to MindsDB using the CREATE DATABASE
statement.
Let’s preview the data that will be used to train the model.
Deploy and train an ML model
Now, lets specify that we want to forecast the ma
column, which is a moving average of the historical median price for house sales. Looking at the data, you can see several entries for the same date, which depend on two factors: how many bedrooms the properties have, and whether properties are “houses” or “units”. This means that we can have up to ten different groupings here. Let’s look at the data for one of them.
We want to generate forecasts to predict the behavior of this and the other series for the next year. MindsDB makes it simple so that we don’t need to repeat the predictor creation process for every group there is. Instead, we can just group for both columns and the predictor will learn from all series and enable all forecasts. We are going to use the CREATE MODEL
statement, where we specify what data to train FROM
and what we want to PREDICT
.
You can check the status of the model as below:
Make predictions
Once the model’s status is complete, you can query it as a table to get forecasts for a given period of time. Usually, you’ll want to know what happens right after the latest training data point that was fed, for which we have a special bit of syntax, the LATEST
keyword.
Now, try changing the value of type
and bedrooms
columns and check how the forecast varies. This is because MindsDB recognizes each grouping as being its own different time series.
Automate continuous improvement of the model
Now, we can take this even further. MindsDB includes powerful automation features called Jobs which allow us to automate queries in MindsDB. This is very handy for production AI/ML systems which all require automation logic to help them to work.
We use the CREATE JOB
statement to create a Job.
Now, let’s use a Job to retrain the model every two days, just like we might in production. You can retrain the model to improve predictions every time when either new data or new MindsDB version is available. And, if you want to retrain your model considering only new data, then go for finetuning it.
This job will execute every 2 days only if there is new data available in the house_sales
table. Learn more about the LAST
keyword here.
And there you have it! You created an end-to-end automated production ML system in a few short minutes.