Retrain a Model
Description
The RETRAIN
statement is used to retrain the already trained predictors with the new data. The predictor is updated to leverage the new data in optimizing its predictive capabilities.
Retraining takes at least as much time as the training process of the predictor did because now the dataset used to retrain has new or updated data in addition to the old data.
Syntax
Here is the syntax:
On execution, we get:
Where:
Expressions | Description |
---|---|
project_name | Name of the project where the model resides. |
predictor_name | Name of the model to be retrained. |
integration_name | Optional. Name of the integration created using the CREATE DATABASE statement or file upload. |
(SELECT column_name, ... FROM table_name) | Optional. Selecting data to be used for training and validation. |
target_column | Optional. Column to be predicted. |
engine_name | You can optionally provide an ML engine, based on which the model is retrained. |
tag_name | You can optionally provide a tag that is visible in the training_options column of the mindsdb.models table. |
active | Optional. Setting it to 0 causes the retrained version to be inactive. And setting it to 1 causes the retrained version to be active. |
Model Versions Every time the model is retrained, its new version is created with the incremented version number.
You can query for all model versions like this:
For more information on managing model versions, check out our docs here.
When to RETRAIN
the Model?
It is advised to RETRAIN
the predictor whenever the update_status
column value from the mindsdb.models
table is set to available
.
Here is when the update_status
column value is set to available
:
- When the new version of MindsDB is available that causes the predictor to become obsolete.
- When the new data is available in the table that was used to train the predictor.
To find out whether you need to retrain your model, query the mindsdb.models
table and look for the update_status
column.
Here are the possible values of the update_status
column:
Name | Description |
---|---|
available | It indicates that the model should be updated. |
updating | It indicates that the retraining process of the model takes place. |
up_to_date | It indicates that your model is up to date and does not need to be retrained. |
Let’s run the query.
On execution, we get:
Where:
Name | Description |
---|---|
predictor_name | Name of the model to be retrained. |
update_status | Column informing whether the model needs to be retrained. |
Example
Let’s look at an example using the home_rentals_model
model.
First, we check the status of the predictor.
On execution, we get:
The available
value of the update_status
column informs us that we should retrain the model.
On execution, we get:
Now, let’s check the status again.
On execution, we get:
And after the retraining process is completed:
On execution, we get: