This is a general-purpose Retrieval Augmented Generation handler that can be used to create, train, and depoy models within MindsDB.
It supports the following:
Setup
MindsDB provides the RAG handler that enables you to use RAG methods for training models within MindsDB.
AI Engine
Before creating a model, it is required to create an AI engine based on the provided handler.
If you installed MindsDB locally, make sure to install all RAG dependencies by running pip install .[rag]
or from the requirements.txt file.
You can create a RAG engine using this command and providing either OpenAI or Writer parameters:
CREATE ML_ENGINE rag_engine
FROM rag
USING
openai_api_key="openai-api-key",
writer_org_id="writer-org",
writer_api_key="writer-api-key";
The name of the engine (here, rag_engine
) should be used as a value for the engine
parameter in the USING
clause of the CREATE MODEL
statement.
AI Model
The CREATE MODEL
statement is used to create, train, and deploy models within MindsDB.
CREATE MODEL rag_model
FROM datasource
(SELECT * FROM table)
PREDICT answer
USING
engine = 'rag_engine',
llm_type = 'openai',
url = 'link-to-webpage',
vector_store_folder_name = 'db_connection';
Where:
Name | Description |
---|
llm_type | It defines which LLM is used. |
url | It is used to provide training data from a website. |
vector_store_folder_name | It is used to provide training data from a vector database. |
Usage
Simple Example
Below is a complete usage example of the RAG handler.
Create an ML engine - here, we are going to use OpenAI.
CREATE ML_ENGINE rag_engine
FROM rag
USING
openai_api_key = 'sk-xxx';
Create a model using this engine.
CREATE MODEL mindsdb_rag_model
predict answer
USING
engine = "rag_engine",
llm_type = "openai",
url='https://docs.mindsdb.com/what-is-mindsdb';
Check the status of the model.
DESCRIBE mindsdb_rag_model;
Now you can use the model to answer your questions.
SELECT *
FROM rag_model
WHERE question = 'What ML use cases does MindsDB support?';
On execution, we get:
+
| answer | source_documents | question |
+
| MindsDB supports supervised learning tasks such as regression, classification, and time series forecasting, as well as unsupervised learning tasks such as clustering and anomaly detection. | {} | What ML use cases does MindsDB support? |
+
Advanced Examples
OpenAI
Create the RAG engine providing credentials for LLM you want to use.
CREATE ML_ENGINE rag_openai
FROM rag
USING
openai_api_key = "value";
Create a model and embed input data.
CREATE MODEL rag_openai_model
FROM datasource
(SELECT * FROM table)
PREDICT answer
USING
engine="rag_openai",
top_k=4,
llm_type="openai",
summarize_context=true,
vector_store_name="faiss",
run_embeddings=true,
vector_store_folder_name='rag_handler_openai_test',
embeddings_model_name="BAAI/bge-base-en",
prompt_template='Use the following pieces of context to answer the question at the end. If you do not know the answer, just say that you do not know, do not try to make up an answer.
Context: {context}
Question: {question}
Helpful Answer:';
Now that the model is created, trained, and deployed, you can query for predictions.
SELECT *
FROM rag_openai_model
WHERE question = 'what product is best for treating a cold?';
Writer
Create the RAG engine providing credentials for LLM you want to use.
CREATE ML_ENGINE rag_writer
FROM rag
USING
writer_org_id = "value",
writer_api_key = "value";
Create a model and embed input data.
CREATE MODEL rag_writer_model
FROM datasource
(SELECT * FROM table)
PREDICT answer
USING
engine="rag_writer",
top_k=4,
llm_type="writer",
summarize_context=true,
vector_store_name="faiss",
run_embeddings=true,
vector_store_folder_name='rag_handler_openai_test',
embeddings_model_name="BAAI/bge-base-en",
prompt_template='Use the following pieces of context to answer the question at the end. If you do not know the answer, just say that you do not know, do not try to make up an answer.
Context: {context}
Question: {question}
Helpful Answer:';
Now that the model is created, trained, and deployed, you can query for predictions.
SELECT *
FROM rag_writer_model
WHERE question = 'what product is best for treating a cold?';