Ollama
Ollama is a project that enables easy local deployment of Large Language Models (LLMs).
All models supported by Ollama are available in MindsDB through this integration.
For now, this integration will only work in MacOS, with Linux and Windows to come later.
Locally deployed LLMs can be desirable for a wide variety of reasons. In this case, data privacy, developer feedback-loop speed and inference cost reduction can be powerful reasons to opt for a local LLM.
Ideal predictive use cases, as in other LLM-focused integrations (e.g. OpenAI, Anthropic, Cohere), will be anything involving language understanding and generation, including but not limited to:
- zero-shot text classification
- sentiment analysis
- question answering
- summarization
- translation
Setup
- A macOS machine, M1 chip or greater.
- A working Ollama installation. For instructions refer to their webpage. This step should be really simple.
- For 7B models, at least 8GB RAM is recommended.
- For 13B models, at least 16GB RAM is recommended.
- For 70B models, at least 64GB RAM is recommended.
More information here. Minimum specs can vary depending on the model.
AI Engine
Before creating a model, it is required to create an AI engine based on the provided handler.
You can create an Ollama engine using this command:
The name of the engine (here, ollama
) should be used as a value for the engine
parameter in the USING
clause of the CREATE MODEL
statement.
AI Model
The CREATE MODEL
statement is used to create, train, and deploy models within MindsDB.
Where:
Name | Description |
---|---|
engine | It defines the Ollama engine. |
model_name | It is used to provide the name of the model to be used |
Supported commands for describing Ollama models are:
DESCRIBE ollama_model;
DESCRIBE ollama_model.model;
DESCRIBE ollama_model.features;
Usage
Once you have connected to an Ollama model, you can use it to make predictions.