Granite on Mac
Overview
In this guide we’ll be using Ollama, an open-source tool that makes it easy to download and run AI models locally.
Install Ollama
Install Ollama using Homebrew. You can also download the app directly from Ollama.
brew install ollama
Once you’ve installed Ollama, start the server.
Note: &
keeps the service running in the background and can be omitted if
preferred.
ollama serve &
Download the models
Now, let’s download the models. Determining which model to use depends on your needs and what your device can handle. Generally, larger models will produce better results but also require more resources.
In this guide, we’ll fetch both the 3b and 20b models. These are large files and will take some time to download.
ollama pull granite-code:20bollama pull granite-code:3b
Run Granite
To run the model, type:
ollama run granite-code:3b
If you want to run a different size Granite Code model, replace :3b
with the
model size. For example, to run the 20b variant, use
ollama run granite-code:20b
.
You can also interact with the model using a curl command to send requests through the OpenAI-compatible API:
curl -XPOST -H "Content-Type: application/json" http://localhost:11434/v1/chat/completions -d '{"model": "granite-code:3b", "messages": [{"role": "users", "content": "How are you today?"}]}'