Self-Hosted Copilot With Ollama

During the end of 2024, there's been a lot of coding companions on the market. One I've liked is Cursor, which is a fork of VSCode and uses Clyde as an underlying model for the companion, among a few others. Its free tier worked well, and you could game the system a bit by going through the signup process multiple time to get more use out of the tool. However they started to patch some of these holes, and I had to look for a fully self hosted solution, as the pricing model for a personal use coding companion is, in my opionion, a little steep.
Self host it!
Hosting LLMs via LM Studio or Ollama are pretty good ways to make this project happen. Since I've used LM Studio enough before, I used Ollama to self host my LLM.
Various VSCode plugins are built for self-hosted LLMs, and it's pretty simple to mix and match. In this experiment I'll try a few tools:
## Running a model
So it's pretty easy to run a model in Ollama. First, get a list of ollama models with ollama list
. Then use ollama run
to download and run your intended model.
ollama listollama run stable-code:3b-code-q4_0
Querying the model directly
By default you'd be able to just directly interface with the LLM from the interactive prompt. The inference is quite fast.
Querying via the API is also simple. The run command starts a server, much like in LM Studio, which has the OpenAI style endpoints that you can curl with some example curls:
curl http://localhost:11434/api/generate -d '{ "model": "stable-code:3b-code-q4_0", "prompt": "Write a hello world program in Java.", "stream": false}'
Packages for Ollama are available in most programming languages if you want to use the OpenAI sytle api programmatically.
LLama Coder Plugin
The Llama Code plugin is able to do tab complete via your local LLM. Luckily it expects a default Ollama setup, which we just configured.All we need is the `ollama` cli installed and it should be able to detect it once we install the extension.
Settings
Next we'll visit the plugin page, click on the settings wheel, and select Settings. For a default installation nothing needs to be saved.
Usage
When writing code, after a space bar press and time waits, the LLM will be fed with the context and a suggestion will be given.
Aider-composer
This plugin supposedly allows for composer like features (code diff that you can apply/reject).
Setup
Certain python packages are required:
pip install aider-chat flask
Resources
Andrea Grandi's dive into this topic