On code completions via LLMs

LLMs can be used to chat to them and for that the end point will be:

POST /v1/chat/completions

But there is another enpoint which is called “Fill In the Middle”, that looks like this:

POST /v1/fim/completions

The idea is that certain LLM providers have trained their models to provide code completion between two points.

The ones I’ve played with are Codestral from Mistral and Mercury Coder Small from Inception Labs.

With Codestral, you get to use it for free, once you create an API key, and you’ll be using this endpoint: https://codestral.mistral.ai. With it, you get 1 request per second access. If you exceed that, you’ll get blocked. But if you sign up for their paid account, you’ll get different access endpoint: https://api.mistral.ai with 6 requests per second.

The second option is Mercury Coder Small, where you just have the paid option. But the access is much quicker, 200 requests per second.

Another benefit of using Mercury Coder Small, it gives good completions without trailing parentheses. (Or at least in the tests I ran.)

What do I mean by that?

When you have the following code prefix:

(defun add (a b)

and suffix:

you’ll get the following suggestion:

(+ a b)

with the final result looking like:

(defun add (a b)
  (+ a b))

with Mercury Coder Small.

But with Codestral you’ll get an extra trailing paren, which will create unbalanced code since you already have the closing “)” suffix:

(defun add (a b)
  (+ a b))) ;; <- note the extra paren

There is a solution for that, you could define a completion hook wich could call utility function for completions cleanup, copilot-balancer: https://github.com/copilot-emacs/copilot.el/blob/main/copilot-balancer.el. This is a standalone functionality, that doesn’t require you to use full copilot package/mode.