Open Interpreter can use OpenAI-compatible server to run models locally. (LM Studio,, ollama etc)

Simply run interpreter with the api_base URL of your inference server (for LM studio it is http://localhost:1234/v1 by default):

interpreter --api_base "http://localhost:1234/v1" --api_key "fake_key"

Alternatively you can use Llamafile without installing any third party software just by running

interpreter --local

for a more detailed guide check out this video by Mike Bird

How to run LM Studio in the background.

  1. Download then start it.
  2. Select a model then click ↓ Download.
  3. Click the ↔️ button on the left (below 💬).
  4. Select your model at the top, then click Start Server.

Once the server is running, you can begin your conversation with Open Interpreter.

(When you run the command interpreter --local and select LMStudio, these steps will be displayed.)

Local mode sets your context_window to 3000, and your max_tokens to 1000. If your model has different requirements, set these parameters manually.


Compared to the terminal interface, our Python package gives you more granular control over each setting.

You can point interpreter.llm.api_base at any OpenAI compatible server (including one running locally).

For example, to connect to LM Studio, use these settings:

from interpreter import interpreter

interpreter.offline = True # Disables online features like Open Procedures
interpreter.llm.model = "openai/x" # Tells OI to send messages in OpenAI's format
interpreter.llm.api_key = "fake_key" # LiteLLM, which we use to talk to LM Studio, requires this
interpreter.llm.api_base = "http://localhost:1234/v1" # Point this at any OpenAI compatible server

Simply ensure that LM Studio, or any other OpenAI compatible server, is running at api_base.