Local Providers
Best Practices
Most settings — like model architecture and GPU offloading — can be adjusted via your LLM providers like LM Studio.
However, max_tokens
and context_window
should be set via Open Interpreter.
For local mode, smaller context windows will use less RAM, so we recommend trying a much shorter window (~1000) if it’s is failing or if it’s slow.
Make sure
max_tokens
is less than context_window
.