max_tokens and context_window should be set via Open Interpreter.
For local mode, smaller context windows will use less RAM, so we recommend trying a much shorter window (~1000) if it’s is failing or if it’s slow.
Make sure
max_tokens is less than context_window.
