Streaming Response
You can stream messages, code, and code outputs out of Open Interpreter by setting stream=True
in an interpreter.chat(message)
call.
Note: Setting display=True
won’t change the behavior of the streaming response, it will just render a display in your terminal.
Anatomy
Each chunk of the streamed response is a dictionary, that has a “role” key that can be either “assistant” or “computer”. The “type” key describes what the chunk is. The “content” key contains the actual content of the chunk.
Every ‘message’ is made up of chunks, and begins with a “start” chunk, and ends with an “end” chunk. This helps you parse the streamed response into messages.
Let’s break down each part of the streamed response.
Code
In this example, the LLM decided to start writing code first. It could have decided to write a message first, or to only write code, or to only write a message.
Every streamed chunk of type “code” has a format key that specifies the language. In this case it decided to write python
.
This can be any language defined in our languages directory.
Then, the LLM decided to write some code. The code is sent token-by-token:
When the LLM finishes writing code, it will send an “end” chunk:
Code Output
After the LLM finishes writing a code block, Open Interpreter will attempt to run it.
Before it runs it, the following chunk is sent:
If you check for this object, you can break (or get confirmation) before executing the code.
While the code is being executed, you’ll receive the line of code that’s being run:
We use this to highlight the active line of code on our UI, which keeps the user aware of what Open Interpreter is doing.
You’ll then receive its output, if it produces any:
When the code is finished executing, this flag will be sent:
Message
Finally, the LLM decided to write a message. This is streamed token-by-token as well:
For an example in JavaScript on how you might process these streamed chunks, see the migration guide