API calls to LLMs are stateless. Even if the same model
is specified, it does not preserve information about previous calls. Each call is independent of the others.
For applications which require multiple steps of processing by the LLM, all the necessary information must be provided with each API call. In the chatbot context, this would mean passing the entire conversation history with each message, so that the model can understand the context of the current message.
Let's walk through an example application where we use an LLM to complete a task in multiple steps. In this example, an LLM monitors alerts from a server, diagnoses the issue, and generates commands to address it.
First we create a system message which instructs the LLM about how it should handle alerts:
Step 1: LLM Receives Alert Message
An alert is sent by a server:
The application calls the LLM API to send the alert to the LLM:
Step 2: LLM Generates Commands to Diagnose the Alert
The LLM API responds with commands to diagnose the issue:
The application then executes the generated commands on the server:
Resulting in the following diagnostic information:
Step 3: Diagnostic Results Are Sent to the LLM
We want the application to provide the LLM with the results of of executing the diagnostic command, along with the context of the original alert, and the command it previously generated. To do so, we send the following API call:
We send the entire list of messages that have been sent so far by both the application and the LLM, in the order they were sent. The message originally sent by the LLM has the assistant
role. This allows the LLM to keep track of what it has already tried.
Step 4: LLM Analyzes Results and Generates Commands to Clear the Alert
Based on the diagnostic results, the LLM responds with the following command to address the high CPU usage:
We may then execute the command to try to clear the alert:
We can continue this process, sending the results of each step back to the LLM along with all of the previous context, allowing the LLM to act autonomously in response to additional inputs, or to provide further instructions by bringing in human operators when necessary.
Notebooks
The following notebooks demonstrate how to diagnose alerts using LLMs from Google, OpenAI, and Anthropic. These links open in Google Colab, where you can run the code yourself.