Fix incomplete unknown event stream from V2 Chat Agent
This is a high priority MR for Switch to Chat Agent V2 (gitlab-org#13533 - closed). Please prioritize the review and merge.
What does this MR do and why?
This MR fixes [V2 Chat Agent Bug] A1002 Gitlab::Llm::Chain::... (#490668 - closed) and Chunked encoding streaming from AI Gateway is n... (#490376 - closed).
Gitlab::HTTP
seems to have a bug that it can't iterate streamed events as-is sent by AI Gateway, but it split the data further by the BUFSIZE = 1024 * 16
. This effectively causes a bug in "Unknown event" that, if the data size exceeds the size, the event can't be parsed/recognized in the step executor.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to reproduce
- Apply the following patch in AI Gateway:
diff --git a/ai_gateway/chat/agents/react.py b/ai_gateway/chat/agents/react.py
index b274e666..6c93ebba 100644
--- a/ai_gateway/chat/agents/react.py
+++ b/ai_gateway/chat/agents/react.py
@@ -234,22 +234,24 @@ class ReActAgent(Prompt[ReActAgentInputs, TypeAgentEvent]):
astream = super().astream(input, config=config, **kwargs)
len_final_answer = 0
- async for event in astream:
- if is_feature_enabled(FeatureFlag.EXPANDED_AI_LOGGING):
- log.info("Response streaming", source=__name__, streamed_event=event)
-
- if isinstance(event, AgentFinalAnswer) and len(event.text) > 0:
- yield AgentFinalAnswer(
- text=event.text[len_final_answer:],
- )
-
- len_final_answer = len(event.text)
-
- events.append(event)
-
- if any(isinstance(e, AgentFinalAnswer) for e in events):
- pass # no-op
- elif any(isinstance(e, AgentToolAction) for e in events):
- yield events[-1]
- elif isinstance(events[-1], AgentUnknownAction):
- yield events[-1]
+ yield AgentUnknownAction(text="a" * 20000)
+ # 16338
+ # async for event in astream:
+ # if is_feature_enabled(FeatureFlag.EXPANDED_AI_LOGGING):
+ # log.info("Response streaming", source=__name__, streamed_event=event)
+
+ # if isinstance(event, AgentFinalAnswer) and len(event.text) > 0:
+ # yield AgentFinalAnswer(
+ # text=event.text[len_final_answer:],
+ # )
+
+ # len_final_answer = len(event.text)
+
+ # events.append(event)
+
+ # if any(isinstance(e, AgentFinalAnswer) for e in events):
+ # pass # no-op
+ # elif any(isinstance(e, AgentToolAction) for e in events):
+ # yield events[-1]
+ # elif isinstance(events[-1], AgentUnknownAction):
+ # yield events[-1]
- Run gdk
- Execute chat command in Rails console:
prompt_message = Gitlab::Llm::ChatMessage.new(
content: "Hello",
role: "user",
user: User.first,
ai_action: "chat",
context: Gitlab::Llm::AiMessageContext.new(resource: User.first)
)
ai_prompt_class = nil
options = {:content=>"Hello", :extra_resource=>{}, :action=>:chat}
Gitlab::Llm::Completions::Chat.new(prompt_message, ai_prompt_class, options).execute