Fix incomplete unknown event stream from V2 Chat Agent (!166424) · Merge requests · GitLab.org / GitLab

Shinya Maeda requested to merge fix-v2-chat-unknown-event-incomplete into master Sep 18, 2024

This is a high priority MR for Switch to Chat Agent V2 (gitlab-org#13533 - closed). Please prioritize the review and merge.

What does this MR do and why?

This MR fixes [V2 Chat Agent Bug] A1002 Gitlab::Llm::Chain::... (#490668 - closed) and Chunked encoding streaming from AI Gateway is n... (#490376 - closed).

Gitlab::HTTP seems to have a bug that it can't iterate streamed events as-is sent by AI Gateway, but it split the data further by the BUFSIZE = 1024 * 16. This effectively causes a bug in "Unknown event" that, if the data size exceeds the size, the event can't be parsed/recognized in the step executor.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to reproduce

Apply the following patch in AI Gateway:

diff --git a/ai_gateway/chat/agents/react.py b/ai_gateway/chat/agents/react.py
index b274e666..6c93ebba 100644
--- a/ai_gateway/chat/agents/react.py
+++ b/ai_gateway/chat/agents/react.py
@@ -234,22 +234,24 @@ class ReActAgent(Prompt[ReActAgentInputs, TypeAgentEvent]):
         astream = super().astream(input, config=config, **kwargs)
         len_final_answer = 0
 
-        async for event in astream:
-            if is_feature_enabled(FeatureFlag.EXPANDED_AI_LOGGING):
-                log.info("Response streaming", source=__name__, streamed_event=event)
-
-            if isinstance(event, AgentFinalAnswer) and len(event.text) > 0:
-                yield AgentFinalAnswer(
-                    text=event.text[len_final_answer:],
-                )
-
-                len_final_answer = len(event.text)
-
-            events.append(event)
-
-        if any(isinstance(e, AgentFinalAnswer) for e in events):
-            pass  # no-op
-        elif any(isinstance(e, AgentToolAction) for e in events):
-            yield events[-1]
-        elif isinstance(events[-1], AgentUnknownAction):
-            yield events[-1]
+        yield AgentUnknownAction(text="a" * 20000)
+        # 16338
+        # async for event in astream:
+        #     if is_feature_enabled(FeatureFlag.EXPANDED_AI_LOGGING):
+        #         log.info("Response streaming", source=__name__, streamed_event=event)
+
+        #     if isinstance(event, AgentFinalAnswer) and len(event.text) > 0:
+        #         yield AgentFinalAnswer(
+        #             text=event.text[len_final_answer:],
+        #         )
+
+        #         len_final_answer = len(event.text)
+
+        #     events.append(event)
+
+        # if any(isinstance(e, AgentFinalAnswer) for e in events):
+        #     pass  # no-op
+        # elif any(isinstance(e, AgentToolAction) for e in events):
+        #     yield events[-1]
+        # elif isinstance(events[-1], AgentUnknownAction):
+        #     yield events[-1]

Run gdk
Execute chat command in Rails console:

prompt_message = Gitlab::Llm::ChatMessage.new(
    content: "Hello",
    role: "user",
    user: User.first,
    ai_action: "chat",
    context: Gitlab::Llm::AiMessageContext.new(resource: User.first)
)
ai_prompt_class = nil
options = {:content=>"Hello", :extra_resource=>{}, :action=>:chat}
Gitlab::Llm::Completions::Chat.new(prompt_message, ai_prompt_class, options).execute

Edited Sep 18, 2024 by Shinya Maeda

Fix incomplete unknown event stream from V2 Chat Agent

What does this MR do and why?

MR acceptance checklist

How to reproduce

Merge request reports