Skip to content

bug(core): same-index tool_call chunks in one streamed delta can split into blank tool_calls and invalid_tool_calls #36627

@lyc280705

Description

@lyc280705

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find similar issues.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain.
  • I posted a self-contained, minimal, reproducible example.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-core
  • Other / not sure / general

Related Issues / PRs

Related symptom, but different root cause and code path:

Those issues are about the Responses API _advance(...) index handling. This report is about AIMessageChunk.init_tool_calls() on the chat.completions-style path when a provider emits multiple same-index tool-call fragments inside a single streamed delta.

Reproduction Steps / Example Code (Python)

from langchain_core.messages import AIMessageChunk
from langchain_core.messages.tool import tool_call_chunk as create_tool_call_chunk

first = AIMessageChunk(
    content="",
    tool_call_chunks=[
        create_tool_call_chunk(name="search", args="", id="id1", index=0),
        create_tool_call_chunk(name=None, args="{", id=None, index=0),
    ],
)

merged = first + AIMessageChunk(
    content="",
    tool_call_chunks=[
        create_tool_call_chunk(
            name=None,
            args='"query": "bar"}',
            id=None,
            index=0,
        )
    ],
)

print("tool_call_chunks=", merged.tool_call_chunks)
print("tool_calls=", merged.tool_calls)
print("invalid_tool_calls=", merged.invalid_tool_calls)

Actual output on current stable:

tool_call_chunks= [
    {
        "name": "search",
        "args": '"query": "bar"}',
        "id": "id1",
        "index": 0,
        "type": "tool_call_chunk",
    },
    {
        "name": None,
        "args": "{",
        "id": None,
        "index": 0,
        "type": "tool_call_chunk",
    },
]
tool_calls= [{"name": "", "args": {}, "id": None, "type": "tool_call"}]
invalid_tool_calls= [
    {
        "name": "search",
        "args": '"query": "bar"}',
        "id": "id1",
        "error": None,
        "type": "invalid_tool_call",
    }
]

Expected behavior:

tool_call_chunks= [
    {
        "name": "search",
        "args": '{"query": "bar"}',
        "id": "id1",
        "index": 0,
        "type": "tool_call_chunk",
    }
]
tool_calls= [
    {"name": "search", "args": {"query": "bar"}, "id": "id1", "type": "tool_call"}
]
invalid_tool_calls= []

Description

The failure mode appears when tool_call_chunks already contains multiple fragments for the same logical tool call before AIMessageChunk.init_tool_calls() parses them.

A real provider payload that triggered this looked like this on the first streamed delta:

{
  "tool_calls": [
    {
      "index": 0,
      "id": "call_...",
      "type": "function",
      "function": {
        "name": "bocha_websearch_tool",
        "arguments": ""
      }
    },
    {
      "index": 0,
      "function": {
        "arguments": "{"
      }
    }
  ]
}

LangChain correctly turns those into two ToolCallChunks, but later stream continuations are merged into the first entry while the dangling second same-index fragment remains in the list. When init_tool_calls() then parses the list, the incomplete fragment becomes a blank tool_call, and the real call falls into invalid_tool_calls.

The underlying problem is that init_tool_calls() parses self.tool_call_chunks as-is, without first normalizing same-index continuation fragments that are already present inside a single AIMessageChunk.

Proposed fix

Before best-effort JSON parsing in AIMessageChunk.init_tool_calls(), normalize self.tool_call_chunks by merging same-index continuation fragments using the existing merge_lists() semantics. That preserves the current behavior for truly parallel same-index tool calls with different non-empty ids, while fixing the case where a provider emits multiple fragments for one logical tool call inside the same delta.

System Info

  • OS: macOS
  • Python: 3.12.8
  • langchain: 1.2.15
  • langchain-core: 1.2.26
  • langchain-openai: 1.1.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    core`langchain-core` package issues & PRsexternal

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions