Skip to content

fix: preserve redacted_thinking blocks in multi-turn Bedrock conversations#2880

Open
laiman1107 wants to merge 8 commits intoopenai:mainfrom
laiman1107:fix/bedrock-redacted-thinking-blocks
Open

fix: preserve redacted_thinking blocks in multi-turn Bedrock conversations#2880
laiman1107 wants to merge 8 commits intoopenai:mainfrom
laiman1107:fix/bedrock-redacted-thinking-blocks

Conversation

@laiman1107
Copy link
Copy Markdown

@laiman1107 laiman1107 commented Apr 13, 2026

Problem

When using Claude with extended thinking in multi-turn agent conversations, thinking blocks are silently dropped during history reconstruction, causing providers to reject subsequent turns.

Two concrete failure modes, both caused by the same bug:

1. AWS Bedrock — redacted_thinking blocks

Bedrock returns redacted_thinking blocks with a data field (base64-encrypted content) instead of thinking/signature:

{"type": "redacted_thinking", "data": "SGVsbG8..."}

These blocks were completely invisible to the serialisation code — they have no "thinking" key and no "signature" key — so they were silently dropped. Bedrock then rejected the next turn with:

litellm.BadRequestError: BedrockException - {"message":"The model returned the following errors:
messages.1.content.26: `thinking` or `redacted_thinking` blocks in the latest assistant message
cannot be modified. These blocks must remain as they were in the original response."}

2. Anthropic API — display: "omitted" thinking blocks

When thinking is configured with display: "omitted", Claude returns blocks with an empty thinking field but a valid signature:

{"type": "thinking", "thinking": "", "signature": "EosnCkYICxIMMb3..."}

Reference: https://platform.claude.com/docs/en/build-with-claude/extended-thinking#redacted-thinking-blocks

These blocks were filtered out of content_items (because thinking == "" is falsy), meaning content_items was empty. The reconstruction guard and content_items then short-circuited and skipped the entire block — even though encrypted_content still held the signature.


Root cause (shared)

In message_to_output_items, blocks were decomposed into two separate fields:

  • reasoning_item.content — thinking text (filtered: skips empty and missing thinking)
  • reasoning_item.encrypted_content — signatures joined by "\n" (skips missing signature)

In items_to_messages, reconstruction required content_items to be non-empty as a guard. Any scenario where content_items was empty (omitted thinking, redacted thinking) caused the entire block list to be silently dropped.

The Anthropic API requirement is clear: thinking and redacted_thinking blocks must be passed back verbatim in multi-turn conversations. Any modification or omission is rejected.

Fix

Serialise the complete block list as JSON in encrypted_content so every block type survives the round-trip unchanged:

  • message_to_output_items: replace per-field extraction with json.dumps(blocks_as_dicts). Still populates content with visible thinking text for display/summary purposes.
  • items_to_messages: try json.loads(encrypted_content) first (new format, handles all block types). Fall back to the legacy "\n"-joined signatures format so any in-flight sessions are not broken. Replace and content_items guard with and encrypted_content — omitted/redacted blocks have no content items but do have encrypted_content.

Tests

  • Updated two existing assertions that checked encrypted_content == "sig1\nsig2" to verify the JSON round-trip instead.
  • test_redacted_thinking_blocks_preserved_across_turns: pure redacted_thinking block survives message_to_output_itemsitems_to_messages verbatim.
  • test_mixed_thinking_and_redacted_thinking_blocks_preserved: mixed thinking + redacted_thinking blocks are all preserved in original order.

All 44 tests in the affected files pass.

…sations

The previous serialisation stored only the thinking text and signature
from each block, silently discarding any block whose type is
"redacted_thinking".  redacted_thinking blocks carry a "data" field
(base64-encrypted content) instead of "thinking"/"signature", so they
were filtered out in both message_to_output_items and never reconstructed
in items_to_messages.

Providers like Bedrock enforce that thinking/redacted_thinking blocks in
the latest assistant message must be passed back verbatim on the next
turn.  With redacted_thinking blocks missing from the reconstructed
history, Bedrock rejected subsequent turns with:

  "thinking or redacted_thinking blocks in the latest assistant message
  cannot be modified. These blocks must remain as they were in the
  original response."

Fix: serialise the complete block list as JSON in encrypted_content so
every block type (thinking and redacted_thinking) survives the round-trip
unchanged.  A legacy fallback decodes the old "\n"-joined signatures
format so any in-flight sessions are not broken.

The condition that gated thinking-block reconstruction on content_items
being non-empty is also removed — redacted_thinking blocks produce no
content items, so the old guard prevented reconstruction entirely.

Tests: two new cases cover the pure-redacted_thinking path and the mixed
thinking + redacted_thinking path.
@github-actions github-actions bot added bug Something isn't working feature:chat-completions labels Apr 13, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a9dbad820d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Address review feedback: the previous change replaced
with  as the reconstruction guard, which silently
dropped thinking blocks for older persisted turns where signatures were
absent and only content text was stored.

Fix by using  so both paths
are covered:
- encrypted_content present: try JSON first, fall back to legacy
  '\n'-joined signatures format
- encrypted_content absent but content_items present: reconstruct
  signature-less thinking blocks from content text only, preserving
  the prior behaviour for those older sessions
@laiman1107
Copy link
Copy Markdown
Author

Good catch. The guard did regress the case where older persisted turns have content items but no encrypted_content (e.g. thinking blocks whose signature was absent).

Fixed in the latest commit by using as the gate, then branching inside:

  • encrypted_content present → try JSON, fall back to legacy -joined signatures
  • encrypted_content absent, content_items present → reconstruct signature-less thinking blocks from content text only (restores previous behaviour for those older sessions)

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4e46151014

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…locks

- Import ResponseReasoningItem for proper type narrowing
- Replace hasattr guards with assert isinstance(reasoning_item, ResponseReasoningItem)
  and cast(ResponseReasoningItem, ...) so mypy can verify attribute access
- Add explicit 'is not None' assertions before json.loads calls to satisfy
  the str | None -> str narrowing mypy requires
- Run ruff format to satisfy the format check
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 36eebafbda

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c126f17ab6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The last-resort fallback in LitellmConverter produces dicts with only
a 'thinking' key and no 'type'.  Including these in the JSON payload
caused the replay-side validation (which requires every block to have
a 'type') to fall back to legacy signature parsing, misinterpreting the
whole JSON string as a newline-joined signature and producing malformed
thinking blocks.

Filter to only serialise blocks that carry a 'type' field, making the
serialisation and validation paths consistent.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a09b7e175d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Rather than filtering out blocks that lack a 'type' field (which silently
drops thinking content and can still cause Bedrock to reject subsequent
turns), inject 'type': 'thinking' for any block that has a 'thinking' key
but no 'type'.  This covers the last-resort fallback path in
LitellmConverter that emits {"thinking": str(block)} dicts.

Blocks that are not dicts, or have neither 'type' nor 'thinking', are
still discarded as unrecoverable.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 363a59484e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

reasoning_item.get('content', []) returns None when the key is present
but explicitly set to None (e.g. streamed Anthropic responses where
chatcmpl_stream_handler sets a signature without populating content).
The old 'and content_items' guard would short-circuit before iterating;
the new '(content_items or encrypted_content)' gate no longer does.

Use 'or []' to ensure content_items is always iterable regardless of
whether the key is absent or explicitly None.
@seratch seratch removed the bug Something isn't working label Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants