fix(watsonx): emit response events with logger by pragnyanramtha · Pull Request #4153 · traceloop/openllmetry

pragnyanramtha · 2026-05-17T02:26:35Z

Summary

pass the Watsonx event logger through response event emission paths
normalize dict and list-shaped Watsonx responses before emitting gen_ai.choice events
add regression coverage for dict, batch/list, and stream response event logs

Validation

uv run pytest tests/test_events.py -q
uv run pytest tests/ -q
uv run ruff check .

Summary by CodeRabbit

Bug Fixes
- Consistent emission of choice events for single, batched, and streaming responses; assistant role and default finish reason are applied uniformly.
Tests
- Added tests covering single, indexed batched, streamed responses and skipping invalid batched items.
Chores
- Redacted real upload URLs in test cassettes, replacing them with placeholder URLs.

CLAassistant · 2026-05-17T02:26:40Z

All committers have signed the CLA.

coderabbitai · 2026-05-17T02:26:41Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4e977813-9a1f-4590-b895-76f915d490f4

📥 Commits

Reviewing files that changed from the base of the PR and between f34d62d and d5ba8b6.

📒 Files selected for processing (1)

packages/opentelemetry-instrumentation-watsonx/tests/test_events.py

📝 Walkthrough

Walkthrough

Refactors _emit_response_events to accept an event_logger, normalize single/list responses (flatten results), filter non-dict items, and emit gen_ai.choice/ChoiceEvent logs. Updates _handle_stream_response to pass event_logger. Adds tests and redacts pre-signed upload URLs in several test cassettes.

Changes

Response Event Emission and Testing

Layer / File(s)	Summary
Response event emission refactoring `packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py`	`_emit_response_events` is refactored to accept `event_logger`, normalize single dict or list responses into a messages list, filter non-dict items, flatten `results` when present, and emit `ChoiceEvent` / `gen_ai.choice` log records. `_handle_stream_response` updated to pass `event_logger` to the refactored function.
Test infrastructure and validation `packages/opentelemetry-instrumentation-watsonx/tests/test_events.py`	New test module adding `RecordingSpan` test-double, `event_logger` fixture (uses `LoggerProvider` + `InMemoryLogExporter`), and tests asserting emitted `gen_ai.choice` log records for single response, list responses (indices 0/1), invalid batched inputs, and stream responses (mapping `generated_text` and finish reason).

Test Cassette Redactions

Layer / File(s)	Summary
Redact pre-signed upload URLs in cassettes `packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_file_attachments_mocked.yaml`, `packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_in_memory_attachment.yaml`, `packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_mixed_attachments.yaml`	Replaced captured `upload_url` pre-signed S3 URLs with `https://example.invalid/redacted-upload-url` placeholders in multiple test cassettes; preserved `storage_key`, `expires_at`, and `method` fields.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🐰 I hopped through code with eager paws,
Normalized responses, smoothed out the laws,
Lists or single, each choice now logs,
Streams whisper events through tidy logs,
Tests confirm the meadow's cause.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(watsonx): emit response events with logger' directly describes the main change: passing the event logger through Watsonx response event emission and normalizing response handling.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Emit Watsonx response choice events through the provided event logger, add regression coverage for single, list, and stream response shapes, and keep dataset attachment cassette upload URLs redacted with deterministic non-live placeholders.

coderabbitai

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-watsonx/tests/test_events.py (1)
58-83: ⚡ Quick win

Add a regression test for mixed invalid entries in batched responses.

Please add one case where the list includes non-dict items and results contains non-dict elements, then assert only valid dict messages emit logs. This locks in the new filtering behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opentelemetry-instrumentation-watsonx/tests/test_events.py` around
lines 58 - 83, Add a regression test in tests/test_events.py alongside
test_emit_response_events_writes_choice_log_for_list_response that calls
_emit_response_events with a mixed list containing non-dict items (e.g., a
string or None) and entries whose "results" list includes non-dict elements;
then use the event_logger fixture/exporter to collect finished logs and assert
that only the valid dict messages produced gen_ai.choice logs (with correct
index/message/finish_reason) and that invalid entries produced no logs.
Reference _emit_response_events, event_logger, and exporter when locating where
to add the new assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/opentelemetry-instrumentation-watsonx/tests/test_events.py`:
- Around line 58-83: Add a regression test in tests/test_events.py alongside
test_emit_response_events_writes_choice_log_for_list_response that calls
_emit_response_events with a mixed list containing non-dict items (e.g., a
string or None) and entries whose "results" list includes non-dict elements;
then use the event_logger fixture/exporter to collect finished logs and assert
that only the valid dict messages produced gen_ai.choice logs (with correct
index/message/finish_reason) and that invalid entries produced no logs.
Reference _emit_response_events, event_logger, and exporter when locating where
to add the new assertions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b3f80768-4106-44e0-9a69-2b0c3e8e0066

📥 Commits

Reviewing files that changed from the base of the PR and between 8ab3c08 and f34d62d.

📒 Files selected for processing (5)

packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py
packages/opentelemetry-instrumentation-watsonx/tests/test_events.py
packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_file_attachments_mocked.yaml
packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_in_memory_attachment.yaml
packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_mixed_attachments.yaml

✅ Files skipped from review due to trivial changes (2)

packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_in_memory_attachment.yaml
packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_mixed_attachments.yaml

pragnyanramtha · 2026-05-20T17:10:08Z

@coderabbitai resume

coderabbitai · 2026-05-20T17:10:16Z

✅ Actions performed

Reviews resumed.

pragnyanramtha marked this pull request as ready for review May 17, 2026 02:29

Copilot AI review requested due to automatic review settings May 17, 2026 02:29

Copilot AI reviewed May 17, 2026

View reviewed changes

pragnyanramtha force-pushed the codex/fix-watsonx-response-events-clean branch from 8ab3c08 to f34d62d Compare May 20, 2026 08:21

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

test(watsonx): cover invalid batched response entries

d5ba8b6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(watsonx): emit response events with logger#4153

fix(watsonx): emit response events with logger#4153
pragnyanramtha wants to merge 2 commits into
traceloop:mainfrom
pragnyanramtha:codex/fix-watsonx-response-events-clean

pragnyanramtha commented May 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

CLAassistant commented May 17, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 17, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

pragnyanramtha commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pragnyanramtha commented May 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Uh oh!

CLAassistant commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pragnyanramtha commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pragnyanramtha commented May 17, 2026 •

edited by coderabbitai Bot

Loading

CLAassistant commented May 17, 2026 •

edited

Loading

coderabbitai Bot commented May 17, 2026 •

edited

Loading