Skip to content

fix(watsonx): emit response events with logger#4153

Open
pragnyanramtha wants to merge 2 commits into
traceloop:mainfrom
pragnyanramtha:codex/fix-watsonx-response-events-clean
Open

fix(watsonx): emit response events with logger#4153
pragnyanramtha wants to merge 2 commits into
traceloop:mainfrom
pragnyanramtha:codex/fix-watsonx-response-events-clean

Conversation

@pragnyanramtha
Copy link
Copy Markdown

@pragnyanramtha pragnyanramtha commented May 17, 2026

Summary

  • pass the Watsonx event logger through response event emission paths
  • normalize dict and list-shaped Watsonx responses before emitting gen_ai.choice events
  • add regression coverage for dict, batch/list, and stream response event logs

Validation

  • uv run pytest tests/test_events.py -q
  • uv run pytest tests/ -q
  • uv run ruff check .

Summary by CodeRabbit

  • Bug Fixes

    • Consistent emission of choice events for single, batched, and streaming responses; assistant role and default finish reason are applied uniformly.
  • Tests

    • Added tests covering single, indexed batched, streamed responses and skipping invalid batched items.
  • Chores

    • Redacted real upload URLs in test cassettes, replacing them with placeholder URLs.

Review Change Stack

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 17, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 17, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4e977813-9a1f-4590-b895-76f915d490f4

📥 Commits

Reviewing files that changed from the base of the PR and between f34d62d and d5ba8b6.

📒 Files selected for processing (1)
  • packages/opentelemetry-instrumentation-watsonx/tests/test_events.py

📝 Walkthrough

Walkthrough

Refactors _emit_response_events to accept an event_logger, normalize single/list responses (flatten results), filter non-dict items, and emit gen_ai.choice/ChoiceEvent logs. Updates _handle_stream_response to pass event_logger. Adds tests and redacts pre-signed upload URLs in several test cassettes.

Changes

Response Event Emission and Testing

Layer / File(s) Summary
Response event emission refactoring
packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py
_emit_response_events is refactored to accept event_logger, normalize single dict or list responses into a messages list, filter non-dict items, flatten results when present, and emit ChoiceEvent / gen_ai.choice log records. _handle_stream_response updated to pass event_logger to the refactored function.
Test infrastructure and validation
packages/opentelemetry-instrumentation-watsonx/tests/test_events.py
New test module adding RecordingSpan test-double, event_logger fixture (uses LoggerProvider + InMemoryLogExporter), and tests asserting emitted gen_ai.choice log records for single response, list responses (indices 0/1), invalid batched inputs, and stream responses (mapping generated_text and finish reason).

Test Cassette Redactions

Layer / File(s) Summary
Redact pre-signed upload URLs in cassettes
packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_file_attachments_mocked.yaml, packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_in_memory_attachment.yaml, packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_mixed_attachments.yaml
Replaced captured upload_url pre-signed S3 URLs with https://example.invalid/redacted-upload-url placeholders in multiple test cassettes; preserved storage_key, expires_at, and method fields.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🐰 I hopped through code with eager paws,
Normalized responses, smoothed out the laws,
Lists or single, each choice now logs,
Streams whisper events through tidy logs,
Tests confirm the meadow's cause.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(watsonx): emit response events with logger' directly describes the main change: passing the event logger through Watsonx response event emission and normalizing response handling.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@pragnyanramtha pragnyanramtha marked this pull request as ready for review May 17, 2026 02:29
Copilot AI review requested due to automatic review settings May 17, 2026 02:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Emit Watsonx response choice events through the provided event logger, add regression coverage for single, list, and stream response shapes, and keep dataset attachment cassette upload URLs redacted with deterministic non-live placeholders.
@pragnyanramtha pragnyanramtha force-pushed the codex/fix-watsonx-response-events-clean branch from 8ab3c08 to f34d62d Compare May 20, 2026 08:21
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-watsonx/tests/test_events.py (1)

58-83: ⚡ Quick win

Add a regression test for mixed invalid entries in batched responses.

Please add one case where the list includes non-dict items and results contains non-dict elements, then assert only valid dict messages emit logs. This locks in the new filtering behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opentelemetry-instrumentation-watsonx/tests/test_events.py` around
lines 58 - 83, Add a regression test in tests/test_events.py alongside
test_emit_response_events_writes_choice_log_for_list_response that calls
_emit_response_events with a mixed list containing non-dict items (e.g., a
string or None) and entries whose "results" list includes non-dict elements;
then use the event_logger fixture/exporter to collect finished logs and assert
that only the valid dict messages produced gen_ai.choice logs (with correct
index/message/finish_reason) and that invalid entries produced no logs.
Reference _emit_response_events, event_logger, and exporter when locating where
to add the new assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/opentelemetry-instrumentation-watsonx/tests/test_events.py`:
- Around line 58-83: Add a regression test in tests/test_events.py alongside
test_emit_response_events_writes_choice_log_for_list_response that calls
_emit_response_events with a mixed list containing non-dict items (e.g., a
string or None) and entries whose "results" list includes non-dict elements;
then use the event_logger fixture/exporter to collect finished logs and assert
that only the valid dict messages produced gen_ai.choice logs (with correct
index/message/finish_reason) and that invalid entries produced no logs.
Reference _emit_response_events, event_logger, and exporter when locating where
to add the new assertions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b3f80768-4106-44e0-9a69-2b0c3e8e0066

📥 Commits

Reviewing files that changed from the base of the PR and between 8ab3c08 and f34d62d.

📒 Files selected for processing (5)
  • packages/opentelemetry-instrumentation-watsonx/opentelemetry/instrumentation/watsonx/__init__.py
  • packages/opentelemetry-instrumentation-watsonx/tests/test_events.py
  • packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_file_attachments_mocked.yaml
  • packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_in_memory_attachment.yaml
  • packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_mixed_attachments.yaml
✅ Files skipped from review due to trivial changes (2)
  • packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_in_memory_attachment.yaml
  • packages/traceloop-sdk/tests/datasets/cassettes/test_dataset_with_attachments/test_create_dataset_with_mixed_attachments.yaml

@pragnyanramtha
Copy link
Copy Markdown
Author

@coderabbitai resume

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

✅ Actions performed

Reviews resumed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants