Skip to content

feat(rootly): add rootly integration plugin#8877

Merged
klesh merged 14 commits into
apache:mainfrom
dncrews:feat/rootly
May 16, 2026
Merged

feat(rootly): add rootly integration plugin#8877
klesh merged 14 commits into
apache:mainfrom
dncrews:feat/rootly

Conversation

@dncrews
Copy link
Copy Markdown
Contributor

@dncrews dncrews commented May 14, 2026

Summary

This PR adds a new Rootly data-source plugin that collects incident-management data from Rootly and maps it into DevLake's ticket domain model.

Rootly Plugin Implementation (backend/plugins/rootly/):

  • Adds the full Rootly plugin implementation with connection, scope, scope config, API, migration, model, and blueprint pipeline support.
  • Adds Rootly connection management using Bearer token authentication, including create/list/update/delete/test connection APIs.
  • Adds remote scope discovery for Rootly services, including search and paginated service selection.
  • Collects Rootly services and incidents through Rootly's JSON:API endpoints.
  • Extracts Rootly incidents, services, and related users into tool-layer tables.
  • Converts Rootly services into DevLake boards and Rootly incidents into DevLake ticket issues, board issues, and issue assignees.
  • Maps Rootly incident statuses, severities, timestamps, creators, and lifecycle role users into DevLake domain fields.
  • Supports blueprint v2.0 pipeline plan generation for selected Rootly service scopes.

Config UI:

  • Registers Rootly in the config UI plugin list.
  • Adds the Rootly connection form with endpoint, token, proxy, and rate-limit configuration.
  • Adds the Rootly icon and stable documentation URLs.

Grafana Dashboard:

  • Adds a Rootly data-source dashboard for incident resolution status and MTTR metrics.
  • Filters dashboard data by Rootly service boards using the standard DevLake board/issue/board_issue domain tables.

Testing:

  • Adds unit coverage for incident query construction, extraction edge cases, status/severity mapping, lead-time calculation, and assignee deduplication.
  • Adds an e2e data-flow fixture test covering raw Rootly incidents through tool-layer extraction and ticket-domain conversion.
  • Registers Rootly in plugin table-info and server-startup plugin coverage.

Does this close any open issues?

Closes #8876

Screenshots

Screenshot 2026-05-13 at 17 52 38@2x

Other Information

Data model mapping:

  • Rootly Service → DevLake Board
  • Rootly Incident → DevLake Issue
  • Rootly Incident Service relationship → DevLake BoardIssue
  • Rootly incident role users → DevLake User and IssueAssignee
  • Rootly status/severity fields → DevLake ticket status, priority, and severity

Validation run locally:

  • jq empty grafana/dashboards/Rootly.json
  • go test ./plugins/rootly/...

dncrews added 14 commits May 12, 2026 13:37
Add the plugin entry point, impl shell with all required plugin
interfaces except DataSourcePluginBlueprintV200 (wired in U2),
connection model with Bearer auth, service + scope-config + placeholder
incident/user/assignment models, and the initial migration script. API
resources and subtasks are intentionally empty and will be populated by
U2-U5.
Wire up connection test, remote scope list/search, scope CRUD, scope
sync state, and blueprint v200 endpoints. Remote scope listing speaks
JSON:API and page-based pagination to match the Rootly API shape.
TestConnection validates the bearer token via GET /users/current.
impl.go now wires api.Init and restores DataSourcePluginBlueprintV200
so the plugin can produce pipeline plans for selected Rootly services.
Add services_collector, services_extractor, and service_converter
subtasks. Collector fetches GET /services/{id} for the scoped service,
unwraps the JSON:API envelope, and populates _tool_rootly_services.
Converter emits one ticket.Board per service using the standard
domain-id generator, feeding into the existing TICKET pipeline.

Also harden remote-scope pagination termination to key off the page
we requested rather than meta.current_page so a response missing that
field does not silently truncate the service list.
Finalize the Incident and User models and add role-specific user-id
fields (creator, started_by, mitigated_by, resolved_by, closed_by)
directly on the Incident row. Add the incidents_collector and
incidents_extractor subtasks. The collector is single-phase,
filter[services]-scoped and filter[updated_at][gt]-incremental; the
extractor unwraps the JSON:API envelope and pulls inline nested user
objects from the incident attributes, emitting deduplicated User rows
per incident.

Drop the Assignment table entirely. Rootly's incident data model is
role-per-lifecycle-event, not a list of assignees, so PagerDuty's
n-assignees shape does not fit. The schema and GetTablesInfo are
adjusted accordingly.
Add incidents_converter. For each tool-layer incident, emit a
ticket.Issue (Type=INCIDENT) with status, priority, lead time, and
resolution date derived from the corresponding Rootly fields; emit one
ticket.IssueAssignee per distinct role user (creator, started_by,
mitigated_by, resolved_by, closed_by); and emit a ticket.BoardIssue
linking the incident to its service board. This feeds DevLake's
existing DORA pipeline so change-failure rate and MTTR compute
correctly for teams running on Rootly.

Unknown incident statuses fall back to IN_PROGRESS with a warning log
rather than panicking (a deliberate divergence from the PagerDuty
plugin, motivated by Rootly's more volatile status enum). Severity
mapping accepts case-insensitive sev0-sev4; unknown severities are
preserved as-is.

Guard computeLeadTime against resolved-before-started timestamps (clock
skew or backfill anomalies) by returning nil rather than the wraparound
garbage a naive uint() cast would produce. Tighten test coverage on the
dedup, key-fallback, and known-status-warning boundaries flagged during
code review.
Add an end-to-end test that drives the extract and convert subtasks
over a crafted raw-incident fixture and verifies the tool-layer and
domain-layer output against snapshot CSVs. Fixtures cover every branch
of the status and severity mapping tables, the same-user-across-roles
dedup path, the zero-user case, and the safety-net filter that drops
incidents whose relationships point at a different service than the
one the task was scoped to.
Add rootly to the backend plugin startup test and the table-info test
so CI exercises it alongside the other plugins. Register RootlyConfig
in the config-ui plugin list so operators can create connections,
browse scopes, and run blueprints through the UI.

The cloud endpoint default includes the /v1/ API version prefix,
which is what Rootly's REST API actually expects; without it requests
land on the marketing site and return 404. The DOC_URL entries point
at a Configuration/Rootly docs page that still needs to be written.

Use the Rootly wordmark glyph as the plugin icon, rewritten as a
fill-aware (currentColor) SVG so the config-ui can recolor it for
selected/unselected states the same way it does every other plugin
icon.
The U1 init migration's archived Connection, Service, and ScopeConfig
models were missing columns contributed by the live models' embedded
helpers, so AutoMigrate produced tables the live models could not
read or write. Each gap surfaced as an "Unknown column" error from
MySQL the first time the table was touched:

- Connection was missing endpoint, proxy, rate_limit_per_hour
  (contributed by helper.RestConnection on the live struct).
- Service was missing scope_config_id (contributed by common.Scope).
- ScopeConfig was missing connection_id and name (contributed by
  common.ScopeConfig, which the archived base type does not include).

Fold the missing fields into the archived models so a single init
migration produces the correct schema. Since rootly has not been
deployed anywhere, keeping one init migration is cleaner than
chaining follow-up ALTERs; a fresh migrate creates the correct
tables in one pass.
The original plugin shape was built from docs summaries rather than an
actual response capture and diverged from the Rootly API in ways that
silently produced zero incidents. Reconcile every code path with
ground truth from the OpenAPI spec and a captured GET /v1/incidents
response:

- Connection test hits /users/me (Rootly's real "who am I" path); the
  original /users/current returns 404.
- Incident list filter is filter[service_ids]=<uuid>, not
  filter[services]=<uuid>. The latter exists but accepts names and
  silently matches nothing for a UUID.
- Role-bearing user fields (user, started_by, mitigated_by,
  resolved_by, closed_by) and severity are JSON:API response
  envelopes nested on attributes: {"data":{"id":...,"attributes":...}}.
  The previous flat NestedUser / SeverityAttrs shapes were reading the
  wrong paths, so those fields were always empty.
- Service membership lives on the sibling relationships block as
  JSON:API id+type pointers, not on attributes. The safety-net
  scope-filter check now reads from the right place.
- The incident resource does not have an urgency field. Drop the
  corresponding column from the model and archived schema.

Also harden the collector: split the ResponseParser / next-page logic
so pagination state is captured during parse (rather than re-reading
the already-drained response body in GetNextPageCustomData), and add
lightweight request/response diagnostics gated behind Debug logging.

Verified end-to-end against a live Rootly tenant: 3 of 6 scoped
services returned incidents, all 3 extracted and converted into
ticket.Issue rows with creator assignees and board linkage.
Match PagerDuty's comment density. Keep the few comments that flag
non-obvious invariants (archived-base field overrides, 1-based
pagination, deliberate divergence from PagerDuty's panic-on-unknown
behavior, clock-skew guard).
  Apply fixes from a multi-lens code review pass:

  - API: rename swagger {serviceId} to {scopeId} to match registered
    route; remove dead Proxy handler.
  - Models: add Sanitize() on RootlyConn; add RoleUserIds() helper on
    Incident; index ServiceId; drop unused Url field on User; remove
    dead RootlyResponse/ApiUserResponse types.
  - Migrations: mirror live schema in archived models (index on
    service_id; drop user.url).
  - Collector: switch pagination to reqData.Pager.Page (avoids
    divide-by-zero), cap at 10000 pages, extract buildIncidentsQuery
    as a pure helper, drop unreachable lastPageEmpty branch and unused
    TotalCount, remove diagnostic logs; add unit test pinning the
    filter[service_ids] param literal as a regression guard.
  - Services: preserve ScopeConfigId across re-collections; declare
    ProductTables on collector and extractor metas.
  - Extractor: skip emitting User rows with neither name nor email so
    sibling scope tasks can fill in fuller data; use generic resolve()
    for SequentialId; type ServiceRef as a named struct.
  - Converter: consolidate mapStatus to return (mapped, known); use
    Incident.RoleUserIds() instead of an inline slice.
  - impl.go: comment justifying services-before-incidents subtask
    ordering.
  - e2e: rewrite raw incident fixtures to JSON:API envelope shape;
    regenerate snapshots (drop urgency column).
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. add-a-plugin This issue is to add a plugin component/plugins This issue or PR relates to plugins pr-type/feature-development This PR is to develop a new feature labels May 14, 2026
Copy link
Copy Markdown
Contributor

@klesh klesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks for your contribution.

@klesh klesh merged commit aff3487 into apache:main May 16, 2026
15 checks passed
@dncrews dncrews deleted the feat/rootly branch May 18, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

add-a-plugin This issue is to add a plugin component/plugins This issue or PR relates to plugins pr-type/feature-development This PR is to develop a new feature size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature][Plugin] Add Rootly Integration

2 participants