feat(rootly): add rootly integration plugin#8877
Merged
Merged
Conversation
Add the plugin entry point, impl shell with all required plugin interfaces except DataSourcePluginBlueprintV200 (wired in U2), connection model with Bearer auth, service + scope-config + placeholder incident/user/assignment models, and the initial migration script. API resources and subtasks are intentionally empty and will be populated by U2-U5.
Wire up connection test, remote scope list/search, scope CRUD, scope sync state, and blueprint v200 endpoints. Remote scope listing speaks JSON:API and page-based pagination to match the Rootly API shape. TestConnection validates the bearer token via GET /users/current. impl.go now wires api.Init and restores DataSourcePluginBlueprintV200 so the plugin can produce pipeline plans for selected Rootly services.
Add services_collector, services_extractor, and service_converter
subtasks. Collector fetches GET /services/{id} for the scoped service,
unwraps the JSON:API envelope, and populates _tool_rootly_services.
Converter emits one ticket.Board per service using the standard
domain-id generator, feeding into the existing TICKET pipeline.
Also harden remote-scope pagination termination to key off the page
we requested rather than meta.current_page so a response missing that
field does not silently truncate the service list.
Finalize the Incident and User models and add role-specific user-id fields (creator, started_by, mitigated_by, resolved_by, closed_by) directly on the Incident row. Add the incidents_collector and incidents_extractor subtasks. The collector is single-phase, filter[services]-scoped and filter[updated_at][gt]-incremental; the extractor unwraps the JSON:API envelope and pulls inline nested user objects from the incident attributes, emitting deduplicated User rows per incident. Drop the Assignment table entirely. Rootly's incident data model is role-per-lifecycle-event, not a list of assignees, so PagerDuty's n-assignees shape does not fit. The schema and GetTablesInfo are adjusted accordingly.
Add incidents_converter. For each tool-layer incident, emit a ticket.Issue (Type=INCIDENT) with status, priority, lead time, and resolution date derived from the corresponding Rootly fields; emit one ticket.IssueAssignee per distinct role user (creator, started_by, mitigated_by, resolved_by, closed_by); and emit a ticket.BoardIssue linking the incident to its service board. This feeds DevLake's existing DORA pipeline so change-failure rate and MTTR compute correctly for teams running on Rootly. Unknown incident statuses fall back to IN_PROGRESS with a warning log rather than panicking (a deliberate divergence from the PagerDuty plugin, motivated by Rootly's more volatile status enum). Severity mapping accepts case-insensitive sev0-sev4; unknown severities are preserved as-is. Guard computeLeadTime against resolved-before-started timestamps (clock skew or backfill anomalies) by returning nil rather than the wraparound garbage a naive uint() cast would produce. Tighten test coverage on the dedup, key-fallback, and known-status-warning boundaries flagged during code review.
Add an end-to-end test that drives the extract and convert subtasks over a crafted raw-incident fixture and verifies the tool-layer and domain-layer output against snapshot CSVs. Fixtures cover every branch of the status and severity mapping tables, the same-user-across-roles dedup path, the zero-user case, and the safety-net filter that drops incidents whose relationships point at a different service than the one the task was scoped to.
Add rootly to the backend plugin startup test and the table-info test so CI exercises it alongside the other plugins. Register RootlyConfig in the config-ui plugin list so operators can create connections, browse scopes, and run blueprints through the UI. The cloud endpoint default includes the /v1/ API version prefix, which is what Rootly's REST API actually expects; without it requests land on the marketing site and return 404. The DOC_URL entries point at a Configuration/Rootly docs page that still needs to be written. Use the Rootly wordmark glyph as the plugin icon, rewritten as a fill-aware (currentColor) SVG so the config-ui can recolor it for selected/unselected states the same way it does every other plugin icon.
The U1 init migration's archived Connection, Service, and ScopeConfig models were missing columns contributed by the live models' embedded helpers, so AutoMigrate produced tables the live models could not read or write. Each gap surfaced as an "Unknown column" error from MySQL the first time the table was touched: - Connection was missing endpoint, proxy, rate_limit_per_hour (contributed by helper.RestConnection on the live struct). - Service was missing scope_config_id (contributed by common.Scope). - ScopeConfig was missing connection_id and name (contributed by common.ScopeConfig, which the archived base type does not include). Fold the missing fields into the archived models so a single init migration produces the correct schema. Since rootly has not been deployed anywhere, keeping one init migration is cleaner than chaining follow-up ALTERs; a fresh migrate creates the correct tables in one pass.
The original plugin shape was built from docs summaries rather than an
actual response capture and diverged from the Rootly API in ways that
silently produced zero incidents. Reconcile every code path with
ground truth from the OpenAPI spec and a captured GET /v1/incidents
response:
- Connection test hits /users/me (Rootly's real "who am I" path); the
original /users/current returns 404.
- Incident list filter is filter[service_ids]=<uuid>, not
filter[services]=<uuid>. The latter exists but accepts names and
silently matches nothing for a UUID.
- Role-bearing user fields (user, started_by, mitigated_by,
resolved_by, closed_by) and severity are JSON:API response
envelopes nested on attributes: {"data":{"id":...,"attributes":...}}.
The previous flat NestedUser / SeverityAttrs shapes were reading the
wrong paths, so those fields were always empty.
- Service membership lives on the sibling relationships block as
JSON:API id+type pointers, not on attributes. The safety-net
scope-filter check now reads from the right place.
- The incident resource does not have an urgency field. Drop the
corresponding column from the model and archived schema.
Also harden the collector: split the ResponseParser / next-page logic
so pagination state is captured during parse (rather than re-reading
the already-drained response body in GetNextPageCustomData), and add
lightweight request/response diagnostics gated behind Debug logging.
Verified end-to-end against a live Rootly tenant: 3 of 6 scoped
services returned incidents, all 3 extracted and converted into
ticket.Issue rows with creator assignees and board linkage.
Match PagerDuty's comment density. Keep the few comments that flag non-obvious invariants (archived-base field overrides, 1-based pagination, deliberate divergence from PagerDuty's panic-on-unknown behavior, clock-skew guard).
Apply fixes from a multi-lens code review pass:
- API: rename swagger {serviceId} to {scopeId} to match registered
route; remove dead Proxy handler.
- Models: add Sanitize() on RootlyConn; add RoleUserIds() helper on
Incident; index ServiceId; drop unused Url field on User; remove
dead RootlyResponse/ApiUserResponse types.
- Migrations: mirror live schema in archived models (index on
service_id; drop user.url).
- Collector: switch pagination to reqData.Pager.Page (avoids
divide-by-zero), cap at 10000 pages, extract buildIncidentsQuery
as a pure helper, drop unreachable lastPageEmpty branch and unused
TotalCount, remove diagnostic logs; add unit test pinning the
filter[service_ids] param literal as a regression guard.
- Services: preserve ScopeConfigId across re-collections; declare
ProductTables on collector and extractor metas.
- Extractor: skip emitting User rows with neither name nor email so
sibling scope tasks can fill in fuller data; use generic resolve()
for SequentialId; type ServiceRef as a named struct.
- Converter: consolidate mapStatus to return (mapped, known); use
Incident.RoleUserIds() instead of an inline slice.
- impl.go: comment justifying services-before-incidents subtask
ordering.
- e2e: rewrite raw incident fixtures to JSON:API envelope shape;
regenerate snapshots (drop urgency column).
klesh
approved these changes
May 16, 2026
Contributor
klesh
left a comment
There was a problem hiding this comment.
LGTM
Thanks for your contribution.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a new Rootly data-source plugin that collects incident-management data from Rootly and maps it into DevLake's ticket domain model.
Rootly Plugin Implementation (
backend/plugins/rootly/):Config UI:
Grafana Dashboard:
Testing:
Does this close any open issues?
Closes #8876
Screenshots
Other Information
Data model mapping:
Validation run locally:
jq empty grafana/dashboards/Rootly.jsongo test ./plugins/rootly/...