[codex] Apply generated cpflow GitHub Actions flow#732
[codex] Apply generated cpflow GitHub Actions flow#732
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Upstream token not passed to copy-image command
- Added the required -t upstream token argument to the cpflow copy-image-from-upstream command while keeping the token sourced from secrets via the environment.
Or push these changes by commenting:
@cursor push 3606e16506
Preview (3606e16506)
diff --git a/.github/workflows/cpflow-promote-staging-to-production.yml b/.github/workflows/cpflow-promote-staging-to-production.yml
--- a/.github/workflows/cpflow-promote-staging-to-production.yml
+++ b/.github/workflows/cpflow-promote-staging-to-production.yml
@@ -203,14 +203,13 @@
- name: Copy image from staging
env:
- # Pass the upstream token via env rather than `-t` so it doesn't appear in /proc/<pid>/cmdline.
CPLN_UPSTREAM_TOKEN: ${{ secrets.CPLN_TOKEN_STAGING }}
PRODUCTION_APP_NAME: ${{ vars.PRODUCTION_APP_NAME }}
CPLN_ORG_PRODUCTION: ${{ vars.CPLN_ORG_PRODUCTION }}
shell: bash
run: |
set -euo pipefail
- cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}"
+ cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" -t "${CPLN_UPSTREAM_TOKEN}" --org "${CPLN_ORG_PRODUCTION}"
- name: Deploy image to production
env:You can send follow-ups to the cloud agent here.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}" |
There was a problem hiding this comment.
Upstream token not passed to copy-image command
High Severity
The copy-image-from-upstream step sets CPLN_UPSTREAM_TOKEN as an environment variable but never passes the upstream token to the cpflow command. The cpflow documentation states this command requires --upstream-token or -t flag — there is no documented support for a CPLN_UPSTREAM_TOKEN env var in cpflow 4.2.0 (the version installed by cpflow-setup-environment). The old workflow correctly used -t ${{ secrets.CPLN_TOKEN_STAGING }}. Without the flag, the command cannot authenticate with the staging org and production promotion will fail at the image copy step.
Reviewed by Cursor Bugbot for commit eb1dbb2. Configure here.
There was a problem hiding this comment.
Leaving this unchanged. cpflow copy-image-from-upstream in cpflow 4.2.0 supports CPLN_UPSTREAM_TOKEN as an env var, so passing -t here would reintroduce the token-in-process-argv exposure this generated flow is intentionally avoiding. Reference: https://github.com/shakacode/control-plane-flow/blob/add-github-flow-generator/lib/command/copy_image_from_upstream.rb#L34-L36
Code Review: cpflow GitHub Actions MigrationThis is a solid migration from hand-written to generated Security — meaningful fixes
Bugs fixed
Potential issues
Nits
Overall this is well-structured with meaningful security and reliability improvements. The one item I'd prioritize is the |
| if cpflow exists -a "${APP_NAME}" --org "${CPLN_ORG}"; then | ||
| echo "exists=true" >> "$GITHUB_OUTPUT" | ||
| else | ||
| echo "exists=false" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
This if/else around cpflow exists conflates two different failure modes: "app does not exist" (expected non-zero) and "API/auth error" (unexpected non-zero). Both land in the else branch and set exists=false.
Compare delete-app.sh in this same PR, which carefully pattern-matches the output ("Double check your org", "Unknown API token format", "ERROR", etc.) to distinguish a real failure from a genuine not-found and surface it as a hard error.
If there's a transient auth issue when a PR is pushed, this step sets exists=false and the pull_request path silently writes a step-summary saying the app doesn't exist yet — masking the real problem until the deploy is manually retried.
Suggested approach: mirror delete-app.sh's pattern — capture the output, check exit code, inspect the text for known error tokens, and exit non-zero on a real failure rather than treating it as "not found".
There was a problem hiding this comment.
Fixed in b9095c6 by capturing cpflow exists output and failing hard on auth/API error tokens instead of treating every non-zero exit as app-not-found.
|
|
||
| if [[ -f "${HOME}/.ssh/cpflow_build_key" ]]; then | ||
| eval "$(ssh-agent -s)" | ||
| trap 'ssh-agent -k >/dev/null; rm -f "${HOME}/.ssh/cpflow_build_key"' EXIT |
There was a problem hiding this comment.
The EXIT trap kills the SSH agent and removes cpflow_build_key, but the ~/.ssh/known_hosts file written in the previous step is not cleaned up here. On ephemeral GitHub-hosted runners this is harmless (the VM is discarded), but if this action is ever adopted on self-hosted runners with a persistent ~/.ssh, the custom known_hosts entries would accumulate across runs.
Consider adding rm -f "${HOME}/.ssh/known_hosts" to the trap, or using a temp file under $RUNNER_TEMP instead of ~/.ssh for the known_hosts.
| default: "3.3.1" | ||
| cpflow_version: | ||
| description: cpflow gem version | ||
| required: false |
There was a problem hiding this comment.
The CLI version is pinned here as a default, which is good for reproducibility. Worth noting: since sudo npm install -g fetches from the npm registry at runtime, a compromised release of @controlplane/cli@3.3.1 would silently affect all runs. Consider adding a SHA/integrity check (e.g. via npm install --integrity) or at minimum locking to a digest-pinned download, especially since this step runs in workflows that access staging/production tokens.
|
|
||
| current_image="${selected_image:-${first_image}}" | ||
| current_version="${selected_version:-${first_version}}" | ||
|
|
There was a problem hiding this comment.
The staging env-var parity check only compares names, not values. It will catch a missing DATABASE_URL in production, but it won't catch a case where production has the right key names but wrong or stale values (e.g. a staging secret was rotated but the production counterpart wasn't updated). This is expected behaviour for a name-only check — just worth documenting explicitly so it's clear what the check does and doesn't guarantee.
There was a problem hiding this comment.
Documented in b9095c6: the production promotion workflow checks that production has the same environment variable names as staging, but it does not compare secret values.
|
Code Review: Replace hand-written CI/CD Actions with generated cpflow- flow* Well-considered overhaul that hardens security, safety, and operational reliability. No blocking issues found. Security improvements Token no longer passed via --token flag — old setup-environment exposed it in /proc/cmdline and ps; now uses CPLN_TOKEN env var only. SSH key scoped to a dedicated step so ACTIONS_STEP_DEBUG cannot leak it from the build step. Delete command now gated on author_association (old workflow had no permission check). Fork PR protection blocks syncs before any secret is consumed. Old contains() command matching replaced with exact match. Safety improvements cancel-in-progress changed from true to false everywhere — old staging deploy risked partial GVC deployments mid-rollout. Prefix-based delete guard replaces the old substring check for production/staging names. Production rollback logic added with per-workload state capture and restore — old workflow had no rollback at all. Health checks now configurable (HEALTH_CHECK_RETRIES, HEALTH_CHECK_INTERVAL, HEALTH_CHECK_ACCEPTED_STATUSES). Issue 1: Fork-PR /deploy-review-app comment silently fails with no PR feedback When a collaborator comments /deploy-review-app on a fork PR, the job condition passes but Validate review app deployment source exits 1 with only a runner log message. No PR comment explains the rejection. Suggest posting an issues.createComment before exit 1, similar to how the unconfigured-secrets path writes to GITHUB_STEP_SUMMARY. Issue 2: cpflow exists error detection is string-matching fragile delete-app.sh uses case matching over stderr to distinguish app-not-found from auth/network errors. The inline TODO acknowledges this. If cpflow changes an error message, the pattern will not match and a real failure will be silently treated as app not found, causing delete to return success vacuously. The TODO to switch to a structured exit code is the right long-term fix. Issue 3: gem install / npm install without integrity pinning cpflow-setup-environment/action.yml installs @controlplane/cli@3.3.1 and cpflow 4.2.0 without hash verification. A re-tagged or compromised release would install silently. For production-adjacent CI, consider verifying SHA256 after install or using lockfiles with integrity entries. Issue 4: Nightly cleanup has no failure notification cpflow-cleanup-stale-review-apps.yml runs on cron with no failure notification. Silent failures accumulate stale apps over days. Consider a step gated on if: failure() that posts a GitHub issue or annotation. Minor observations APP_NAME / PR_NUMBER double-assignment is intentional (workflow-level env needed early for concurrency.group, step overwrites with resolved values) but a short comment would clarify intent. cpflow-review-app-help.yml using pull_request_target is correct and well-documented. deploy job in cpflow-deploy-staging.yml re-runs cpflow-setup-environment unnecessarily (only needs cpflow deploy-image) — minor redundancy, acceptable for simplicity. |
| echo "🗑️ Deleting application: $APP_NAME" | ||
| cpflow delete -a "$APP_NAME" --org "$CPLN_ORG" --yes | ||
|
|
||
| echo "✅ Successfully deleted application: $APP_NAME" |
There was a problem hiding this comment.
The case pattern matching over stderr is the right approach given cpflow exists doesn't expose a structured signal, and the inline TODO is the right way to track this. One additional defensive measure worth considering: log the full exists_output to the runner even when the "not found" path is taken, so there's an audit trail if unexpected output silently passed through the filter. Right now the output is printed only if non-empty, but it's easy to miss in logs.
Also worth noting: the pattern list doesn't include "timeout", "connection refused", or "ENOTFOUND" — all plausible network failure messages. If cpflow's Ruby HTTP client times out, it may surface Errno::ECONNREFUSED or Net::ReadTimeout, which would match "Net::", so that's covered. But curl-style timeouts would not. Low risk given this is a Ruby gem, but worth keeping in mind when bumping the cpflow version.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| cpflow cleanup-stale-apps -a "${{ vars.REVIEW_APP_PREFIX }}" --org "${{ vars.CPLN_ORG_STAGING }}" --yes |
There was a problem hiding this comment.
No failure notification exists for this cron job. If cpflow cleanup-stale-apps fails (auth error, API hiccup, etc.), stale review apps accumulate silently across days.
Consider appending a step:
- name: Notify on failure
if: failure()
run: |
echo "::error::Stale review app cleanup failed. Check the workflow logs."Or, if the team monitors GitHub issues, use gh issue create to open a tracking issue on failure so it's not lost in workflow history.
| } >> "$GITHUB_STEP_SUMMARY" | ||
| exit 0 | ||
| fi | ||
|
|
There was a problem hiding this comment.
When a collaborator comments /deploy-review-app on a fork PR, execution reaches this exit 1 with only a runner log message — no feedback is posted to the PR thread. The collaborator sees a red workflow run with no explanation in the PR.
Consider posting a comment before exiting. The GH_TOKEN is available from the Resolve PR ref and commit step but not inherited here; you'd need to add GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} to this step's env block. Something like:
gh api "repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \
--method POST \
--field body="Fork PRs cannot use /deploy-review-app because this workflow uses repository secrets. Ask a maintainer to deploy from a branch in the base repository." \
--silent || true
There was a problem hiding this comment.
Fixed in 1e46bfe: fork PR /deploy-review-app rejections now post a PR comment before exiting, while still keeping the deployment blocked because the workflow uses repository secrets.
| run: | | ||
| set -euo pipefail | ||
|
|
||
| sudo npm install -g @controlplane/cli@${{ inputs.cpln_cli_version }} |
There was a problem hiding this comment.
Version-pinned but not hash-verified. A malicious actor who gains write access to the @controlplane/cli npm package (or re-tags a release) could substitute a compromised binary. Same applies to gem install cpflow below.
For production-adjacent pipelines, consider adding a post-install checksum check, or at minimum document the expected SHA256 for each pinned version in a comment so drift is detectable during future version bumps.
| run: | | ||
| # Best-effort rollback: try every workload, aggregate failures, exit non-zero at the end | ||
| # if any failed. A single cpln hiccup shouldn't leave other workloads mid-promotion. | ||
| set -uo pipefail |
There was a problem hiding this comment.
The intentional omission of -e here (using set -uo pipefail instead of set -euo pipefail) is the correct design — it lets rollback continue across all workloads even if one cpln call fails, with rollback_failures tracking the aggregate. Worth a brief comment explaining the omission so a future reader doesn't add -e thinking it was forgotten:
| set -uo pipefail | |
| # Intentionally no -e: we want every workload to attempt rollback even if earlier ones fail. | |
| set -uo pipefail |
There was a problem hiding this comment.
Fixed in 1e46bfe with a comment explaining that set -e is intentionally omitted so rollback can attempt every workload and aggregate failures.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Token not persisted for subsequent workflow steps
- The composite action now writes CPLN_TOKEN to GITHUB_ENV so subsequent cpflow and cpln workflow steps inherit authentication without passing the token on the command line.
Or push these changes by commenting:
@cursor push f1eee564c7
Preview (f1eee564c7)
diff --git a/.github/actions/cpflow-setup-environment/action.yml b/.github/actions/cpflow-setup-environment/action.yml
--- a/.github/actions/cpflow-setup-environment/action.yml
+++ b/.github/actions/cpflow-setup-environment/action.yml
@@ -63,6 +63,12 @@
exit 1
fi
+ {
+ echo "CPLN_TOKEN<<__CPLN_TOKEN__"
+ echo "$CPLN_TOKEN"
+ echo "__CPLN_TOKEN__"
+ } >> "$GITHUB_ENV"
+
create_output=""
if ! create_output="$(cpln profile create default --org "$ORG" 2>&1)"; then
if ! echo "$create_output" | grep -qi "already exists"; thenYou can send follow-ups to the cloud agent here.
Code Review — PR #732: Generated cpflow GitHub Actions FlowOverviewThis is a well-structured replacement of hand-written CI/CD workflows with the generated Significant Improvements ✅
IssuesBug:
|
| fi | ||
| deploy_args+=(--org "${CPLN_ORG}" --verbose) | ||
|
|
||
| cpflow deploy-image "${deploy_args[@]}" |
There was a problem hiding this comment.
Bug: deploy runs even when build fails.
When a job has an explicit if:, GitHub Actions drops the implicit success() guard. Because this condition only checks validate-branch's output, the deploy job will start even if the build job failed, wasting runner minutes and attempting to deploy a missing image.
| cpflow deploy-image "${deploy_args[@]}" | |
| if: needs.validate-branch.outputs.is_deployable == 'true' && needs.build.result == 'success' |
| echo "🗑️ Deleting application: $APP_NAME" | ||
| cpflow delete -a "$APP_NAME" --org "$CPLN_ORG" --yes | ||
|
|
||
| echo "✅ Successfully deleted application: $APP_NAME" |
There was a problem hiding this comment.
The string-matching heuristic on cpflow exists error output (lines 43–54) is the main reliability risk in this file. The consequence of an unmatched error pattern is silently exiting 0 ("app not found") when the API actually returned an error, meaning the delete is skipped without any visible failure.
An alternative that inverts the logic: treat any non-zero exit with non-empty output as a failure unless it matches a known "not found" pattern:
# instead of matching known error strings, match known "not found" patterns
case "$exists_output" in
*"not found"*|*"does not exist"*)
echo "⚠️ Application does not exist: $APP_NAME"
exit 0
;;
*)
echo "❌ ERROR: failed to determine whether application exists" >&2
printf '%s\n' "$exists_output" >&2
exit 1
;;
esacThis is safer by default — unknown output is treated as an error rather than a no-op. The tradeoff is that you'd need to know the exact "not found" wording from cpflow exists, which has the same coupling problem in the other direction. The TODO comment already captures this; just flagging it for awareness.
| echo "ready=false" >> "$GITHUB_OUTPUT" | ||
| { | ||
| echo "Control Plane review app automation is not configured yet." | ||
| echo |
There was a problem hiding this comment.
This top-level APP_NAME is computed from the raw event payload and is later overwritten in $GITHUB_ENV by the Resolve PR ref and commit step (echo "APP_NAME=..." >> "$GITHUB_ENV"). The step-level write wins, so this declaration is dead configuration and can be confusing. Consider removing it and relying solely on the step-computed value, which is already the authoritative one (it handles all three event types correctly).
Code Review: cpflow Generated GitHub Actions FlowThis is a meaningful security and reliability upgrade over the handwritten workflows. Several issues from the old setup are fixed here; I've also flagged a few new items worth addressing. ✅ Notable Improvements
🔴 Issues1. The delete workflow triggers on 2. The same fragile string-matching logic ( 🟡 Warnings3. In 4. Production health check accepts 301/302 by default
5. Nightly cleanup has no failure notification
ℹ️ Informational6.
7. These will pick up any non-breaking updates automatically, which is generally fine. However, |
| name: Delete Review App | ||
|
|
||
| on: | ||
| pull_request_target: |
There was a problem hiding this comment.
pull_request_target deserves an explanation comment here.
pull_request_target runs in the context of the base repo (write permissions, access to secrets) and is required for fork PR cleanup — pull_request can't post comments on fork PRs because the token is read-only. This is safe here because no PR code is ever checked out (the actions/checkout@v4 below fetches base-branch code only).
cpflow-review-app-help.yml has a good comment explaining this exact pattern. Without a parallel comment here, a future maintainer may "fix" this to pull_request thinking it's safer, silently breaking fork PR cleanup.
| pull_request_target: | |
| # pull_request_target is intentional: it has write permission to comment on PRs from | |
| # forks, where `pull_request` would be read-only. This is safe because no PR code is | |
| # checked out — the job only calls cpflow with base-repo credentials. | |
| # Do not switch this to `pull_request` or add a checkout of the PR ref without re-evaluating. | |
| pull_request_target: |
| # Make the token available to later workflow steps without putting it on argv. | ||
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | ||
| { | ||
| echo "CPLN_TOKEN<<${token_delimiter}" | ||
| printf '%s\n' "$CPLN_TOKEN" | ||
| echo "${token_delimiter}" | ||
| } >> "$GITHUB_ENV" |
There was a problem hiding this comment.
Writing CPLN_TOKEN to GITHUB_ENV propagates it to every subsequent step in the calling workflow — not just the steps that explicitly use it. This is a necessary tradeoff (cpflow picks it up from the environment automatically), but it's worth documenting so future maintainers don't inadvertently add steps that log environment state.
| # Make the token available to later workflow steps without putting it on argv. | |
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | |
| { | |
| echo "CPLN_TOKEN<<${token_delimiter}" | |
| printf '%s\n' "$CPLN_TOKEN" | |
| echo "${token_delimiter}" | |
| } >> "$GITHUB_ENV" | |
| # Persist CPLN_TOKEN to GITHUB_ENV so all subsequent steps can authenticate | |
| # with cpflow/cpln without putting the token on the command line. | |
| # Trade-off: the token is present in every subsequent step's environment. | |
| # Do not add steps that dump env vars (e.g., `env | sort`) after this point. | |
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | |
| { | |
| echo "CPLN_TOKEN<<${token_delimiter}" | |
| printf '%s\n' "$CPLN_TOKEN" | |
| echo "${token_delimiter}" | |
| } >> "$GITHUB_ENV" |
| # TODO: replace this string-matching with a structured signal once `cpflow exists` exposes one | ||
| # (e.g. a distinct exit code for "not found" vs. API/auth errors, or `cpflow exists --json`). | ||
| # Until then, keep this list in sync if `cpflow exists` starts emitting new error patterns — | ||
| # any unmatched error string would otherwise be silently treated as "app not found". | ||
| exists_output="" | ||
| if ! exists_output="$(cpflow exists -a "$APP_NAME" --org "$CPLN_ORG" 2>&1)"; then | ||
| case "$exists_output" in | ||
| *"Double check your org"*|*"Unknown API token format"*|*"ERROR"*|*"Error:"*|*"Traceback"*|*"Net::"*) |
There was a problem hiding this comment.
The same cpflow exists error-pattern matching appears verbatim in the Check if review app exists step in cpflow-deploy-review-app.yml (line ~195). The TODO comment here is accurate but only covers this file — the workflow copy is uncovered.
If cpflow adds a new error pattern (e.g. "rate limit" or "timeout"), both locations need updating in sync. Consider extracting this into a shared script (e.g. .github/scripts/cpflow-app-exists.sh) that both the action and the workflow call, so the pattern list is maintained in one place.
| # repo variable to tighten this for apps that expose a dedicated health endpoint | ||
| # (e.g. "200" for a plain /health, or "200 401 403" for apps that auth-gate / without | ||
| # redirecting). | ||
| HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }} |
There was a problem hiding this comment.
Accepting 301/302 as "healthy" means a redirect to a login page, maintenance page, or even a misconfigured CDN passes the health check. The comment correctly explains why (curl without -L, Rails auth-gating /), but a redirect can pass even when the destination is broken.
Consider adding a step summary warning when a non-200 status is accepted, so operators reviewing the promotion log have visibility:
if [[ "${http_status}" == "${accepted}" ]]; then
if [[ "${http_status}" != "200" ]]; then
echo "::warning::Health check passed with HTTP ${http_status} (not 200) — verify the redirect destination is healthy."
fi
echo "healthy=true" >> "$GITHUB_OUTPUT"
exit 0
fiAlso worth documenting in .controlplane/shakacode-team.md that setting HEALTH_CHECK_ACCEPTED_STATUSES=200 and adding a /health endpoint is the recommended production posture.
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| - cron: "0 0 * * *" |
There was a problem hiding this comment.
Scheduled workflows that fail silently are easy to miss — GitHub does send an email for failed scheduled runs but only to the repo owner by default, and only after 7 days of consecutive failures (as of current GitHub behavior).
Consider adding a failure notification step, or at minimum documenting that the team should verify GitHub's "Notify me of failed workflow runs" notification setting is enabled. Stale review apps that aren't cleaned up accrue Control Plane costs.
| - cron: "0 0 * * *" | |
| - cron: "0 3 * * *" # 3 AM UTC — avoids exact midnight when many repos run scheduled jobs simultaneously |
(The cron timing is a minor suggestion — staggering off midnight reduces GitHub's scheduler contention, making the run more likely to start on time.)
Code Review: Replace hand-written CI/CD with generated
|
| Old | New |
|---|---|
setup-environment action |
cpflow-setup-environment action |
build-docker-image action |
cpflow-build-docker-image action |
delete-control-plane-app action (no prefix guard) |
cpflow-delete-control-plane-app action (prefix-guarded) |
deploy-to-control-plane-review-app.yml |
cpflow-deploy-review-app.yml |
delete-review-app.yml (pull_request closed) |
cpflow-delete-review-app.yml (pull_request_target closed) |
deploy-to-control-plane-staging.yml (triggered on all branches) |
cpflow-deploy-staging.yml (main/master only) |
nightly-remove-stale-review-apps.yml |
cpflow-cleanup-stale-review-apps.yml |
promote-staging-to-production.yml |
cpflow-promote-staging-to-production.yml |
help-command.yml |
cpflow-help-command.yml + cpflow-review-app-help.yml |
Security Improvements ✅
- Token never on argv:
cpflow-setup-environmentpassesCPLN_TOKENvia environment variable rather than--tokenon the command line, avoiding/proc/<pid>/cmdlineandpsleakage. The oldsetup-environmentusedcpln profile update default --org "$ORG" --token "$TOKEN". - SSH key isolation:
cpflow-build-docker-imagededicates a separate step for writing the SSH key soDOCKER_BUILD_SSH_KEYis not present in the main build step's environment, even underACTIONS_STEP_DEBUG=true. - Prefix guard on deletes:
cpflow-delete-control-plane-app/delete-app.shchecks thatAPP_NAMEstarts withREVIEW_APP_PREFIX-before callingcpflow delete. The old action had no such guard. - Fork PR gating:
cpflow-deploy-review-app.ymlexplicitly blocks fork PRs onpull_requestevents (cannot access secrets anyway) and addsauthor_associationchecks for comment-triggered deploys. The old workflow triggered on any PRsynchronize/reopenedwithout fork checks. pull_request_targetused correctly:cpflow-delete-review-app.ymlusespull_request_targetfor the closed-PR cleanup path (to access secrets from fork PRs) and checks out only the base repo — the comment in the file explains this. This is a well-known footgun that has been handled correctly here.
Robustness Improvements ✅
cancel-in-progress: falseon all deploy/delete concurrency groups. Mid-rollout cancellation ofcpflow deploy-imagecan leave workloads in a partially-deployed state; letting in-flight runs finish is the right call.- Multi-workload rollback: The production promotion workflow now captures per-container image state for all workloads and restores each one on failure, rather than only rolling back the single
railsworkload image. - Randomized heredoc delimiters in
cpflow-promote-staging-to-production.ymlprevent a strayEOFin a variable value from truncating the multi-lineGITHUB_OUTPUTwrite. --run-release-phaseis now opt-in:cpflow-deploy-review-app.ymlandcpflow-deploy-staging.ymlinspectcpflow configto decide whether to pass--run-release-phase, rather than always passing it. This avoids failures when the flag isn't configured.- Health check window is larger: 24 retries × 25s = ~10 min (old: 12 × 10s = ~2 min). More headroom for slow Rails cold boots.
Issues and Suggestions
Medium: Stale cpflow exists error-matching is fragile
delete-app.sh and cpflow-deploy-review-app.yml both parse free-form stderr output from cpflow exists to distinguish "app not found" from authentication/API errors. The comment in the file acknowledges this with a TODO. This approach is inherently brittle if cpflow changes its error output. Consider filing an issue against cpflow to request a structured signal (distinct exit code or --json output) and linking it from the TODO so it doesn't get forgotten.
Low: cpflow-deploy-review-app.yml — steps.source.outputs.allowed can be empty on workflow_dispatch
The Validate review app deployment source step sets allowed=true for same-repo branches and allowed=false for fork PRs via pull_request events. For workflow_dispatch, steps.resolve-pr.outputs.same_repo could be "false" if the PR is from a fork, which falls into the issue_comment branch of the condition and calls gh pr comment — but workflow_dispatch on a fork PR should perhaps just fail rather than try to comment. Low risk since workflow_dispatch is manual, but the code path is slightly inconsistent.
Low: cpflow-setup-environment — redundant null checks
The Setup Control Plane profile and registry login step validates that CPLN_TOKEN and ORG are non-empty via shell if [[ -z ... ]] guards. These variables are both declared required: true in the action's inputs: block, so Actions will already refuse to call the action without them. The guards add no safety but don't hurt anything.
Low: Old staging workflow triggered on branches: ['*']
The deleted deploy-to-control-plane-staging.yml ran on every push to every branch and relied on a validate-branch job to short-circuit. The new cpflow-deploy-staging.yml correctly restricts the on.push.branches filter to ["main", "master"] with a comment explaining the GitHub limitation on vars in branch filters. This is an improvement, but it means pushes to a non-main/master STAGING_APP_BRANCH will not trigger the workflow unless the file is edited. The comment in the file explains this trade-off clearly — no action needed, just confirming the approach is understood.
Nit: cpflow-cleanup-stale-review-apps.yml — no contents: write permission for release creation
The promote workflow needs contents: write to create GitHub releases; the cleanup workflow does not. Both look correct on permissions.
Nit: cpflow-deploy-review-app.yml line 1011 — long if: conditions
Several steps have near-identical if: conditions like:
if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success')
This is repetitive across 10+ steps. A job-level output or an environment variable set once would be cleaner, but this is a style point and not a functional issue in generated code.
Documentation ✅
.controlplane/readme.mdandshakacode-team.mdare updated to reference the new workflow names, slash commands, andREVIEW_APP_PREFIXnaming convention.- Required/optional repository secrets and variables are now fully documented in
shakacode-team.md. - The
pull_request_targetusage incpflow-delete-review-app.ymlandcpflow-review-app-help.ymlis explained with inline comments that document why the trigger was chosen, which is important for future maintainers.
Summary
The new generated flow is substantially more secure and robust than what it replaces. The main open item is the cpflow exists output-parsing fragility, which the code already acknowledges. The PR is in good shape to merge once the draft status is resolved and tests can be run.
| exists_output="" | ||
| if ! exists_output="$(cpflow exists -a "$APP_NAME" --org "$CPLN_ORG" 2>&1)"; then | ||
| case "$exists_output" in | ||
| *"Double check your org"*|*"Unknown API token format"*|*"ERROR"*|*"Error:"*|*"Traceback"*|*"Net::"*) |
There was a problem hiding this comment.
The TODO comment here is well-placed. To make this easier to track, consider filing an issue against the cpflow gem linking to this file, so the workaround has a visible upstream ticket. The risk of silent misclassification (an unrecognized error string treated as "app not found") is the main hazard — a regression test that mocks cpflow exists returning an exit-1 with a novel error string would catch drift before it reaches production.
| echo "Failed to post fork rejection comment to PR #${PR_NUMBER}" >&2 | ||
| fi | ||
|
|
||
| echo "${message}" >&2 |
There was a problem hiding this comment.
When the trigger is workflow_dispatch and the PR being deployed is from a fork (same_repo == "false"), neither the pull_request branch (line 143) nor the same_repo == "true" branch (line 138) matches, so execution falls through to this gh pr comment call followed by exit 1. That is technically correct — the deployment is blocked — but it reaches the gh pr comment path that was written for the issue_comment case, which may be surprising. A small guard makes the intent explicit:
| echo "${message}" >&2 | |
| # workflow_dispatch with a fork PR: block the run without attempting a PR comment. | |
| if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then | |
| echo "${message}" >&2 | |
| exit 1 | |
| fi | |
| message="Review app deploys from fork pull requests are not allowed because this workflow uses repository secrets." | |
| if ! gh pr comment "${PR_NUMBER}" --body "${message}"; then | |
| echo "Failed to post fork rejection comment to PR #${PR_NUMBER}" >&2 | |
| fi | |
| echo "${message}" >&2 |
| run: | | ||
| set -euo pipefail | ||
|
|
||
| if [[ -z "$CPLN_TOKEN" ]]; then |
There was a problem hiding this comment.
Both token and org are declared required: true in inputs:, so Actions will already refuse to invoke this composite action without them. These null checks will never fire. They're harmless, but if left in they may mislead a future reader into thinking there's a code path where these are empty.
PR #732 Review: cpflow Generated GitHub Actions FlowOverviewThis PR replaces hand-written Control Plane GitHub Actions with the output of Security Improvements (Notable)
Correctness Improvements
Issues FoundMisleading comment:
|
| # Override these by editing this file or by setting the matching repository variable. | ||
| # Worst-case wall time per attempt is HEALTH_CHECK_INTERVAL plus the curl --max-time below | ||
| # (10s), so the defaults give a ~10 minute window (24 × (15 + 10) = 600s) — enough for | ||
| # most Rails cold boots (asset precompile + db:migrate + workload readiness). | ||
| HEALTH_CHECK_RETRIES: 24 | ||
| HEALTH_CHECK_INTERVAL: 15 |
There was a problem hiding this comment.
The comment says these can be overridden "by setting the matching repository variable", but unlike HEALTH_CHECK_ACCEPTED_STATUSES (line 28 which reads ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }}), these two are hardcoded literals — setting a repo variable with the same name has no effect.
Either wire them up to match the pattern used for HEALTH_CHECK_ACCEPTED_STATUSES:
| # Override these by editing this file or by setting the matching repository variable. | |
| # Worst-case wall time per attempt is HEALTH_CHECK_INTERVAL plus the curl --max-time below | |
| # (10s), so the defaults give a ~10 minute window (24 × (15 + 10) = 600s) — enough for | |
| # most Rails cold boots (asset precompile + db:migrate + workload readiness). | |
| HEALTH_CHECK_RETRIES: 24 | |
| HEALTH_CHECK_INTERVAL: 15 | |
| HEALTH_CHECK_RETRIES: ${{ vars.HEALTH_CHECK_RETRIES || 24 }} | |
| HEALTH_CHECK_INTERVAL: ${{ vars.HEALTH_CHECK_INTERVAL || 15 }} |
…or remove the "setting the matching repository variable" claim from the comment above so the next person who tries to tune these doesn't spend time debugging why their repo var is ignored.
| # Until then, keep this list in sync if `cpflow exists` starts emitting new error patterns — | ||
| # any unmatched error string would otherwise be silently treated as "app not found". | ||
| exists_output="" | ||
| if ! exists_output="$(cpflow exists -a "$APP_NAME" --org "$CPLN_ORG" 2>&1)"; then | ||
| case "$exists_output" in | ||
| *"Double check your org"*|*"Unknown API token format"*|*"ERROR"*|*"Error:"*|*"Traceback"*|*"Net::"*) |
There was a problem hiding this comment.
The same error-token list and detection logic is duplicated verbatim in cpflow-deploy-review-app.yml (the "Check if review app exists" step at line ~204). When this list needs to grow (e.g., a new cpflow exists error pattern), both files need to change in sync — which the TODO already flags.
A practical option: extract the detection to a small shared shell script (e.g., .github/actions/cpflow-delete-control-plane-app/cpflow-app-exists.sh) that returns 0/1/2 for exists/not-found/error, then source or call it from both places. That makes the error-token list a single source of truth and keeps the TODO resolution contained to one file.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Unnecessary CI work on every new PR open
- Removed the opened pull_request trigger so the deploy workflow only runs on synchronize and reopened events while PR-open help remains handled by the help workflow.
Or push these changes by commenting:
@cursor push 2c303a005e
Preview (2c303a005e)
diff --git a/.github/workflows/cpflow-deploy-review-app.yml b/.github/workflows/cpflow-deploy-review-app.yml
--- a/.github/workflows/cpflow-deploy-review-app.yml
+++ b/.github/workflows/cpflow-deploy-review-app.yml
@@ -4,7 +4,7 @@
on:
pull_request:
- types: [opened, synchronize, reopened]
+ types: [synchronize, reopened]
issue_comment:
types: [created]
workflow_dispatch:
@@ -35,7 +35,7 @@
jobs:
deploy:
- # Skip synchronize/opened events from fork PRs at the job level — they cannot access
+ # Skip pull_request events from fork PRs at the job level — they cannot access
# repository secrets anyway, so running any steps just burns billable minutes. Users
# can still manually deploy a fork PR via `/deploy-review-app` (gated below by
# author_association) or workflow_dispatch.You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 9a390c1. Configure here.
|
|
||
| on: | ||
| pull_request: | ||
| types: [opened, synchronize, reopened] |
There was a problem hiding this comment.
Unnecessary CI work on every new PR open
Low Severity
Adding opened to the pull_request types (the old workflow only had synchronize, reopened) causes the full workflow to run on every new PR — including Ruby setup, npm install of the CLI, and gem install — only to determine the review app doesn't exist yet and write a step summary. That summary is only visible in the Actions tab, while the separate cpflow-review-app-help.yml already posts a help comment directly on the PR. A brand-new PR can never have an existing review app, so opened adds several minutes of billable CI time per PR with no deployment benefit.
Reviewed by Cursor Bugbot for commit 9a390c1. Configure here.



Summary
cpflow-*action and workflow set fromcpflow generate-github-actions.cpflow-*action/workflow names.Verification
BUNDLE_GEMFILE=/tmp/cpflow-pr-278/Gemfile bin/conductor-exec bundle exec ruby /tmp/cpflow-pr-278/bin/cpflow github-flow-readiness— no blocking readiness issues detected.bin/conductor-exec ruby -e 'require "yaml"; paths = Dir[".github/**/*.yml"] + Dir[".github/**/*.yaml"] + Dir[".controlplane/**/*.yml"]; paths.sort.each { |path| YAML.load_file(path, aliases: true); puts path }'bin/conductor-exec git diff --cached --checkbin/conductor-exec bundle exec rubocopNot Run / Blocked
bin/conductor-exec bundle exec rspeccould not complete locally because PostgreSQL was not running at/tmp/.s.PGSQL.5432; after the first Rails load failure, Coveralls setup also reported repeated coverage initialization errors.Note
Medium Risk
Replaces and rewires deployment automation (review apps, staging deploys, and production promotion), so misconfiguration could break deployments or cleanup even though changes are mostly workflow/script-level.
Overview
Switches Control Plane CI/CD from the older hand-written GitHub Actions to the generated
cpflow-*composite actions and workflows, removing the previoussetup-environment,build-docker-image, review-app deploy/delete, staging deploy, promotion, and nightly cleanup workflows.Adds a full generated flow: opt-in review apps via
/deploy-review-app(with fork safeguards), automated redeploys once created, gated staging deploys, a scheduled stale review-app cleanup, and a more robust manual promotion workflow (env-var name parity check, health checks, multi-workload rollback support, and GitHub release creation).Updates Control Plane docs/internal notes to reference the new
cpflow-*workflow names and clarifies required repo secrets/vars (includingREVIEW_APP_PREFIXnaming expectations and optional Docker SSH build settings).Reviewed by Cursor Bugbot for commit 9a390c1. Bugbot is set up for automated code reviews on this repo. Configure here.