fix: dynamic Action jobs can stall by marking themselves blocked (#10658)

#10647 introduced a regression which was detected by the [matrix-dynamic end-to-end test](0e0b1429e6/actions/example-matrix-dynamic/.forgejo/workflows/test.yml), resulting in [test failures](https://code.forgejo.org/forgejo/end-to-end/actions/runs/4608/jobs/2/attempt/1). The cause of this regression was an added code path which is invoked when a job is being reparsed due to being incomplete, in which the original `needs` value of the job is preserved. To describe it in detail, here's an incomplete job... ``` on: [push] jobs: # ... matrix-job: runs-on: docker needs: define-matrix strategy: matrix: ${{ fromJSON(needs.define-matrix.outputs.matrix-value) }} steps: # ... ``` This job is "incomplete" when it is parsed initially because the `define-matrix` job needs to be completed before its matrix can be evaluated into the correct jobs to run. It `needs: define-matrix`. When it is first parsed by Forgejo, `needs: define-matrix` is pulled out of the job definition and stored in the database, in the `action_run_job` table's `needs` column, because Forgejo will be the one managing when it is run by detecting the completion of the `define-matrix` job. During #10647, a change was made which caused new jobs' which have their own `needs` (from reusable workflows) to have their `needs` be merged with the `needs` that were stored in the database. The regression is that when a job is expanded, it will still be considered a blocked job because it has a `needs` list (even though we know that those jobs are complete): 03e2ecfbc4/models/actions/run.go (L369-L370) As for why it was originally added in #10647? Unfortunately the comment that I left in the code doesn't explain the functional problem it was attempting to solve, and just describes it as-if the original needs are still needed for *something*. 🙄 It makes perfect sense to me, right now, that we would **not** keep `needs` values that are already known to be complete. I've run through my new reusable workflow end-to-end test (https://code.forgejo.org/forgejo/end-to-end/pulls/1362) with this codepath removed and it works perfectly fine. ## Checklist The [contributor guide](https://forgejo.org/docs/next/contributor/) contains information that will be helpful to first time contributors. There also are a few [conditions for merging Pull Requests in Forgejo repositories](https://codeberg.org/forgejo/governance/src/branch/main/PullRequestsAgreement.md). You are also welcome to join the [Forgejo development chatroom](https://matrix.to/#/#forgejo-development:matrix.org). ### Tests - I added test coverage for Go changes... - [x] in their respective `*_test.go` for unit tests. - [ ] in the `tests/integration` directory if it involves interactions with a live Forgejo server. - I added test coverage for JavaScript changes... - [ ] in `web_src/js/*.test.js` if it can be unit tested. - [ ] in `tests/e2e/*.test.e2e.js` if it requires interactions with a live Forgejo server (see also the [developer guide for JavaScript testing](https://codeberg.org/forgejo/forgejo/src/branch/forgejo/tests/e2e/README.md#end-to-end-tests)). ### Documentation - [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change. - [x] I did not document these changes and I do not expect someone else to do it. ### Release notes - [x] I do not want this change to show in the release notes. - [ ] I want the title to show in the release notes with a link to this pull request. - [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title. Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/10658 Reviewed-by: Andreas Ahlenstorf <aahlenst@noreply.codeberg.org> Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net> Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
2026-05-12 22:10:25 +00:00 · 2026-01-01 14:22:11 +01:00 · 2026-01-01 14:22:11 +01:00 · c2649d4055
commit c2649d4055
parent 2566924ff8
2 changed files with 2 additions and 20 deletions
--- a/services/actions/job_emitter.go
+++ b/services/actions/job_emitter.go
@ -298,7 +298,7 @@ func tryHandleIncompleteMatrix(ctx context.Context, blockedJob *actions_model.Ac
 		// it to be a "persistent incomplete" job with some error that needs to be reported to the user.  If the
 		// re-evaluated job has a different job ID, then it's likely an expanded job -- such as from a reusable workflow
 		// -- which could have it's own `needs` that allows it to expand into a correct job in the future.
-		jobID, job := swf.Job()
+		jobID, _ := swf.Job()
 		if jobID == blockedJob.JobID {
 			if swf.IncompleteMatrix {
 				errorCode, errorDetails := persistentIncompleteMatrixError(blockedJob, swf.IncompleteMatrixNeeds)
@ -322,23 +322,6 @@ func tryHandleIncompleteMatrix(ctx context.Context, blockedJob *actions_model.Ac
 				// Return `true` to skip running this job in this invalid state
 				return true, nil
 			}
-
-			// When `InsertRunJobs` is run on a job (including `blockedJob` when it was persisted), the `needs` are
-			// removed from it's WorkflowPayload and moved up to the database record so that Forgejo can manage ordering
-			// the run execution.  Now that `blockedJob` has been changed from incomplete->complete by reparsing it and
-			// providing its inputs, it would have been reparsed with an empty `needs` entry because of this earlier
-			// removal.  And the returned record could have its own new `needs` if it was a reusable workflow with inner
-			// jobs.  So, merge the old database list of needs with the new reparsed list of needs, and store them in
-			// the new `SingleWorkflow` which will be paseed to `InsertRunJobs` where it will be ripped out again.
-			//
-			// This is only relevant for `blockedJob` and not for any other generated jobs since they wouldn't have yet
-			// gone through `InsertRunJobs` to have this mutation performed.
-			newNeeds := append(job.Needs(), blockedJob.Needs...)
-			_ = job.RawNeeds.Encode(newNeeds)
-			err := swf.SetJob(jobID, job)
-			if err != nil {
-				return false, fmt.Errorf("failure to update needs in job: %w", err)
-			}
 		}
 	}

--- a/services/actions/job_emitter_test.go
+++ b/services/actions/job_emitter_test.go
@ -475,7 +475,7 @@ func Test_tryHandleIncompleteMatrix(t *testing.T) {
 			needs: map[string][]string{
 				"define-workflow-call":    nil,
 				"inner my-workflow-input": nil,
-				"perform-workflow-call":   {"define-workflow-call", "perform-workflow-call.inner_job"},
+				"perform-workflow-call":   {"perform-workflow-call.inner_job"},
 			},
 		},
 		// Before reusable workflow expansion, there weren't any cases where evaluating a job in the job emitter could
@ -502,7 +502,6 @@ func Test_tryHandleIncompleteMatrix(t *testing.T) {
 				"inner define-runs-on my-workflow-input": nil,
 				"inner incomplete-job my-workflow-input": {"perform-workflow-call.define-runs-on"},
 				"perform-workflow-call": {
-					"define-workflow-call",
 					"perform-workflow-call.define-runs-on",
 					"perform-workflow-call.scalar-job",
 				},