fix: dynamic Action jobs can stall by marking themselves blocked (#10658)

#10647 introduced a regression which was detected by the [matrix-dynamic end-to-end test](0e0b1429e6/actions/example-matrix-dynamic/.forgejo/workflows/test.yml), resulting in [test failures](https://code.forgejo.org/forgejo/end-to-end/actions/runs/4608/jobs/2/attempt/1).

The cause of this regression was an added code path which is invoked when a job is being reparsed due to being incomplete, in which the original `needs` value of the job is preserved.  To describe it in detail, here's an incomplete job...

```
on: [push]
jobs:
  # ...
  matrix-job:
    runs-on: docker
    needs: define-matrix
    strategy:
      matrix: ${{ fromJSON(needs.define-matrix.outputs.matrix-value) }}
    steps: # ...
```

This job is "incomplete" when it is parsed initially because the `define-matrix` job needs to be completed before its matrix can be evaluated into the correct jobs to run.  It `needs: define-matrix`.  When it is first parsed by Forgejo, `needs: define-matrix` is pulled out of the job definition and stored in the database, in the `action_run_job` table's `needs` column, because Forgejo will be the one managing when it is run by detecting the completion of the `define-matrix` job.

During #10647, a change was made which caused new jobs' which have their own `needs` (from reusable workflows) to have their `needs` be merged with the `needs` that were stored in the database.  The regression is that when a job is expanded, it will still be considered a blocked job because it has a `needs` list (even though we know that those jobs are complete):

03e2ecfbc4/models/actions/run.go (L369-L370)

As for why it was originally added in #10647?  Unfortunately the comment that I left in the code doesn't explain the functional problem it was attempting to solve, and just describes it as-if the original needs are still needed for *something*. 🙄  It makes perfect sense to me, right now, that we would **not** keep `needs` values that are already known to be complete.

I've run through my new reusable workflow end-to-end test (https://code.forgejo.org/forgejo/end-to-end/pulls/1362) with this codepath removed and it works perfectly fine.

## Checklist

The [contributor guide](https://forgejo.org/docs/next/contributor/) contains information that will be helpful to first time contributors. There also are a few [conditions for merging Pull Requests in Forgejo repositories](https://codeberg.org/forgejo/governance/src/branch/main/PullRequestsAgreement.md). You are also welcome to join the [Forgejo development chatroom](https://matrix.to/#/#forgejo-development:matrix.org).

### Tests

- I added test coverage for Go changes...
  - [x] in their respective `*_test.go` for unit tests.
  - [ ] in the `tests/integration` directory if it involves interactions with a live Forgejo server.
- I added test coverage for JavaScript changes...
  - [ ] in `web_src/js/*.test.js` if it can be unit tested.
  - [ ] in `tests/e2e/*.test.e2e.js` if it requires interactions with a live Forgejo server (see also the [developer guide for JavaScript testing](https://codeberg.org/forgejo/forgejo/src/branch/forgejo/tests/e2e/README.md#end-to-end-tests)).

### Documentation

- [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change.
- [x] I did not document these changes and I do not expect someone else to do it.

### Release notes

- [x] I do not want this change to show in the release notes.
- [ ] I want the title to show in the release notes with a link to this pull request.
- [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title.

Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/10658
Reviewed-by: Andreas Ahlenstorf <aahlenst@noreply.codeberg.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
This commit is contained in:
Mathieu Fenniak 2026-01-01 14:22:11 +01:00 committed by Mathieu Fenniak
parent 2566924ff8
commit c2649d4055
2 changed files with 2 additions and 20 deletions

View file

@ -298,7 +298,7 @@ func tryHandleIncompleteMatrix(ctx context.Context, blockedJob *actions_model.Ac
// it to be a "persistent incomplete" job with some error that needs to be reported to the user. If the
// re-evaluated job has a different job ID, then it's likely an expanded job -- such as from a reusable workflow
// -- which could have it's own `needs` that allows it to expand into a correct job in the future.
jobID, job := swf.Job()
jobID, _ := swf.Job()
if jobID == blockedJob.JobID {
if swf.IncompleteMatrix {
errorCode, errorDetails := persistentIncompleteMatrixError(blockedJob, swf.IncompleteMatrixNeeds)
@ -322,23 +322,6 @@ func tryHandleIncompleteMatrix(ctx context.Context, blockedJob *actions_model.Ac
// Return `true` to skip running this job in this invalid state
return true, nil
}
// When `InsertRunJobs` is run on a job (including `blockedJob` when it was persisted), the `needs` are
// removed from it's WorkflowPayload and moved up to the database record so that Forgejo can manage ordering
// the run execution. Now that `blockedJob` has been changed from incomplete->complete by reparsing it and
// providing its inputs, it would have been reparsed with an empty `needs` entry because of this earlier
// removal. And the returned record could have its own new `needs` if it was a reusable workflow with inner
// jobs. So, merge the old database list of needs with the new reparsed list of needs, and store them in
// the new `SingleWorkflow` which will be paseed to `InsertRunJobs` where it will be ripped out again.
//
// This is only relevant for `blockedJob` and not for any other generated jobs since they wouldn't have yet
// gone through `InsertRunJobs` to have this mutation performed.
newNeeds := append(job.Needs(), blockedJob.Needs...)
_ = job.RawNeeds.Encode(newNeeds)
err := swf.SetJob(jobID, job)
if err != nil {
return false, fmt.Errorf("failure to update needs in job: %w", err)
}
}
}

View file

@ -475,7 +475,7 @@ func Test_tryHandleIncompleteMatrix(t *testing.T) {
needs: map[string][]string{
"define-workflow-call": nil,
"inner my-workflow-input": nil,
"perform-workflow-call": {"define-workflow-call", "perform-workflow-call.inner_job"},
"perform-workflow-call": {"perform-workflow-call.inner_job"},
},
},
// Before reusable workflow expansion, there weren't any cases where evaluating a job in the job emitter could
@ -502,7 +502,6 @@ func Test_tryHandleIncompleteMatrix(t *testing.T) {
"inner define-runs-on my-workflow-input": nil,
"inner incomplete-job my-workflow-input": {"perform-workflow-call.define-runs-on"},
"perform-workflow-call": {
"define-workflow-call",
"perform-workflow-call.define-runs-on",
"perform-workflow-call.scalar-job",
},