jojo/services/actions
Mathieu Fenniak 0ae6235386 fix: allow Actions runner to recover tasks lost during fetching from intermittent errors (#11401)
Probably fixes (or improves, at least) https://code.forgejo.org/forgejo/runner/issues/1391, paired with the runner implementation https://code.forgejo.org/forgejo/runner/pulls/1393.

When the FetchTask() API is invoked to create a task, unpreventable environmental errors may occur; for example, network disconnects and timeouts. It's possible that these errors occur after the server-side has assigned a task to the runner during the API call, in which case the error would cause that task to be lost between the two systems -- the server will think it's assigned to the runner, and the runner never received it.  This can cause jobs to appear stuck at "Set up job".

The solution implemented here is idempotency in the FetchTask() API call, which means that the "same" FetchTask() API call is expected to return the same values. Specifically, the runner creates a unique identifier which is transmitted to the server as a header `x-runner-request-key` with each FetchTask() invocation which defines the sameness of the call, and the runner retains the value until the API call receives a successful response. The server implementation returns the same tasks back if a second (or Nth) call is received with the same `x-runner-request-key` header.  In order to accomplish this is records the `x-runner-request-key` value that is used with each request that assigns tasks.

As a complication, the Forgejo server is unable to return the same `${{ secrets.forgejo_token }}` for the task because the server stores that value in a one-way hash in the database.  To resolve this, the server regenerates the token when retrieving tasks for a second time.

## Checklist

The [contributor guide](https://forgejo.org/docs/next/contributor/) contains information that will be helpful to first time contributors. There also are a few [conditions for merging Pull Requests in Forgejo repositories](https://codeberg.org/forgejo/governance/src/branch/main/PullRequestsAgreement.md). You are also welcome to join the [Forgejo development chatroom](https://matrix.to/#/#forgejo-development:matrix.org).

### Tests for Go changes

(can be removed for JavaScript changes)

- I added test coverage for Go changes...
  - [x] in their respective `*_test.go` for unit tests.
  - [x] in the `tests/integration` directory if it involves interactions with a live Forgejo server.
- I ran...
  - [x] `make pr-go` before pushing

### Documentation

- [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change.
- [x] I did not document these changes and I do not expect someone else to do it.

### Release notes

- [x] This change will be noticed by a Forgejo user or admin (feature, bug fix, performance, etc.). I suggest to include a release note for this change.
- [ ] This change is not visible to a Forgejo user or admin (refactor, dependency upgrade, etc.). I think there is no need to add a release note for this change.

*The decision if the pull request will be shown in the release notes is up to the mergers / release team.*

The content of the `release-notes/<pull request number>.md` file will serve as the basis for the release notes. If the file does not exist, the title of the pull request will be used instead.

Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/11401
Reviewed-by: Andreas Ahlenstorf <aahlenst@noreply.codeberg.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
2026-02-22 23:24:38 +01:00
..
Test_checkJobsOfRun fix: newly expanded dynamic matrix jobs can become stuck in a 'blocked' state (#11184) 2026-02-07 14:36:49 +01:00
Test_tryHandleIncompleteMatrix fix: empty dynamic matrix can leave action run hanging incomplete (#11063) 2026-01-27 17:10:59 +01:00
Test_tryHandleWorkflowCallOuterJob fix: re-running an expanded reusable workflow causes duplicate "attempt 1" job (#10666) 2026-01-02 15:26:11 +01:00
TestActions_CancelOrApproveRun refactor: migrate from lib/pq to jackc/pgx (#10219) 2025-11-30 17:47:45 +01:00
TestActions_consistencyCheckRun feat(actions): support referencing ${{ needs... }} variables in runs-on (#10308) 2025-12-05 18:14:43 +01:00
TestActionsNotifier_IsTrusted chore(refactor): replace ifNeedApproval with trust management 2025-11-06 11:07:39 +01:00
TestActionsTrust_GetPullRequestUserIsTrustedWithActions feat: trust management for runs created from a forked pull request 2025-11-06 11:07:38 +01:00
TestCancelAbandonedJobs fix: don't abandon Action jobs waiting for approval (#11145) 2026-02-04 16:00:18 +01:00
TestCancelPreviousJobs refactor: migrate from lib/pq to jackc/pgx (#10219) 2025-11-30 17:47:45 +01:00
TestCancelPreviousWithConcurrencyGroup refactor: migrate from lib/pq to jackc/pgx (#10219) 2025-11-30 17:47:45 +01:00
TestCreateCommitStatus fix: don't duplicate commit status records on workflows with empty name (#10678) 2026-01-02 19:02:10 +01:00
TestExpandLocalReusableWorkflows feat: expand reusable workflow calls into their inner jobs (#10525) 2025-12-24 20:47:21 +01:00
TestGetSecretsOfJob feat: support jobs.<job_id>.secrets with reusable workflow expansion (#10627) 2025-12-30 17:33:21 +01:00
TestServiceActions_startTask fix: a corrupted Forgejo Actions scheduled workflow is disabled (#8942) 2025-08-18 22:45:10 +02:00
TestServicesActions_TransferLingeringLogs refactor: migrate from lib/pq to jackc/pgx (#10219) 2025-11-30 17:47:45 +01:00
auth.go feat: add OIDC workload identity federation support (#10481) 2026-01-15 03:39:00 +01:00
auth_test.go feat: add OIDC workload identity federation support (#10481) 2026-01-15 03:39:00 +01:00
cleanup.go feat: implement ephemeral runners (#9962) 2026-02-16 18:56:56 +01:00
cleanup_test.go feat: implement ephemeral runners (#9962) 2026-02-16 18:56:56 +01:00
clear_tasks.go fix: ensure actions logs are transferred when a task is done (#10008) 2026-02-22 05:11:22 +01:00
clear_tasks_test.go fix: don't abandon Action jobs waiting for approval (#11145) 2026-02-04 16:00:18 +01:00
commit_status.go fix: retain Forgejo Action's commit_status entries with distinct descriptions (#10696) 2026-01-05 14:47:27 +01:00
commit_status_test.go fix: retain Forgejo Action's commit_status entries with distinct descriptions (#10696) 2026-01-05 14:47:27 +01:00
context.go feat: add Forgejo server version to runner context (#10642) 2025-12-30 22:39:34 +01:00
context_test.go feat: support workflow inputs on expanded reusable workflows (#10614) 2025-12-29 15:37:44 +01:00
init.go chore: branding import path (#7337) 2025-03-27 19:40:14 +00:00
interface.go feat: add HTTP API endpoint for runner registration (#10677) 2026-01-05 04:59:04 +01:00
job_emitter.go fix: newly expanded dynamic matrix jobs can become stuck in a 'blocked' state (#11184) 2026-02-07 14:36:49 +01:00
job_emitter_test.go fix: newly expanded dynamic matrix jobs can become stuck in a 'blocked' state (#11184) 2026-02-07 14:36:49 +01:00
log.go fix: garbage collect lingering actions logs (#10009) 2025-11-18 18:59:01 +01:00
log_test.go chore: fix typos throughout the codebase (#10753) 2026-01-26 22:57:33 +01:00
main_test.go chore: move all test blank imports in a single package (#10662) 2026-01-02 05:32:32 +01:00
notifier.go fix: cancel runs pending approval when a PR is closed (#11134) 2026-02-02 23:20:41 +01:00
notifier_helper.go fix: prevent intermittent test failures caused by uncancellable tasks (#10713) 2026-01-06 15:34:43 +01:00
notifier_helper_test.go feat: expand reusable workflow calls into their inner jobs (#10525) 2025-12-24 20:47:21 +01:00
rerun.go chore: branding import path (#7337) 2025-03-27 19:40:14 +00:00
rerun_test.go chore: branding import path (#7337) 2025-03-27 19:40:14 +00:00
reusable_workflows.go feat: expand reusable workflow calls into their inner jobs (#10525) 2025-12-24 20:47:21 +01:00
reusable_workflows_test.go feat: expand reusable workflow calls into their inner jobs (#10525) 2025-12-24 20:47:21 +01:00
run.go chore: fix typos throughout the codebase (#10753) 2026-01-26 22:57:33 +01:00
run_test.go feat(actions): support referencing ${{ needs... }} variables in runs-on (#10308) 2025-12-05 18:14:43 +01:00
schedule_tasks.go fix: make concurrency group job cancellation effect runs that are failed (#10863) 2026-01-16 10:54:01 +01:00
schedule_tasks_test.go fix: make concurrency group job cancellation effect runs that are failed (#10863) 2026-01-16 10:54:01 +01:00
secret.go feat: support jobs.<job_id>.secrets with reusable workflow expansion (#10627) 2025-12-30 17:33:21 +01:00
secret_test.go feat: support jobs.<job_id>.secrets with reusable workflow expansion (#10627) 2025-12-30 17:33:21 +01:00
task.go fix: allow Actions runner to recover tasks lost during fetching from intermittent errors (#11401) 2026-02-22 23:24:38 +01:00
task_test.go feat: add OIDC workload identity federation support (#10481) 2026-01-15 03:39:00 +01:00
trust.go fix: cancel runs pending approval when a PR is closed (#11134) 2026-02-02 23:20:41 +01:00
trust_test.go fix: cancel runs pending approval when a PR is closed (#11134) 2026-02-02 23:20:41 +01:00
variables.go fix: actions variable and secret names validation (#10682) 2026-01-14 04:19:21 +01:00
variables_test.go fix: allow Forgejo Actions environment variables starting with CI (#8850) 2025-08-10 22:56:16 +02:00
workflows.go feat: expand reusable workflow calls into their inner jobs (#10525) 2025-12-24 20:47:21 +01:00
workflows_test.go Update module code.forgejo.org/forgejo/runner/v11 to v12 (forgejo) (#10213) 2025-11-23 15:58:57 +01:00