fix: realign indexes on the 'action' table (#10040)
Fixes #9963. This realigns all the indexes on the `action` table to best match their intended usages.
New:
- `IDX_action_created_unix (created_unix)`
- Intended for usage in `DeleteOldActions`.
- `IDX_action_repo_id_created_unix (repo_id, created_unix)`
- Intended for usage when fetching action feeds for a repo and a team, with the same logic as that described below in `IDX_action_user_id_created_unix`.
- `IDX_action_repo_id_op_type (repo_id, op_type)`
- Intended for `DeleteIssueActions` when it searches for CreateIssue & CreatePullRequest actions for cleanup. Could be optimized further with a denormalization of the issue identifier into a field, but there's no current evidence that this is required.
Replaced:
- `IDX_action_c_u (created_unix, user_id)`
- Replaced with `IDX_action_user_id_created_unix (user_id, created_unix)`. When action feeds are created w/ `ORDER BY created_unix DESC LIMIT 20`, an index beginning with `created_unix` will have to index scan until it can satisfy 20 records; the `user_id` portion of the index is effectively useless until two records appear at the same time. By inverting the order, the database will be able to identify the records created by a user and then pop the most recent 20 from the index order.
- At the scale of database I have access to, the performance difference is unmeasurable. This change is supported by theoretical grounds and the findings of #9963, but no experimental evidence.
Removed:
- `IDX_action_user_id (user_id)`
- Redundant with the new `IDX_action_user_id_created_unix`.
- `IDX_action_r_u (repo_id, user_id)`
- No clear consumer for this index.
Retained with no modification:
- `IDX_action_comment_id (comment_id)`
- Used in `DeleteIssueActions`.
- `IDX_action_au_r_c_u (act_user_id, repo_id, created_unix, user_id)`
- Heat map generation.
## Checklist
The [contributor guide](https://forgejo.org/docs/next/contributor/) contains information that will be helpful to first time contributors. There also are a few [conditions for merging Pull Requests in Forgejo repositories](https://codeberg.org/forgejo/governance/src/branch/main/PullRequestsAgreement.md). You are also welcome to join the [Forgejo development chatroom](https://matrix.to/#/#forgejo-development:matrix.org).
### Tests
- I added test coverage for Go changes...
- [ ] in their respective `*_test.go` for unit tests.
- [ ] in the `tests/integration` directory if it involves interactions with a live Forgejo server.
- I added test coverage for JavaScript changes...
- [ ] in `web_src/js/*.test.js` if it can be unit tested.
- [ ] in `tests/e2e/*.test.e2e.js` if it requires interactions with a live Forgejo server (see also the [developer guide for JavaScript testing](https://codeberg.org/forgejo/forgejo/src/branch/forgejo/tests/e2e/README.md#end-to-end-tests)).
### Documentation
- [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change.
- [x] I did not document these changes and I do not expect someone else to do it.
### Release notes
- [ ] I do not want this change to show in the release notes.
- [x] I want the title to show in the release notes with a link to this pull request.
- [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title.
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/10040
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
2025-11-18 18:34:25 +01:00
|
|
|
// Copyright 2025 The Forgejo Authors. All rights reserved.
|
|
|
|
|
// SPDX-License-Identifier: GPL-3.0-or-later
|
|
|
|
|
|
|
|
|
|
package forgejo_migrations
|
|
|
|
|
|
|
|
|
|
import (
|
|
|
|
|
"xorm.io/xorm"
|
|
|
|
|
"xorm.io/xorm/schemas"
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
func init() {
|
|
|
|
|
registerMigration(&Migration{
|
|
|
|
|
Description: "rework indexes on table action",
|
|
|
|
|
Upgrade: reworkActionIndexes,
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
type v14bAction struct {
|
|
|
|
|
ID int64 `xorm:"pk autoincr"`
|
|
|
|
|
UserID int64 // Receiver user id.
|
|
|
|
|
OpType int
|
|
|
|
|
ActUserID int64 // Action user id.
|
|
|
|
|
RepoID int64
|
|
|
|
|
CommentID int64 `xorm:"INDEX"` // indexed to support `DeleteIssueActions`
|
|
|
|
|
CreatedUnix int64 `xorm:"created INDEX"` // indexed to support `DeleteOldActions`
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// TableName sets the name of this table
|
|
|
|
|
func (a *v14bAction) TableName() string {
|
|
|
|
|
return "action"
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// TableIndices implements xorm's TableIndices interface. It is used here to ensure indexes with specified column order
|
|
|
|
|
// are created, which can't be created through xorm tags on the struct.
|
|
|
|
|
func (a *v14bAction) TableIndices() []*schemas.Index {
|
|
|
|
|
// Index to support getUserHeatmapData, which searches for data that is visible-to (user_id) and performed-by
|
|
|
|
|
// (act_user_id) a user, but only includes visible repos (repo_id).
|
|
|
|
|
actUserIndex := schemas.NewIndex("au_r_c_u", schemas.IndexType)
|
|
|
|
|
actUserIndex.AddColumn("act_user_id", "repo_id", "created_unix", "user_id")
|
|
|
|
|
|
|
|
|
|
// GetFeeds is a common access point to Action and requires that all action feeds be queried based upon one of
|
|
|
|
|
// user_id (opts.RequestedUser), repo_id (opts.RequestedTeam... kinda), and/or repo_id (opts.RequestedRepo), and
|
|
|
|
|
// then the results are ordered by created_unix and paginated. The most efficient indexes to support those queries
|
|
|
|
|
// are:
|
|
|
|
|
requestedUser := schemas.NewIndex("user_id_created_unix", schemas.IndexType)
|
|
|
|
|
requestedUser.AddColumn("user_id", "created_unix")
|
|
|
|
|
requestedRepo := schemas.NewIndex("repo_id_created_unix", schemas.IndexType)
|
|
|
|
|
requestedRepo.AddColumn("repo_id", "created_unix")
|
|
|
|
|
|
|
|
|
|
// To support `DeleteIssueActions` search for createissue / createpullrequest actions; this isn't a great search
|
|
|
|
|
// because `DeleteIssueActions` searches by `content` as well, but it should be sufficient performance-wise for
|
|
|
|
|
// infrequent deleting of issues.
|
|
|
|
|
repoOpType := schemas.NewIndex("repo_id_op_type", schemas.IndexType)
|
|
|
|
|
repoOpType.AddColumn("repo_id", "op_type")
|
|
|
|
|
|
|
|
|
|
indices := []*schemas.Index{actUserIndex, requestedUser, requestedRepo, repoOpType}
|
|
|
|
|
|
|
|
|
|
return indices
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func reworkActionIndexes(x *xorm.Engine) error {
|
|
|
|
|
if err := dropIndexIfExists(x, "action", "IDX_action_c_u"); err != nil {
|
|
|
|
|
return err
|
|
|
|
|
}
|
|
|
|
|
if err := dropIndexIfExists(x, "action", "IDX_action_r_u"); err != nil {
|
|
|
|
|
return err
|
|
|
|
|
}
|
|
|
|
|
if err := dropIndexIfExists(x, "action", "IDX_action_user_id"); err != nil {
|
|
|
|
|
return err
|
|
|
|
|
}
|
|
|
|
|
|
ci: introduce `semgrep` to prevent using `xorm.Sync()` incorrectly in new migrations (#11142)
Adds a CI check which detects any usage of xorm's `Sync` method that doesn't include `IgnoreDropIndices: true`, and causes an error.
`semgrep` is a semantic grep tool that allows for the relatively easy authoring of linting tools that are customized to a project's specific needs, rather than generic like `golangci` and related tools. Although `semgrep` offers a suite of out-of-the-box rules (and a paid set of rules), neither of those are used here -- only one Forgejo-specific rule is added in `.semgrep/xorm.yaml`.
My intent with this change is to introduce the idea and infrastructure of `semgrep` with a single minimal rule. Once in-place, this will become a tool that we can use when we recognize bad coding patterns and wish to correct them permanently, rather than relying on human code review. While generic linting tools do this well for general patterns, this will allow Forgejo to apply domain-specific checks. For example, in #11112, an error indicates that it might be appropriate for us to always use `.StorageEngine("InnoDB")` when using an xorm engine -- if we made that determination, it could be cemented in-place with a `semgrep` rule relatively easily.
This specific rule looks for any access for xorm's `Sync` or `SyncWithOptions` methods on the `*xorm.Engine` or `*xorm.Session`. They are then considered errors if they don't include `IgnoreDropIndices: true`. This is *typically* correct and safe, but can also be ignored when specifically needed. In the `.semgrep/tests` folder, test code is added which validates that the `semgrep` rule matches the expected patterns; this self-test is run before `semgrep` runs on the PR in CI.
As a demonstration, when `IgnoreDropIndices` is removed from a migration, here's an error: https://codeberg.org/forgejo/forgejo/actions/runs/135750/jobs/12/attempt/1
```
models/forgejo_migrations/v14b_add-action_run-preexecutionerrorcode.go
❯❯❱ semgrep.xorm-sync-missing-ignore-drop-indices
xorm Sync operation may drop indices if used on an incomplete bean definition for an existing table.
Use SyncWithOptions with IgnoreDropIndices: true instead.
22┆ _, err := x.SyncWithOptions(xorm.SyncOptions{}, new(ActionRun))
```
## Checklist
The [contributor guide](https://forgejo.org/docs/next/contributor/) contains information that will be helpful to first time contributors. There also are a few [conditions for merging Pull Requests in Forgejo repositories](https://codeberg.org/forgejo/governance/src/branch/main/PullRequestsAgreement.md). You are also welcome to join the [Forgejo development chatroom](https://matrix.to/#/#forgejo-development:matrix.org).
### Tests
- I added test coverage for Go changes...
- [ ] in their respective `*_test.go` for unit tests.
- [ ] in the `tests/integration` directory if it involves interactions with a live Forgejo server.
- I added test coverage for JavaScript changes...
- [ ] in `web_src/js/*.test.js` if it can be unit tested.
- [ ] in `tests/e2e/*.test.e2e.js` if it requires interactions with a live Forgejo server (see also the [developer guide for JavaScript testing](https://codeberg.org/forgejo/forgejo/src/branch/forgejo/tests/e2e/README.md#end-to-end-tests)).
### Documentation
- [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change.
- [x] I did not document these changes and I do not expect someone else to do it.
### Release notes
- [ ] This change will be noticed by a Forgejo user or admin (feature, bug fix, performance, etc.). I suggest to include a release note for this change.
- [x] This change is not visible to a Forgejo user or admin (refactor, dependency upgrade, etc.). I think there is no need to add a release note for this change.
*The decision if the pull request will be shown in the release notes is up to the mergers / release team.*
The content of the `release-notes/<pull request number>.md` file will serve as the basis for the release notes. If the file does not exist, the title of the pull request will be used instead.
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/11142
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
2026-02-07 21:52:43 +01:00
|
|
|
return x.Sync(new(v14bAction)) // nosemgrep:xorm-sync-missing-ignore-drop-indices
|
fix: realign indexes on the 'action' table (#10040)
Fixes #9963. This realigns all the indexes on the `action` table to best match their intended usages.
New:
- `IDX_action_created_unix (created_unix)`
- Intended for usage in `DeleteOldActions`.
- `IDX_action_repo_id_created_unix (repo_id, created_unix)`
- Intended for usage when fetching action feeds for a repo and a team, with the same logic as that described below in `IDX_action_user_id_created_unix`.
- `IDX_action_repo_id_op_type (repo_id, op_type)`
- Intended for `DeleteIssueActions` when it searches for CreateIssue & CreatePullRequest actions for cleanup. Could be optimized further with a denormalization of the issue identifier into a field, but there's no current evidence that this is required.
Replaced:
- `IDX_action_c_u (created_unix, user_id)`
- Replaced with `IDX_action_user_id_created_unix (user_id, created_unix)`. When action feeds are created w/ `ORDER BY created_unix DESC LIMIT 20`, an index beginning with `created_unix` will have to index scan until it can satisfy 20 records; the `user_id` portion of the index is effectively useless until two records appear at the same time. By inverting the order, the database will be able to identify the records created by a user and then pop the most recent 20 from the index order.
- At the scale of database I have access to, the performance difference is unmeasurable. This change is supported by theoretical grounds and the findings of #9963, but no experimental evidence.
Removed:
- `IDX_action_user_id (user_id)`
- Redundant with the new `IDX_action_user_id_created_unix`.
- `IDX_action_r_u (repo_id, user_id)`
- No clear consumer for this index.
Retained with no modification:
- `IDX_action_comment_id (comment_id)`
- Used in `DeleteIssueActions`.
- `IDX_action_au_r_c_u (act_user_id, repo_id, created_unix, user_id)`
- Heat map generation.
## Checklist
The [contributor guide](https://forgejo.org/docs/next/contributor/) contains information that will be helpful to first time contributors. There also are a few [conditions for merging Pull Requests in Forgejo repositories](https://codeberg.org/forgejo/governance/src/branch/main/PullRequestsAgreement.md). You are also welcome to join the [Forgejo development chatroom](https://matrix.to/#/#forgejo-development:matrix.org).
### Tests
- I added test coverage for Go changes...
- [ ] in their respective `*_test.go` for unit tests.
- [ ] in the `tests/integration` directory if it involves interactions with a live Forgejo server.
- I added test coverage for JavaScript changes...
- [ ] in `web_src/js/*.test.js` if it can be unit tested.
- [ ] in `tests/e2e/*.test.e2e.js` if it requires interactions with a live Forgejo server (see also the [developer guide for JavaScript testing](https://codeberg.org/forgejo/forgejo/src/branch/forgejo/tests/e2e/README.md#end-to-end-tests)).
### Documentation
- [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change.
- [x] I did not document these changes and I do not expect someone else to do it.
### Release notes
- [ ] I do not want this change to show in the release notes.
- [x] I want the title to show in the release notes with a link to this pull request.
- [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title.
Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/10040
Reviewed-by: Gusted <gusted@noreply.codeberg.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
2025-11-18 18:34:25 +01:00
|
|
|
}
|