Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow skipping forks and mirrors from being indexed #23187

Merged
merged 21 commits into from
May 25, 2023
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
397c81a
Allow skipping forks and mirrors from being indexed
techknowlogick Feb 28, 2023
2ef2557
Merge branch 'main' into allowlist-code-search
techknowlogick Mar 8, 2023
3aa4251
Merge branch 'main' into allowlist-code-search
techknowlogick Mar 14, 2023
5350d59
Merge branch 'main' into allowlist-code-search
techknowlogick Mar 14, 2023
1d12dab
Merge remote-tracking branch 'upstream/main' into allowlist-code-search
techknowlogick May 15, 2023
9547e99
skip repos from being indexed based on units
techknowlogick May 15, 2023
7960c8e
Update docs
techknowlogick May 16, 2023
2ea8c5b
update per feedback
techknowlogick May 16, 2023
5d56cb3
Merge branch 'main' into allowlist-code-search
techknowlogick May 16, 2023
e5129d6
fix default
techknowlogick May 16, 2023
7e360f6
Merge branch 'allowlist-code-search' of github.com:techknowlogick/git…
techknowlogick May 16, 2023
5f672f1
match docs
techknowlogick May 16, 2023
62a6a72
plural
techknowlogick May 16, 2023
0fa595f
Merge branch 'main' into allowlist-code-search
techknowlogick May 19, 2023
e9417f7
Merge branch 'main' into allowlist-code-search
techknowlogick May 22, 2023
7ac8788
Update config-cheat-sheet.en-us.md
techknowlogick May 23, 2023
ce4d58a
Merge branch 'main' into allowlist-code-search
techknowlogick May 23, 2023
61b1b9b
update variable name
techknowlogick May 23, 2023
ffc9e83
Merge branch 'main' into allowlist-code-search
techknowlogick May 23, 2023
af3253d
Merge branch 'main' into allowlist-code-search
techknowlogick May 25, 2023
c622015
Merge branch 'main' into allowlist-code-search
GiteaBot May 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions custom/conf/app.example.ini
Original file line number Diff line number Diff line change
Expand Up @@ -1380,6 +1380,10 @@ ROUTER = console
;; repo indexer by default disabled, since it uses a lot of disk space
;REPO_INDEXER_ENABLED = false
;;
;; repo indexer units, the items to index, could be `sources`, `forks`, `mirrors`, `templates` or any combination of them separated by a comma.
;; If empty then it defaults to `sources` only, as if you'd like to disable fully please see REPO_INDEXER_ENABLED.
;REPO_INDEXER_REPO_TYPES = sources,forks,mirrors,templates
;;
;; Code search engine type, could be `bleve` or `elasticsearch`.
;REPO_INDEXER_TYPE = bleve
;;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,7 @@ relation to port exhaustion.
- `ISSUE_INDEXER_PATH`: **indexers/issues.bleve**: Index file used for issue search; available when ISSUE_INDEXER_TYPE is bleve and elasticsearch. Relative paths will be made absolute against _`AppWorkPath`_.

- `REPO_INDEXER_ENABLED`: **false**: Enables code search (uses a lot of disk space, about 6 times more than the repository size).
- `REPO_INDEXER_UNITS`: **sources,forks,mirrors,templates**: Repo indexer units. The items to index could be `sources`, `forks`, `mirrors`, `templates` or any combination of them separated by a comma. If empty then it defaults to `sources` only, as if you'd like to disable fully please see `REPO_INDEXER_ENABLED`.
techknowlogick marked this conversation as resolved.
Show resolved Hide resolved
- `REPO_INDEXER_TYPE`: **bleve**: Code search engine type, could be `bleve` or `elasticsearch`.
- `REPO_INDEXER_PATH`: **indexers/repos.bleve**: Index file used for code search.
- `REPO_INDEXER_CONN_STR`: ****: Code indexer connection string, available when `REPO_INDEXER_TYPE` is elasticsearch. i.e. http://elastic:changeme@localhost:9200
Expand Down
27 changes: 27 additions & 0 deletions modules/indexer/code/indexer.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"code.gitea.io/gitea/modules/queue"
"code.gitea.io/gitea/modules/setting"
"code.gitea.io/gitea/modules/timeutil"
"code.gitea.io/gitea/modules/util"
)

// SearchResult result of performing a search in a repo
Expand Down Expand Up @@ -91,6 +92,32 @@ func index(ctx context.Context, indexer Indexer, repoID int64) error {
return err
}

units := setting.Indexer.RepoIndexerUnits

if len(units) == 0 {
units = []string{"sources"}
}

// skip forks from being indexed if unit is not present
if !util.SliceContains(units, "forks") && repo.IsFork {
return nil
}

// skip mirrors from being indexed if unit is not present
if !util.SliceContains(units, "mirrors") && repo.IsMirror {
return nil
}

// skip templates from being indexed if unit is not present
if !util.SliceContains(units, "templates") && repo.IsTemplate {
return nil
}

// skip regular repos from being indexed if unit is not present
if !util.SliceContains(units, "sources") && !repo.IsFork && !repo.IsMirror && !repo.IsTemplate {
return nil
}

sha, err := getDefaultBranchSha(ctx, repo)
if err != nil {
return err
Expand Down
3 changes: 3 additions & 0 deletions modules/setting/indexer.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ var Indexer = struct {
StartupTimeout time.Duration

RepoIndexerEnabled bool
RepoIndexerUnits []string
RepoType string
RepoPath string
RepoConnStr string
Expand All @@ -40,6 +41,7 @@ var Indexer = struct {
IssueIndexerName: "gitea_issues",

RepoIndexerEnabled: false,
RepoIndexerUnits: []string{"sources", "fork", "mirror", "templates"},
techknowlogick marked this conversation as resolved.
Show resolved Hide resolved
RepoType: "bleve",
RepoPath: "indexers/repos.bleve",
RepoConnStr: "",
Expand Down Expand Up @@ -71,6 +73,7 @@ func loadIndexerFrom(rootCfg ConfigProvider) {
Indexer.IssueIndexerName = sec.Key("ISSUE_INDEXER_NAME").MustString(Indexer.IssueIndexerName)

Indexer.RepoIndexerEnabled = sec.Key("REPO_INDEXER_ENABLED").MustBool(false)
Indexer.RepoIndexerUnits = strings.Split(sec.Key("REPO_INDEXER_REPO_TYPES").MustString("sources,forks,mirrors,templates"), ",")
techknowlogick marked this conversation as resolved.
Show resolved Hide resolved
techknowlogick marked this conversation as resolved.
Show resolved Hide resolved
Indexer.RepoType = sec.Key("REPO_INDEXER_TYPE").MustString("bleve")
Indexer.RepoPath = filepath.ToSlash(sec.Key("REPO_INDEXER_PATH").MustString(filepath.ToSlash(filepath.Join(AppDataPath, "indexers/repos.bleve"))))
if !filepath.IsAbs(Indexer.RepoPath) {
Expand Down