diff --git a/README.md b/README.md index 4698eb9d..cea92a58 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -*[English](README_en.md) ∙ [简体中文](README_cn.md)* +*[English](README_en.md) ∙ [简体中文](README_cn.md) ∙ [日本語](README_ja.md)* `CSGHub Server` is a part of the open source and reliable large model assets management platform - [CSGHub](https://github.com/OpenCSGs/CSGHub/). It focuses on management of models、datasets and other LLM assets through REST API。 @@ -44,7 +44,7 @@ docker-compose -f docker-compose.yml up -d - Enable content moderation on demand, and choose any third-party content moderation service. ## Roadmap -- [ ] Support more Git Servers: Currently supports Gitea, and plans to support mainstream Git repositories in the future. +- [x] Support more Git Servers: Currently supports Gitea, and plans to support mainstream Git repositories in the future. - [x] Git LFS: Git LFS supports large files, and supports Git command operations and online download through the Web UI. - [x] DataSet online viewer: Data set preview, supports the Top20/TopN loading preview of LFS format data sets. - [x] Model/Dataset AutoTag: Supports custom metadata and automatic extraction of model/dataset tags. diff --git a/README_cn.md b/README_cn.md index 4a502e29..3a67bd91 100644 --- a/README_cn.md +++ b/README_cn.md @@ -1,4 +1,4 @@ -*[English](README_en.md) ∙ [简体中文](README_cn.md)* +*[English](README_en.md) ∙ [简体中文](README_cn.md) ∙ [日本語](README_ja.md)* CSGHub Server是开源、可信的大模型资产管理平台[CSGHub](https://github.com/OpenCSGs/CSGHub/)的服务端部分开源项目,提供基于REST API的模型、数据集等大模型资产管理功能。 @@ -43,7 +43,7 @@ docker compose -f docker-compose.yml up -d - 按需开启内容审核,选择任意第三方内容审核服务 ## 技术规划 -- [ ] 支持更多Git Server: 目前内置了对gitea的支持,未来计划实现对主流Git仓库的支持 +- [x] 支持更多Git Server: 目前内置了对gitea的支持,未来计划实现对主流Git仓库的支持 - [x] 支持Git LFS: Git LFS支持超大文件, 支持git命令操作和Web UI在线下载 - [x] 数据集在线预览: 数据集预览,支持LFS格式数据集的Top20/TopN加载预览 - [x] 模型和数据集自动打标签::支持自定义元数据和自动化提取模型/数据集标签 diff --git a/README_en.md b/README_en.md index 4698eb9d..1ad3b723 100644 --- a/README_en.md +++ b/README_en.md @@ -44,7 +44,7 @@ docker-compose -f docker-compose.yml up -d - Enable content moderation on demand, and choose any third-party content moderation service. ## Roadmap -- [ ] Support more Git Servers: Currently supports Gitea, and plans to support mainstream Git repositories in the future. +- [x] Support more Git Servers: Currently supports Gitea, and plans to support mainstream Git repositories in the future. - [x] Git LFS: Git LFS supports large files, and supports Git command operations and online download through the Web UI. - [x] DataSet online viewer: Data set preview, supports the Top20/TopN loading preview of LFS format data sets. - [x] Model/Dataset AutoTag: Supports custom metadata and automatic extraction of model/dataset tags. diff --git a/README_ja.md b/README_ja.md new file mode 100644 index 00000000..a9be0f0d --- /dev/null +++ b/README_ja.md @@ -0,0 +1,75 @@ +*[English](README_en.md) ∙ [简体中文](README_cn.md) ∙ [日本語](README_ja.md)* + +`CSGHub Server`は、オープンソースで信頼性の高い大規模モデル資産管理プラットフォーム - [CSGHub](https://github.com/OpenCSGs/CSGHub/)の一部です。REST APIを通じてモデル、データセット、その他のLLM資産の管理に焦点を当てています。 + +## 主な機能: +- ユーザーと組織の作成と管理 +- モデルとデータセットのラベルの自動タグ付け +- ユーザー、組織、モデル、データの検索 +- データセットファイルのオンラインプレビュー、例えば `.parquet` ファイル +- テキストと画像のコンテンツモデレーション +- 個々のファイルのダウンロード、LFSファイルを含む +- モデルとデータセットのアクティビティデータの追跡、ダウンロード数やいいね数など + +## デモ +CSGHubの機能と使用方法を迅速に理解するために、デモビデオを録画しました。このビデオを視聴することで、プログラムの主な機能と操作手順を迅速に理解できます。 +- CSGHubのデモビデオは以下の通りです。また、[YouTube](https://www.youtube.com/watch?v=SFDISpqowXs)や[Bilibili](https://www.bilibili.com/video/BV12T4y187bv/)でもご覧いただけます。 + + +強力な管理機能を体験するには、[OpenCSGウェブサイト](https://portal.opencsg.com/models)をご覧ください。 + +## クイックスタート +> システムリソース要件: 4c CPU/8GBメモリ + +Dockerをインストールしてください。このプロジェクトはUbuntu22環境でテストされています。 + +docker-composeを使用してローカライズされた`CSGHub Server`サービスを迅速にデプロイできます: +```shell +# APIトークンは少なくとも128文字の長さである必要があり、csghub-serverへのHTTPリクエストにはAPIトークンをBearerトークンとして送信して認証を行う必要があります。 +export STARHUB_SERVER_API_TOKEN= +mkdir -m 777 gitea minio_data +curl -L https://raw.githubusercontent.com/OpenCSGs/csghub-server/main/docker-compose.yml -o docker-compose.yml +docker-compose -f docker-compose.yml up -d +``` + +## 技術アーキテクチャ +
+ csghub-server architecture +
+ +### 拡張性とカスタマイズ性 +- Gitea、GitLabなどの異なるGitサーバーをサポート +- LFSストレージシステムの柔軟な構成をサポートし、S3プロトコルに対応したローカルまたは任意のサードパーティクラウドストレージサービスを使用できます +- 必要に応じてコンテンツモデレーションを有効にし、任意のサードパーティコンテンツモデレーションサービスを選択できます + +## ロードマップ +- [x] さらに多くのGitサーバーをサポート: 現在はGiteaをサポートしており、将来的には主流のGitリポジトリをサポートする予定です。 +- [x] Git LFS: Git LFSは大きなファイルをサポートし、Gitコマンド操作とWeb UIを通じたオンラインダウンロードをサポートします。 +- [x] データセットのオンラインビューア: データセットのプレビュー、LFS形式のデータセットのTop20/TopNの読み込みプレビューをサポートします。 +- [x] モデル/データセットの自動タグ付け: カスタムメタデータとモデル/データセットタグの自動抽出をサポートします。 +- [x] S3プロトコルのサポート: S3(MinIO)ストレージプロトコルをサポートし、より高い信頼性とストレージコスト効率を提供します。 +- [ ] モデルフォーマットの変換: 主流のモデルフォーマットの変換。 +- [x] モデルのワンクリックデプロイ: OpenCSG llm-inferenceとの統合をサポートし、ワンクリックでモデル推論を開始します。 + +## ライセンス +Apache 2.0ライセンスを使用しています。詳細は`LICENSE`ファイルをご覧ください。 + +## 貢献 +貢献したい場合は、[貢献ガイドライン](docs/en/contributing.md)に従ってください。貢献を非常に楽しみにしています! + +## 謝辞 +このプロジェクトは、Gin、DuckDB、minio、Giteaなどのオープンソースプロジェクトに基づいています。これらのオープンソースの貢献に心から感謝します! + +### お問い合わせ +使用中に問題が発生した場合は、以下のいずれかの方法でお問い合わせください: +1. GitHubでissueを発行する +2. WeChatヘルパーのQRコードをスキャンしてWeChatグループに参加する +3. 公式Discordチャンネルに参加する: [OpenCSG Discord Channel](https://discord.gg/bXnu4C9BkR) +4. Slackワークスペースに参加する: [OpenCSG Slack Channel](https://join.slack.com/t/opencsghq/shared_invite/zt-2fmtem7hs-s_RmMeoOIoF1qzslql2q~A) +
+ +                   + +                    + +
diff --git a/api/handler/runtime_architecture.go b/api/handler/runtime_architecture.go new file mode 100644 index 00000000..88dca304 --- /dev/null +++ b/api/handler/runtime_architecture.go @@ -0,0 +1,187 @@ +package handler + +import ( + "fmt" + "log/slog" + "strconv" + + "github.com/gin-gonic/gin" + "opencsg.com/csghub-server/api/httpbase" + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" + "opencsg.com/csghub-server/component" +) + +func NewRuntimeArchitectureHandler(config *config.Config) (*RuntimeArchitectureHandler, error) { + nrc, err := component.NewRepoComponent(config) + if err != nil { + return nil, fmt.Errorf("fail to create repo component, %w", err) + } + nrac, err := component.NewRuntimeArchitectureComponent(config) + if err != nil { + return nil, fmt.Errorf("fail to create runtime arch component, %w", err) + } + + return &RuntimeArchitectureHandler{ + rc: nrc, + rac: nrac, + }, nil +} + +type RuntimeArchitectureHandler struct { + rc *component.RepoComponent + rac *component.RuntimeArchitectureComponent +} + +// GetArchitectures godoc +// @Security ApiKey +// @Summary Get runtime framework architectures +// @Description get runtime framework architectures +// @Tags RuntimeFramework +// @Accept json +// @Produce json +// @Param id path int true "runtime framework id" +// @Success 200 {object} types.Response{} "OK" +// @Failure 400 {object} types.APIBadRequest "Bad request" +// @Failure 500 {object} types.APIInternalServerError "Internal server error" +// @Router /runtime_framework/{id}/architecture [get] +func (r *RuntimeArchitectureHandler) ListByRuntimeFrameworkID(ctx *gin.Context) { + strID := ctx.Param("id") + id, err := strconv.ParseInt(strID, 10, 64) + if err != nil { + slog.Error("invalid runtime framework ID", slog.Any("error", err)) + httpbase.BadRequest(ctx, "invalid runtime framework ID format") + return + } + resp, err := r.rac.ListByRuntimeFrameworkID(ctx, id) + if err != nil { + slog.Error("fail to list runtime architectures", slog.Any("error", err)) + httpbase.ServerError(ctx, err) + return + } + httpbase.OK(ctx, resp) +} + +// UpdateArchitectures godoc +// @Security ApiKey +// @Summary Set runtime framework architectures +// @Description set runtime framework architectures +// @Tags RuntimeFramework +// @Accept json +// @Produce json +// @Param id path int true "runtime framework id" +// @Param body body types.RuntimeArchitecture true "body" +// @Success 200 {object} types.Response{} "OK" +// @Failure 400 {object} types.APIBadRequest "Bad request" +// @Failure 500 {object} types.APIInternalServerError "Internal server error" +// @Router /runtime_framework/{id}/architecture [put] +func (r *RuntimeArchitectureHandler) UpdateArchitecture(ctx *gin.Context) { + var req types.RuntimeArchitecture + if err := ctx.ShouldBindJSON(&req); err != nil { + slog.Error("Bad request format", "error", err) + httpbase.BadRequest(ctx, err.Error()) + return + } + + id, err := strconv.ParseInt(ctx.Param("id"), 10, 64) + if err != nil { + slog.Error("Bad request runtime framework id format", "error", err) + httpbase.BadRequest(ctx, err.Error()) + return + } + + res, err := r.rac.SetArchitectures(ctx, id, req.Architectures) + if err != nil { + slog.Error("Failed to set architectures", slog.Any("error", err)) + httpbase.ServerError(ctx, err) + return + } + + httpbase.OK(ctx, res) +} + +// DeleteArchitectures godoc +// @Security ApiKey +// @Summary Delete runtime framework architectures +// @Description Delete runtime framework architectures +// @Tags RuntimeFramework +// @Accept json +// @Produce json +// @Param id path int true "runtime framework id" +// @Param body body types.RuntimeArchitecture true "body" +// @Success 200 {object} types.Response{} "OK" +// @Failure 400 {object} types.APIBadRequest "Bad request" +// @Failure 500 {object} types.APIInternalServerError "Internal server error" +// @Router /runtime_framework/{id}/architecture [delete] +func (r *RuntimeArchitectureHandler) DeleteArchitecture(ctx *gin.Context) { + var req types.RuntimeArchitecture + if err := ctx.ShouldBindJSON(&req); err != nil { + slog.Error("Bad request format", "error", err) + httpbase.BadRequest(ctx, err.Error()) + return + } + + id, err := strconv.ParseInt(ctx.Param("id"), 10, 64) + if err != nil { + slog.Error("Bad request runtime framework id format", "error", err) + httpbase.BadRequest(ctx, err.Error()) + return + } + + list, err := r.rac.DeleteArchitectures(ctx, id, req.Architectures) + if err != nil { + slog.Error("Failed to delete architectures", slog.Any("error", err)) + httpbase.ServerError(ctx, err) + return + } + + httpbase.OK(ctx, list) +} + +// ScanArchitecture godoc +// @Security ApiKey +// @Summary Scan runtime architecture +// @Description Scan runtime architecture +// @Tags RuntimeFramework +// @Accept json +// @Produce json +// @Param id path int true "runtime framework id" +// @Param scan_type query int false "scan_type(0:all models, 1:new models, 2:old models)" Enums(0, 1, 2) +// @Param body body types.RuntimeFrameworkModels true "body" +// @Success 200 {object} types.Response{} "OK" +// @Failure 400 {object} types.APIBadRequest "Bad request" +// @Failure 500 {object} types.APIInternalServerError "Internal server error" +// @Router /runtime_framework/{id}/scan [post] +func (r *RuntimeArchitectureHandler) ScanArchitecture(ctx *gin.Context) { + id, err := strconv.ParseInt(ctx.Param("id"), 10, 64) + if err != nil { + slog.Error("Bad request runtime framework id format", "error", err) + httpbase.BadRequest(ctx, err.Error()) + return + } + + scanTypeStr := ctx.Query("scan_type") + if scanTypeStr == "" { + slog.Error("Bad request scan type") + httpbase.BadRequest(ctx, "bad request scan type") + return + } + scanType, err := strconv.Atoi(scanTypeStr) + if err != nil { + slog.Error("Bad request scan format", "error", err) + httpbase.BadRequest(ctx, err.Error()) + return + } + + var req types.RuntimeFrameworkModels + ctx.ShouldBindJSON(&req) + + err = r.rac.ScanArchitecture(ctx, id, scanType, req.Models) + if err != nil { + slog.Error("Failed to scan architecture", slog.Any("error", err)) + httpbase.ServerError(ctx, err) + return + } + + httpbase.OK(ctx, nil) +} diff --git a/api/middleware/git_http_param.go b/api/middleware/git_http_param.go index c0268b9f..8ee8a49f 100644 --- a/api/middleware/git_http_param.go +++ b/api/middleware/git_http_param.go @@ -84,7 +84,7 @@ func GetCurrentUserFromHeader() gin.HandlerFunc { userStore := database.NewUserStore() return func(c *gin.Context) { authHeader := c.Request.Header.Get("Authorization") - if authHeader != "" { + if authHeader != "" && !strings.HasPrefix(authHeader, "X-OPENCSG-Sync-Token") { authHeader = strings.TrimPrefix(authHeader, "Basic ") authInfo, err := base64.StdEncoding.DecodeString(authHeader) if err != nil { diff --git a/api/router/api.go b/api/router/api.go index 7528268a..c40851bc 100644 --- a/api/router/api.go +++ b/api/router/api.go @@ -299,7 +299,12 @@ func NewRouter(config *config.Config, enableSwagger bool) (*gin.Engine, error) { event := apiGroup.Group("/events") event.POST("", eventHandler.Create) - createRuntimeFrameworkRoutes(apiGroup, needAPIKey, modelHandler) + runtimeArchHandler, err := handler.NewRuntimeArchitectureHandler(config) + if err != nil { + return nil, fmt.Errorf("error creating runtime framework architecture handler:%w", err) + } + + createRuntimeFrameworkRoutes(apiGroup, needAPIKey, modelHandler, runtimeArchHandler) syncHandler, err := handler.NewSyncHandler(config) if err != nil { @@ -640,7 +645,7 @@ func createUserRoutes(apiGroup *gin.RouterGroup, needAPIKey gin.HandlerFunc, use apiGroup.GET("/user/:username/run/serverless", needAPIKey, userHandler.GetRunServerless) } -func createRuntimeFrameworkRoutes(apiGroup *gin.RouterGroup, needAPIKey gin.HandlerFunc, modelHandler *handler.ModelHandler) { +func createRuntimeFrameworkRoutes(apiGroup *gin.RouterGroup, needAPIKey gin.HandlerFunc, modelHandler *handler.ModelHandler, runtimeArchHandler *handler.RuntimeArchitectureHandler) { runtimeFramework := apiGroup.Group("/runtime_framework") { runtimeFramework.GET("/:id/models", modelHandler.ListByRuntimeFrameworkID) @@ -648,6 +653,11 @@ func createRuntimeFrameworkRoutes(apiGroup *gin.RouterGroup, needAPIKey gin.Hand runtimeFramework.POST("/:id", needAPIKey, modelHandler.UpdateModelRuntimeFrameworks) runtimeFramework.DELETE("/:id", needAPIKey, modelHandler.DeleteModelRuntimeFrameworks) runtimeFramework.GET("/models", modelHandler.ListModelsOfRuntimeFrameworks) + + runtimeFramework.GET("/:id/architecture", needAPIKey, runtimeArchHandler.ListByRuntimeFrameworkID) + runtimeFramework.PUT("/:id/architecture", needAPIKey, runtimeArchHandler.UpdateArchitecture) + runtimeFramework.DELETE("/:id/architecture", needAPIKey, runtimeArchHandler.DeleteArchitecture) + runtimeFramework.POST("/:id/scan", needAPIKey, runtimeArchHandler.ScanArchitecture) } } @@ -691,7 +701,6 @@ func createHFRoutes(r *gin.Engine, hfdsHandler *handler.HFDatasetHandler, repoCo } } } - } func createDiscussionRoutes(apiGroup *gin.RouterGroup, needAPIKey gin.HandlerFunc, discussionHandler *handler.DiscussionHandler) { diff --git a/builder/git/gitserver.go b/builder/git/gitserver.go index 0ca697dc..3d32ab0d 100644 --- a/builder/git/gitserver.go +++ b/builder/git/gitserver.go @@ -7,13 +7,14 @@ import ( "opencsg.com/csghub-server/builder/git/gitserver/gitaly" "opencsg.com/csghub-server/builder/git/gitserver/gitea" "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" ) func NewGitServer(config *config.Config) (gitserver.GitServer, error) { - if config.GitServer.Type == "gitea" { + if config.GitServer.Type == types.GitServerTypeGitea { gitServer, err := gitea.NewClient(config) return gitServer, err - } else if config.GitServer.Type == "gitaly" { + } else if config.GitServer.Type == types.GitServerTypeGitaly { gitServer, err := gitaly.NewClient(config) return gitServer, err } diff --git a/builder/git/gitserver/gitaly/commit.go b/builder/git/gitserver/gitaly/commit.go index 672d0035..0af84b63 100644 --- a/builder/git/gitserver/gitaly/commit.go +++ b/builder/git/gitserver/gitaly/commit.go @@ -2,6 +2,7 @@ package gitaly import ( "context" + "errors" "fmt" "io" "math" @@ -142,7 +143,7 @@ func (c *Client) GetSingleCommit(ctx context.Context, req gitserver.GetRepoLastC if err != nil { return nil, err } - if commitResp != nil { + if commitResp != nil && commitResp.Commit != nil { commit = types.Commit{ ID: string(commitResp.Commit.Id), CommitterName: string(commitResp.Commit.Committer.Name), @@ -160,6 +161,8 @@ func (c *Client) GetSingleCommit(ctx context.Context, req gitserver.GetRepoLastC }) } + } else { + return nil, errors.New("commit not found") } result = types.CommitResponse{ Commit: &commit, diff --git a/builder/git/gitserver/gitaly/mirror.go b/builder/git/gitserver/gitaly/mirror.go index 8e1e3818..fe8ac567 100644 --- a/builder/git/gitserver/gitaly/mirror.go +++ b/builder/git/gitserver/gitaly/mirror.go @@ -11,25 +11,28 @@ import ( ) func (c *Client) CreateMirrorRepo(ctx context.Context, req gitserver.CreateMirrorRepoReq) (int64, error) { - var authorHeader string + var ( + remoteCheckReq *gitalypb.FindRemoteRepositoryRequest + authorHeader string + err error + ) repoType := fmt.Sprintf("%ss", string(req.RepoType)) ctx, cancel := context.WithTimeout(ctx, timeoutTime) defer cancel() - remoteCheckReq := &gitalypb.FindRemoteRepositoryRequest{ - Remote: req.CloneUrl, - StorageName: c.config.GitalyServer.Storge, - } - - resp, err := c.remoteClient.FindRemoteRepository(ctx, remoteCheckReq) - if err != nil { - return 0, err - } - if !resp.Exists { - return 0, fmt.Errorf("invalid clone url") - } - - if req.Username != "" && req.AccessToken != "" { + if req.MirrorToken == "" { + remoteCheckReq = &gitalypb.FindRemoteRepositoryRequest{ + Remote: req.CloneUrl, + StorageName: c.config.GitalyServer.Storge, + } + + resp, err := c.remoteClient.FindRemoteRepository(ctx, remoteCheckReq) + if err != nil { + return 0, err + } + if !resp.Exists { + return 0, fmt.Errorf("invalid clone url") + } authorHeader = base64.StdEncoding.EncodeToString([]byte(fmt.Sprintf("%s:%s", req.Username, req.AccessToken))) } @@ -41,10 +44,13 @@ func (c *Client) CreateMirrorRepo(ctx context.Context, req gitserver.CreateMirro Url: req.CloneUrl, Mirror: true, } - if authorHeader != "" { + if req.MirrorToken != "" { + gitalyReq.HttpAuthorizationHeader = fmt.Sprintf("X-OPENCSG-Sync-Token%s", req.MirrorToken) + } else if authorHeader != "" { gitalyReq.HttpAuthorizationHeader = authorHeader + } else { + gitalyReq.HttpAuthorizationHeader = "" } - _, err = c.repoClient.CreateRepositoryFromURL(ctx, gitalyReq) if err != nil { return 0, err @@ -70,12 +76,12 @@ func (c *Client) CreateMirrorForExistsRepo(ctx context.Context, req gitserver.Cr }, } - if req.Username != "" && req.AccessToken != "" { - authorHeader = base64.StdEncoding.EncodeToString([]byte(fmt.Sprintf("%s:%s", req.Username, req.AccessToken))) - } - - if authorHeader != "" { + if req.MirrorToken != "" { + fetchRemoteReq.RemoteParams.HttpAuthorizationHeader = fmt.Sprintf("X-OPENCSG-Sync-Token%s", req.MirrorToken) + } else if authorHeader != "" { fetchRemoteReq.RemoteParams.HttpAuthorizationHeader = authorHeader + } else { + fetchRemoteReq.RemoteParams.HttpAuthorizationHeader = "" } _, err := c.repoClient.FetchRemote(ctx, fetchRemoteReq) @@ -108,12 +114,12 @@ func (c *Client) MirrorSync(ctx context.Context, req gitserver.MirrorSyncReq) er CheckTagsChanged: true, } - if req.Username != "" && req.AccessToken != "" { - authorHeader = base64.StdEncoding.EncodeToString([]byte(fmt.Sprintf("%s:%s", req.Username, req.AccessToken))) - } - - if authorHeader != "" { + if req.MirrorToken != "" { + fetchRemoteReq.RemoteParams.HttpAuthorizationHeader = fmt.Sprintf("X-OPENCSG-Sync-Token%s", req.MirrorToken) + } else if authorHeader != "" { fetchRemoteReq.RemoteParams.HttpAuthorizationHeader = authorHeader + } else { + fetchRemoteReq.RemoteParams.HttpAuthorizationHeader = "" } _, err := c.repoClient.FetchRemote(ctx, fetchRemoteReq) diff --git a/builder/git/gitserver/gitaly/repo.go b/builder/git/gitserver/gitaly/repo.go index 1c5b47f5..3932e991 100644 --- a/builder/git/gitserver/gitaly/repo.go +++ b/builder/git/gitserver/gitaly/repo.go @@ -79,5 +79,19 @@ func (c *Client) DeleteRepo(ctx context.Context, req gitserver.DeleteRepoReq) er } func (c *Client) GetRepo(ctx context.Context, req gitserver.GetRepoReq) (*gitserver.CreateRepoResp, error) { - return nil, nil + repoType := fmt.Sprintf("%ss", string(req.RepoType)) + ctx, cancel := context.WithTimeout(ctx, timeoutTime) + defer cancel() + gitalyReq := &gitalypb.FindDefaultBranchNameRequest{ + Repository: &gitalypb.Repository{ + StorageName: c.config.GitalyServer.Storge, + RelativePath: BuildRelativePath(repoType, req.Namespace, req.Name), + }, + } + resp, err := c.refClient.FindDefaultBranchName(ctx, gitalyReq) + if err != nil { + return nil, err + } + + return &gitserver.CreateRepoResp{DefaultBranch: string(resp.Name)}, nil } diff --git a/builder/git/gitserver/gitaly/user.go b/builder/git/gitserver/gitaly/user.go index b87e560c..31247d29 100644 --- a/builder/git/gitserver/gitaly/user.go +++ b/builder/git/gitserver/gitaly/user.go @@ -9,7 +9,10 @@ import ( ) func (c *Client) CreateUser(u gitserver.CreateUserRequest) (user *gitserver.CreateUserResponse, err error) { - return + return &gitserver.CreateUserResponse{ + GitID: 0, + Password: "", + }, nil } func (c *Client) UpdateUser(u *types.UpdateUserRequest, user *database.User) (*database.User, error) { diff --git a/builder/git/gitserver/types.go b/builder/git/gitserver/types.go index 72c6f7c5..23bd47de 100644 --- a/builder/git/gitserver/types.go +++ b/builder/git/gitserver/types.go @@ -171,6 +171,7 @@ type MirrorSyncReq struct { CloneUrl string `json:"clone_url"` Username string `json:"username"` AccessToken string `json:"access_token"` + MirrorToken string `json:"mirror_token"` } type MirrorTaskInfo struct { diff --git a/builder/git/mirrorserver.go b/builder/git/mirrorserver.go index 91f0bbbe..89f352a5 100644 --- a/builder/git/mirrorserver.go +++ b/builder/git/mirrorserver.go @@ -6,13 +6,14 @@ import ( "opencsg.com/csghub-server/builder/git/mirrorserver" "opencsg.com/csghub-server/builder/git/mirrorserver/gitea" "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" ) func NewMirrorServer(config *config.Config) (mirrorserver.MirrorServer, error) { if !config.MirrorServer.Enable { return nil, nil } - if config.MirrorServer.Type == "gitea" { + if config.MirrorServer.Type == types.GitServerTypeGitea { mirrorServer, err := gitea.NewMirrorClient(config) return mirrorServer, err } diff --git a/builder/mirror/queue/queue.go b/builder/mirror/queue/queue.go deleted file mode 100644 index e42779a3..00000000 --- a/builder/mirror/queue/queue.go +++ /dev/null @@ -1,98 +0,0 @@ -package queue - -import ( - "container/heap" - "sync" - - "opencsg.com/csghub-server/common/types" -) - -type Priority int - -func (p Priority) Int() int { return int(p) } - -const ( - HighPriority Priority = 0 - MediumPriority Priority = 1 - LowPriority Priority = 2 -) - -var PriorityMap = map[types.MirrorPriority]Priority{ - types.HighMirrorPriority: HighPriority, - types.MediumMirrorPriority: MediumPriority, - types.LowMirrorPriority: LowPriority, -} - -type MirrorTask struct { - MirrorID int64 - Priority Priority - Index int -} - -type MirrorQueue []*MirrorTask - -func (mq MirrorQueue) Len() int { return len(mq) } - -func (mq MirrorQueue) Less(i, j int) bool { return mq[i].Priority < mq[j].Priority } - -func (mq MirrorQueue) Swap(i, j int) { - mq[i], mq[j] = mq[j], mq[i] - mq[i].Index, mq[j].Index = i, j -} - -func (mq *MirrorQueue) Push(x interface{}) { - n := len(*mq) - item := x.(*MirrorTask) - item.Index = n - *mq = append(*mq, item) -} - -func (mq *MirrorQueue) Pop() interface{} { - old := *mq - n := len(old) - item := old[n-1] - item.Index = -1 - *mq = old[0 : n-1] - return item -} - -type PriorityQueue struct { - Queue MirrorQueue - lock sync.Mutex - cond *sync.Cond -} - -var instance *PriorityQueue -var once sync.Once - -func NewPriorityQueue() *PriorityQueue { - mq := &PriorityQueue{ - Queue: MirrorQueue{}, - } - mq.cond = sync.NewCond(&mq.lock) - heap.Init(&mq.Queue) - return mq -} - -func (pq *PriorityQueue) Push(mt *MirrorTask) { - pq.lock.Lock() - defer pq.lock.Unlock() - heap.Push(&pq.Queue, mt) - pq.cond.Signal() -} - -func (pq *PriorityQueue) Pop() *MirrorTask { - pq.lock.Lock() - defer pq.lock.Unlock() - for pq.Queue.Len() == 0 { - pq.cond.Wait() - } - return heap.Pop(&pq.Queue).(*MirrorTask) -} - -func GetPriorityQueueInstance() *PriorityQueue { - once.Do(func() { - instance = NewPriorityQueue() - }) - return instance -} diff --git a/builder/mirror/service.go b/builder/mirror/service.go deleted file mode 100644 index 94c07be3..00000000 --- a/builder/mirror/service.go +++ /dev/null @@ -1,350 +0,0 @@ -package mirror - -import ( - "bytes" - "context" - "encoding/json" - "fmt" - "log/slog" - "net/http" - "path/filepath" - "strings" - "sync" - "time" - - "github.com/minio/minio-go/v7" - "opencsg.com/csghub-server/builder/git" - "opencsg.com/csghub-server/builder/git/gitserver" - "opencsg.com/csghub-server/builder/mirror/queue" - "opencsg.com/csghub-server/builder/store/database" - "opencsg.com/csghub-server/builder/store/s3" - "opencsg.com/csghub-server/common/config" - "opencsg.com/csghub-server/common/types" -) - -type MirrorService struct { - mq *queue.PriorityQueue - tasks chan queue.MirrorTask - numWorkers int - wg sync.WaitGroup - tokenStore *database.GitServerAccessTokenStore - saas bool - mirrorStore *database.MirrorStore - repoStore *database.RepoStore - modelStore *database.ModelStore - datasetStore *database.DatasetStore - codeStore *database.CodeStore - mirrorSourceStore *database.MirrorSourceStore - namespaceStore *database.NamespaceStore - lfsMetaObjectStore *database.LfsMetaObjectStore - git gitserver.GitServer - s3Client *minio.Client - lfsBucket string - config *config.Config -} - -func NewMirrorService(config *config.Config, numWorkers int) (*MirrorService, error) { - var err error - s := &MirrorService{} - s.git, err = git.NewGitServer(config) - if err != nil { - newError := fmt.Errorf("fail to create git server,error:%w", err) - slog.Error(newError.Error()) - return nil, newError - } - s.s3Client, err = s3.NewMinio(config) - if err != nil { - newError := fmt.Errorf("fail to init s3 client for code,error:%w", err) - slog.Error(newError.Error()) - return nil, newError - } - s.lfsBucket = config.S3.Bucket - s.modelStore = database.NewModelStore() - s.datasetStore = database.NewDatasetStore() - s.codeStore = database.NewCodeStore() - s.repoStore = database.NewRepoStore() - s.mirrorStore = database.NewMirrorStore() - s.tokenStore = database.NewGitServerAccessTokenStore() - s.mirrorSourceStore = database.NewMirrorSourceStore() - s.namespaceStore = database.NewNamespaceStore() - s.lfsMetaObjectStore = database.NewLfsMetaObjectStore() - s.saas = config.Saas - s.config = config - s.mq = queue.GetPriorityQueueInstance() - s.tasks = make(chan queue.MirrorTask) - s.numWorkers = numWorkers - return s, nil -} - -func (ms *MirrorService) Enqueue(task *queue.MirrorTask) { - ms.mq.Push(task) -} - -func (ms *MirrorService) Start() { - for i := 1; i <= ms.numWorkers; i++ { - ms.wg.Add(1) - go ms.worker(i) - } - go ms.dispatcher() - ms.wg.Wait() -} - -func (ms *MirrorService) EnqueueMirrorTasks() { - mirrorStore := database.NewMirrorStore() - mirrors, err := mirrorStore.ToSync(context.Background()) - if err != nil { - slog.Error("fail to get mirror to sync", slog.String("error", err.Error())) - return - } - - for _, mirror := range mirrors { - ms.mq.Push(&queue.MirrorTask{MirrorID: mirror.ID, Priority: queue.Priority(mirror.Priority)}) - mirror.Status = types.MirrorWaiting - err = mirrorStore.Update(context.Background(), &mirror) - if err != nil { - slog.Error("fail to update mirror status", slog.Int64("mirrorId", mirror.ID), slog.String("error", err.Error())) - continue - } - } -} - -func (ms *MirrorService) worker(id int) { - defer ms.wg.Done() - defer func() { - if r := recover(); r != nil { - ms.wg.Add(1) - go ms.worker(id) - slog.Info("worker ecovered from panic ", slog.Int("workerId", id)) - } - }() - slog.Info("worker start", slog.Int("workerId", id)) - for { - task := <-ms.tasks - slog.Info("start to mirror", slog.Int64("mirrorId", task.MirrorID), slog.Int("priority", task.Priority.Int()), slog.Int("workerId", id)) - err := ms.Mirror(context.Background(), task.MirrorID) - if err != nil { - slog.Info("fail to mirror", slog.Int64("mirrorId", task.MirrorID), slog.Int("priority", task.Priority.Int()), slog.Int("workerId", id), slog.String("error", err.Error())) - } - slog.Info("finish to mirror", slog.Int64("mirrorId", task.MirrorID), slog.Int("priority", task.Priority.Int()), slog.Int("workerId", id)) - } -} - -func (ms *MirrorService) dispatcher() { - for { - task := ms.mq.Pop() - if task != nil { - ms.tasks <- *task - } - } -} - -func (c *MirrorService) Mirror(ctx context.Context, mirrorID int64) error { - mirror, err := c.mirrorStore.FindByID(ctx, mirrorID) - if err != nil { - return fmt.Errorf("failed to get mirror: %v", err) - } - mirror.Status = types.MirrorRunning - err = c.mirrorStore.Update(ctx, mirror) - if err != nil { - return fmt.Errorf("failed to update mirror status: %v", err) - } - if mirror.Repository == nil { - return fmt.Errorf("mirror repository is nil") - } - namespace := strings.Split(mirror.Repository.Path, "/")[0] - name := strings.Split(mirror.Repository.Path, "/")[1] - - slog.Info("Start to sync mirror", "repo_type", mirror.Repository.RepositoryType, "namespace", namespace, "name", name) - err = c.git.MirrorSync(ctx, gitserver.MirrorSyncReq{ - Namespace: namespace, - Name: name, - CloneUrl: mirror.SourceUrl, - Username: mirror.Username, - AccessToken: mirror.AccessToken, - RepoType: mirror.Repository.RepositoryType, - }) - - if err != nil { - return fmt.Errorf("failed mirror remote repo in git server: %v", err) - } - slog.Info("Mirror remote repo in git server successfully", "repo_type", mirror.Repository.RepositoryType, "namespace", namespace, "name", name) - slog.Info("Start to sync lfs files", "repo_type", mirror.Repository.RepositoryType, "namespace", namespace, "name", name) - err = c.syncLfsFiles(ctx, mirror) - if err != nil { - mirror.Status = types.MirrorIncomplete - mirror.LastMessage = err.Error() - err = c.mirrorStore.Update(ctx, mirror) - if err != nil { - return fmt.Errorf("failed to update mirror: %w", err) - } - return fmt.Errorf("failed to sync lfs files: %v", err) - } - mirror.NextExecutionTimestamp = time.Now().Add(24 * time.Hour) - mirror.Status = types.MirrorFinished - mirror.Priority = types.LowMirrorPriority - err = c.mirrorStore.Update(ctx, mirror) - if err != nil { - return fmt.Errorf("failed to update mirror: %w", err) - } - - return nil -} - -func (c *MirrorService) syncLfsFiles(ctx context.Context, mirror *database.Mirror) error { - var pointers []*types.Pointer - namespace := strings.Split(mirror.Repository.Path, "/")[0] - name := strings.Split(mirror.Repository.Path, "/")[1] - branches, err := c.git.GetRepoBranches(ctx, gitserver.GetBranchesReq{ - Namespace: namespace, - Name: name, - RepoType: mirror.Repository.RepositoryType, - }) - if err != nil { - return fmt.Errorf("failed to get repo branches: %v", err) - } - for _, branch := range branches { - lfsPointers, err := c.getAllLfsPointersByRef(ctx, mirror.Repository.RepositoryType, namespace, name, branch.Name) - if err != nil { - return fmt.Errorf("failed to get all lfs pointers: %v", err) - } - for _, lfsPointer := range lfsPointers { - pointers = append(pointers, &types.Pointer{ - Oid: lfsPointer.FileOid, - Size: lfsPointer.FileSize, - }) - } - } - - pointers, err = c.GetLFSDownloadURLs(ctx, mirror, pointers) - if err != nil { - return fmt.Errorf("failed to get LFS download URLs: %v", err) - } - err = c.DownloadAndUploadLFSFiles(ctx, mirror, pointers) - if err != nil { - return err - } - - return nil -} - -func (c *MirrorService) getAllLfsPointersByRef(ctx context.Context, RepoType types.RepositoryType, namespace, name, ref string) ([]*types.LFSPointer, error) { - return c.git.GetRepoAllLfsPointers(ctx, gitserver.GetRepoAllFilesReq{ - Namespace: namespace, - Name: name, - Ref: ref, - RepoType: RepoType, - }) -} - -func (c *MirrorService) GetLFSDownloadURLs(ctx context.Context, mirror *database.Mirror, pointers []*types.Pointer) ([]*types.Pointer, error) { - var resPointers []*types.Pointer - requestPayload := types.LFSBatchRequest{ - Operation: "download", - } - - for _, pointer := range pointers { - requestPayload.Objects = append(requestPayload.Objects, types.LFSBatchObject{ - Oid: pointer.Oid, - Size: pointer.Size, - }) - } - - lfsAPIURL := mirror.SourceUrl + "/info/lfs/objects/batch" - - payload, err := json.Marshal(requestPayload) - if err != nil { - return resPointers, fmt.Errorf("failed to marshal request payload: %v", err) - } - - resp, err := http.Post(lfsAPIURL, "application/json", bytes.NewReader(payload)) - if err != nil { - return resPointers, fmt.Errorf("failed to get LFS download URL: %v", err) - } - defer resp.Body.Close() - - if resp.StatusCode != http.StatusOK { - return resPointers, fmt.Errorf("failed to get LFS download URL, status code: %d", resp.StatusCode) - } - - var batchResponse types.LFSBatchResponse - err = json.NewDecoder(resp.Body).Decode(&batchResponse) - if err != nil { - return resPointers, fmt.Errorf("failed to decode LFS batch response: %v", err) - } - - if len(batchResponse.Objects) == 0 { - return resPointers, fmt.Errorf("no objects found in LFS batch response") - } - for _, object := range batchResponse.Objects { - resPointers = append(resPointers, &types.Pointer{ - Oid: object.Oid, - Size: object.Size, - DownloadURL: object.Actions.Download.Href, - }) - } - - return resPointers, nil -} - -func (c *MirrorService) DownloadAndUploadLFSFiles(ctx context.Context, mirror *database.Mirror, pointers []*types.Pointer) error { - var finishedLFSFileCount int - lfsFilesCount := len(pointers) - for _, pointer := range pointers { - objectKey := filepath.Join("lfs", pointer.RelativePath()) - fileInfo, err := c.s3Client.StatObject(ctx, c.config.S3.Bucket, objectKey, minio.StatObjectOptions{}) - if err != nil && err.Error() != "The specified key does not exist." { - slog.Error("failed to check if LFS file exists", slog.Any("error", err)) - continue - } - if (err != nil && err.Error() != "The specified key does not exist.") || fileInfo.Size != pointer.Size { - err = c.DownloadAndUploadLFSFile(ctx, mirror, pointer) - if err != nil { - slog.Error("failed to download and upload LFS file", slog.Any("error", err)) - } - } - - lfsMetaObject := database.LfsMetaObject{ - Size: pointer.Size, - Oid: pointer.Oid, - RepositoryID: mirror.Repository.ID, - Existing: true, - } - _, err = c.lfsMetaObjectStore.UpdateOrCreate(ctx, lfsMetaObject) - if err != nil { - slog.Error("failed to update or create LFS meta object", slog.Any("error", err)) - return fmt.Errorf("failed to update or create LFS meta object: %w", err) - } - finishedLFSFileCount += 1 - mirror.Progress = int8(finishedLFSFileCount * 100 / lfsFilesCount) - err = c.mirrorStore.Update(ctx, mirror) - if err != nil { - return fmt.Errorf("failed to update mirror progress: %w", err) - } - } - return nil -} - -func (c *MirrorService) DownloadAndUploadLFSFile(ctx context.Context, mirror *database.Mirror, pointer *types.Pointer) error { - objectKey := filepath.Join("lfs", pointer.RelativePath()) - slog.Info("downloading LFS file from", slog.Any("url", pointer.DownloadURL)) - resp, err := http.Get(pointer.DownloadURL) - if err != nil { - return err - } - defer resp.Body.Close() - - if resp.StatusCode != http.StatusOK { - return fmt.Errorf("failed to download LFS file: %s", resp.Status) - } - slog.Info("uploading LFS file", slog.Any("object_key", objectKey)) - uploadInfo, err := c.s3Client.PutObject(ctx, c.config.S3.Bucket, objectKey, resp.Body, resp.ContentLength, minio.PutObjectOptions{}) - if err != nil { - return fmt.Errorf("failed to upload to Minio: %w", err) - } - - if uploadInfo.Size != pointer.Size { - return fmt.Errorf("uploaded file size does not match expected size: %d != %d", uploadInfo.Size, pointer.Size) - } - - return nil -} diff --git a/builder/store/cache/cache.go b/builder/store/cache/cache.go index c29dcb2a..8de1ec90 100644 --- a/builder/store/cache/cache.go +++ b/builder/store/cache/cache.go @@ -50,3 +50,12 @@ end` func (c *Cache) FlushAll(ctx context.Context) error { return c.core.FlushAll(ctx).Err() } + +func (c *Cache) ZAdd(ctx context.Context, key string, z redis.Z) error { + _, err := c.core.ZAdd(ctx, key, z).Result() + return err +} + +func (c *Cache) ZPopMax(ctx context.Context, key string, count int64) ([]redis.Z, error) { + return c.core.ZPopMax(ctx, key, count).Result() +} diff --git a/builder/store/database/lfs_meta_object.go b/builder/store/database/lfs_meta_object.go index 990baec6..c8b1acf8 100644 --- a/builder/store/database/lfs_meta_object.go +++ b/builder/store/database/lfs_meta_object.go @@ -38,6 +38,18 @@ func (s *LfsMetaObjectStore) FindByOID(ctx context.Context, RepoId int64, Oid st return &lfsMetaObject, nil } +func (s *LfsMetaObjectStore) FindByRepoID(ctx context.Context, repoID int64) ([]LfsMetaObject, error) { + var lfsMetaObjects []LfsMetaObject + err := s.db.Operator.Core.NewSelect(). + Model(&lfsMetaObjects). + Where("repository_id=?", repoID). + Scan(ctx) + if err != nil { + return nil, err + } + return lfsMetaObjects, nil +} + func (s *LfsMetaObjectStore) Create(ctx context.Context, lfsObj LfsMetaObject) (*LfsMetaObject, error) { err := s.db.Operator.Core.NewInsert(). Model(&lfsObj). @@ -75,3 +87,12 @@ func (s *LfsMetaObjectStore) UpdateOrCreate(ctx context.Context, input LfsMetaOb return &input, nil } + +func (s *LfsMetaObjectStore) BulkUpdateOrCreate(ctx context.Context, input []LfsMetaObject) error { + _, err := s.db.Core.NewInsert(). + Model(&input). + On("CONFLICT (oid, repository_id) DO UPDATE"). + Set("size = EXCLUDED.size, updated_at = EXCLUDED.updated_at, existing = EXCLUDED.existing"). + Exec(ctx) + return err +} diff --git a/builder/store/database/migrations/20240719025744_create_table_runtime_architecture.go b/builder/store/database/migrations/20240719025744_create_table_runtime_architecture.go new file mode 100644 index 00000000..badc7738 --- /dev/null +++ b/builder/store/database/migrations/20240719025744_create_table_runtime_architecture.go @@ -0,0 +1,22 @@ +package migrations + +import ( + "context" + + "github.com/uptrace/bun" +) + +type RuntimeArchitecture struct { + ID int64 `bun:",pk,autoincrement" json:"id"` + RuntimeFrameworkID int64 `bun:",notnull" json:"runtime_framework_id"` + ArchitectureName string `bun:",notnull" json:"architecture_name"` +} + +func init() { + Migrations.MustRegister(func(ctx context.Context, db *bun.DB) error { + err := createTables(ctx, db, RuntimeArchitecture{}) + return err + }, func(ctx context.Context, db *bun.DB) error { + return dropTables(ctx, db, RuntimeArchitecture{}) + }) +} diff --git a/builder/store/database/migrations/20240719031622_create_index_runtime_architecture.down.sql b/builder/store/database/migrations/20240719031622_create_index_runtime_architecture.down.sql new file mode 100644 index 00000000..2237ba46 --- /dev/null +++ b/builder/store/database/migrations/20240719031622_create_index_runtime_architecture.down.sql @@ -0,0 +1,5 @@ +SET statement_timeout = 0; + +--bun:split + +DROP INDEX IF EXISTS idx_unique_runtime_architecture; diff --git a/builder/store/database/migrations/20240719031622_create_index_runtime_architecture.up.sql b/builder/store/database/migrations/20240719031622_create_index_runtime_architecture.up.sql new file mode 100644 index 00000000..abfbe569 --- /dev/null +++ b/builder/store/database/migrations/20240719031622_create_index_runtime_architecture.up.sql @@ -0,0 +1,6 @@ +SET statement_timeout = 0; + +--bun:split + +CREATE UNIQUE INDEX IF NOT EXISTS idx_unique_runtime_architecture ON runtime_architectures (runtime_framework_id, architecture_name); + diff --git a/builder/store/database/migrations/20240913042113_add_index_to_lfs_meta_objects.down.sql b/builder/store/database/migrations/20240913042113_add_index_to_lfs_meta_objects.down.sql new file mode 100644 index 00000000..084985cf --- /dev/null +++ b/builder/store/database/migrations/20240913042113_add_index_to_lfs_meta_objects.down.sql @@ -0,0 +1,6 @@ + +SET statement_timeout = 0; + +--bun:split + +DROP INDEX IF EXISTS idx_lfs_meta_objects_repository_id_oid; \ No newline at end of file diff --git a/builder/store/database/migrations/20240913042113_add_index_to_lfs_meta_objects.up.sql b/builder/store/database/migrations/20240913042113_add_index_to_lfs_meta_objects.up.sql new file mode 100644 index 00000000..4c9970ef --- /dev/null +++ b/builder/store/database/migrations/20240913042113_add_index_to_lfs_meta_objects.up.sql @@ -0,0 +1,6 @@ + +SET statement_timeout = 0; + +--bun:split + +CREATE UNIQUE INDEX IF NOT EXISTS idx_lfs_meta_objects_repository_id_oid ON lfs_meta_objects(repository_id, oid); \ No newline at end of file diff --git a/builder/store/database/mirror.go b/builder/store/database/mirror.go index a135375f..e8e93c97 100644 --- a/builder/store/database/mirror.go +++ b/builder/store/database/mirror.go @@ -225,11 +225,23 @@ func (s *MirrorStore) Finished(ctx context.Context) ([]Mirror, error) { return mirrors, nil } -func (s *MirrorStore) ToSync(ctx context.Context) ([]Mirror, error) { +func (s *MirrorStore) ToSyncRepo(ctx context.Context) ([]Mirror, error) { var mirrors []Mirror err := s.db.Operator.Core.NewSelect(). Model(&mirrors). - Where("next_execution_timestamp < ?", time.Now()). + Where("next_execution_timestamp < ? or status in (?,?,?)", time.Now(), types.MirrorIncomplete, types.MirrorFailed, types.MirrorWaiting). + Scan(ctx) + if err != nil { + return nil, err + } + return mirrors, nil +} + +func (s *MirrorStore) ToSyncLfs(ctx context.Context) ([]Mirror, error) { + var mirrors []Mirror + err := s.db.Operator.Core.NewSelect(). + Model(&mirrors). + Where("next_execution_timestamp < ? or status = ?", time.Now(), types.MirrorRepoSynced). Scan(ctx) if err != nil { return nil, err diff --git a/builder/store/database/repository.go b/builder/store/database/repository.go index dd78f9d8..1a98c6c7 100644 --- a/builder/store/database/repository.go +++ b/builder/store/database/repository.go @@ -587,6 +587,36 @@ func (s *RepoStore) CountByRepoType(ctx context.Context, repoType types.Reposito return s.db.Core.NewSelect().Model(&Repository{}).Where("repository_type = ?", repoType).Count(ctx) } +func (s *RepoStore) GetRepoWithoutRuntimeByID(ctx context.Context, rfID int64, paths []string) ([]Repository, error) { + var res []Repository + q := s.db.Operator.Core.NewSelect().Model(&res) + if len(paths) > 0 { + q.Where("path in (?)", bun.In(paths)) + } + err := q.Where("repository_type = ?", types.ModelRepo). + Where("id not in (select repo_id from repositories_runtime_frameworks where runtime_framework_id = ?)", rfID). + Scan(ctx) + if err != nil { + return nil, fmt.Errorf("select repos without runtime failed, %w", err) + } + return res, nil +} + +func (s *RepoStore) GetRepoWithRuntimeByID(ctx context.Context, rfID int64, paths []string) ([]Repository, error) { + var res []Repository + q := s.db.Operator.Core.NewSelect().Model(&res) + if len(paths) > 0 { + q.Where("path in (?)", bun.In(paths)) + } + err := q.Where("repository_type = ?", types.ModelRepo). + Where("id in (select repo_id from repositories_runtime_frameworks where runtime_framework_id = ?)", rfID). + Scan(ctx) + if err != nil { + return nil, fmt.Errorf("select repos with runtime failed, %w", err) + } + return res, nil +} + func (s *RepoStore) FindWithBatch(ctx context.Context, batchSize, batch int) ([]Repository, error) { var res []Repository err := s.db.Operator.Core.NewSelect(). diff --git a/builder/store/database/repository_runtime_framework.go b/builder/store/database/repository_runtime_framework.go index 57d3c530..841a29f3 100644 --- a/builder/store/database/repository_runtime_framework.go +++ b/builder/store/database/repository_runtime_framework.go @@ -2,6 +2,7 @@ package database import ( "context" + "fmt" ) type RepositoriesRuntimeFrameworkStore struct { @@ -53,6 +54,14 @@ func (m *RepositoriesRuntimeFrameworkStore) Delete(ctx context.Context, runtimeF return err } +func (m *RepositoriesRuntimeFrameworkStore) DeleteByRepoID(ctx context.Context, repoID int64) error { + _, err := m.db.Operator.Core.NewDelete().Model((*RepositoriesRuntimeFramework)(nil)).Where("repo_id = ?", repoID).Exec(ctx) + if err != nil { + return fmt.Errorf("delete repo runtime failed, %w", err) + } + return nil +} + func (m *RepositoriesRuntimeFrameworkStore) GetByIDsAndType(ctx context.Context, runtimeFrameworkID, repoID int64, deployType int) ([]RepositoriesRuntimeFramework, error) { var result []RepositoriesRuntimeFramework _, err := m.db.Operator.Core.NewSelect().Model(&result).Where("type = ? and repo_id=? and runtime_framework_id = ?", deployType, repoID, runtimeFrameworkID).Exec(ctx, &result) @@ -70,3 +79,12 @@ func (m *RepositoriesRuntimeFrameworkStore) GetByRepoIDsAndType(ctx context.Cont _, err := m.db.Operator.Core.NewSelect().Model(&result).Where("type = ? and repo_id=?", deployType, repoID).Exec(ctx, &result) return result, err } + +func (m *RepositoriesRuntimeFrameworkStore) GetByRepoIDs(ctx context.Context, repoID int64) ([]RepositoriesRuntimeFramework, error) { + var result []RepositoriesRuntimeFramework + _, err := m.db.Operator.Core.NewSelect().Model(&result).Where("repo_id=?", repoID).Exec(ctx, &result) + if err != nil { + return nil, fmt.Errorf("get runtime by repoid failed, %w", err) + } + return result, nil +} diff --git a/builder/store/database/runtime_architecture.go b/builder/store/database/runtime_architecture.go new file mode 100644 index 00000000..fa2d2781 --- /dev/null +++ b/builder/store/database/runtime_architecture.go @@ -0,0 +1,71 @@ +package database + +import ( + "context" + "database/sql" + "errors" + "fmt" +) + +type RuntimeArchitecturesStore struct { + db *DB +} + +func NewRuntimeArchitecturesStore() *RuntimeArchitecturesStore { + return &RuntimeArchitecturesStore{ + db: defaultDB, + } +} + +type RuntimeArchitecture struct { + ID int64 `bun:",pk,autoincrement" json:"id"` + RuntimeFrameworkID int64 `bun:",notnull" json:"runtime_framework_id"` + ArchitectureName string `bun:",notnull" json:"architecture_name"` +} + +func (ra *RuntimeArchitecturesStore) ListByRuntimeFrameworkID(ctx context.Context, id int64) ([]RuntimeArchitecture, error) { + var result []RuntimeArchitecture + _, err := ra.db.Operator.Core.NewSelect().Model(&result).Where("runtime_framework_id = ?", id).Exec(ctx, &result) + if err != nil { + return nil, fmt.Errorf("error happened while getting runtime architecture, %w", err) + } + return result, nil +} + +func (ra *RuntimeArchitecturesStore) Add(ctx context.Context, arch RuntimeArchitecture) error { + res, err := ra.db.Core.NewInsert().Model(&arch).Exec(ctx, &arch) + if err := assertAffectedOneRow(res, err); err != nil { + return fmt.Errorf("creating runtime architecture in the db failed,error:%w", err) + } + return nil +} + +func (ra *RuntimeArchitecturesStore) DeleteByRuntimeIDAndArchName(ctx context.Context, id int64, archName string) error { + var arch RuntimeArchitecture + _, err := ra.db.Core.NewDelete().Model(&arch).Where("runtime_framework_id = ? and architecture_name = ?", id, archName).Exec(ctx) + if err != nil { + return fmt.Errorf("deleteing runtime architecture in the db failed, error:%w", err) + } + return nil +} + +func (ra *RuntimeArchitecturesStore) FindByRuntimeIDAndArchName(ctx context.Context, id int64, archName string) (*RuntimeArchitecture, error) { + var arch RuntimeArchitecture + _, err := ra.db.Core.NewSelect().Model(&arch).Where("runtime_framework_id = ? and architecture_name = ?", id, archName).Exec(ctx, &arch) + if errors.Is(err, sql.ErrNoRows) { + return nil, nil + } + if err != nil { + return nil, fmt.Errorf("getting runtime architecture in the db failed, error:%w", err) + } + return &arch, nil +} + +func (ra *RuntimeArchitecturesStore) ListByRArchName(ctx context.Context, archName string) ([]RuntimeArchitecture, error) { + var result []RuntimeArchitecture + _, err := ra.db.Operator.Core.NewSelect().Model(&result).Where("architecture_name = ?", archName).Exec(ctx, &result) + if err != nil { + return nil, fmt.Errorf("error happened while getting runtime architecture, %w", err) + } + return result, nil +} diff --git a/builder/store/database/runtime_framework.go b/builder/store/database/runtime_framework.go index e03f54b8..eab7366a 100644 --- a/builder/store/database/runtime_framework.go +++ b/builder/store/database/runtime_framework.go @@ -4,6 +4,8 @@ import ( "context" "fmt" "log/slog" + + "github.com/uptrace/bun" ) type RuntimeFrameworksStore struct { @@ -93,3 +95,12 @@ func (rf *RuntimeFrameworksStore) ListAll(ctx context.Context) ([]RuntimeFramewo } return result, nil } + +func (rf *RuntimeFrameworksStore) ListByIDs(ctx context.Context, ids []int64) ([]RuntimeFramework, error) { + var result []RuntimeFramework + _, err := rf.db.Operator.Core.NewSelect().Model(&result).Where("id in (?)", bun.In(ids)).Exec(ctx, &result) + if err != nil { + return nil, fmt.Errorf("query runtimes failed, %w", err) + } + return result, nil +} diff --git a/cmd/csghub-server/cmd/cron/create_push_mirror.go b/cmd/csghub-server/cmd/cron/create_push_mirror.go index b0426105..84e83785 100644 --- a/cmd/csghub-server/cmd/cron/create_push_mirror.go +++ b/cmd/csghub-server/cmd/cron/create_push_mirror.go @@ -8,6 +8,7 @@ import ( "github.com/spf13/cobra" "opencsg.com/csghub-server/builder/store/database" "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" "opencsg.com/csghub-server/component" ) @@ -44,7 +45,7 @@ var cmdCreatePushMirror = &cobra.Command{ return } - if config.GitServer.Type != "gitea" { + if config.GitServer.Type != types.GitServerTypeGitea { return } diff --git a/cmd/csghub-server/cmd/cron/cron.go b/cmd/csghub-server/cmd/cron/cron.go index bbd93438..b013151e 100644 --- a/cmd/csghub-server/cmd/cron/cron.go +++ b/cmd/csghub-server/cmd/cron/cron.go @@ -8,7 +8,6 @@ func init() { // add subcommands here Cmd.AddCommand(cmdCalcRecomScore) Cmd.AddCommand(cmdCreatePushMirror) - Cmd.AddCommand(cmdSyncAsClient) Cmd.AddCommand(cmdGenTelemetry) } diff --git a/cmd/csghub-server/cmd/git/generate_lfs_meta_objects.go b/cmd/csghub-server/cmd/git/generate_lfs_meta_objects.go index fa33456e..0ef1c178 100644 --- a/cmd/csghub-server/cmd/git/generate_lfs_meta_objects.go +++ b/cmd/csghub-server/cmd/git/generate_lfs_meta_objects.go @@ -49,7 +49,7 @@ var generateLfsMetaObjectsCmd = &cobra.Command{ return } - if config.GitServer.Type == "gitea" { + if config.GitServer.Type == types.GitServerTypeGitea { return } diff --git a/cmd/csghub-server/cmd/mirror/check_mirror_progress.go b/cmd/csghub-server/cmd/mirror/check_mirror_progress.go index de5bd1fa..d5532bac 100644 --- a/cmd/csghub-server/cmd/mirror/check_mirror_progress.go +++ b/cmd/csghub-server/cmd/mirror/check_mirror_progress.go @@ -10,6 +10,7 @@ import ( "opencsg.com/csghub-server/builder/store/cache" "opencsg.com/csghub-server/builder/store/database" "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" "opencsg.com/csghub-server/component" ) @@ -54,7 +55,7 @@ var checkMirrorProgress = &cobra.Command{ return } - if config.GitServer.Type != "gitea" { + if config.GitServer.Type != types.GitServerTypeGitea { return } diff --git a/cmd/csghub-server/cmd/mirror/lfs_sync.go b/cmd/csghub-server/cmd/mirror/lfs_sync.go new file mode 100644 index 00000000..3d9564a6 --- /dev/null +++ b/cmd/csghub-server/cmd/mirror/lfs_sync.go @@ -0,0 +1,40 @@ +package mirror + +import ( + "github.com/spf13/cobra" + "opencsg.com/csghub-server/builder/store/database" + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/mirror" +) + +var lfsSyncCmd = &cobra.Command{ + Use: "lfs-sync", + Short: "Start the repoisotry lfs files sync server", + Example: lfsSyncExample(), + RunE: func(*cobra.Command, []string) (err error) { + cfg, err := config.LoadConfig() + if err != nil { + return err + } + dbConfig := database.DBConfig{ + Dialect: database.DatabaseDialect(cfg.Database.Driver), + DSN: cfg.Database.DSN, + } + database.InitDB(dbConfig) + + lfsSyncWorker, err := mirror.NewLFSSyncWorker(cfg, cfg.Mirror.WorkerNumber) + if err != nil { + return err + } + lfsSyncWorker.Run() + + return nil + }, +} + +func lfsSyncExample() string { + return ` +# for development +csghub-server mirror lfs-sync +` +} diff --git a/cmd/csghub-server/cmd/mirror/mirror.go b/cmd/csghub-server/cmd/mirror/mirror.go index b62aebf3..da38e4b5 100644 --- a/cmd/csghub-server/cmd/mirror/mirror.go +++ b/cmd/csghub-server/cmd/mirror/mirror.go @@ -8,6 +8,8 @@ func init() { // add subcommands here Cmd.AddCommand(createMirrorRepoFromFile) Cmd.AddCommand(checkMirrorProgress) + Cmd.AddCommand(lfsSyncCmd) + Cmd.AddCommand(repoSyncCmd) } var Cmd = &cobra.Command{ diff --git a/cmd/csghub-server/cmd/mirror/repo_sync.go b/cmd/csghub-server/cmd/mirror/repo_sync.go new file mode 100644 index 00000000..d6f030de --- /dev/null +++ b/cmd/csghub-server/cmd/mirror/repo_sync.go @@ -0,0 +1,42 @@ +package mirror + +import ( + "github.com/spf13/cobra" + "opencsg.com/csghub-server/builder/store/database" + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/mirror" +) + +var repoSyncCmd = &cobra.Command{ + Use: "repo-sync", + Short: "Start the repoisotry sync server", + Example: repoSyncExample(), + RunE: func(*cobra.Command, []string) (err error) { + cfg, err := config.LoadConfig() + if err != nil { + return err + } + + dbConfig := database.DBConfig{ + Dialect: database.DatabaseDialect(cfg.Database.Driver), + DSN: cfg.Database.DSN, + } + database.InitDB(dbConfig) + + repoSYncer, err := mirror.NewRepoSyncWorker(cfg, cfg.Mirror.WorkerNumber) + if err != nil { + return err + } + + repoSYncer.Run() + + return nil + }, +} + +func repoSyncExample() string { + return ` +# for development +csghub-server mirror repo-sync +` +} diff --git a/cmd/csghub-server/cmd/root.go b/cmd/csghub-server/cmd/root.go index e2b6d876..21494f5c 100644 --- a/cmd/csghub-server/cmd/root.go +++ b/cmd/csghub-server/cmd/root.go @@ -14,7 +14,7 @@ import ( "opencsg.com/csghub-server/cmd/csghub-server/cmd/migration" "opencsg.com/csghub-server/cmd/csghub-server/cmd/mirror" "opencsg.com/csghub-server/cmd/csghub-server/cmd/start" - "opencsg.com/csghub-server/cmd/csghub-server/cmd/syncversion" + "opencsg.com/csghub-server/cmd/csghub-server/cmd/sync" "opencsg.com/csghub-server/cmd/csghub-server/cmd/trigger" "opencsg.com/csghub-server/cmd/csghub-server/cmd/user" ) @@ -55,7 +55,7 @@ func init() { cron.Cmd, mirror.Cmd, accounting.Cmd, - syncversion.Cmd, + sync.Cmd, user.Cmd, git.Cmd, ) diff --git a/cmd/csghub-server/cmd/start/server.go b/cmd/csghub-server/cmd/start/server.go index 9c587b51..4bd80ae8 100644 --- a/cmd/csghub-server/cmd/start/server.go +++ b/cmd/csghub-server/cmd/start/server.go @@ -9,10 +9,11 @@ import ( "opencsg.com/csghub-server/api/router" "opencsg.com/csghub-server/builder/deploy" "opencsg.com/csghub-server/builder/event" - "opencsg.com/csghub-server/builder/mirror" "opencsg.com/csghub-server/builder/store/database" "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" "opencsg.com/csghub-server/docs" + "opencsg.com/csghub-server/mirror" ) var enableSwagger bool @@ -81,14 +82,13 @@ var serverCmd = &cobra.Command{ ) // Initialize mirror service - mirrorService, err := mirror.NewMirrorService(cfg, 5) + mirrorService, err := mirror.NewMirrorPriorityQueue(cfg) if err != nil { return fmt.Errorf("failed to init mirror service: %w", err) } - if cfg.MirrorServer.Enable { + if cfg.MirrorServer.Enable && cfg.GitServer.Type == types.GitServerTypeGitaly { mirrorService.EnqueueMirrorTasks() - go mirrorService.Start() } server.Run() diff --git a/cmd/csghub-server/cmd/start/start.go b/cmd/csghub-server/cmd/start/start.go index 628ef60a..39d4d85c 100644 --- a/cmd/csghub-server/cmd/start/start.go +++ b/cmd/csghub-server/cmd/start/start.go @@ -14,7 +14,6 @@ import ( func init() { Cmd.AddCommand(serverCmd) Cmd.AddCommand(rproxyCmd) - Cmd.AddCommand(syncServerCmd) } var Cmd = &cobra.Command{ diff --git a/cmd/csghub-server/cmd/start/syncserver.go b/cmd/csghub-server/cmd/start/syncserver.go deleted file mode 100644 index 4e422704..00000000 --- a/cmd/csghub-server/cmd/start/syncserver.go +++ /dev/null @@ -1,49 +0,0 @@ -package start - -import ( - "fmt" - - "github.com/spf13/cobra" - "opencsg.com/csghub-server/api/httpbase" - "opencsg.com/csghub-server/builder/store/database" - "opencsg.com/csghub-server/common/config" - "opencsg.com/csghub-server/multisync/router" -) - -var syncServerCmd = &cobra.Command{ - Use: "sync-server", - Short: "Start the multi source sync server", - Example: rproxyExample(), - RunE: func(*cobra.Command, []string) (err error) { - cfg, err := config.LoadConfig() - if err != nil { - return err - } - - dbConfig := database.DBConfig{ - Dialect: database.DatabaseDialect(cfg.Database.Driver), - DSN: cfg.Database.DSN, - } - database.InitDB(dbConfig) - r, err := router.NewRouter(cfg) - if err != nil { - return fmt.Errorf("failed to init router: %w", err) - } - server := httpbase.NewGracefulServer( - httpbase.GraceServerOpt{ - Port: cfg.Mirror.Port, - }, - r, - ) - server.Run() - - return nil - }, -} - -func syncServerExample() string { - return ` -# for development -csghub-server start sync-server -` -} diff --git a/cmd/csghub-server/cmd/cron/multi_sync.go b/cmd/csghub-server/cmd/sync/client.go similarity index 97% rename from cmd/csghub-server/cmd/cron/multi_sync.go rename to cmd/csghub-server/cmd/sync/client.go index b9e985ad..fe75057c 100644 --- a/cmd/csghub-server/cmd/cron/multi_sync.go +++ b/cmd/csghub-server/cmd/sync/client.go @@ -1,4 +1,4 @@ -package cron +package sync import ( "context" @@ -53,6 +53,10 @@ var cmdSyncAsClient = &cobra.Command{ return } + if !config.MultiSync.Enabled { + return + } + locker, err := cache.NewCache(ctx, cache.RedisConfig{ Addr: config.Redis.Endpoint, Username: config.Redis.User, diff --git a/cmd/csghub-server/cmd/syncversion/syncversion.go b/cmd/csghub-server/cmd/sync/sync.go similarity index 75% rename from cmd/csghub-server/cmd/syncversion/syncversion.go rename to cmd/csghub-server/cmd/sync/sync.go index 8fe2017f..c1c7fe40 100644 --- a/cmd/csghub-server/cmd/syncversion/syncversion.go +++ b/cmd/csghub-server/cmd/sync/sync.go @@ -1,4 +1,4 @@ -package syncversion +package sync import ( "github.com/spf13/cobra" @@ -6,11 +6,11 @@ import ( func init() { // add subcommands here - Cmd.AddCommand(InitCmd) + Cmd.AddCommand(cmdSyncAsClient) } var Cmd = &cobra.Command{ - Use: "sync-version", + Use: "sync", Short: "entry point for mirror jobs", Run: func(cmd *cobra.Command, args []string) { _ = cmd.Help() diff --git a/cmd/csghub-server/cmd/syncversion/init.go b/cmd/csghub-server/cmd/syncversion/init.go deleted file mode 100644 index cfd6e954..00000000 --- a/cmd/csghub-server/cmd/syncversion/init.go +++ /dev/null @@ -1,98 +0,0 @@ -package syncversion - -import ( - "context" - "fmt" - "log/slog" - - "github.com/spf13/cobra" - "opencsg.com/csghub-server/builder/store/database" - "opencsg.com/csghub-server/common/config" - "opencsg.com/csghub-server/common/types" - "opencsg.com/csghub-server/component" -) - -var InitCmd = &cobra.Command{ - Use: "init", - Short: "init syncversion table", - PersistentPreRunE: func(cmd *cobra.Command, args []string) (err error) { - config, err := config.LoadConfig() - if err != nil { - return fmt.Errorf("failed to load config,%w", err) - } - - dbConfig := database.DBConfig{ - Dialect: database.DatabaseDialect(config.Database.Driver), - DSN: config.Database.DSN, - } - - database.InitDB(dbConfig) - if err != nil { - return fmt.Errorf("initializing DB connection: %w", err) - } - ctx := context.WithValue(cmd.Context(), "config", config) - cmd.SetContext(ctx) - return - }, - Run: func(cmd *cobra.Command, args []string) { - ctx := cmd.Context() - config, ok := ctx.Value("config").(*config.Config) - if !ok { - slog.Error("config not found in context") - return - } - var versions []database.SyncVersion - repoComponent, err := component.NewRepoComponent(config) - if err != nil { - slog.Error("failed to create repository component: %v", err) - return - } - mirrorRepo := database.NewMirrorStore() - syncVersionStore := database.NewSyncVersionStore() - - mirrors, err := mirrorRepo.Finished(ctx) - if err != nil { - slog.Error("error finding mirror repositories: %v", err) - return - } - for _, mirror := range mirrors { - repo := mirror.Repository - if repo == nil { - continue - } - if repo.Private { - continue - } - namespace, name := repo.NamespaceAndName() - req := &types.GetCommitsReq{ - Namespace: namespace, - Name: name, - Ref: repo.DefaultBranch, - RepoType: repo.RepositoryType, - } - commit, err := repoComponent.LastCommit(ctx, req) - if err != nil { - slog.Error("error getting repository last commit: %v", err) - continue - } - - versions = append(versions, database.SyncVersion{ - SourceID: types.SyncVersionSourceOpenCSG, - RepoPath: repo.Path, - RepoType: repo.RepositoryType, - LastModifiedAt: repo.UpdatedAt, - ChangeLog: commit.Message, - }) - } - if len(versions) == 0 { - slog.Error("there are no finished mirror repositories") - return - } - err = syncVersionStore.BatchCreate(ctx, versions) - if err != nil { - slog.Error("failed to init sync version error: %v", err) - return - } - slog.Info("sync versions successfully") - }, -} diff --git a/common/config/config.go b/common/config/config.go index fb24d043..717f6306 100644 --- a/common/config/config.go +++ b/common/config/config.go @@ -21,6 +21,7 @@ type Config struct { Token string `envconfig:"STARHUB_SERVER_MIRROR_Token" default:""` Port int `envconfig:"STARHUB_SERVER_MIRROR_PORT" default:"8085"` SessionSecretKey string `envconfig:"STARHUB_SERVER_MIRROR_SESSION_SECRET_KEY" default:"mirror"` + WorkerNumber int `envconfig:"STARHUB_SERVER_MIRROR_WORKER_NUMBER" default:"5"` } DocsHost string `envconfig:"STARHUB_SERVER_SERVER_DOCS_HOST" default:"http://localhost:6636"` @@ -167,7 +168,7 @@ type Config struct { MultiSync struct { SaasAPIDomain string `envconfig:"OPENCSG_SAAS_API_DOMAIN" default:"https://hub.opencsg.com"` SaasSyncDomain string `envconfig:"OPENCSG_SAAS_SYNC_DOMAIN" default:"https://sync.opencsg.com"` - // Enabled bool `envconfig:"STARHUB_SERVER_MULTI_SYNC_ENABLED" default:"false"` + Enabled bool `envconfig:"STARHUB_SERVER_MULTI_SYNC_ENABLED" default:"false"` } Telemetry struct { diff --git a/common/types/git_server_type.go b/common/types/git_server_type.go new file mode 100644 index 00000000..fa3e28f5 --- /dev/null +++ b/common/types/git_server_type.go @@ -0,0 +1,6 @@ +package types + +const ( + GitServerTypeGitaly string = "gitaly" + GitServerTypeGitea string = "gitea" +) diff --git a/common/types/lfs.go b/common/types/lfs.go index e061b288..522e7e79 100644 --- a/common/types/lfs.go +++ b/common/types/lfs.go @@ -90,8 +90,15 @@ type LFSBatchResponse struct { } type LFSBatchRequest struct { - Operation string `json:"operation"` - Objects []LFSBatchObject `json:"objects"` + Operation string `json:"operation"` + Objects []LFSBatchObject `json:"objects"` + Ref LFSBatchObjectRef `json:"ref"` + Transfers []string `json:"transfers,omitempty"` + HashAlog string `json:"hash_alog,omitempty"` +} + +type LFSBatchObjectRef struct { + Name string `json:"name"` } type LFSBatchObject struct { diff --git a/common/types/mirror.go b/common/types/mirror.go index 24c40da4..f9e6cecc 100644 --- a/common/types/mirror.go +++ b/common/types/mirror.go @@ -90,6 +90,7 @@ type MirrorTaskStatus string const ( MirrorWaiting MirrorTaskStatus = "waiting" MirrorRunning MirrorTaskStatus = "running" + MirrorRepoSynced MirrorTaskStatus = "repo_synced" MirrorFinished MirrorTaskStatus = "finished" MirrorFailed MirrorTaskStatus = "failed" MirrorIncomplete MirrorTaskStatus = "incomplete" @@ -106,9 +107,9 @@ const ( type MirrorPriority int const ( - HighMirrorPriority MirrorPriority = 2 - MediumMirrorPriority MirrorPriority = 1 - LowMirrorPriority MirrorPriority = 0 + HighMirrorPriority MirrorPriority = 3 + MediumMirrorPriority MirrorPriority = 2 + LowMirrorPriority MirrorPriority = 1 ) type Mirror struct { diff --git a/component/callback/git_callback.go b/component/callback/git_callback.go index da6a63b2..0a309dcb 100644 --- a/component/callback/git_callback.go +++ b/component/callback/git_callback.go @@ -39,6 +39,10 @@ type GitCallbackComponent struct { rrs *database.RepoRelationsStore mirrorStore *database.MirrorStore svGen *SyncVersionGenerator + rrf *database.RepositoriesRuntimeFrameworkStore + rac *component.RuntimeArchitectureComponent + ras *database.RuntimeArchitecturesStore + rfs *database.RuntimeFrameworksStore // set visibility if file content is sensitive setRepoVisibility bool } @@ -61,10 +65,17 @@ func NewGitCallback(config *config.Config) (*GitCallbackComponent, error) { mirrorStore := database.NewMirrorStore() checker := component.NewSensitiveComponent(config) sc, err := component.NewSpaceComponent(config) + ras := database.NewRuntimeArchitecturesStore() if err != nil { return nil, err } svGen := NewSyncVersionGenerator() + rrf := database.NewRepositoriesRuntimeFramework() + rac, err := component.NewRuntimeArchitectureComponent(config) + if err != nil { + return nil, err + } + rfs := database.NewRuntimeFrameworksStore() return &GitCallbackComponent{ config: config, gs: gs, @@ -78,6 +89,10 @@ func NewGitCallback(config *config.Config) (*GitCallbackComponent, error) { mirrorStore: mirrorStore, checker: checker, svGen: svGen, + rrf: rrf, + rac: rac, + ras: ras, + rfs: rfs, }, nil } @@ -172,6 +187,8 @@ func (c *GitCallbackComponent) HandlePush(ctx context.Context, req *types.GiteaC func (c *GitCallbackComponent) modifyFiles(ctx context.Context, repoType, namespace, repoName, ref string, fileNames []string) error { for _, fileName := range fileNames { slog.Debug("modify file", slog.String("file", fileName)) + // update model runtime + c.updateModelRuntimeFrameworks(ctx, repoType, namespace, repoName, ref, fileName, false) // only care about readme file under root directory if fileName != ReadmeFileName { continue @@ -207,6 +224,8 @@ func (c *GitCallbackComponent) removeFiles(ctx context.Context, repoType, namesp // delete tagss for _, fileName := range fileNames { slog.Debug("remove file", slog.String("file", fileName)) + // update model runtime + c.updateModelRuntimeFrameworks(ctx, repoType, namespace, repoName, ref, fileName, true) // only care about readme file under root directory if fileName == ReadmeFileName { // use empty content to clear all the meta tags @@ -248,6 +267,8 @@ func (c *GitCallbackComponent) removeFiles(ctx context.Context, repoType, namesp func (c *GitCallbackComponent) addFiles(ctx context.Context, repoType, namespace, repoName, ref string, fileNames []string) error { for _, fileName := range fileNames { slog.Debug("add file", slog.String("file", fileName)) + // update model runtime + c.updateModelRuntimeFrameworks(ctx, repoType, namespace, repoName, ref, fileName, false) // only care about readme file under root directory if fileName == ReadmeFileName { content, err := c.getFileRaw(repoType, namespace, repoName, ref, fileName) @@ -420,3 +441,90 @@ func (c *GitCallbackComponent) setPrivate(ctx context.Context, repoType, namespa return err } + +func (c *GitCallbackComponent) updateModelRuntimeFrameworks(ctx context.Context, repoType, namespace, repoName, ref, fileName string, deleteAction bool) { + slog.Debug("update model relation for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("repoType", repoType), slog.Any("fileName", fileName), slog.Any("branch", ref)) + // must be model repo and config.json + if repoType != ModelRepoType || fileName != component.ConfigFileName || ref != ("refs/heads/"+component.MainBranch) { + return + } + repo, err := c.rs.FindByPath(ctx, types.ModelRepo, namespace, repoName) + if err != nil || repo == nil { + slog.Warn("fail to query repo for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("error", err)) + return + } + // delete event + if deleteAction { + err := c.rrf.DeleteByRepoID(ctx, repo.ID) + if err != nil { + slog.Warn("fail to remove repo runtimes for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("repoid", repo.ID), slog.Any("error", err)) + } + return + } + arch, err := c.rac.GetArchitectureFromConfig(ctx, namespace, repoName) + if err != nil { + slog.Warn("fail to get config.json content for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("error", err)) + return + } + slog.Debug("get arch for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("arch", arch)) + runtimes, err := c.ras.ListByRArchName(ctx, arch) + if err != nil { + slog.Warn("fail to get runtime ids by arch for git callback", slog.Any("arch", arch), slog.Any("error", err)) + return + } + slog.Debug("get runtimes by arch for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("arch", arch), slog.Any("runtimes", runtimes)) + var frameIDs []int64 + for _, runtime := range runtimes { + frameIDs = append(frameIDs, runtime.RuntimeFrameworkID) + } + slog.Debug("get new frame ids for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("frameIDs", frameIDs)) + newFrames, err := c.rfs.ListByIDs(ctx, frameIDs) + if err != nil { + slog.Warn("fail to get runtime frameworks for git callback", slog.Any("arch", arch), slog.Any("error", err)) + return + } + slog.Debug("get new frames by arch for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("newFrames", newFrames)) + var newFrameMap map[string]string = make(map[string]string) + for _, frame := range newFrames { + newFrameMap[string(frame.ID)] = string(frame.ID) + } + slog.Debug("get new frame map by arch for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("newFrameMap", newFrameMap)) + oldRepoRuntimes, err := c.rrf.GetByRepoIDs(ctx, repo.ID) + if err != nil { + slog.Warn("fail to get repo runtimes for git callback", slog.Any("repo.ID", repo.ID), slog.Any("error", err)) + return + } + slog.Debug("get old frames by arch for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("oldRepoRuntimes", oldRepoRuntimes)) + var oldFrameMap map[string]string = make(map[string]string) + // get map + for _, runtime := range oldRepoRuntimes { + oldFrameMap[string(runtime.RuntimeFrameworkID)] = string(runtime.RuntimeFrameworkID) + } + slog.Debug("get old frame map by arch for git callback", slog.Any("namespace", namespace), slog.Any("repoName", repoName), slog.Any("oldFrameMap", oldFrameMap)) + // remove incorrect relation + for _, old := range oldRepoRuntimes { + // check if it need remove + _, exist := newFrameMap[string(old.RuntimeFrameworkID)] + if !exist { + // remove incorrect relations + err := c.rrf.Delete(ctx, old.RuntimeFrameworkID, repo.ID, old.Type) + if err != nil { + slog.Warn("fail to delete old repo runtimes for git callback", slog.Any("repo.ID", repo.ID), slog.Any("runtime framework id", old.RuntimeFrameworkID), slog.Any("error", err)) + } + } + } + + // add new relation + for _, new := range newFrames { + // check if it need add + _, exist := oldFrameMap[string(new.ID)] + if !exist { + // add new relations + err := c.rrf.Add(ctx, new.ID, repo.ID, new.Type) + if err != nil { + slog.Warn("fail to add new repo runtimes for git callback", slog.Any("repo.ID", repo.ID), slog.Any("runtime framework id", new.ID), slog.Any("error", err)) + } + } + } + +} diff --git a/component/mirror.go b/component/mirror.go index 759ae506..5e613ae3 100644 --- a/component/mirror.go +++ b/component/mirror.go @@ -13,12 +13,12 @@ import ( "opencsg.com/csghub-server/builder/git" "opencsg.com/csghub-server/builder/git/gitserver" "opencsg.com/csghub-server/builder/git/mirrorserver" - "opencsg.com/csghub-server/builder/mirror/queue" "opencsg.com/csghub-server/builder/store/database" "opencsg.com/csghub-server/builder/store/s3" "opencsg.com/csghub-server/common/config" "opencsg.com/csghub-server/common/types" "opencsg.com/csghub-server/common/utils/common" + "opencsg.com/csghub-server/mirror/queue" ) type MirrorComponent struct { @@ -39,6 +39,7 @@ type MirrorComponent struct { lfsMetaObjectStore *database.LfsMetaObjectStore userStore *database.UserStore config *config.Config + mq *queue.PriorityQueue } func NewMirrorComponent(config *config.Config) (*MirrorComponent, error) { @@ -50,6 +51,10 @@ func NewMirrorComponent(config *config.Config) (*MirrorComponent, error) { slog.Error(newError.Error()) return nil, newError } + c.mq, err = queue.GetPriorityQueueInstance() + if err != nil { + return nil, fmt.Errorf("failed to get priority queue: %v", err) + } c.repoComp, err = NewRepoComponent(config) if err != nil { return nil, fmt.Errorf("fail to create repo component,error:%w", err) @@ -231,7 +236,7 @@ func (c *MirrorComponent) CreateMirrorRepo(ctx context.Context, req types.Create mirror.SourceRepoPath = fmt.Sprintf("%s/%s", req.SourceNamespace, req.SourceName) mirror.Priority = types.HighMirrorPriority var taskId int64 - if c.config.GitServer.Type == "gitea" { + if c.config.GitServer.Type == types.GitServerTypeGitea { taskId, err = c.mirrorServer.CreateMirrorRepo(ctx, mirrorserver.CreateMirrorRepoReq{ Namespace: "root", Name: mirror.LocalRepoPath, @@ -254,8 +259,8 @@ func (c *MirrorComponent) CreateMirrorRepo(ctx context.Context, req types.Create return nil, fmt.Errorf("failed to create mirror") } - if c.config.GitServer.Type == "gitaly" { - queue.GetPriorityQueueInstance().Push(&queue.MirrorTask{ + if c.config.GitServer.Type == types.GitServerTypeGitaly { + c.mq.PushRepoMirror(&queue.MirrorTask{ MirrorID: reqMirror.ID, Priority: queue.PriorityMap[reqMirror.Priority], }) diff --git a/component/repo.go b/component/repo.go index 0ec9e4f2..c9b13a02 100644 --- a/component/repo.go +++ b/component/repo.go @@ -25,13 +25,13 @@ import ( "opencsg.com/csghub-server/builder/git/gitserver" "opencsg.com/csghub-server/builder/git/membership" "opencsg.com/csghub-server/builder/git/mirrorserver" - "opencsg.com/csghub-server/builder/mirror/queue" "opencsg.com/csghub-server/builder/rpc" "opencsg.com/csghub-server/builder/store/database" "opencsg.com/csghub-server/builder/store/s3" "opencsg.com/csghub-server/common/config" "opencsg.com/csghub-server/common/types" "opencsg.com/csghub-server/common/utils/common" + "opencsg.com/csghub-server/mirror/queue" ) const ( @@ -73,6 +73,7 @@ type RepoComponent struct { srs *database.SpaceResourceStore lfsMetaObjectStore *database.LfsMetaObjectStore recom *database.RecomStore + mq *queue.PriorityQueue } func NewRepoComponent(config *config.Config) (*RepoComponent, error) { @@ -96,6 +97,11 @@ func NewRepoComponent(config *config.Config) (*RepoComponent, error) { slog.Error(newError.Error()) return nil, newError } + mq, err := queue.GetPriorityQueueInstance() + if err != nil { + return nil, fmt.Errorf("failed to get priority queue: %v", err) + } + c.mq = mq c.mirrorServer, err = git.NewMirrorServer(config) if err != nil { newError := fmt.Errorf("fail to create git mirror server,error:%w", err) @@ -458,7 +464,7 @@ func (c *RepoComponent) CreateFile(ctx context.Context, req *types.CreateFileReq ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() - if c.config.GitServer.Type == "gitaly" { + if c.config.GitServer.Type == types.GitServerTypeGitaly { useLfs, req = c.checkIfShouldUseLfs(ctx, req) } @@ -570,7 +576,7 @@ func (c *RepoComponent) UpdateFile(ctx context.Context, req *types.UpdateFileReq return nil, fmt.Errorf("fail to check namespace, cause: %w", err) } - if c.config.GitServer.Type == "gitaly" { + if c.config.GitServer.Type == types.GitServerTypeGitaly { useLfs, req = c.checkIfShouldUseLfsUpdate(ctx, req) } @@ -941,6 +947,7 @@ func (c *RepoComponent) Tree(ctx context.Context, req *types.GetFileReq) ([]*typ Path: req.Path, RepoType: req.RepoType, } + getRepoFileTree.Ref = repo.DefaultBranch tree, err := c.git.GetRepoFileTree(ctx, getRepoFileTree) if err != nil { return nil, fmt.Errorf("failed to get git %s repository file tree, error: %w", req.RepoType, err) @@ -1420,7 +1427,7 @@ func (c *RepoComponent) CreateMirror(ctx context.Context, req types.CreateMirror mirror.RepositoryID = repo.ID if c.config.Saas { - if c.config.GitServer.Type == "gitea" { + if c.config.GitServer.Type == types.GitServerTypeGitea { mirror.PushUsername = req.CurrentUser mirror.PushAccessToken = pushAccessToken.Token taskId, err = c.mirrorServer.CreateMirrorRepo(ctx, mirrorserver.CreateMirrorRepoReq{ @@ -1436,7 +1443,7 @@ func (c *RepoComponent) CreateMirror(ctx context.Context, req types.CreateMirror } } } else { - if c.config.GitServer.Type == "gitea" { + if c.config.GitServer.Type == types.GitServerTypeGitea { err = c.git.MirrorSync(ctx, gitserver.MirrorSyncReq{ Namespace: req.Namespace, Name: req.Name, @@ -1458,10 +1465,11 @@ func (c *RepoComponent) CreateMirror(ctx context.Context, req types.CreateMirror return nil, fmt.Errorf("failed to create mirror") } - if c.config.GitServer.Type == "gitaly" { - queue.GetPriorityQueueInstance().Push(&queue.MirrorTask{ - MirrorID: reqMirror.ID, - Priority: queue.PriorityMap[reqMirror.Priority], + if c.config.GitServer.Type == types.GitServerTypeGitaly { + c.mq.PushRepoMirror(&queue.MirrorTask{ + MirrorID: reqMirror.ID, + Priority: queue.PriorityMap[reqMirror.Priority], + CreatedAt: mirror.CreatedAt.Unix(), }) reqMirror.Status = types.MirrorWaiting err = c.mirror.Update(ctx, reqMirror) @@ -1485,7 +1493,7 @@ func (c *RepoComponent) MirrorFromSaas(ctx context.Context, namespace, name, cur } } if m != nil { - err := c.mirrorFromSaasSync(ctx, repo, namespace, name, repoType) + err := c.mirrorFromSaasSync(ctx, m, namespace, name, repoType) if err != nil { return fmt.Errorf("failed to trigger mirror sync, error: %w", err) } @@ -1521,8 +1529,6 @@ func (c *RepoComponent) MirrorFromSaas(ctx context.Context, namespace, name, cur Namespace: namespace, Name: name, CloneUrl: mirror.SourceUrl, - Username: mirror.Username, - AccessToken: mirror.AccessToken, RepoType: repoType, MirrorToken: syncClientSetting.Token, Private: false, @@ -1538,6 +1544,16 @@ func (c *RepoComponent) MirrorFromSaas(ctx context.Context, namespace, name, cur if err != nil { return fmt.Errorf("failed to create mirror") } + + if c.config.GitServer.Type == types.GitServerTypeGitaly { + c.mq.PushRepoMirror(&queue.MirrorTask{ + MirrorID: mirror.ID, + Priority: queue.Priority(mirror.Priority), + CreatedAt: mirror.CreatedAt.Unix(), + MirrorToken: syncClientSetting.Token, + }) + } + repo.SyncStatus = types.SyncStatusInProgress _, err = c.repo.UpdateRepo(ctx, *repo) if err != nil { @@ -1546,36 +1562,30 @@ func (c *RepoComponent) MirrorFromSaas(ctx context.Context, namespace, name, cur return nil } -func (c *RepoComponent) mirrorFromSaasSync(ctx context.Context, repo *database.Repository, namespace, name string, repoType types.RepositoryType) error { +func (c *RepoComponent) mirrorFromSaasSync(ctx context.Context, mirror *database.Mirror, namespace, name string, repoType types.RepositoryType) error { var err error - if c.config.GitServer.Type == "gitea" { + syncClientSetting, err := c.syncClientSetting.First(ctx) + if err != nil { + return fmt.Errorf("failed to find sync client setting, error: %w", err) + } + if c.config.GitServer.Type == types.GitServerTypeGitea { err = c.git.MirrorSync(ctx, gitserver.MirrorSyncReq{ - Namespace: namespace, - Name: name, - RepoType: repoType, + Namespace: namespace, + Name: name, + RepoType: repoType, + MirrorToken: syncClientSetting.Token, }) if err != nil { return fmt.Errorf("failed to sync mirror, error: %w", err) } - } else if c.config.GitServer.Type == "gitaly" { - mirror, err := c.mirror.FindByRepoID(ctx, repo.ID) - if err != nil { - return fmt.Errorf("failed to find mirror, error: %w", err) - } - queue.GetPriorityQueueInstance().Push(&queue.MirrorTask{ - MirrorID: mirror.ID, - Priority: queue.PriorityMap[mirror.Priority], - }) - mirror.Status = types.MirrorWaiting - err = c.mirror.Update(ctx, mirror) - if err != nil { - return fmt.Errorf("failed to update mirror status: %v", err) - } } - repo.SyncStatus = types.SyncStatusInProgress - _, err = c.repo.UpdateRepo(ctx, *repo) - if err != nil { - return fmt.Errorf("failed to update repo sync status") + if c.config.GitServer.Type == types.GitServerTypeGitaly { + c.mq.PushRepoMirror(&queue.MirrorTask{ + MirrorID: mirror.ID, + Priority: queue.Priority(mirror.Priority), + CreatedAt: mirror.CreatedAt.Unix(), + MirrorToken: syncClientSetting.Token, + }) } return nil } @@ -2238,7 +2248,7 @@ func (c *RepoComponent) SyncMirror(ctx context.Context, repoType types.Repositor return fmt.Errorf("failed to find mirror, error: %w", err) } mirror.Priority = types.HighMirrorPriority - if c.config.GitServer.Type == "gitea" { + if c.config.GitServer.Type == types.GitServerTypeGitea { err = c.mirrorServer.MirrorSync(ctx, mirrorserver.MirrorSyncReq{ Namespace: "root", Name: mirror.LocalRepoPath, @@ -2246,10 +2256,11 @@ func (c *RepoComponent) SyncMirror(ctx context.Context, repoType types.Repositor if err != nil { return fmt.Errorf("failed to sync mirror, error: %w", err) } - } else if c.config.GitServer.Type == "gitaly" { - queue.GetPriorityQueueInstance().Push(&queue.MirrorTask{ - MirrorID: mirror.ID, - Priority: queue.PriorityMap[mirror.Priority], + } else if c.config.GitServer.Type == types.GitServerTypeGitaly { + c.mq.PushRepoMirror(&queue.MirrorTask{ + MirrorID: mirror.ID, + Priority: queue.PriorityMap[mirror.Priority], + CreatedAt: mirror.CreatedAt.Unix(), }) mirror.Status = types.MirrorWaiting err = c.mirror.Update(ctx, mirror) diff --git a/component/runtime_architecture.go b/component/runtime_architecture.go new file mode 100644 index 00000000..8fe65aa5 --- /dev/null +++ b/component/runtime_architecture.go @@ -0,0 +1,229 @@ +package component + +import ( + "context" + "encoding/json" + "fmt" + "log/slog" + "strings" + "sync" + + "opencsg.com/csghub-server/builder/git/gitserver" + "opencsg.com/csghub-server/builder/store/database" + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" +) + +var ( + MainBranch string = "main" + ConfigFileName string = "config.json" + ScanLock sync.Mutex +) + +type RuntimeArchitectureComponent struct { + r *RepoComponent + ras *database.RuntimeArchitecturesStore +} + +func NewRuntimeArchitectureComponent(config *config.Config) (*RuntimeArchitectureComponent, error) { + c := &RuntimeArchitectureComponent{} + c.ras = database.NewRuntimeArchitecturesStore() + repo, err := NewRepoComponent(config) + if err != nil { + return nil, fmt.Errorf("fail to create repo component, %w", err) + } + c.r = repo + return c, nil +} + +func (c *RuntimeArchitectureComponent) ListByRuntimeFrameworkID(ctx context.Context, id int64) ([]database.RuntimeArchitecture, error) { + archs, err := c.ras.ListByRuntimeFrameworkID(ctx, id) + if err != nil { + return nil, fmt.Errorf("list runtime arch failed, %w", err) + } + return archs, nil +} + +func (c *RuntimeArchitectureComponent) SetArchitectures(ctx context.Context, id int64, architectures []string) ([]string, error) { + _, err := c.r.rtfm.FindByID(ctx, id) + if err != nil { + return nil, fmt.Errorf("invalid runtime framework id, %w", err) + } + var failedArchs []string + for _, arch := range architectures { + if len(strings.Trim(arch, " ")) < 1 { + continue + } + err := c.ras.Add(ctx, database.RuntimeArchitecture{ + RuntimeFrameworkID: id, + ArchitectureName: strings.Trim(arch, " "), + }) + if err != nil { + failedArchs = append(failedArchs, arch) + } + } + return failedArchs, nil +} + +func (c *RuntimeArchitectureComponent) DeleteArchitectures(ctx context.Context, id int64, architectures []string) ([]string, error) { + _, err := c.r.rtfm.FindByID(ctx, id) + if err != nil { + return nil, fmt.Errorf("invalid runtime framework id, %w", err) + } + var failedDeletes []string + for _, arch := range architectures { + if len(strings.Trim(arch, " ")) < 1 { + continue + } + err := c.ras.DeleteByRuntimeIDAndArchName(ctx, id, strings.Trim(arch, " ")) + if err != nil { + failedDeletes = append(failedDeletes, arch) + } + } + return failedDeletes, nil +} + +func (c *RuntimeArchitectureComponent) ScanArchitecture(ctx context.Context, id int64, scanType int, models []string) error { + frame, err := c.r.rtfm.FindByID(ctx, id) + if err != nil { + return fmt.Errorf("invalid runtime framework id, %w", err) + } + archs, err := c.ras.ListByRuntimeFrameworkID(ctx, id) + if err != nil { + return fmt.Errorf("list runtime arch failed, %w", err) + } + var archMap map[string]string = make(map[string]string) + for _, arch := range archs { + archMap[arch.ArchitectureName] = arch.ArchitectureName + } + + if ScanLock.TryLock() { + go func() { + slog.Info("scan models to update runtime frameworks started") + defer ScanLock.Unlock() + if scanType == 0 || scanType == 2 { + err := c.scanExistModels(ctx, types.ScanReq{ + FrameID: id, + FrameType: frame.Type, + ArchMap: archMap, + Models: models, + }) + if err != nil { + slog.Any("scan old models failed", slog.Any("error", err)) + } + } + + if scanType == 0 || scanType == 1 { + err := c.scanNewModels(ctx, types.ScanReq{ + FrameID: id, + FrameType: frame.Type, + ArchMap: archMap, + Models: models, + }) + if err != nil { + slog.Any("scan new models failed", slog.Any("error", err)) + } + } + slog.Info("scan models to update runtime frameworks done") + }() + } else { + return fmt.Errorf("architecture scan is already in progress") + } + return nil +} + +func (c *RuntimeArchitectureComponent) scanNewModels(ctx context.Context, req types.ScanReq) error { + repos, err := c.r.repo.GetRepoWithoutRuntimeByID(ctx, req.FrameID, req.Models) + if err != nil { + return fmt.Errorf("failed to get repos without runtime by ID, %w", err) + } + if repos == nil { + return nil + } + for _, repo := range repos { + fields := strings.Split(repo.Path, "/") + arch, err := c.GetArchitectureFromConfig(ctx, fields[0], fields[1]) + if err != nil { + slog.Warn("did not to get arch for create relation", slog.Any("ConfigFileName", ConfigFileName), slog.Any("repo", repo.Path), slog.Any("error", err)) + continue + } + if len(arch) < 1 { + continue + } + _, exist := req.ArchMap[arch] + if !exist { + continue + } + err = c.r.rrtfms.Add(ctx, req.FrameID, repo.ID, req.FrameType) + if err != nil { + slog.Warn("fail to create relation", slog.Any("repo", repo.Path), slog.Any("frameid", req.FrameID), slog.Any("error", err)) + } + } + return nil +} + +func (c *RuntimeArchitectureComponent) scanExistModels(ctx context.Context, req types.ScanReq) error { + repos, err := c.r.repo.GetRepoWithRuntimeByID(ctx, req.FrameID, req.Models) + if err != nil { + return fmt.Errorf("fail to get repos with runtime by ID, %w", err) + } + if repos == nil { + return nil + } + for _, repo := range repos { + fields := strings.Split(repo.Path, "/") + arch, err := c.GetArchitectureFromConfig(ctx, fields[0], fields[1]) + if err != nil { + slog.Warn("did not to get arch for remove relation", slog.Any("ConfigFileName", ConfigFileName), slog.Any("repo", repo.Path), slog.Any("error", err)) + continue + } + if len(arch) < 1 { + continue + } + _, exist := req.ArchMap[arch] + if exist { + continue + } + err = c.r.rrtfms.Delete(ctx, req.FrameID, repo.ID, req.FrameType) + if err != nil { + slog.Warn("fail to remove relation", slog.Any("repo", repo.Path), slog.Any("frameid", req.FrameID), slog.Any("error", err)) + } + } + return nil +} + +func (c *RuntimeArchitectureComponent) GetArchitectureFromConfig(ctx context.Context, namespace, name string) (string, error) { + content, err := c.getConfigContent(ctx, namespace, name) + if err != nil { + return "", fmt.Errorf("fail to read config.json for relation, %w", err) + } + var config struct { + Architectures []string `json:"architectures"` + } + if err := json.Unmarshal([]byte(content), &config); err != nil { + return "", fmt.Errorf("fail to unmarshal config, %w", err) + } + slog.Debug("unmarshal config", slog.Any("config", config)) + if config.Architectures == nil { + return "", nil + } + if len(config.Architectures) < 1 { + return "", nil + } + slog.Debug("architectures of config", slog.Any("Architectures", config.Architectures)) + return config.Architectures[0], nil +} + +func (c *RuntimeArchitectureComponent) getConfigContent(ctx context.Context, namespace, name string) (string, error) { + content, err := c.r.git.GetRepoFileRaw(ctx, gitserver.GetRepoInfoByPathReq{ + Namespace: namespace, + Name: name, + Ref: MainBranch, + Path: ConfigFileName, + RepoType: types.ModelRepo, + }) + if err != nil { + return "", fmt.Errorf("get RepoFileRaw for relation, %w", err) + } + return content, nil +} diff --git a/component/tagparser/nameparser.go b/component/tagparser/nameparser.go index 0065fb6a..f6dd7061 100644 --- a/component/tagparser/nameparser.go +++ b/component/tagparser/nameparser.go @@ -36,14 +36,14 @@ func LibraryTag(filePath string) string { } func isPytorch(filename string) bool { - return strings.HasPrefix(filename, "pytorch_model") && strings.HasSuffix(filename, ".bin") + return (strings.HasPrefix(filename, "pytorch_model") && strings.HasSuffix(filename, ".bin")) || strings.HasSuffix(filename, ".pt") } func isTensorflow(filename string) bool { return strings.HasPrefix(filename, "tf_model") && strings.HasSuffix(filename, ".h5") } func isSafetensors(filename string) bool { - return strings.HasPrefix(filename, "model") && strings.HasSuffix(filename, ".safetensors") + return strings.HasSuffix(filename, ".safetensors") } func isJAX(filename string) bool { return strings.HasPrefix(filename, "flax_model") && strings.HasSuffix(filename, ".msgpack") diff --git a/component/tagparser/nameparser_test.go b/component/tagparser/nameparser_test.go index e0cd19df..fdeaea29 100644 --- a/component/tagparser/nameparser_test.go +++ b/component/tagparser/nameparser_test.go @@ -14,6 +14,7 @@ func TestLibraryTag(t *testing.T) { {name: "case insensitive", args: args{filePath: "Pytorch_model.Bin"}, want: "pytorch"}, {name: "pytorch", args: args{filePath: "pytorch_model.bin"}, want: "pytorch"}, + {name: "pytorch", args: args{filePath: "model.pt"}, want: "pytorch"}, {name: "pytorch", args: args{filePath: "pytorch_model_001.bin"}, want: "pytorch"}, {name: "not pytorch", args: args{filePath: "1-pytorch_model_001.bin"}, want: ""}, {name: "not pytorch", args: args{filePath: "pytorch_model-bin"}, want: ""}, @@ -25,8 +26,9 @@ func TestLibraryTag(t *testing.T) { {name: "safetensors", args: args{filePath: "model.safetensors"}, want: "safetensors"}, {name: "safetensors", args: args{filePath: "model_001.safetensors"}, want: "safetensors"}, - {name: "not safetensors", args: args{filePath: "1-model.safetensors"}, want: ""}, - {name: "not safetensors", args: args{filePath: "model-safetensors"}, want: ""}, + {name: "safetensors", args: args{filePath: "adpter_model.safetensors"}, want: "safetensors"}, + {name: "not safetensors", args: args{filePath: "1-test.safeten"}, want: ""}, + {name: "not safetensors", args: args{filePath: "test-safetensors"}, want: ""}, {name: "flax_model", args: args{filePath: "flax_model.msgpack"}, want: "jax"}, {name: "flax_model", args: args{filePath: "flax_model-001.msgpack"}, want: "jax"}, diff --git a/docs/docs.go b/docs/docs.go index d87e1f19..7a994101 100644 --- a/docs/docs.go +++ b/docs/docs.go @@ -6302,6 +6302,167 @@ const docTemplate = `{ } } }, + "/runtime_framework/{id}/architecture": { + "get": { + "security": [ + { + "ApiKey": [] + } + ], + "description": "get runtime framework architectures", + "consumes": [ + "application/json" + ], + "produces": [ + "application/json" + ], + "tags": [ + "RuntimeFramework" + ], + "summary": "Get runtime framework architectures", + "parameters": [ + { + "type": "integer", + "description": "runtime framework id", + "name": "id", + "in": "path", + "required": true + } + ], + "responses": { + "200": { + "description": "OK", + "schema": { + "$ref": "#/definitions/types.Response" + } + }, + "400": { + "description": "Bad request", + "schema": { + "$ref": "#/definitions/types.APIBadRequest" + } + }, + "500": { + "description": "Internal server error", + "schema": { + "$ref": "#/definitions/types.APIInternalServerError" + } + } + } + }, + "put": { + "security": [ + { + "ApiKey": [] + } + ], + "description": "set runtime framework architectures", + "consumes": [ + "application/json" + ], + "produces": [ + "application/json" + ], + "tags": [ + "RuntimeFramework" + ], + "summary": "Set runtime framework architectures", + "parameters": [ + { + "type": "integer", + "description": "runtime framework id", + "name": "id", + "in": "path", + "required": true + }, + { + "description": "body", + "name": "body", + "in": "body", + "required": true, + "schema": { + "$ref": "#/definitions/types.RuntimeArchitecture" + } + } + ], + "responses": { + "200": { + "description": "OK", + "schema": { + "$ref": "#/definitions/types.Response" + } + }, + "400": { + "description": "Bad request", + "schema": { + "$ref": "#/definitions/types.APIBadRequest" + } + }, + "500": { + "description": "Internal server error", + "schema": { + "$ref": "#/definitions/types.APIInternalServerError" + } + } + } + }, + "delete": { + "security": [ + { + "ApiKey": [] + } + ], + "description": "Delete runtime framework architectures", + "consumes": [ + "application/json" + ], + "produces": [ + "application/json" + ], + "tags": [ + "RuntimeFramework" + ], + "summary": "Delete runtime framework architectures", + "parameters": [ + { + "type": "integer", + "description": "runtime framework id", + "name": "id", + "in": "path", + "required": true + }, + { + "description": "body", + "name": "body", + "in": "body", + "required": true, + "schema": { + "$ref": "#/definitions/types.RuntimeArchitecture" + } + } + ], + "responses": { + "200": { + "description": "OK", + "schema": { + "$ref": "#/definitions/types.Response" + } + }, + "400": { + "description": "Bad request", + "schema": { + "$ref": "#/definitions/types.APIBadRequest" + } + }, + "500": { + "description": "Internal server error", + "schema": { + "$ref": "#/definitions/types.APIInternalServerError" + } + } + } + } + }, "/runtime_framework/{id}/models": { "get": { "security": [ @@ -6383,6 +6544,75 @@ const docTemplate = `{ } } }, + "/runtime_framework/{id}/scan": { + "post": { + "security": [ + { + "ApiKey": [] + } + ], + "description": "Scan runtime architecture", + "consumes": [ + "application/json" + ], + "produces": [ + "application/json" + ], + "tags": [ + "RuntimeFramework" + ], + "summary": "Scan runtime architecture", + "parameters": [ + { + "type": "integer", + "description": "runtime framework id", + "name": "id", + "in": "path", + "required": true + }, + { + "enum": [ + 0, + 1, + 2 + ], + "type": "integer", + "description": "scan_type(0:all models, 1:new models, 2:old models)", + "name": "scan_type", + "in": "query" + }, + { + "description": "body", + "name": "body", + "in": "body", + "required": true, + "schema": { + "$ref": "#/definitions/types.RuntimeFrameworkModels" + } + } + ], + "responses": { + "200": { + "description": "OK", + "schema": { + "$ref": "#/definitions/types.Response" + } + }, + "400": { + "description": "Bad request", + "schema": { + "$ref": "#/definitions/types.APIBadRequest" + } + }, + "500": { + "description": "Internal server error", + "schema": { + "$ref": "#/definitions/types.APIInternalServerError" + } + } + } + } + }, "/space_resources": { "get": { "security": [ @@ -14043,6 +14273,32 @@ const docTemplate = `{ "types.CreateFileResp": { "type": "object" }, + "types.CreateJWTReq": { + "type": "object", + "required": [ + "current_user", + "uuid" + ], + "properties": { + "current_user": { + "type": "string" + }, + "uuid": { + "type": "string" + } + } + }, + "types.CreateJWTResp": { + "type": "object", + "properties": { + "expire_at": { + "type": "string" + }, + "token": { + "type": "string" + } + } + }, "types.CreateMirrorParams": { "type": "object", "properties": { @@ -15010,6 +15266,9 @@ const docTemplate = `{ "description": "unique name of the organization", "type": "string" }, + "user_id": { + "type": "integer" + }, "verified": { "type": "boolean" } @@ -15169,6 +15428,17 @@ const docTemplate = `{ } } }, + "types.RuntimeArchitecture": { + "type": "object", + "properties": { + "architectures": { + "type": "array", + "items": { + "type": "string" + } + } + } + }, "types.RuntimeFramework": { "type": "object", "properties": { @@ -15694,6 +15964,9 @@ const docTemplate = `{ "homepage": { "type": "string" }, + "id": { + "type": "integer" + }, "last_login_at": { "type": "string" }, diff --git a/docs/swagger.json b/docs/swagger.json index 433774ac..9a4b2406 100644 --- a/docs/swagger.json +++ b/docs/swagger.json @@ -6291,6 +6291,167 @@ } } }, + "/runtime_framework/{id}/architecture": { + "get": { + "security": [ + { + "ApiKey": [] + } + ], + "description": "get runtime framework architectures", + "consumes": [ + "application/json" + ], + "produces": [ + "application/json" + ], + "tags": [ + "RuntimeFramework" + ], + "summary": "Get runtime framework architectures", + "parameters": [ + { + "type": "integer", + "description": "runtime framework id", + "name": "id", + "in": "path", + "required": true + } + ], + "responses": { + "200": { + "description": "OK", + "schema": { + "$ref": "#/definitions/types.Response" + } + }, + "400": { + "description": "Bad request", + "schema": { + "$ref": "#/definitions/types.APIBadRequest" + } + }, + "500": { + "description": "Internal server error", + "schema": { + "$ref": "#/definitions/types.APIInternalServerError" + } + } + } + }, + "put": { + "security": [ + { + "ApiKey": [] + } + ], + "description": "set runtime framework architectures", + "consumes": [ + "application/json" + ], + "produces": [ + "application/json" + ], + "tags": [ + "RuntimeFramework" + ], + "summary": "Set runtime framework architectures", + "parameters": [ + { + "type": "integer", + "description": "runtime framework id", + "name": "id", + "in": "path", + "required": true + }, + { + "description": "body", + "name": "body", + "in": "body", + "required": true, + "schema": { + "$ref": "#/definitions/types.RuntimeArchitecture" + } + } + ], + "responses": { + "200": { + "description": "OK", + "schema": { + "$ref": "#/definitions/types.Response" + } + }, + "400": { + "description": "Bad request", + "schema": { + "$ref": "#/definitions/types.APIBadRequest" + } + }, + "500": { + "description": "Internal server error", + "schema": { + "$ref": "#/definitions/types.APIInternalServerError" + } + } + } + }, + "delete": { + "security": [ + { + "ApiKey": [] + } + ], + "description": "Delete runtime framework architectures", + "consumes": [ + "application/json" + ], + "produces": [ + "application/json" + ], + "tags": [ + "RuntimeFramework" + ], + "summary": "Delete runtime framework architectures", + "parameters": [ + { + "type": "integer", + "description": "runtime framework id", + "name": "id", + "in": "path", + "required": true + }, + { + "description": "body", + "name": "body", + "in": "body", + "required": true, + "schema": { + "$ref": "#/definitions/types.RuntimeArchitecture" + } + } + ], + "responses": { + "200": { + "description": "OK", + "schema": { + "$ref": "#/definitions/types.Response" + } + }, + "400": { + "description": "Bad request", + "schema": { + "$ref": "#/definitions/types.APIBadRequest" + } + }, + "500": { + "description": "Internal server error", + "schema": { + "$ref": "#/definitions/types.APIInternalServerError" + } + } + } + } + }, "/runtime_framework/{id}/models": { "get": { "security": [ @@ -6372,6 +6533,75 @@ } } }, + "/runtime_framework/{id}/scan": { + "post": { + "security": [ + { + "ApiKey": [] + } + ], + "description": "Scan runtime architecture", + "consumes": [ + "application/json" + ], + "produces": [ + "application/json" + ], + "tags": [ + "RuntimeFramework" + ], + "summary": "Scan runtime architecture", + "parameters": [ + { + "type": "integer", + "description": "runtime framework id", + "name": "id", + "in": "path", + "required": true + }, + { + "enum": [ + 0, + 1, + 2 + ], + "type": "integer", + "description": "scan_type(0:all models, 1:new models, 2:old models)", + "name": "scan_type", + "in": "query" + }, + { + "description": "body", + "name": "body", + "in": "body", + "required": true, + "schema": { + "$ref": "#/definitions/types.RuntimeFrameworkModels" + } + } + ], + "responses": { + "200": { + "description": "OK", + "schema": { + "$ref": "#/definitions/types.Response" + } + }, + "400": { + "description": "Bad request", + "schema": { + "$ref": "#/definitions/types.APIBadRequest" + } + }, + "500": { + "description": "Internal server error", + "schema": { + "$ref": "#/definitions/types.APIInternalServerError" + } + } + } + } + }, "/space_resources": { "get": { "security": [ @@ -14032,6 +14262,32 @@ "types.CreateFileResp": { "type": "object" }, + "types.CreateJWTReq": { + "type": "object", + "required": [ + "current_user", + "uuid" + ], + "properties": { + "current_user": { + "type": "string" + }, + "uuid": { + "type": "string" + } + } + }, + "types.CreateJWTResp": { + "type": "object", + "properties": { + "expire_at": { + "type": "string" + }, + "token": { + "type": "string" + } + } + }, "types.CreateMirrorParams": { "type": "object", "properties": { @@ -14999,6 +15255,9 @@ "description": "unique name of the organization", "type": "string" }, + "user_id": { + "type": "integer" + }, "verified": { "type": "boolean" } @@ -15158,6 +15417,17 @@ } } }, + "types.RuntimeArchitecture": { + "type": "object", + "properties": { + "architectures": { + "type": "array", + "items": { + "type": "string" + } + } + } + }, "types.RuntimeFramework": { "type": "object", "properties": { @@ -15683,6 +15953,9 @@ "homepage": { "type": "string" }, + "id": { + "type": "integer" + }, "last_login_at": { "type": "string" }, diff --git a/docs/swagger.yaml b/docs/swagger.yaml index 0f596f38..fbee72f4 100644 --- a/docs/swagger.yaml +++ b/docs/swagger.yaml @@ -969,6 +969,23 @@ definitions: type: object types.CreateFileResp: type: object + types.CreateJWTReq: + properties: + current_user: + type: string + uuid: + type: string + required: + - current_user + - uuid + type: object + types.CreateJWTResp: + properties: + expire_at: + type: string + token: + type: string + type: object types.CreateMirrorParams: properties: mirror_source_id: @@ -1624,6 +1641,8 @@ definitions: path: description: unique name of the organization type: string + user_id: + type: integer verified: type: boolean type: object @@ -1734,6 +1753,13 @@ definitions: total: type: integer type: object + types.RuntimeArchitecture: + properties: + architectures: + items: + type: string + type: array + type: object types.RuntimeFramework: properties: container_port: @@ -2088,6 +2114,8 @@ definitions: type: string homepage: type: string + id: + type: integer last_login_at: type: string nickname: @@ -7772,6 +7800,109 @@ paths: summary: Set model runtime frameworks tags: - RuntimeFramework + /runtime_framework/{id}/architecture: + delete: + consumes: + - application/json + description: Delete runtime framework architectures + parameters: + - description: runtime framework id + in: path + name: id + required: true + type: integer + - description: body + in: body + name: body + required: true + schema: + $ref: '#/definitions/types.RuntimeArchitecture' + produces: + - application/json + responses: + "200": + description: OK + schema: + $ref: '#/definitions/types.Response' + "400": + description: Bad request + schema: + $ref: '#/definitions/types.APIBadRequest' + "500": + description: Internal server error + schema: + $ref: '#/definitions/types.APIInternalServerError' + security: + - ApiKey: [] + summary: Delete runtime framework architectures + tags: + - RuntimeFramework + get: + consumes: + - application/json + description: get runtime framework architectures + parameters: + - description: runtime framework id + in: path + name: id + required: true + type: integer + produces: + - application/json + responses: + "200": + description: OK + schema: + $ref: '#/definitions/types.Response' + "400": + description: Bad request + schema: + $ref: '#/definitions/types.APIBadRequest' + "500": + description: Internal server error + schema: + $ref: '#/definitions/types.APIInternalServerError' + security: + - ApiKey: [] + summary: Get runtime framework architectures + tags: + - RuntimeFramework + put: + consumes: + - application/json + description: set runtime framework architectures + parameters: + - description: runtime framework id + in: path + name: id + required: true + type: integer + - description: body + in: body + name: body + required: true + schema: + $ref: '#/definitions/types.RuntimeArchitecture' + produces: + - application/json + responses: + "200": + description: OK + schema: + $ref: '#/definitions/types.Response' + "400": + description: Bad request + schema: + $ref: '#/definitions/types.APIBadRequest' + "500": + description: Internal server error + schema: + $ref: '#/definitions/types.APIInternalServerError' + security: + - ApiKey: [] + summary: Set runtime framework architectures + tags: + - RuntimeFramework /runtime_framework/{id}/models: get: consumes: @@ -7826,6 +7957,51 @@ paths: summary: Get Visible models by runtime framework for current user tags: - RuntimeFramework + /runtime_framework/{id}/scan: + post: + consumes: + - application/json + description: Scan runtime architecture + parameters: + - description: runtime framework id + in: path + name: id + required: true + type: integer + - description: scan_type(0:all models, 1:new models, 2:old models) + enum: + - 0 + - 1 + - 2 + in: query + name: scan_type + type: integer + - description: body + in: body + name: body + required: true + schema: + $ref: '#/definitions/types.RuntimeFrameworkModels' + produces: + - application/json + responses: + "200": + description: OK + schema: + $ref: '#/definitions/types.Response' + "400": + description: Bad request + schema: + $ref: '#/definitions/types.APIBadRequest' + "500": + description: Internal server error + schema: + $ref: '#/definitions/types.APIInternalServerError' + security: + - ApiKey: [] + summary: Scan runtime architecture + tags: + - RuntimeFramework /runtime_framework/models: get: consumes: diff --git a/mirror/lfs_sync_worker.go b/mirror/lfs_sync_worker.go new file mode 100644 index 00000000..6940a07e --- /dev/null +++ b/mirror/lfs_sync_worker.go @@ -0,0 +1,17 @@ +package mirror + +import ( + "context" + + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/mirror/lfssyncer" +) + +type LFSSyncWorker interface { + Run() + SyncLfs(ctx context.Context, workerID int, mirrorID int64) error +} + +func NewLFSSyncWorker(config *config.Config, numWorkers int) (LFSSyncWorker, error) { + return lfssyncer.NewMinioLFSSyncWorker(config, numWorkers) +} diff --git a/mirror/lfssyncer/minio.go b/mirror/lfssyncer/minio.go new file mode 100644 index 00000000..3051c140 --- /dev/null +++ b/mirror/lfssyncer/minio.go @@ -0,0 +1,276 @@ +package lfssyncer + +import ( + "bytes" + "context" + "encoding/json" + "fmt" + "log/slog" + "net/http" + "net/url" + "path/filepath" + "sync" + + "github.com/minio/minio-go/v7" + "opencsg.com/csghub-server/builder/store/database" + "opencsg.com/csghub-server/builder/store/s3" + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" + "opencsg.com/csghub-server/mirror/queue" +) + +type MinioLFSSyncWorker struct { + mq *queue.PriorityQueue + tasks chan queue.MirrorTask + wg sync.WaitGroup + mirrorStore *database.MirrorStore + lfsMetaObjectStore *database.LfsMetaObjectStore + s3Client *minio.Client + config *config.Config + numWorkers int +} + +func NewMinioLFSSyncWorker(config *config.Config, numWorkers int) (*MinioLFSSyncWorker, error) { + var err error + w := &MinioLFSSyncWorker{} + w.numWorkers = numWorkers + w.s3Client, err = s3.NewMinio(config) + if err != nil { + newError := fmt.Errorf("fail to init s3 client for code,error:%w", err) + slog.Error(newError.Error()) + return nil, newError + } + w.mirrorStore = database.NewMirrorStore() + w.lfsMetaObjectStore = database.NewLfsMetaObjectStore() + w.config = config + mq, err := queue.GetPriorityQueueInstance() + if err != nil { + return nil, fmt.Errorf("fail to get priority queue: %w", err) + } + w.mq = mq + w.tasks = make(chan queue.MirrorTask) + return w, nil +} + +func (w *MinioLFSSyncWorker) Run() { + for i := 1; i <= w.numWorkers; i++ { + w.wg.Add(1) + go w.worker(i) + } + go w.dispatcher() + w.wg.Wait() +} + +func (w *MinioLFSSyncWorker) dispatcher() { + for { + task := w.mq.PopLfsMirror() + if task != nil { + w.tasks <- *task + } + } +} + +func (w *MinioLFSSyncWorker) worker(id int) { + defer w.wg.Done() + defer func() { + if r := recover(); r != nil { + w.wg.Add(1) + go w.worker(id) + slog.Info("worker ecovered from panic ", slog.Int("workerId", id)) + } + }() + slog.Info("worker start", slog.Int("workerId", id)) + for { + task := <-w.tasks + ctx := context.Background() + err := w.SyncLfs(ctx, id, task.MirrorID) + if err != nil { + slog.Error("fail to sync lfs", slog.Int("workerId", id), slog.String("error", err.Error())) + continue + } + } +} + +func (w *MinioLFSSyncWorker) SyncLfs(ctx context.Context, workerId int, mirrorID int64) error { + var pointers []*types.Pointer + mirror, err := w.mirrorStore.FindByID(ctx, mirrorID) + if err != nil { + slog.Error("fail to get mirror", slog.Int("workerId", workerId), slog.String("error", err.Error())) + return fmt.Errorf("fail to get mirror: %w", err) + } + lfsMetaObjects, err := w.lfsMetaObjectStore.FindByRepoID(ctx, mirror.Repository.ID) + if err != nil { + slog.Error("fail to get lfs meta objects", slog.Int("workerId", workerId), slog.String("error", err.Error())) + return fmt.Errorf("fail to get lfs meta objects: %w", err) + } + for _, lfsMetaObject := range lfsMetaObjects { + pointers = append(pointers, &types.Pointer{ + Oid: lfsMetaObject.Oid, + Size: lfsMetaObject.Size, + }) + } + + pointers, err = w.GetLFSDownloadURLs(ctx, mirror, pointers) + if err != nil { + return fmt.Errorf("fail to get LFS download URL: %w", err) + } + err = w.DownloadAndUploadLFSFiles(ctx, mirror, pointers) + if err != nil { + return fmt.Errorf("fail to download and upload LFS files: %w", err) + } + return nil +} + +func (w *MinioLFSSyncWorker) GetLFSDownloadURLs(ctx context.Context, mirror *database.Mirror, pointers []*types.Pointer) ([]*types.Pointer, error) { + var resPointers []*types.Pointer + requestPayload := types.LFSBatchRequest{ + Operation: "download", + } + + for _, pointer := range pointers { + requestPayload.Objects = append(requestPayload.Objects, types.LFSBatchObject{ + Oid: pointer.Oid, + Size: pointer.Size, + }) + } + requestPayload.HashAlog = "sha256" + requestPayload.Transfers = []string{"lfs-standalone-file", "basic", "bash"} + + lfsAPIURL := mirror.SourceUrl + "/info/lfs/objects/batch" + + payload, err := json.Marshal(requestPayload) + if err != nil { + return resPointers, fmt.Errorf("failed to marshal request payload: %v", err) + } + + req, err := http.NewRequest("POST", lfsAPIURL, bytes.NewReader(payload)) + if err != nil { + return resPointers, fmt.Errorf("failed to create LFS batch request: %v", err) + } + + parsedURL, err := url.Parse(lfsAPIURL) + if err != nil { + return resPointers, fmt.Errorf("failed to parse LFS API URL: %v", err) + } + + req.Header.Set("Host", parsedURL.Host) + req.Header.Set("Accept", "application/vnd.git-lfs+json") + req.Header.Set("Content-Type", "application/vnd.git-lfs+json; charset=utf-8") + req.Header.Set("User-Agent", "git-lfs/3.5.1") + + client := &http.Client{} + resp, err := client.Do(req) + if err != nil { + return resPointers, fmt.Errorf("failed to send LFS batch request: %v", err) + } + defer resp.Body.Close() + + if resp.StatusCode != http.StatusOK { + return resPointers, fmt.Errorf("failed to get LFS download URL, status code: %d", resp.StatusCode) + } + + var batchResponse types.LFSBatchResponse + err = json.NewDecoder(resp.Body).Decode(&batchResponse) + if err != nil { + return resPointers, fmt.Errorf("failed to decode LFS batch response: %v", err) + } + + if len(batchResponse.Objects) == 0 { + return resPointers, fmt.Errorf("no objects found in LFS batch response") + } + for _, object := range batchResponse.Objects { + resPointers = append(resPointers, &types.Pointer{ + Oid: object.Oid, + Size: object.Size, + DownloadURL: object.Actions.Download.Href, + }) + } + + return resPointers, nil +} + +func (w *MinioLFSSyncWorker) DownloadAndUploadLFSFiles(ctx context.Context, mirror *database.Mirror, pointers []*types.Pointer) error { + var finishedLFSFileCount int + lfsFilesCount := len(pointers) + for _, pointer := range pointers { + objectKey := filepath.Join("lfs", pointer.RelativePath()) + fileInfo, err := w.s3Client.StatObject(ctx, w.config.S3.Bucket, objectKey, minio.StatObjectOptions{}) + if err != nil && err.Error() != "The specified key does not exist." { + slog.Error("failed to check if LFS file exists", slog.Any("error", err)) + continue + } + if (err != nil && err.Error() != "The specified key does not exist.") || fileInfo.Size != pointer.Size { + err = w.DownloadAndUploadLFSFile(ctx, mirror, pointer) + if err != nil { + slog.Error("failed to download and upload LFS file", slog.Any("error", err)) + } + } + + lfsMetaObject := database.LfsMetaObject{ + Size: pointer.Size, + Oid: pointer.Oid, + RepositoryID: mirror.Repository.ID, + Existing: true, + } + _, err = w.lfsMetaObjectStore.UpdateOrCreate(ctx, lfsMetaObject) + if err != nil { + slog.Error("failed to update or create LFS meta object", slog.Any("error", err)) + return fmt.Errorf("failed to update or create LFS meta object: %w", err) + } + slog.Info("finish to download and upload LFS file", slog.Any("objectKey", objectKey)) + finishedLFSFileCount += 1 + mirror.Progress = int8(finishedLFSFileCount * 100 / lfsFilesCount) + err = w.mirrorStore.Update(ctx, mirror) + if err != nil { + return fmt.Errorf("failed to update mirror progress: %w", err) + } + } + mirror.Status = types.MirrorFinished + err := w.mirrorStore.Update(ctx, mirror) + if err != nil { + return fmt.Errorf("failed to update mirror status: %w", err) + } + return nil +} + +func (w *MinioLFSSyncWorker) DownloadAndUploadLFSFile(ctx context.Context, mirror *database.Mirror, pointer *types.Pointer) error { + objectKey := filepath.Join("lfs", pointer.RelativePath()) + slog.Info("downloading LFS file from", slog.Any("url", pointer.DownloadURL)) + + req, err := http.NewRequest("GET", pointer.DownloadURL, nil) + if err != nil { + return fmt.Errorf("failed to create downlaod request: %w", err) + } + + parsedURL, err := url.Parse(pointer.DownloadURL) + if err != nil { + return fmt.Errorf("failed to parse LFS API URL: %v", err) + } + + req.Header.Set("Host", parsedURL.Host) + req.Header.Set("Accept", "application/vnd.git-lfs+json") + req.Header.Set("Content-Type", "application/vnd.git-lfs+json; charset=utf-8") + req.Header.Set("User-Agent", "git-lfs/3.5.1") + + client := &http.Client{} + resp, err := client.Do(req) + if err != nil { + return fmt.Errorf("failed to download LFS file: %w", err) + } + defer resp.Body.Close() + + if resp.StatusCode != http.StatusOK { + return fmt.Errorf("failed to download LFS file: %s", resp.Status) + } + slog.Info("uploading LFS file", slog.Any("object_key", objectKey)) + uploadInfo, err := w.s3Client.PutObject(ctx, w.config.S3.Bucket, objectKey, resp.Body, resp.ContentLength, minio.PutObjectOptions{}) + if err != nil { + return fmt.Errorf("failed to upload to Minio: %w", err) + } + + if uploadInfo.Size != pointer.Size { + return fmt.Errorf("uploaded file size does not match expected size: %d != %d", uploadInfo.Size, pointer.Size) + } + + return nil +} diff --git a/mirror/prioriy_queue.go b/mirror/prioriy_queue.go new file mode 100644 index 00000000..9c94a328 --- /dev/null +++ b/mirror/prioriy_queue.go @@ -0,0 +1,73 @@ +package mirror + +import ( + "context" + "fmt" + "log/slog" + + "opencsg.com/csghub-server/builder/store/database" + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" + "opencsg.com/csghub-server/mirror/queue" +) + +type MirrorPriorityQueue struct { + mq *queue.PriorityQueue + tasks chan queue.MirrorTask + numWorkers int +} + +func NewMirrorPriorityQueue(config *config.Config) (*MirrorPriorityQueue, error) { + s := &MirrorPriorityQueue{} + mq, err := queue.GetPriorityQueueInstance() + if err != nil { + return nil, fmt.Errorf("fail to get priority queue: %w", err) + } + s.mq = mq + s.tasks = make(chan queue.MirrorTask) + s.numWorkers = config.Mirror.WorkerNumber + return s, nil +} + +func (ms *MirrorPriorityQueue) EnqueueMirrorTasks() { + mirrorStore := database.NewMirrorStore() + mirrors, err := mirrorStore.ToSyncRepo(context.Background()) + if err != nil { + slog.Error("fail to get mirror to sync", slog.String("error", err.Error())) + return + } + + for _, mirror := range mirrors { + ms.mq.PushRepoMirror(&queue.MirrorTask{ + MirrorID: mirror.ID, + Priority: queue.Priority(mirror.Priority), + CreatedAt: mirror.CreatedAt.Unix(), + }) + mirror.Status = types.MirrorWaiting + err = mirrorStore.Update(context.Background(), &mirror) + if err != nil { + slog.Error("fail to update mirror status", slog.Int64("mirrorId", mirror.ID), slog.String("error", err.Error())) + continue + } + } + + mirrors, err = mirrorStore.ToSyncLfs(context.Background()) + if err != nil { + slog.Error("fail to get mirror to sync", slog.String("error", err.Error())) + return + } + + for _, mirror := range mirrors { + ms.mq.PushLfsMirror(&queue.MirrorTask{ + MirrorID: mirror.ID, + Priority: queue.Priority(mirror.Priority), + CreatedAt: mirror.CreatedAt.Unix(), + }) + mirror.Status = types.MirrorWaiting + err = mirrorStore.Update(context.Background(), &mirror) + if err != nil { + slog.Error("fail to update mirror status", slog.Int64("mirrorId", mirror.ID), slog.String("error", err.Error())) + continue + } + } +} diff --git a/mirror/queue/queue.go b/mirror/queue/queue.go new file mode 100644 index 00000000..6779b898 --- /dev/null +++ b/mirror/queue/queue.go @@ -0,0 +1,136 @@ +package queue + +import ( + "context" + "encoding/json" + "fmt" + "sync" + "time" + + "github.com/redis/go-redis/v9" + "opencsg.com/csghub-server/builder/store/cache" + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" +) + +type Priority int + +func (p Priority) Int() int { return int(p) } + +const ( + HighPriority Priority = 3 + MediumPriority Priority = 2 + LowPriority Priority = 1 +) + +var PriorityMap = map[types.MirrorPriority]Priority{ + types.HighMirrorPriority: HighPriority, + types.MediumMirrorPriority: MediumPriority, + types.LowMirrorPriority: LowPriority, +} + +const ( + repoQueueName = "repo_mirror_queue" + lfsQueueName = "lfs_mirror_queue" +) + +type MirrorTask struct { + MirrorID int64 `json:"mirror_id"` + Priority Priority `json:"priority"` + CreatedAt int64 `json:"created_at"` + MirrorToken string `json:"mirror_token"` +} + +type MirrorQueue struct { + redis *cache.Cache + QueueName string +} + +func (m *MirrorTask) MarshalBinary() ([]byte, error) { + return json.Marshal(m) +} + +func (m *MirrorTask) UnmarshalBinary(data []byte) error { + return json.Unmarshal(data, m) +} + +func (mq *MirrorQueue) Push(t *MirrorTask) { + if t.CreatedAt == 0 { + t.CreatedAt = time.Now().Unix() + } + mq.redis.ZAdd(context.Background(), mq.QueueName, redis.Z{ + Score: float64(t.CreatedAt) * float64(t.Priority), + Member: t, + }) +} + +func (mq *MirrorQueue) Pop() *MirrorTask { + r, _ := mq.redis.ZPopMax(context.Background(), mq.QueueName, 1) + if len(r) == 0 { + return nil + } + var task MirrorTask + json.Unmarshal([]byte(r[0].Member.(string)), &task) + return &task +} + +type PriorityQueue struct { + RepoMirrorQueue MirrorQueue + LfsMirrorQueue MirrorQueue +} + +var ( + instance *PriorityQueue + once sync.Once + err error + c *config.Config +) + +func NewPriorityQueue(ctx context.Context, config *config.Config) (*PriorityQueue, error) { + redis, err := cache.NewCache(ctx, cache.RedisConfig{ + Addr: config.Redis.Endpoint, + Username: config.Redis.User, + Password: config.Redis.Password, + }) + if err != nil { + return nil, fmt.Errorf("initializing redis: %w", err) + } + mq := &PriorityQueue{ + RepoMirrorQueue: MirrorQueue{ + redis: redis, + QueueName: repoQueueName, + }, + LfsMirrorQueue: MirrorQueue{ + redis: redis, + QueueName: lfsQueueName, + }, + } + return mq, nil +} + +func (pq *PriorityQueue) PushRepoMirror(mt *MirrorTask) { + pq.RepoMirrorQueue.Push(mt) +} + +func (pq *PriorityQueue) PopRepoMirror() *MirrorTask { + return pq.RepoMirrorQueue.Pop() +} + +func (pq *PriorityQueue) PushLfsMirror(mt *MirrorTask) { + pq.LfsMirrorQueue.Push(mt) +} + +func (pq *PriorityQueue) PopLfsMirror() *MirrorTask { + return pq.LfsMirrorQueue.Pop() +} + +func GetPriorityQueueInstance() (*PriorityQueue, error) { + once.Do(func() { + c, err = config.LoadConfig() + instance, err = NewPriorityQueue(context.Background(), c) + }) + if err != nil { + return nil, err + } + return instance, nil +} diff --git a/mirror/repo_sync_worker.go b/mirror/repo_sync_worker.go new file mode 100644 index 00000000..9a9bfa4e --- /dev/null +++ b/mirror/repo_sync_worker.go @@ -0,0 +1,18 @@ +package mirror + +import ( + "context" + + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/mirror/queue" + "opencsg.com/csghub-server/mirror/reposyncer" +) + +type RepoSyncWorker interface { + Run() + SyncRepo(ctx context.Context, task queue.MirrorTask) error +} + +func NewRepoSyncWorker(config *config.Config, numWorkers int) (RepoSyncWorker, error) { + return reposyncer.NewLocalMirrorWoker(config, numWorkers) +} diff --git a/mirror/reposyncer/local_woker.go b/mirror/reposyncer/local_woker.go new file mode 100644 index 00000000..298eea60 --- /dev/null +++ b/mirror/reposyncer/local_woker.go @@ -0,0 +1,233 @@ +package reposyncer + +import ( + "context" + "fmt" + "log/slog" + "strconv" + "strings" + "sync" + + "opencsg.com/csghub-server/builder/git" + "opencsg.com/csghub-server/builder/git/gitserver" + "opencsg.com/csghub-server/builder/store/database" + "opencsg.com/csghub-server/common/config" + "opencsg.com/csghub-server/common/types" + "opencsg.com/csghub-server/mirror/queue" +) + +type LocalMirrorWoker struct { + mq *queue.PriorityQueue + tasks chan queue.MirrorTask + numWorkers int + wg sync.WaitGroup + saas bool + mirrorStore *database.MirrorStore + lfsMetaObjectStore *database.LfsMetaObjectStore + repoStore *database.RepoStore + git gitserver.GitServer + config *config.Config +} + +func NewLocalMirrorWoker(config *config.Config, numWorkers int) (*LocalMirrorWoker, error) { + var err error + w := &LocalMirrorWoker{} + w.numWorkers = numWorkers + w.git, err = git.NewGitServer(config) + if err != nil { + newError := fmt.Errorf("fail to create git server,error:%w", err) + slog.Error(newError.Error()) + return nil, newError + } + w.mirrorStore = database.NewMirrorStore() + w.repoStore = database.NewRepoStore() + w.lfsMetaObjectStore = database.NewLfsMetaObjectStore() + w.saas = config.Saas + w.config = config + mq, err := queue.GetPriorityQueueInstance() + if err != nil { + return nil, fmt.Errorf("fail to get priority queue: %w", err) + } + w.mq = mq + w.tasks = make(chan queue.MirrorTask) + w.numWorkers = numWorkers + return w, nil +} + +func (w *LocalMirrorWoker) Run() { + for i := 1; i <= w.numWorkers; i++ { + w.wg.Add(1) + go w.worker(i) + } + go w.dispatcher() + w.wg.Wait() +} + +func (w *LocalMirrorWoker) dispatcher() { + for { + task := w.mq.PopRepoMirror() + if task != nil { + w.tasks <- *task + } + } +} + +func (w *LocalMirrorWoker) worker(id int) { + defer w.wg.Done() + defer func() { + if r := recover(); r != nil { + w.wg.Add(1) + go w.worker(id) + slog.Info("worker ecovered from panic ", slog.Int("workerId", id)) + } + }() + slog.Info("worker start", slog.Int("workerId", id)) + for { + task := <-w.tasks + slog.Info("start to mirror", slog.Int64("mirrorId", task.MirrorID), slog.Int("priority", task.Priority.Int()), slog.Int("workerId", id)) + err := w.SyncRepo(context.Background(), task) + if err != nil { + slog.Info("fail to mirror", slog.Int64("mirrorId", task.MirrorID), slog.Int("priority", task.Priority.Int()), slog.Int("workerId", id), slog.String("error", err.Error())) + } + slog.Info("finish to mirror", slog.Int64("mirrorId", task.MirrorID), slog.Int("priority", task.Priority.Int()), slog.Int("workerId", id)) + } +} + +func (w *LocalMirrorWoker) SyncRepo(ctx context.Context, task queue.MirrorTask) error { + mirror, err := w.mirrorStore.FindByID(ctx, task.MirrorID) + if err != nil { + return fmt.Errorf("failed to get mirror: %v", err) + } + mirror.Status = types.MirrorRunning + mirror.Priority = types.LowMirrorPriority + err = w.mirrorStore.Update(ctx, mirror) + if err != nil { + return fmt.Errorf("failed to update mirror status: %v", err) + } + if mirror.Repository == nil { + return fmt.Errorf("mirror repository is nil") + } + namespace := strings.Split(mirror.Repository.Path, "/")[0] + name := strings.Split(mirror.Repository.Path, "/")[1] + + slog.Info("Start to sync mirror repo", "repo_type", mirror.Repository.RepositoryType, "namespace", namespace, "name", name) + req := gitserver.MirrorSyncReq{ + Namespace: namespace, + Name: name, + CloneUrl: mirror.SourceUrl, + Username: mirror.Username, + AccessToken: mirror.AccessToken, + RepoType: mirror.Repository.RepositoryType, + } + if task.MirrorToken != "" { + req.MirrorToken = task.MirrorToken + } + err = w.git.MirrorSync(ctx, req) + + if err != nil { + return fmt.Errorf("failed mirror remote repo in git server: %v", err) + } + slog.Info("Mirror remote repo in git server successfully", "repo_type", mirror.Repository.RepositoryType, "namespace", namespace, "name", name) + + resp, err := w.git.GetRepo(ctx, gitserver.GetRepoReq{ + Namespace: namespace, + Name: name, + RepoType: mirror.Repository.RepositoryType, + }) + if err != nil { + return fmt.Errorf("failed to get repo default branch: %w", err) + } + parts := strings.Split(string(resp.DefaultBranch), "/") + branch := parts[len(parts)-1] + + mirror.Repository.DefaultBranch = branch + _, err = w.repoStore.UpdateRepo(ctx, *mirror.Repository) + if err != nil { + return fmt.Errorf("failed to update repo: %w", err) + } + slog.Info("Update repo default branch successfully", slog.Any("repo_type", mirror.Repository.RepositoryType), slog.Any("namespace", namespace), slog.Any("name", name)) + slog.Info("Start to sync lfs files", "repo_type", mirror.Repository.RepositoryType, "namespace", namespace, "name", name) + err = w.generateLfsMetaObjects(ctx, mirror) + if err != nil { + mirror.Status = types.MirrorIncomplete + mirror.LastMessage = err.Error() + err = w.mirrorStore.Update(ctx, mirror) + if err != nil { + return fmt.Errorf("failed to update mirror: %w", err) + } + return fmt.Errorf("failed to sync lfs files: %v", err) + } + mirror.Status = types.MirrorRepoSynced + err = w.mirrorStore.Update(ctx, mirror) + if err != nil { + return fmt.Errorf("failed to update mirror: %w", err) + } + w.mq.PushLfsMirror(&queue.MirrorTask{ + MirrorID: mirror.ID, + Priority: queue.Priority(mirror.Priority), + CreatedAt: mirror.CreatedAt.Unix(), + MirrorToken: task.MirrorToken, + }) + + return nil +} + +func (c *LocalMirrorWoker) generateLfsMetaObjects(ctx context.Context, mirror *database.Mirror) error { + var lfsMetaObjects []database.LfsMetaObject + namespace := strings.Split(mirror.Repository.Path, "/")[0] + name := strings.Split(mirror.Repository.Path, "/")[1] + branches, err := c.git.GetRepoBranches(ctx, gitserver.GetBranchesReq{ + Namespace: namespace, + Name: name, + RepoType: mirror.Repository.RepositoryType, + }) + if err != nil { + return fmt.Errorf("failed to get repo branches: %v", err) + } + for _, branch := range branches { + lfsPointers, err := c.getAllLfsPointersByRef(ctx, mirror.Repository.RepositoryType, namespace, name, branch.Name) + if err != nil { + return fmt.Errorf("failed to get all lfs pointers: %v", err) + } + for _, lfsPointer := range lfsPointers { + lfsMetaObjects = append(lfsMetaObjects, database.LfsMetaObject{ + Size: lfsPointer.FileSize, + Oid: lfsPointer.FileOid, + RepositoryID: mirror.Repository.ID, + Existing: true, + }) + } + } + lfsMetaObjects = removeDuplicateLfsMetaObject(lfsMetaObjects) + + err = c.lfsMetaObjectStore.BulkUpdateOrCreate(ctx, lfsMetaObjects) + if err != nil { + return fmt.Errorf("failed to bulk update or create lfs meta objects: %v", err) + } + + return nil +} + +func (c *LocalMirrorWoker) getAllLfsPointersByRef(ctx context.Context, RepoType types.RepositoryType, namespace, name, ref string) ([]*types.LFSPointer, error) { + return c.git.GetRepoAllLfsPointers(ctx, gitserver.GetRepoAllFilesReq{ + Namespace: namespace, + Name: name, + Ref: ref, + RepoType: RepoType, + }) +} + +func removeDuplicateLfsMetaObject(objects []database.LfsMetaObject) []database.LfsMetaObject { + seen := make(map[string]bool) + uniqueObjects := []database.LfsMetaObject{} + + for _, obj := range objects { + key := obj.Oid + "_" + strconv.Itoa(int(obj.RepositoryID)) + if !seen[key] { + uniqueObjects = append(uniqueObjects, obj) + seen[key] = true + } + } + + return uniqueObjects +} diff --git a/multisync/accounting/aync_quota_statement.go b/multisync/accounting/aync_quota_statement.go deleted file mode 100644 index 41386325..00000000 --- a/multisync/accounting/aync_quota_statement.go +++ /dev/null @@ -1,53 +0,0 @@ -package accounting - -import ( - "bytes" - "encoding/json" - "fmt" - "net/http" - "time" -) - -type SyncQuotaStatement struct { - ID int64 `json:"id"` - UserID int64 `json:"user_id"` - RepoPath string `json:"repo_path"` - RepoType string `json:"repo_type"` - CreatedAt time.Time `json:"created_at"` -} - -type SyncQuotaStatementRes struct { - Message string `json:"msg"` - Data SyncQuotaStatement `json:"data"` -} - -type GetSyncQuotaStatementsReq struct { - RepoPath string `json:"repo_path"` - RepoType string `json:"repo_type"` - AccessToken string `json:"-"` -} - -type CreateSyncQuotaStatementReq = GetSyncQuotaStatementsReq - -func (c *AccountingClient) CreateSyncQuotaStatement(opt *CreateSyncQuotaStatementReq) (*Response, error) { - header := http.Header{"content-type": []string{"application/json"}} - body, err := json.Marshal(&opt) - if err != nil { - return nil, err - } - if opt.AccessToken != "" { - header.Add("Authorization", "Bearer "+opt.AccessToken) - } - _, resp, err := c.getResponse("POST", "/accounting/multisync/downloads", header, bytes.NewReader(body)) - return resp, err -} - -func (c *AccountingClient) GetSyncQuotaStatement(opt *GetSyncQuotaStatementsReq) (*SyncQuotaStatement, *Response, error) { - s := new(SyncQuotaStatementRes) - header := http.Header{} - if opt.AccessToken != "" { - header.Add("Authorization", "Bearer "+opt.AccessToken) - } - resp, err := c.getParsedResponse("GET", fmt.Sprintf("/accounting/multisync/download?repo_path=%s&repo_type=%s", opt.RepoPath, opt.RepoType), header, nil, s) - return &s.Data, resp, err -} diff --git a/multisync/accounting/client.go b/multisync/accounting/client.go deleted file mode 100644 index db3db65d..00000000 --- a/multisync/accounting/client.go +++ /dev/null @@ -1,135 +0,0 @@ -package accounting - -import ( - "context" - "encoding/json" - "fmt" - "io" - "net/http" - "sync" - "time" - - "opencsg.com/csghub-server/common/config" -) - -type AccountingClient struct { - baseURL string - httpClient *http.Client - mutex sync.RWMutex - ctx context.Context -} - -type Response struct { - *http.Response -} - -func NewAccountingClient(config *config.Config) (*AccountingClient, error) { - if config.Accounting.Host == "" { - return nil, fmt.Errorf("accounting host should be configured") - } - - if config.Accounting.Port == 0 { - return nil, fmt.Errorf("accounting port should be configured") - } - if config.APIToken == "" { - return nil, fmt.Errorf("api token should be configured") - } - - return &AccountingClient{ - baseURL: fmt.Sprintf("%s:%d", config.Accounting.Host, config.Accounting.Port), - httpClient: &http.Client{ - Timeout: time.Second * 5, - }, - ctx: context.Background(), - }, nil -} - -func (c *AccountingClient) getParsedResponse(method, path string, header http.Header, body io.Reader, obj interface{}) (*Response, error) { - data, resp, err := c.getResponse(method, path, header, body) - if err != nil { - return resp, err - } - return resp, json.Unmarshal(data, obj) -} - -func (c *AccountingClient) getResponse(method, path string, header http.Header, body io.Reader) ([]byte, *Response, error) { - resp, err := c.doRequest(method, path, header, body) - if err != nil { - return nil, resp, err - } - defer resp.Body.Close() - - // check for errors - data, err := statusCodeToErr(resp) - if err != nil { - return data, resp, err - } - - // success (2XX), read body - data, err = io.ReadAll(resp.Body) - if err != nil { - return nil, resp, err - } - - return data, resp, nil -} - -// Converts a response for a HTTP status code indicating an error condition -// (non-2XX) to a well-known error value and response body. For non-problematic -// (2XX) status codes nil will be returned. Note that on a non-2XX response, the -// response body stream will have been read and, hence, is closed on return. -func statusCodeToErr(resp *Response) (body []byte, err error) { - // no error - if resp.StatusCode/100 == 2 { - return nil, nil - } - - // - // error: body will be read for details - // - defer resp.Body.Close() - data, err := io.ReadAll(resp.Body) - if err != nil { - return nil, fmt.Errorf("body read on HTTP error %d: %v", resp.StatusCode, err) - } - - // Try to unmarshal and get an error message - errMap := make(map[string]interface{}) - if err = json.Unmarshal(data, &errMap); err != nil { - // when the JSON can't be parsed, data was probably empty or a - // plain string, so we try to return a helpful error anyway - path := resp.Request.URL.Path - method := resp.Request.Method - header := resp.Request.Header - return data, fmt.Errorf("unknown API Error: %d\nRequest: '%s' with '%s' method '%s' header and '%s' body", resp.StatusCode, path, method, header, string(data)) - } - - if msg, ok := errMap["message"]; ok { - return data, fmt.Errorf("%v", msg) - } - - // If no error message, at least give status and data - return data, fmt.Errorf("%s: %s", resp.Status, string(data)) -} - -func (c *AccountingClient) doRequest(method, path string, header http.Header, body io.Reader) (*Response, error) { - c.mutex.RLock() - req, err := http.NewRequestWithContext(c.ctx, method, c.baseURL+"/api/v1"+path, body) - if err != nil { - c.mutex.RUnlock() - return nil, err - } - - client := c.httpClient - c.mutex.RUnlock() - - for k, v := range header { - req.Header[k] = v - } - - resp, err := client.Do(req) - if err != nil { - return nil, err - } - return &Response{resp}, nil -} diff --git a/multisync/accounting/sync_quota.go b/multisync/accounting/sync_quota.go deleted file mode 100644 index 77b6860f..00000000 --- a/multisync/accounting/sync_quota.go +++ /dev/null @@ -1,52 +0,0 @@ -package accounting - -import ( - "bytes" - "encoding/json" - "net/http" -) - -type GetSyncQuotaReq struct { - AccessToken string `json:"access_token"` -} - -type SyncQuota struct { - RepoCountLimit int64 `json:"repo_count_limit"` - TrafficLimit int64 `json:"traffic_limit"` - AccessToken string `json:"-"` - RepoCountUsed int64 `json:"repo_count_used"` - SpeedLimit int64 `json:"speed_limit"` - TrafficUsed int64 `json:"traffic_used"` -} - -type SyncQuotaRes struct { - Message string `json:"msg"` - Data SyncQuota `json:"data"` -} - -type CreateSyncQuotaReq = SyncQuota - -type UpdateSyncQuotaReq = SyncQuota - -func (c *AccountingClient) CreateOrUpdateSyncQuota(opt *CreateSyncQuotaReq) (*Response, error) { - header := http.Header{"content-type": []string{"application/json"}} - body, err := json.Marshal(&opt) - if err != nil { - return nil, err - } - if opt.AccessToken != "" { - header.Add("Authorization", "Bearer "+opt.AccessToken) - } - _, resp, err := c.getResponse("POST", "/accounting/multisync/quotas", header, bytes.NewReader(body)) - return resp, err -} - -func (c *AccountingClient) GetSyncQuota(opt *GetSyncQuotaReq) (*SyncQuota, *Response, error) { - s := new(SyncQuotaRes) - header := http.Header{} - if opt.AccessToken != "" { - header.Add("Authorization", "Bearer "+opt.AccessToken) - } - resp, err := c.getParsedResponse("GET", "/accounting/multisync/quota", header, nil, s) - return &s.Data, resp, err -} diff --git a/multisync/component/mirror_proxy.go b/multisync/component/mirror_proxy.go deleted file mode 100644 index 5935e3b1..00000000 --- a/multisync/component/mirror_proxy.go +++ /dev/null @@ -1,77 +0,0 @@ -package component - -import ( - "context" - "fmt" - "net/http" - "strconv" - - "github.com/gin-gonic/gin" - "opencsg.com/csghub-server/builder/store/database" - "opencsg.com/csghub-server/common/config" - "opencsg.com/csghub-server/multisync/accounting" - "opencsg.com/csghub-server/multisync/types" -) - -type MirrorProxyComponent struct { - ac *accounting.AccountingClient - user *database.UserStore -} - -func NewMirrorProxyComponent(config *config.Config) (*MirrorProxyComponent, error) { - ac, err := accounting.NewAccountingClient(config) - if err != nil { - return nil, err - } - return &MirrorProxyComponent{ - ac: ac, - user: database.NewUserStore(), - }, nil -} - -func (c *MirrorProxyComponent) Serve(ctx context.Context, req *types.GetSyncQuotaStatementReq) error { - sq, _, err := c.ac.GetSyncQuota(&accounting.GetSyncQuotaReq{ - AccessToken: req.AccessToken, - }) - if err != nil { - return fmt.Errorf("error getting sync quota: %v", err) - } - if sq.RepoCountLimit <= sq.RepoCountUsed { - return fmt.Errorf("sync repository count limit exceeded") - } - sqs, _, err := c.ac.GetSyncQuotaStatement(&accounting.GetSyncQuotaStatementsReq{ - AccessToken: req.AccessToken, - RepoPath: req.RepoPath, - RepoType: req.RepoType, - }) - if err != nil { - return fmt.Errorf("error getting sync quota statement: %v", err) - } - if sqs.ID != 0 { - return nil - } - resp, err := c.ac.CreateSyncQuotaStatement(&accounting.CreateSyncQuotaStatementReq{ - AccessToken: req.AccessToken, - RepoPath: req.RepoPath, - RepoType: req.RepoType, - }) - if err != nil { - return fmt.Errorf("error creating sync quota statement: %v", err) - } - if resp.StatusCode != http.StatusOK { - return fmt.Errorf("error creating sync quota statement") - } - return nil -} - -func (c *MirrorProxyComponent) LfsDownload(ctx *gin.Context, token string) error { - sq, _, err := c.ac.GetSyncQuota(&accounting.GetSyncQuotaReq{ - AccessToken: token, - }) - if err != nil { - return fmt.Errorf("error getting sync quota: %v", err) - } - - ctx.Request.Header.Add("X-OPENCSG-Speed-Limit", strconv.FormatInt(sq.SpeedLimit, 10)) - return nil -} diff --git a/multisync/handler/mirror_proxy.go b/multisync/handler/mirror_proxy.go deleted file mode 100644 index 70f13b19..00000000 --- a/multisync/handler/mirror_proxy.go +++ /dev/null @@ -1,85 +0,0 @@ -package handler - -import ( - "fmt" - "log/slog" - "strings" - - "github.com/gin-gonic/gin" - "opencsg.com/csghub-server/api/httpbase" - "opencsg.com/csghub-server/builder/proxy" - "opencsg.com/csghub-server/common/config" - "opencsg.com/csghub-server/multisync/component" - "opencsg.com/csghub-server/multisync/types" -) - -const MirrorTokenHeaderKey = "X-OPENCSG-Sync-Token" - -type MirrorProxyHandler struct { - gitServerURL string - mpComp *component.MirrorProxyComponent -} - -func NewMirrorProxyHandler(config *config.Config) (*MirrorProxyHandler, error) { - mpComp, err := component.NewMirrorProxyComponent(config) - if err != nil { - return nil, fmt.Errorf("failed to create repo component,%w", err) - } - - return &MirrorProxyHandler{ - mpComp: mpComp, - gitServerURL: config.GitServer.URL, - }, nil -} - -func (r *MirrorProxyHandler) Serve(ctx *gin.Context) { - var req types.GetSyncQuotaStatementReq - token := getMirrorTokenFromContext(ctx) - repoType := ctx.Param("repo_type") - namespace := ctx.Param("namespace") - name := ctx.Param("name") - name, _ = strings.CutSuffix(name, ".git") - req.RepoPath = fmt.Sprintf("%s/%s", namespace, name) - req.RepoType = strings.TrimSuffix(repoType, "s") - req.AccessToken = token - - if strings.HasSuffix(ctx.Request.URL.Path, "git-upload-pack") { - err := r.mpComp.Serve(ctx, &req) - if err != nil { - slog.Error("failed to serve git upload pack request:", slog.Any("err", err)) - httpbase.BadRequest(ctx, err.Error()) - return - } - } - - path := strings.Replace(ctx.Request.URL.Path, fmt.Sprintf("%s/", repoType), fmt.Sprintf("%s_", repoType), 1) - rp, _ := proxy.NewReverseProxy(r.gitServerURL) - rp.ServeHTTP(ctx.Writer, ctx.Request, path) -} - -func (r *MirrorProxyHandler) ServeLFS(ctx *gin.Context) { - var req types.GetSyncQuotaStatementReq - token := getMirrorTokenFromContext(ctx) - repoType := ctx.Param("repo_type") - namespace := ctx.Param("namespace") - name := ctx.Param("name") - name, _ = strings.CutSuffix(name, ".git") - req.RepoPath = fmt.Sprintf("%s/%s", namespace, name) - req.RepoType = strings.TrimSuffix(repoType, "s") - req.AccessToken = token - - err := r.mpComp.LfsDownload(ctx, token) - if err != nil { - slog.Error("failed to serve lfs download request:", slog.Any("err", err)) - httpbase.BadRequest(ctx, err.Error()) - return - } - - path := strings.Replace(ctx.Request.URL.Path, fmt.Sprintf("%s/", repoType), fmt.Sprintf("%s_", repoType), 1) - rp, _ := proxy.NewReverseProxy(r.gitServerURL) - rp.ServeHTTP(ctx.Writer, ctx.Request, path) -} - -func getMirrorTokenFromContext(ctx *gin.Context) string { - return ctx.GetHeader(MirrorTokenHeaderKey) -} diff --git a/multisync/router/api.go b/multisync/router/api.go deleted file mode 100644 index a384fc8e..00000000 --- a/multisync/router/api.go +++ /dev/null @@ -1,47 +0,0 @@ -package router - -import ( - "fmt" - - "github.com/gin-gonic/gin" - "opencsg.com/csghub-server/api/middleware" - "opencsg.com/csghub-server/common/config" - "opencsg.com/csghub-server/multisync/handler" -) - -func NewRouter(config *config.Config) (*gin.Engine, error) { - r := gin.New() - r.Use(gin.Recovery()) - r.Use(middleware.Log()) - // store := cookie.NewStore([]byte(config.Mirror.SessionSecretKey)) - // store.Options(sessions.Options{ - // SameSite: http.SameSiteNoneMode, - // Secure: config.EnableHTTPS, - // }) - // r.Use(sessions.Sessions("jwt_session", store)) - // r.Use(middleware.BuildJwtSession(config.JWT.SigningKey)) - - mpHandler, err := handler.NewMirrorProxyHandler(config) - if err != nil { - return nil, fmt.Errorf("error creating rproxy handler:%w", err) - } - rGroup := r.Group("/:repo_type/:namespace/:name") - { - rGroup.POST("/git-upload-pack", mpHandler.Serve) - rGroup.POST("/git-receive-pack", mpHandler.Serve) - rGroup.GET("/info/refs", mpHandler.Serve) - rGroup.GET("/HEAD", mpHandler.Serve) - rGroup.GET("/objects/info/alternates", mpHandler.Serve) - rGroup.GET("/objects/info/http-alternates", mpHandler.Serve) - rGroup.GET("/objects/info/packs", mpHandler.Serve) - rGroup.GET("/objects/info/:file", mpHandler.Serve) - rGroup.GET("/objects/:head/:hash", mpHandler.Serve) - rGroup.GET("/objects/pack/pack-:file", mpHandler.Serve) - rGroup.POST("/info/lfs/objects/batch", mpHandler.ServeLFS) - rGroup.GET("/info/lfs/objects/:oid", mpHandler.ServeLFS) - } - - // r.Any("/*api", handler.Serve) - - return r, nil -} diff --git a/multisync/types/mirror_proxy.go b/multisync/types/mirror_proxy.go deleted file mode 100644 index 21c9ef8d..00000000 --- a/multisync/types/mirror_proxy.go +++ /dev/null @@ -1,7 +0,0 @@ -package types - -type GetSyncQuotaStatementReq struct { - RepoPath string `json:"repo_path"` - RepoType string `json:"repo_type"` - AccessToken string `json:"access_token"` -} diff --git a/scripts/init.sh b/scripts/init.sh index b59ce6d1..78832f4e 100755 --- a/scripts/init.sh +++ b/scripts/init.sh @@ -126,7 +126,7 @@ if [ "$STARHUB_SERVER_SAAS" == "false" ]; then else echo "Creating cron job for sync saas sync verions..." read_and_set_cron "STARHUB_SERVER_CRON_SYNC_AS_CLIENT" "0 * * * *" - (crontab -l ;echo "$cron STARHUB_SERVER_PUBLIC_DOMAIN=$STARHUB_SERVER_PUBLIC_DOMAIN STARHUB_DATABASE_DSN=$STARHUB_DATABASE_DSN STARHUB_SERVER_GITSERVER_HOST=$STARHUB_SERVER_GITSERVER_HOST STARHUB_SERVER_GITSERVER_USERNAME=$STARHUB_SERVER_GITSERVER_USERNAME STARHUB_SERVER_GITSERVER_PASSWORD=$STARHUB_SERVER_GITSERVER_PASSWORD STARHUB_SERVER_REDIS_ENDPOINT=$STARHUB_SERVER_REDIS_ENDPOINT STARHUB_SERVER_REDIS_USER=$STARHUB_SERVER_REDIS_USER STARHUB_SERVER_REDIS_PASSWORD=$STARHUB_SERVER_REDIS_PASSWORD /starhub-bin/starhub cron sync-as-client >> /starhub-bin/cron-sync-as-client.log 2>&1") | crontab - + (crontab -l ;echo "$cron STARHUB_SERVER_PUBLIC_DOMAIN=$STARHUB_SERVER_PUBLIC_DOMAIN STARHUB_DATABASE_DSN=$STARHUB_DATABASE_DSN STARHUB_SERVER_GITSERVER_HOST=$STARHUB_SERVER_GITSERVER_HOST STARHUB_SERVER_GITSERVER_USERNAME=$STARHUB_SERVER_GITSERVER_USERNAME STARHUB_SERVER_GITSERVER_PASSWORD=$STARHUB_SERVER_GITSERVER_PASSWORD STARHUB_SERVER_REDIS_ENDPOINT=$STARHUB_SERVER_REDIS_ENDPOINT STARHUB_SERVER_REDIS_USER=$STARHUB_SERVER_REDIS_USER STARHUB_SERVER_REDIS_PASSWORD=$STARHUB_SERVER_REDIS_PASSWORD /starhub-bin/starhub sync sync-as-client >> /starhub-bin/cron-sync-as-client.log 2>&1") | crontab - fi else echo "Saas does not need sync-as-client cron job" diff --git a/user/component/member.go b/user/component/member.go index 56808383..19b9d881 100644 --- a/user/component/member.go +++ b/user/component/member.go @@ -30,7 +30,7 @@ func NewMemberComponent(config *config.Config) (*MemberComponent, error) { if err != nil { return nil, fmt.Errorf("failed to create git server:%w", err) } - if config.GitServer.Type == "gitea" { + if config.GitServer.Type == types.GitServerTypeGitea { gms, err = git.NewMemberShip(*config) if err != nil { return nil, fmt.Errorf("failed to create git membership:%w", err) @@ -91,7 +91,7 @@ func (c *MemberComponent) OrgMembers(ctx context.Context, orgName, currentUser s } func (c *MemberComponent) InitRoles(ctx context.Context, org *database.Organization) error { - if c.config.GitServer.Type == "gitea" { + if c.config.GitServer.Type == types.GitServerTypeGitea { return c.gitMemberShip.AddRoles(ctx, org.Name, []membership.Role{membership.RoleAdmin, membership.RoleRead, membership.RoleWrite}) } else { @@ -108,7 +108,7 @@ func (c *MemberComponent) SetAdmin(ctx context.Context, org *database.Organizati err = fmt.Errorf("failed to create member,caused by:%w", err) return err } - if c.config.GitServer.Type == "gitea" { + if c.config.GitServer.Type == types.GitServerTypeGitea { return c.gitMemberShip.AddMember(ctx, org.Name, user.Username, membership.RoleAdmin) } else { return nil @@ -193,7 +193,7 @@ func (c *MemberComponent) AddMembers(ctx context.Context, orgName string, users err = fmt.Errorf("failed to create db member, org:%s, user:%s,caused by:%w", orgName, userName, err) return err } - if c.config.GitServer.Type == "gitea" { + if c.config.GitServer.Type == types.GitServerTypeGitea { err = c.gitMemberShip.AddMember(ctx, orgName, userName, c.toGitRole(role)) if err != nil { return fmt.Errorf("failed to add git member, org:%s, user:%s caused by:%w", orgName, userName, err) @@ -251,7 +251,7 @@ func (c *MemberComponent) Delete(ctx context.Context, orgName, userName, operato err = fmt.Errorf("failed to delete member,caused by:%w", err) return err } - if c.config.GitServer.Type == "gitea" { + if c.config.GitServer.Type == types.GitServerTypeGitea { return c.gitMemberShip.RemoveMember(ctx, orgName, userName, c.toGitRole(role)) } else { return nil