Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor sensitive check into a new moderation service #168

Merged
merged 1 commit into from
Nov 6, 2024

Conversation

Rader
Copy link
Collaborator

@Rader Rader commented Nov 6, 2024

  • new moderation service to replace sensitive checking
  • introduce Temporal workflow to run repo content sensitive checking
  • add DFA based local sensitive words detecting

MR Summary:

The summary is added by @codegpt.

The Merge Request introduces a new moderation service to handle sensitive content checking in repository content. It replaces the existing sensitive checking mechanism with a more structured approach, including the integration of Temporal workflow for asynchronous processing and a DFA (Deterministic Finite Automaton) based local sensitive words detection mechanism. Key updates include:

  1. Implementation of a new moderation service for sensitive content checking.
  2. Integration of Temporal workflow for asynchronous processing of repo content checks.
  3. Introduction of a DFA based mechanism for detecting sensitive words locally.
  4. Refactoring of existing code to utilize the new moderation service across various components such as user, organization, and repository handling.
  5. Addition of new tests and modification of existing ones to align with the new moderation service.
  6. Creation of new handlers, components, and workflow activities to support the moderation service functionalities.
  7. Update of documentation and Swagger API definitions to reflect the changes introduced by the moderation service.

@starship-github
Copy link

Linter Issue Report

During the code review, a list issues were found. These issues could affect the code quality, maintainability, and consistency. Below is the detailed Linter issue report:

builder/rpc/moderation_svc_client.go

Lint Issue: undefined: RequestOption

  • Location: Line 25
  • Code Context:
    func NewModerationSvcHttpClient(endpoint string, opts ...RequestOption) ModerationSvcClient {
        return &ModerationSvcHttpClient{
            hc: NewHttpClient(endpoint, opts...),
        }
    }
  • Actionable Suggestion: Ensure that the RequestOption type is defined within the package or imported correctly. If RequestOption is intended to be a part of an external package, verify that the package is imported correctly at the top of your file. If it's a custom type that should be defined in your project, ensure that the definition is available in the scope where NewModerationSvcHttpClient function is defined.
builder/store/database/repository_file.go

Lint Issue: undefined: DB

  • Location: Line 25, Column 6
  • Code Snippet:
    type RepoFileStore struct {
        db *DB
    }
  • Suggestion: It seems like the DB type is not defined within this file or imported from another package. Ensure that you have defined the DB type within the scope of this package or import it from the correct package where it is defined. If DB is supposed to be a part of an external library, make sure the library is correctly imported at the top of your file.

Lint Issue: undefined: defaultDB

  • Location: Line 30, Column 7
  • Code Snippet:
    func NewRepoFileStore() *RepoFileStore {
        return &RepoFileStore{
            db: defaultDB,
        }
    }
  • Suggestion: The variable defaultDB is not defined in the visible scope. If defaultDB is a global variable, ensure it is declared and initialized before use. Alternatively, if defaultDB is intended to be a constant or a configuration value, verify that it is correctly imported or defined within the package. It might also be helpful to check if there are any missing package imports or if the variable name has been misspelled.
builder/store/database/migrations/20240812113006_create_table_repository_files.go

Lint Issue: undefined: Migrations

  • Location: Line 14, Column 2
  • Code Snippet:
    func init() {
        Migrations.MustRegister(func(ctx context.Context, db *bun.DB) error {
  • Suggestion: Ensure that the Migrations variable or type is defined within the package or imported correctly. If it's part of an external package, verify that the package is imported and that Migrations is accessible from this file.

Lint Issue: undefined: createTables

  • Location: Line 15, Column 10
  • Code Snippet:
            err := createTables(ctx, db, database.RepositoryFile{}, RepositoryFileCheck{})
  • Suggestion: Check if the createTables function is defined in the current or another package. If it's supposed to be part of the current package, ensure it's implemented. If it's from another package, verify that the package is imported correctly and accessible.

Lint Issue: undefined: dropTables

  • Location: Line 44, Column 10
  • Code Snippet:
    }, func(ctx context.Context, db *bun.DB) error {
            return dropTables(ctx, db, database.RepositoryFile{}, RepositoryFileCheck{})
  • Suggestion: Similar to the previous issues, ensure that dropTables is defined and accessible within the context of this file. If it's part of another package, check the import statements and access modifiers.
cmd/csghub-server/cmd/trigger/init.go

Lint Issue: undefined: fixOrgDataCmd

  • Location: Line 16, Column 3
  • Code Snippet:
    Cmd.AddCommand(
        gitCallbackCmd,
        fixOrgDataCmd, // Issue is here
        fixUserDataCmd,
        updateRepoCmd,
    )
  • Suggestion: Ensure that fixOrgDataCmd is defined and correctly imported into this file. If it's supposed to be a command handled by this package, verify that it's declared in the same package or an imported package. If it's missing, you may need to implement or import it.

Lint Issue: undefined: fixUserDataCmd

  • Location: Line 17, Column 3
  • Code Snippet:
    Cmd.AddCommand(
        gitCallbackCmd,
        fixOrgDataCmd,
        fixUserDataCmd, // Issue is here
        updateRepoCmd,
    )
  • Suggestion: Check if fixUserDataCmd is defined and properly imported. Similar to the previous issue, this command needs to be declared within the scope of this package or an imported one. If it's not available, consider implementing it or ensuring it's correctly imported.

Please make the suggested changes to improve the code quality.

Copy link
Collaborator

@ganisback ganisback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@starship-github
Copy link

Possible Issues And Suggestions:

  • component/callback/git_callback.go

    • Comments:
      • The removal of sensitive content checking logic might reduce the security of the system by not setting repositories private when sensitive content is detected.
  • Line 79 in component/callback/git_callback.go

    • Comments:
      • The addition of ModerationSvcClient without checking its initialization might lead to nil pointer dereference if config.SensitiveCheck.Enable is false.
    • Suggestions:
      if c.modSvcClient != nil {
          // Use c.modSvcClient safely here
      }
      
  • api/handler/model.go

    • Comments:
      • The method CheckRequestV2 is used but its definition or modification is not shown. Ensure it exists and is correctly implemented to replace CheckRequest.
  • Line 25 in user/handler/organization.go

    • Comments:
      • Error wrapping with %w in fmt.Errorf should be used carefully to avoid exposing internal errors to the API consumers.
    • Suggestions:
      return nil, fmt.Errorf("error creating sensitive component: %v", err)
      
  • Line 22 in api/handler/sshkey.go

    • Comments:
      • Error message "error creating sensitive component" could be more descriptive to aid debugging.
    • Suggestions:
      return nil, fmt.Errorf("error creating sensitive component in SSHKeyHandler: %w", err)
      
  • moderation/checker/text_file_checker.go

    • Comments:
      • The variable 'contentChecker' is not defined in this scope.
    • Suggestions:
      // Assuming contentChecker is a valid instance of SensitiveChecker
      return &TextFileChecker{
          SensitiveChecker: contentChecker,
      }
      
  • moderation/checker/text_file_checker.go

    • Comments:
      • 'localWordChecker.ContainsSensitiveWord(txt)' might fail if 'localWordChecker' is not initialized.
  • cmd/csghub-server/cmd/moderation/launch.go

    • Comments:
      • API token length check should consider being configurable or documented.
  • moderation/workflow/repo_full_check.go

    • Comments:
      • RetryPolicy commented out. If retries are desired, these should be uncommented and configured appropriately.
    • Suggestions:
      retryPolicy := &temporal.RetryPolicy{
          InitialInterval:    time.Second,
          BackoffCoefficient: 2.0,
          MaximumInterval:    time.Minute,
          MaximumAttempts:    3,
      }
      
  • moderation/checker/local_sensitive_word_checker.go

    • Comments:
      • The function 'isIgnoredCharacter' could be optimized by using a map instead of a slice for faster lookups.
    • Suggestions:
      var ignoredCharacters = map[rune]bool{' ': true, '\u3000': true, '\t': true, '&': true, '%': true, '$': true, '@': true, '*': true, '!': true, '!': true, '#': true, '^': true, '~': true, '_': true, '—': true, '|': true, '\'': true, '\"': true, ';': true, '.': true, ',': true, ',': true, '?': true, '<': true, '>': true, '《': true, '》': true, ':': true, ':': true}
      func isIgnoredCharacter(c rune) bool {
          return ignoredCharacters[c]
      }
      
  • moderation/checker/init.go

    • Comments:
      • The panic call for disabled SensitiveCheck may cause the application to terminate unexpectedly. Consider using error handling instead.
    • Suggestions:
      // Consider returning an error instead of panicking
      return errors.New("SensitiveCheck is not enabled")
      
  • cmd/csghub-server/cmd/trigger/init.go

    • Comments:
      • Error from database.InitDB(dbConfig) is not checked. This could lead to runtime issues if the database initialization fails.
    • Suggestions:
      // Check error from database initialization
      if err := database.InitDB(dbConfig); err != nil {
          return fmt.Errorf("initializing DB connection: %w", err)
      }
      
  • builder/store/database/repository_file_check.go

    • Comments:
      • The 'Upsert' method uses 'NewInsert' instead of 'NewUpdate' for the update operation. This might not perform the intended update operation.
    • Suggestions:
      // Use NewUpdate for the update operation
      _, err := s.db.Operator.Core.NewUpdate().Model(&history).WherePK().Exec(ctx)
      
  • Line 38 in builder/sensitive/sensitive_checker.go

    • Comments:
      • Changing the return type of PassTextCheck and PassImageCheck to *CheckResult may break existing code that expects boolean values.

MR Evaluation:

This feature is still under test, evaluation are given by AI and might be inaccurate.

After evaluation, the code changes in the Merge Request get score: 100.

Tips

CodeReview Commands (invoked as MR or PR comments)

  • @codegpt /review to trigger an code review.
  • @codegpt /evaluate to trigger code evaluation process.
  • @codegpt /describe to regenerate the summary of the MR.
  • @codegpt /secscan to scan security vulnerabilities for the MR or the Repository.
  • @codegpt /help to get help.

CodeReview Discussion Chat

There are 2 ways to chat with Starship CodeReview:

  • Review comments: Directly reply to a review comment made by StarShip.
    Example:
    • @codegpt How to fix this bug?
  • Files and specific lines of code (under the "Files changed" tab):
    Tag @codegpt in a new review comment at the desired location with your query.
    Examples:
    • @codegpt generate unit testing code for this code snippet.

Note: Be mindful of the bot's finite context window.
It's strongly recommended to break down tasks such as reading entire modules into smaller chunks.
For a focused discussion, use review comments to chat about specific files and their changes, instead of using the MR/PR comments.

CodeReview Documentation and Community

  • Visit our Documentation
    for detailed information on how to use Starship CodeReview.

About Us:

Visit the OpenCSG StarShip website for the Dashboard and detailed information on CodeReview, CodeGen, and other StarShip modules.

@Rader Rader merged commit 811fd11 into main Nov 6, 2024
@Rader Rader deleted the new-moderation-service branch November 6, 2024 04:08
@starship-github
Copy link

The StarShip CodeReviewer was triggered but terminated because it encountered an issue: The MR state is not opened.

Tips

CodeReview Commands (invoked as MR or PR comments)

  • @codegpt /review to trigger an code review.
  • @codegpt /evaluate to trigger code evaluation process.
  • @codegpt /describe to regenerate the summary of the MR.
  • @codegpt /secscan to scan security vulnerabilities for the MR or the Repository.
  • @codegpt /help to get help.

CodeReview Discussion Chat

There are 2 ways to chat with Starship CodeReview:

  • Review comments: Directly reply to a review comment made by StarShip.
    Example:
    • @codegpt How to fix this bug?
  • Files and specific lines of code (under the "Files changed" tab):
    Tag @codegpt in a new review comment at the desired location with your query.
    Examples:
    • @codegpt generate unit testing code for this code snippet.

Note: Be mindful of the bot's finite context window.
It's strongly recommended to break down tasks such as reading entire modules into smaller chunks.
For a focused discussion, use review comments to chat about specific files and their changes, instead of using the MR/PR comments.

CodeReview Documentation and Community

  • Visit our Documentation
    for detailed information on how to use Starship CodeReview.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants