Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for specifying Moonbit Core version in setup process #55

Merged
merged 4 commits into from
Feb 17, 2025

Conversation

hustcer
Copy link
Owner

@hustcer hustcer commented Feb 17, 2025

feat: Add support for specifying Moonbit Core version in setup process, fixes #15

Copy link

Reasoning Details Okay, I need to review the provided code changes for a Nushell script, focusing on the areas outlined. Let's start by looking at the diff.

First, the README files and action.yaml have been updated to include a new input parameter called core-version. That's probably related to allowing users to specify a different version for Moonbit Core. In the action.yaml, the new input is added, and in the Nushell script (moonbit.nu), the setup moonbit command now accepts a --core-version option.

Looking at the changes in moonbit.nu, the function setup moonbit now has the --core-version parameter with a default of 'latest'. Then, in the core setup section, the version variable switched to using $core_version instead of the main $version. That makes sense because they separated the core version from the toolchain version.

Now, checking compatibility with Nu 0.90+. The script uses into bool for converting the setup-core input, which should be fine. They also use par-each? Wait, I don't see any par-each here. The code uses each, maybe that's a place for optimization. Wait, no, in the current diff, they're using each in the code? Let me check again. The code here doesn't have any loops shown in the diff; the main change is adding the core-version.

Security analysis: the inputs from GitHub Actions are passed into variables using '${{...}}', which in Nushell would be strings. They then convert setup-core to a bool with into bool. But when they use variables in command arguments like --core-version=$core_version, is there a risk of command injection? Since the value is being passed as a parameter, not executed as a command, that's probably safe. But maybe the core_version isn't sanitized. For example, if someone passes a core-version with spaces or special characters, could that break the command? Since it's being used as a value in a flag, Nushell should handle that correctly by escaping. But if the core_version is used in a way that's concatenated into a command string elsewhere, that could be an issue.

Looking at fetch-core $core_version, and in fetch-core function, the version is used to construct URLs. If core_version contains unexpected characters, it could lead to unexpected behavior, like downloading from a malformed URL, but not command injection. However, the URL is constructed using $version, so if core_version isn't validated, it might result in errors or fetching unintended resources. For example, a version like "../../malicious" could be problematic. But since the inputs are controlled (from GitHub action inputs), perhaps that's a lower risk. Still, input sanitization is a good practice.

In terms of performance, the script is using built-in commands like rm, git clone, which are external commands. But in Nushell, some commands like rm are aliases to std functions. Wait, no: rm in Nushell is a built-in command, so that's okay. The script uses unzip which is an external command. But maybe that's necessary. Also, perhaps using http get instead of curl could be better, but in the fetch-core function, they use curl -sL $url -o core-$version.zip. Relying on external curl might be replaced with Nushell's http get, but that depends on whether the response is a file download. The http get in Nushell may not handle binary downloads as well as curl, so using curl here might be a valid choice.

Looking at the use of if $core_version == 'bleeding', they clone the core repo. Otherwise, they fetch the core version using the function. That's structured data handling. The core_version is passed as a parameter, which is good.

Module organization: The script is part of a module called moonbit.nu. The code seems organized, with functions defined, and parameters documented.

Potential issues:

  1. Security: When using core_version in string interpolation, like in the print statement (ansi g)($core_version)(ansi reset), if core_version has escape characters, could that cause issues? Probably not, since it's printed as a string. But if core_version contains, say, ANSI escape codes, it might mess with the terminal. However, given that this is part of a GitHub Action, that's less of a concern.

  2. The fetch-core function constructs the URL based on the version. If core_version isn't validated, it could result in invalid URLs. For example, if core_version contains slashes or other characters, it might form a bad URL. So they should validate that core_version matches an expected pattern. The current code doesn't do that. That's a potential issue if the inputs are not properly controlled. Since the input comes from a GitHub action, which is user-configurable, there's a risk of passing invalid versions leading to failed downloads or unintended file paths.

  3. In line where they run git clone, the URL is hardcoded, which is fine. But using rm -rf $coreDir could be dangerous if coreDir is not properly set, but in this case, coreDir is set to $MOONBIT_LIB_DIR/core, which is controlled by the script, so that's safe.

  4. The use of unzip -qo core*.zip might be a problem if multiple zip files exist, but in the context, they just fetched core-$core_version.zip and then delete it afterward. Wait, no: after fetch-core $core_version, which downloads core-$core_version.zip. Then in the Windows block, they do unzip -qo core*.zip, which would unpack all core*.zip files. But if core_version is set correctly, there should only be one. However, if the download fetched multiple files, this could cause issues. But since fetch-core specifically downloads a single file named core-$version.zip, then rm core*.zip after unzipping, perhaps this is safe. However, using a glob here could be a problem if other core*.zip files are present. Maybe better to use the exact filename.

Optimization: The script uses each in some parts? Not shown in the diff. The current code changes don't have loops that could be parallelized. In the code, the main loop I see is cloning the repo and unzipping, which are external commands. Potential optimizations might be limited here.

Structured data: They're using parameters with types, which is good. The function definitions have type annotations for parameters.

Overall, the code looks compatible with Nu 0.90+. The potential security issues are around input validation of core_version, and ensuring that the generated filenames are safe. Also, using a wildcard in unzip core*.zip might be risky if multiple files are present. Performance is okay, but maybe using exact filenames instead of globs would be more reliable.

Another point is that when using $core_version in the fetch-core function, if it's a user-provided input, it could contain characters that are not safe for file paths or URLs. But the script relies on the GitHub action inputs to validate the core-version, which might not be the case. So the script should sanitize core_version to prevent directory traversal or other injection attacks.

Wait, in the code, core_version is used in core-$version.zip when fetched. If the version is something like ../malicious, that could write the zip file to a different directory. Let me check the fetch-core function:

def fetch-core [ version: string ] {
let url = if $version == 'nightly' {
'https://github.com/moonbitlang/core/releases/download/nightly/core-nightly.zip'
} else if $version == 'latest' {
'https://github.com/moonbitlang/core/releases/latest/download/core-latest.zip'
} else {
$"https://github.com/moonbitlang/core/releases/download/($version)/core-($version).zip"
}
curl -sL $url -o $'core-($version).zip'
}

If $version is something like ../../something, the URL would be https://github.com/moonbitlang/core/releases/download/../../something/core-../../something.zip which would resolve to a different path. However, GitHub releases URLs don't allow directory traversal, so that might be safe. But the filename saved is core-($version).zip, so if version is ../badfile, the zip would be saved as core-../badfile.zip, which on Unix systems is a valid filename, but when unzipping, could potentially write files to parent directories. Wait, when you unzip a file that has paths with ../, it could extract files outside the current directory. For example, if core-../evil.zip contains a file with path ../../evil.exe, unzipping it would place that file in the parent directories. That's a security risk.

This is a critical issue. If the core_version is user-controlled and can contain directory traversal sequences (like ..), then fetching a maliciously crafted zip file could allow writing files outside the intended directory. Therefore, the script must validate that core_version does not contain any path traversal characters or other unsafe characters.

In the current code, there's no such validation. The core_version is taken directly from the input and used to construct both the URL and the filename. Even if the URL is constructed safely, the downloaded zip's contents could still be malicious. However, the bigger problem is if the zip file itself is crafted to have files escaping the target directory upon extraction.

Therefore, the code needs to sanitize the core_version input to only allow valid version patterns, perhaps matching a certain regex (like alphanumeric plus some allowed symbols). The problem is that without proper validation, this can lead to arbitrary file writes during unzip. So this is a major security vulnerability.

Additionally, the use of core*.zip in the unzip command might pick up unintended files, especially if the version is not properly sanitized, or if other files exist in the directory. It would be safer to use the exact filename, i.e., core-($core_version).zip when unzipping, provided that the core_version is properly sanitized.

So security findings include:

  • Lack of input validation on core-version, allowing directory traversal in the zip filename and potentially malicious zip contents.
  • Using wildcard in unzip command (core*.zip) which could lead to extracting unintended files if multiple core zip files exist.

Other observations:

  • The script uses rm -rf with variables, but since coreDir is controlled (set to $MOONBIT_LIB_DIR/core), this is safe.
  • The use of external curl and unzip is necessary here, so no optimization there unless Nushell has built-ins for downloading and extracting zips, which it doesn't as of now.

Optimization suggestions:

  • They could consider using par-each if there are loops that can be parallelized, but the current changes don't show such loops.
  • Ensuring structured data is properly handled. The core_version is a string, so that's okay.

Script Analysis

  • Added new core-version input parameter with proper Nushell type annotations
  • Maintained cross-platform path handling with path join
  • Structured data flow preserved through function parameters
  • Git operations and archive handling remain external command dependent

Security Review

  • Critical Vulnerability: Unvalidated core-version input allows potential directory traversal via specially crafted version strings (e.g., ../../malicious) in zip filename generation
  • ⚠️ Wildcard pattern core*.zip in unzip command (line 106) could process unintended files
  • ⚠️ Environment variables used for paths without explicit sanitization

Optimization Suggestions

  • Replace zip wildcard with exact filename: unzip -qo $'core-($core_version).zip'
  • Add version validation regex pattern to core-version parameter
  • Consider concurrent download/extract processes using par-each when handling multiple components

Overall Quality: 3

checklist:
  - Compatibility: 
    - "Nu version: Compatible with 0.90+ features"
    - "Cross-platform support: Maintained through path join"
    - "Plugin dependencies: Relies on external curl/unzip"
  - Security:
    - "Input sanitization: Missing core-version pattern validation"
    - "Temporary file handling: Zip file cleanup present"
    - "Env exposure: MOONBIT_HOME properly scoped"
  - Reliability:
    - "Error propagation: Missing try? blocks for network ops"
    - "Null handling: Default values provided for critical paths"
    - "Type validation: Parameter types enforced"
  - Performance:
    - "Lazy evaluation: Not applicable in current flow"
    - "Batch processing: Single-file download handling"
    - "Stream handling: Direct file writes used"

Copy link

Reasoning Details Okay, let me try to review this Nu script. First, I'll start by checking the core requirements. The script targets Nu 0.90+? Well, I see they're using structured data like tables and custom commands, which is good. They have constants and export def for commands, which are part of Nu's features.

Looking at security. There's user input handling for core-version and version. They use string interpolation for external commands like git clone, unzip, tar. Wait, in the setup moonbit command, they pass $core_version into fetch-core. Then in fetch-core, they check if it's a valid version or semver. Oh, but in the aria2c and http get commands, the version is used in the URL. Are they properly sanitized? The version is passed as an argument to the commands, but since they validate it against VALID_VERSION_TAG or is-semver first, that might prevent some injection. But what if the semver check has a flaw? The regex pattern for semver seems correct. But maybe using shell arguments without proper escaping could still be a risk. However, in Nu, when you pass variables as arguments, they are handled as strings, so maybe injection isn't an issue here. But if the version variable isn't properly sanitized before being used in the URL or commands, there might be a risk. Also, in the git clone command, the $coreDir is passed, which is a path based on MOONBIT_LIB_DIR, which is controlled by the environment variable. If MOONBIT_HOME is set by the user, could that lead to a path traversal? Maybe not, since it's using path join, which should handle that.

Performance-wise, they use built-in commands like rm, cd, which are efficient. They use http get or aria2c for downloads. They could perhaps parallelize some steps, but I don't see obvious bottlenecks. They do check if aria2c is installed for potentially faster downloads. The is-semver function uses a regex, which should be efficient.

Another thing is handling of temporary files. After downloading core-*.zip or tar.gz, they extract and delete them. So no leftover files, which is good. The script also checks if the version is valid before proceeding, preventing unnecessary processing.

Wait, in the setup moonbit function, the core_version is a new input. They check if the version is valid but also the core_version. The core_version is checked in the fetch-core function. But in setup moonbit, they have a check for $version, but core_version might need its own validation. However, looking at the code, the setup moonbit command takes --core-version and passes it to fetch-core, which does the check. But the code for setup moonbit has a check for $version, not $core_version. Wait, in the setup moonbit function, they have:

if ($version not-in $VALID_VERSION_TAG) and not (is-semver $version) { ... exit }

But the core_version is a separate parameter. However, in fetch-core, they do:

if ($version not-in VALID_VERSION_TAG) and not (is-semver $version) { ... exit }

So when fetch-core is called with $core_version, that's where the check occurs. So the setup moonbit function only checks the toolchain version, not the core version. But when they call fetch-core $core_version, that function checks the core_version. So that's covered. But the setup moonbit function validates the toolchain version, not the core version. The core version's validation is in fetch-core. That's okay.

Are there any cases where core_version isn't validated? For example, in the branch where core_version is 'bleeding', they clone the repo directly. But in that case, 'bleeding' is in VALID_VERSION_TAG, so it's allowed. The check in fetch-core ensures that if it's not a valid tag or semver, it exits.

Security-wise, they're validating inputs before using them in URLs and commands. The regex for semver is strict. Possible issues: using string interpolation for commands like git clone. If any of the variables (like coreDir) were user-controlled and not sanitized, but coreDir is built from MOONBIT_LIB_DIR, which is either from env or default path. User can set MOONBIT_HOME, but the script uses path join, which should prevent path traversal. So no obvious command injection here.

Performance suggestions: Maybe replace some loops with built-in commands or use par-each if possible. But looking at the code, there are not many loops. The bundle-core function iterates over platform dirs. Not sure if that's a candidate for parallel processing. But the code seems efficient as is.

Another point: using built-in Nu commands instead of external ones where possible. For example, they use http get instead of curl, which is good. Using aria2c if available is a plus for performance.

Potential issues: when running external commands like git, unzip, tar, ensure that arguments are properly quoted. Since Nu handles command arguments as lists, maybe that's safe. For example, in git clone, the URL and destination are passed as separate arguments, so injection is unlikely.

In the input handling, in the composite action, the inputs are passed via ${{ }} which are GitHub context variables. The script uses into bool for setup-core, which is properly converted. The core_version is taken as a string input, which is then passed to the setup moonbit function.

In the is-semver custom command, they check against the regex. But they use $version =~ $semver_pattern, which returns true if it matches. That's correct. The regex seems correct for semver, though not allowing 'v' prefix. But the comment mentions that if needed, add 'v?' which is not there. So the current regex doesn't allow 'v' prefixes. If the core versions can have 'v', that's a problem. But according to the input description, core-version examples include versions without 'v', like 0.1.20250210+7be093d1f. So maybe the regex is correct as is.

If a core version with 'v' is passed, the is-semver would return false, causing the script to exit. But the valid tags include 'latest', 'bleeding', etc. So if someone passes 'v1.2.3', it's invalid unless the regex is adjusted. But the user should follow the version formats specified.

Overall, the script seems well-structured. The main security point is ensuring input validation is thorough. The performance looks okay. Optimization could include using parallel processing where possible, but not sure if applicable here.

Script Analysis

  • Added new core-version input with validation using semver regex and allowed tags
  • Improved core handling with version-specific artifact names
  • Implemented is-semver custom validation command
  • Maintained cross-platform path handling
  • Followed Nu's structured data patterns for configuration

Security Review

  • ❗ Insecure path construction in Git clone destination (line 94) - uses user-influenced $MOONBIT_HOME without sanitization
  • ⚠️ Potential version string injection in aria2c/http calls (line 49) - validated but needs strict type constraints
  • ❗ Temporary archive files deleted after use (good practice)
  • ✅ Input validation for both toolchain and core versions reduces injection risk

Optimization Suggestions

  • ➡️ Replace sequential archive removal with try { rm ... } for error suppression
  • ➡️ Use path join instead of string interpolation for core archive paths
  • ➡️ Convert is-semver to use parse instead of regex for better maintainability
  • ➡️ Consider parallel execution for core setup vs toolchain setup where possible

Overall Quality: 4

  - Compatibility: ["Nu 0.90+", "Cross-platform paths", "No plugin dependencies"]
  - Security: ["Version validation", "Temp file cleanup", "Env path handling"]
  - Reliability: ["Semver validation", "Error exits", "Multi-arch support"]
  - Performance: ["Aria2c preference", "Batch file ops", "Stream handling"]

Copy link

Reasoning Details Okay, let's start reviewing the provided code changes. The main focus is on the Nushell script modifications in `moonbit.nu`.

First, I'll check compatibility with Nu 0.90+. The script uses sys host which is valid in 0.90. The par-each isn't used here, so maybe a suggestion for parallelism. Structured data handling looks okay with records and tables. The pipeline seems efficient, but there's a each in is-installed using which and length, which is okay.

Looking at security: The core_version and version inputs are validated using is-semver and checking against VALID_VERSION_TAG, which prevents command injection. However, when constructing file names like core-($core_version).zip, if core_version isn't properly sanitized, there might be a risk. But since they're validated earlier, it's probably safe. No obvious data leakage points. External commands like aria2c, git, unzip, tar are used with validated inputs, so safe. Permission checks? The script writes to MOONBIT_HOME, which should be user-writable, but maybe could check if directories exist before writing.

Performance: Using par-each isn't present, so maybe in areas where multiple files are processed, but not evident here. is-installed uses which, which is a built-in. The is-semver uses a regex, which is efficient. The fetch-core uses external commands but necessary. Maybe http get could be used instead of aria2c if possible, but the script already checks for aria2c and uses it for efficiency.

Other observations: The is-semver function's regex includes a comment about a 'v' prefix, but the regex doesn't account for it. The pattern starts with ^v?, but in the code, the regex pattern variable semver_pattern doesn't include the 'v?'. Wait, looking at the code: the regex is written as a string, and the comment mentions adding v? but the actual regex starts with ^v?. So the code does handle 'v' prefixes. However, in the is-semver function, they check if the version matches the pattern. That's correct.

Possible issues: When constructing filenames like core-($core_version).zip, if core_version has special characters, but since it's validated by is-semver or being in VALID_VERSION_TAG, maybe it's safe. However, $core_version could be 'nightly' which is a tag, but when using in a filename, is there any injection? Since the version is checked against valid tags and semver, maybe not.

In the setup moonbit command, the parameters --core-version(-V): string = 'latest' is added. The core_version is then used in fetching. The validation for core_version happens in fetch-core which calls is-semver or checks against VALID_VERSION_TAG. But in the setup moonbit function, only the version is validated, not the core_version. Wait, looking at the code:

In setup moonbit, they have:

if ($version not-in $VALID_VERSION_TAG) and not (is-semver $version) {
  print ...; exit 2
}

But core_version is passed as a parameter. Is core_version validated? In fetch-core, which is called with core_version, there's a check:

if ($version not-in $VALID_VERSION_TAG) and not (is-semver $version) {
  print ...; exit 2
}

So when fetch-core $core_version is called, it checks the core_version. So that's correct.

Potential optimization: Replace which $app | length in is-installed with ^$app? Because which returns a list, and checking length is okay, but maybe using ^$app > /dev/null or built-in exists command. Wait, in Nushell, which returns a list, so checking if it's empty is better. The current approach is okay.

Another observation: The is-semver function uses a regex that's correctly formatted. However, in the code, the line:

let semver_pattern = '^v?(0|[1-9]\d*)\.(0|[1-9]\d*)... Keep the reset of the pattern the same

Wait, no, in the actual code, the regex is written correctly. The user's code has the regex properly defined as:
'^v?(0|[1-9]\d*).(0|[1-9]\d*).(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-])(?:.(?:0|[1-9]\d|\d*[a-zA-Z-][0-9a-zA-Z-]))))?(?:+([0-9a-zA-Z-]+(?:.[0-9a-zA-Z-]+)*))?$'
So that's correct.

Security-wise, the inputs are validated before being used in external commands, which is good. The script exits if invalid versions are provided, preventing injection.

Performance: Using par-each isn't applicable here as there are no loops that can be parallelized. The script seems to handle files efficiently with built-in commands where possible.

Possible suggestions: Maybe use path join instead of string interpolation for file paths, but the current approach is okay. Also, the is-semver function could be optimized by pre-compiling the regex, but in Nushell, regex is handled as strings, so maybe not necessary.

Overall, the script is well-structured, handles input validation, and uses Nu features appropriately. The security aspects are addressed with proper checks. Performance is good, but no major bottlenecks observed.

Script Analysis

  • Added new core-version input with validation checks
  • Implemented semantic version validation using regex
  • Improved core handling with version-specific artifacts
  • Maintained cross-platform path handling
  • Enhanced input sanitization for version parameters

Security Review

  • ❗ Missing input validation for core-version in main command (added in fetch-core but should be duplicated in parent scope)
  • ⚠️ Potential path traversal risk in $core_version filename construction (mitigated by regex but could use additional sanitization)
  • ✔️ Proper validation of version formats before external command execution

Optimization Suggestions

  • Replace which $app | length check with ^$app --help | complete for more reliable installed checks
  • Use path join instead of string interpolation for $'core-($core_version).zip' paths
  • Cache downloaded artifacts to avoid redundant network calls
  • Consider using nu-plugin for archive handling to reduce external dependencies

Overall Quality: 4

checklist:
  - Compatibility: ["Nu 0.90+ compatible", "Cross-platform path handling", "No plugin dependencies"]
  - Security: ["Version input validation", "Regex-based semver check", "Limited external command exposure"]
  - Reliability: ["Error handling with early exits", "Null checks for environment variables", "Type validation via regex"]
  - Performance: ["Conditional network calls", "Bulk file operations", "Builtin path handling"]

examples:
  - issue: "❗ Missing core_version validation in main command scope"
  - suggestion: "Implement artifact caching mechanism for repeated installations"

@hustcer hustcer merged commit 6dbe932 into main Feb 17, 2025
18 checks passed
@github-actions github-actions bot added this to the v1.11.0 milestone Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add core-version input
1 participant