-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: adapter new html format on nodejs.org/dist #728
Conversation
WalkthroughThe changes involve modifications to the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant NodeBinary
participant HTTPClient
participant HTMLParser
User->>NodeBinary: fetch()
NodeBinary->>HTTPClient: Request HTML from Node.js distribution
HTTPClient-->>NodeBinary: HTML Response
NodeBinary->>HTMLParser: Parse HTML for file links
HTMLParser-->>NodeBinary: Extracted file data
NodeBinary-->>User: Return file data
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #728 +/- ##
==========================================
- Coverage 96.46% 96.46% -0.01%
==========================================
Files 188 188
Lines 18803 18879 +76
Branches 2463 2481 +18
==========================================
+ Hits 18139 18211 +72
- Misses 664 668 +4 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (4)
app/common/adapter/binary/NodeBinary.ts (3)
25-32
: Improve format documentation clarityThe comments showing HTML format examples use a future date (2024). Consider:
- Using actual dates from nodejs.org
- Structuring the comments to clearly separate old vs new formats
- Adding JSDoc-style documentation explaining the format differences
37-42
: LGTM! Consider adding type annotationThe directory handling logic is correct. The use of
basename
properly handles both relative and absolute paths.Consider adding a type annotation for clarity:
- let name = m[1]; + let name: string = m[1];
42-46
: Consider parsing size values with unitsThe size handling could be improved to parse values with units (KB, B) into a standardized format.
Consider adding a helper function:
function parseSize(sizeStr: string): number | null { if (sizeStr === '-') return null; const match = sizeStr.trim().match(/^([\d.]+)\s*([KMG]B|B)?$/i); if (!match) return null; const [, value, unit] = match; const baseValue = parseFloat(value); const multipliers = { B: 1, KB: 1024, MB: 1024 * 1024, GB: 1024 * 1024 * 1024 }; return baseValue * (unit ? multipliers[unit.toUpperCase()] : 1); }test/common/adapter/binary/NodeBinary.test.ts (1)
87-103
: Maintain consistent assertion style across testsThe new test uses
assert.equal()
while other tests use theassert()
with direct comparison. Consider updating the assertions to match the existing style for consistency:- assert.equal(item.date, '-'); - assert.equal(item.isDir, true); - assert.equal(item.size, '-'); + assert(item.date === '-'); + assert(item.isDir === true); + assert(item.size === '-');
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
app/common/adapter/binary/NodeBinary.ts
(2 hunks)test/common/adapter/binary/NodeBinary.test.ts
(1 hunks)test/fixtures/nodejs.org/site/v18.15.0/index.html
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- test/fixtures/nodejs.org/site/v18.15.0/index.html
🔇 Additional comments (2)
app/common/adapter/binary/NodeBinary.ts (2)
49-57
: LGTM! Clean implementation
The item construction is well-structured and properly handles optional configuration.
33-33
: Verify regex pattern robustness
The regex pattern handles both old and new formats, but let's verify it handles all edge cases correctly.
✅ Verification successful
Regex pattern is correctly handling all edge cases
The regex pattern has been thoroughly verified through the test cases which cover:
- Both old and new HTML formats from nodejs.org
- Directory entries with "-" size and various date formats
- File entries with different size formats (e.g., "3.2 KB", "20416228")
- Various date formats including "-" and "DD-MMM-YYYY HH:MM"
- Both relative ("/dist/v18.15.0/file") and direct ("file") href formats
- Zero-size file exclusion
The test suite demonstrates the pattern successfully parses:
- Root directory listings
- Version subdirectories (v16, v18)
- Nightly build directories
- Different file types (docs, binaries, checksums)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Test regex pattern against actual nodejs.org directory listings
# to ensure it handles all possible formats correctly.
# Fetch a sample of actual HTML content from different Node.js versions
curl -s "https://nodejs.org/dist/latest/" > latest.html
curl -s "https://nodejs.org/dist/latest-v18.x/" > v18.html
curl -s "https://nodejs.org/dist/latest-v16.x/" > v16.html
# Test the regex pattern against actual content
cat latest.html v18.html v16.html | grep -P '<a href="[^\"]+?"[^>]*?>[^<]+?</a>\s+?((?:[\w\-]+? \w{2}\:\d{2})|\-)\s+?([\d\.\-\s\w]+)'
# Clean up
rm latest.html v18.html v16.html
Length of output: 308
Script:
#!/bin/bash
# Let's analyze the regex pattern usage in the codebase to understand its context and test cases
# First, find where this regex is used
rg -l "re = /<a href=\"" --type ts
# Then look for any test files that might validate this regex
fd "test|spec" -e ts -e js | xargs rg "NodeBinary|href"
# Look for the complete implementation to understand the parsing logic
ast-grep --pattern 'const re = /<a href=' -A 10
Length of output: 2044
Script:
#!/bin/bash
# Let's examine the test file to understand the expected behavior and test cases
cat test/common/adapter/binary/NodeBinary.test.ts
# Also get the full implementation context
cat app/common/adapter/binary/NodeBinary.ts
Length of output: 14300
matchDir = true; | ||
} | ||
if (item.name === 'SHASUMS256.txt') { | ||
assert.equal(item.date, '04-Nov-2024 17:29'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Test fixtures contain future dates that need updating
The future dates are coming from the test fixtures file (test/fixtures/nodejs.org/site/v18.15.0/index.html
). These dates are used to mock HTTP responses in tests. While the tests will pass today, they will start failing when we reach and pass these dates (October/November 2024).
Recommendations:
- Update the fixture file to use past dates (e.g., March 2023)
- Update the corresponding test assertions in
NodeBinary.test.ts
- Consider using relative dates or date mocking utilities for more maintainable tests
🔗 Analysis chain
Verify test dates: Future dates detected
The test uses dates from November and October 2024, which are in the future. This could cause test failures when these dates become past dates. Consider:
- Verifying if these dates are correct
- Using past dates for more reliable testing
- Using date assertions that are resilient to time passage
Also applies to: 100-100
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the dates in the test match the actual nodejs.org distribution dates
# Fetch the actual page to verify dates
curl -sI "https://nodejs.org/dist/v18.15.0/SHASUMS256.txt" | grep "Last-Modified:"
curl -sI "https://nodejs.org/dist/v18.15.0/node-v18.15.0-win-x64.zip" | grep "Last-Modified:"
Length of output: 172
Script:
#!/bin/bash
# Let's examine the test file to understand the context of these dates
cat test/common/adapter/binary/NodeBinary.test.ts
Length of output: 11115
Script:
#!/bin/bash
# Let's check the fixture file to understand if these dates are part of test fixtures
cat test/fixtures/nodejs.org/site/v18.15.0/index.html
Length of output: 5069
if (item.name === 'SHASUMS256.txt') { | ||
assert.equal(item.date, '04-Nov-2024 17:29'); | ||
assert.equal(item.isDir, false); | ||
assert.equal(item.size, '3.2 KB'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
🛠️ Refactor suggestion
Test coverage for size format parsing is essential
Based on the implementation and test fixtures, the review comment is valid. The NodeBinary implementation directly uses the size string from the HTML without proper parsing or validation, while the test now expects formatted sizes ("3.2 KB", "29 MB"). This change needs proper test coverage as:
- The implementation in
NodeBinary.ts
simply usesm[3].trim()
without any size format parsing - The nodejs.org fixture shows various size formats (B, KB, MB) that need to be handled
- Current tests don't verify edge cases in size parsing
Key areas needing test coverage:
- Various size units (B, KB, MB, GB)
- Decimal places handling (e.g., "3.2 KB", "8.6 MB")
- Edge cases (e.g., "478 KB", "310 B")
- Invalid formats
🔗 Analysis chain
Add test coverage for size format parsing
The test introduces human-readable sizes ("3.2 KB", "29 MB") while other tests use byte counts. This suggests a change in the NodeBinary.fetch()
implementation to handle formatted sizes. Consider adding test cases for:
- Edge cases in size formatting (e.g., "1.0 KB", "0.9 MB")
- Various size units (B, KB, MB, GB)
- Invalid size formats
Also applies to: 102-102
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the size format change is consistent across the nodejs.org distribution site
# Search for different size format patterns in the HTML files
echo "Searching for size formats in HTML files..."
fd -e html -x rg -l '\d+(\.\d+)?\s*(B|KB|MB|GB)' {} \;
# Check if any files still use byte counts
echo "Checking for byte count formats..."
fd -e html -x rg -l '\d{4,}(?!\s*[KMGT]?B)' {} \;
Length of output: 3464
Script:
#!/bin/bash
# Let's check the implementation and test files to understand the size format handling
# Search for size-related code in NodeBinary implementation
echo "Searching for size handling in NodeBinary implementation..."
ast-grep --pattern 'size'
# Check all test files for size assertions
echo "Checking test files for size assertions..."
fd -e test.ts -x rg -A 2 'assert.*size' {} \;
# Look for size format parsing implementation
echo "Looking for size format parsing implementation..."
rg -l "size.*format|format.*size|parseSize|formatSize"
Length of output: 13444
Script:
#!/bin/bash
# Let's check the NodeBinary implementation for size formatting logic
# Check NodeBinary implementation
echo "Checking NodeBinary implementation..."
rg -A 10 'size.*=.*m\[3\]' app/common/adapter/binary/NodeBinary.ts
# Check the test fixture to understand the format change
echo "Checking nodejs.org test fixture..."
cat test/fixtures/nodejs.org/site/v18.15.0/index.html
Length of output: 5622
[skip ci] ## [3.67.1](v3.67.0...v3.67.1) (2024-11-14) ### Bug Fixes * adapter new html format on nodejs.org/dist ([#728](#728)) ([914b59c](914b59c))
Summary by CodeRabbit
New Features
Bug Fixes
Tests
fetch()
method to verify functionality against the Node.js distribution version 18.15.0.