Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adapter new html format on nodejs.org/dist #728

Merged
merged 1 commit into from
Nov 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 19 additions & 5 deletions app/common/adapter/binary/NodeBinary.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import { basename } from 'node:path';
import { SingletonProto } from '@eggjs/tegg';
import { BinaryType } from '../../enum/Binary';
import binaries, { BinaryName } from '../../../../config/binaries';
Expand All @@ -21,26 +22,39 @@ export class NodeBinary extends AbstractBinary {
// <a href="index.tab">index.tab</a> 17-Dec-2021 23:16 136319
// <a href="node-0.0.1.tar.gz">node-0.0.1.tar.gz</a> 26-Aug-2011 16:22 2846972
// <a href="node-v14.0.0-nightly20200119b318926634-linux-armv7l.tar.xz">node-v14.0.0-nightly20200119b318926634-linux-ar..&gt;</a> 19-Jan-2020 06:07 18565976
const re = /<a href="([^\"]+?)"[^>]*?>[^<]+?<\/a>\s+?([\w\-]+? \w{2}\:\d{2})\s+?(\d+|\-)/ig;

// new html format
// <a href="docs/">docs/</a> - -
// <a href="win-x64/">win-x64/</a> - -
// <a href="win-x86/">win-x86/</a> - -
// <a href="/dist/v18.15.0/SHASUMS256.txt.asc">SHASUMS256.txt.asc</a> 04-Nov-2024 17:29 3.7 KB
// <a href="/dist/v18.15.0/SHASUMS256.txt.sig">SHASUMS256.txt.sig</a> 04-Nov-2024 17:29 310 B
// <a href="/dist/v18.15.0/SHASUMS256.txt">SHASUMS256.txt</a> 04-Nov-2024 17:29 3.2 KB
const re = /<a href="([^\"]+?)"[^>]*?>[^<]+?<\/a>\s+?((?:[\w\-]+? \w{2}\:\d{2})|\-)\s+?([\d\.\-\s\w]+)/ig;
const matchs = html.matchAll(re);
const items: BinaryItem[] = [];
for (const m of matchs) {
const name = m[1];
let name = m[1];
const isDir = name.endsWith('/');
if (!isDir) {
// /dist/v18.15.0/SHASUMS256.txt => SHASUMS256.txt
name = basename(name);
}
const fileUrl = isDir ? '' : `${url}${name}`;
const date = m[2];
const size = m[3];
const size = m[3].trim();
if (size === '0') continue;
if (binaryConfig.ignoreFiles?.includes(`${dir}${name}`)) continue;

items.push({
const item = {
name,
isDir,
url: fileUrl,
size,
date,
ignoreDownloadStatuses: binaryConfig.options?.ignoreDownloadStatuses,
});
};
items.push(item);
}
return { items, nextParams: null };
}
Expand Down
39 changes: 39 additions & 0 deletions test/common/adapter/binary/NodeBinary.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,45 @@ describe('test/common/adapter/binary/NodeBinary.test.ts', () => {
assert(matchFile);
});

it('should fetch subdir: /v18.15.0/ work', async () => {
app.mockHttpclient('https://nodejs.org/dist/v18.15.0/', 'GET', {
data: await TestUtil.readFixturesFile('nodejs.org/site/v18.15.0/index.html'),
});
const result = await binary.fetch('/v18.15.0/', 'node');
assert(result);
assert(result.items.length > 0);
let matchDir = false;
let matchFile = false;
for (const item of result.items) {
if (item.name === 'docs/') {
assert.equal(item.date, '-');
assert.equal(item.isDir, true);
assert.equal(item.size, '-');
matchDir = true;
}
if (item.name === 'SHASUMS256.txt') {
assert.equal(item.date, '04-Nov-2024 17:29');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Test fixtures contain future dates that need updating

The future dates are coming from the test fixtures file (test/fixtures/nodejs.org/site/v18.15.0/index.html). These dates are used to mock HTTP responses in tests. While the tests will pass today, they will start failing when we reach and pass these dates (October/November 2024).

Recommendations:

  • Update the fixture file to use past dates (e.g., March 2023)
  • Update the corresponding test assertions in NodeBinary.test.ts
  • Consider using relative dates or date mocking utilities for more maintainable tests
🔗 Analysis chain

Verify test dates: Future dates detected

The test uses dates from November and October 2024, which are in the future. This could cause test failures when these dates become past dates. Consider:

  1. Verifying if these dates are correct
  2. Using past dates for more reliable testing
  3. Using date assertions that are resilient to time passage

Also applies to: 100-100

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the dates in the test match the actual nodejs.org distribution dates

# Fetch the actual page to verify dates
curl -sI "https://nodejs.org/dist/v18.15.0/SHASUMS256.txt" | grep "Last-Modified:"
curl -sI "https://nodejs.org/dist/v18.15.0/node-v18.15.0-win-x64.zip" | grep "Last-Modified:"

Length of output: 172


Script:

#!/bin/bash
# Let's examine the test file to understand the context of these dates
cat test/common/adapter/binary/NodeBinary.test.ts

Length of output: 11115


Script:

#!/bin/bash
# Let's check the fixture file to understand if these dates are part of test fixtures
cat test/fixtures/nodejs.org/site/v18.15.0/index.html

Length of output: 5069

assert.equal(item.isDir, false);
assert.equal(item.size, '3.2 KB');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Test coverage for size format parsing is essential

Based on the implementation and test fixtures, the review comment is valid. The NodeBinary implementation directly uses the size string from the HTML without proper parsing or validation, while the test now expects formatted sizes ("3.2 KB", "29 MB"). This change needs proper test coverage as:

  1. The implementation in NodeBinary.ts simply uses m[3].trim() without any size format parsing
  2. The nodejs.org fixture shows various size formats (B, KB, MB) that need to be handled
  3. Current tests don't verify edge cases in size parsing

Key areas needing test coverage:

  • Various size units (B, KB, MB, GB)
  • Decimal places handling (e.g., "3.2 KB", "8.6 MB")
  • Edge cases (e.g., "478 KB", "310 B")
  • Invalid formats
🔗 Analysis chain

Add test coverage for size format parsing

The test introduces human-readable sizes ("3.2 KB", "29 MB") while other tests use byte counts. This suggests a change in the NodeBinary.fetch() implementation to handle formatted sizes. Consider adding test cases for:

  1. Edge cases in size formatting (e.g., "1.0 KB", "0.9 MB")
  2. Various size units (B, KB, MB, GB)
  3. Invalid size formats

Also applies to: 102-102

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the size format change is consistent across the nodejs.org distribution site

# Search for different size format patterns in the HTML files
echo "Searching for size formats in HTML files..."
fd -e html -x rg -l '\d+(\.\d+)?\s*(B|KB|MB|GB)' {} \;

# Check if any files still use byte counts
echo "Checking for byte count formats..."
fd -e html -x rg -l '\d{4,}(?!\s*[KMGT]?B)' {} \;

Length of output: 3464


Script:

#!/bin/bash
# Let's check the implementation and test files to understand the size format handling

# Search for size-related code in NodeBinary implementation
echo "Searching for size handling in NodeBinary implementation..."
ast-grep --pattern 'size'

# Check all test files for size assertions
echo "Checking test files for size assertions..."
fd -e test.ts -x rg -A 2 'assert.*size' {} \;

# Look for size format parsing implementation
echo "Looking for size format parsing implementation..."
rg -l "size.*format|format.*size|parseSize|formatSize"

Length of output: 13444


Script:

#!/bin/bash
# Let's check the NodeBinary implementation for size formatting logic

# Check NodeBinary implementation
echo "Checking NodeBinary implementation..."
rg -A 10 'size.*=.*m\[3\]' app/common/adapter/binary/NodeBinary.ts

# Check the test fixture to understand the format change
echo "Checking nodejs.org test fixture..."
cat test/fixtures/nodejs.org/site/v18.15.0/index.html

Length of output: 5622

assert.equal(item.url, 'https://nodejs.org/dist/v18.15.0/SHASUMS256.txt');
matchFile = true;
}
if (item.name === 'node-v18.15.0-win-x64.zip') {
assert.equal(item.date, '30-Oct-2024 18:04');
assert.equal(item.isDir, false);
assert.equal(item.size, '29 MB');
assert.equal(item.url, 'https://nodejs.org/dist/v18.15.0/node-v18.15.0-win-x64.zip');
matchFile = true;
}
if (!item.isDir) {
assert(typeof item.size === 'string');
assert(item.size.length > 2);
}
}
assert(matchDir);
assert(matchFile);
});

it('should fetch subdir: /v14.0.0-nightly20200119b318926634/ work', async () => {
app.mockHttpclient('https://nodejs.org/download/nightly/v14.0.0-nightly20200119b318926634/', 'GET', {
data: await TestUtil.readFixturesFile('nodejs.org/download/nightly/v14.0.0-nightly20200119b318926634/index.html'),
Expand Down
55 changes: 55 additions & 0 deletions test/fixtures/nodejs.org/site/v18.15.0/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@

<!DOCTYPE html><html>
<head>
<title>Index of /dist/v18.15.0/</title>
<style>
@media (prefers-color-scheme: dark) {
body {
color: white;
background-color: #1c1b22;
}
a {
color: #3391ff;
}
a:visited {
color: #C63B65;
}
}
</style>
</head>
<body>
<h1>Index of /dist/v18.15.0/</h1><hr><pre><a href="../">../</a>
<a href="docs/">docs/</a> - -
<a href="win-x64/">win-x64/</a> - -
<a href="win-x86/">win-x86/</a> - -
<a href="/dist/v18.15.0/SHASUMS256.txt.asc">SHASUMS256.txt.asc</a> 04-Nov-2024 17:29 3.7 KB
<a href="/dist/v18.15.0/SHASUMS256.txt.sig">SHASUMS256.txt.sig</a> 04-Nov-2024 17:29 310 B
<a href="/dist/v18.15.0/SHASUMS256.txt">SHASUMS256.txt</a> 04-Nov-2024 17:29 3.2 KB
<a href="/dist/v18.15.0/node-v18.15.0-aix-ppc64.tar.gz">node-v18.15.0-aix-ppc64.tar.gz</a> 30-Oct-2024 18:04 57 MB
<a href="/dist/v18.15.0/node-v18.15.0-darwin-arm64.tar.gz">node-v18.15.0-darwin-arm64.tar.gz</a> 30-Oct-2024 18:04 40 MB
<a href="/dist/v18.15.0/node-v18.15.0-darwin-arm64.tar.xz">node-v18.15.0-darwin-arm64.tar.xz</a> 30-Oct-2024 18:04 20 MB
<a href="/dist/v18.15.0/node-v18.15.0-darwin-x64.tar.gz">node-v18.15.0-darwin-x64.tar.gz</a> 30-Oct-2024 18:04 42 MB
<a href="/dist/v18.15.0/node-v18.15.0-darwin-x64.tar.xz">node-v18.15.0-darwin-x64.tar.xz</a> 30-Oct-2024 18:04 22 MB
<a href="/dist/v18.15.0/node-v18.15.0-headers.tar.gz">node-v18.15.0-headers.tar.gz</a> 30-Oct-2024 18:04 8.6 MB
<a href="/dist/v18.15.0/node-v18.15.0-headers.tar.xz">node-v18.15.0-headers.tar.xz</a> 04-Nov-2024 17:29 478 KB
<a href="/dist/v18.15.0/node-v18.15.0-linux-arm64.tar.gz">node-v18.15.0-linux-arm64.tar.gz</a> 30-Oct-2024 18:04 44 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-arm64.tar.xz">node-v18.15.0-linux-arm64.tar.xz</a> 30-Oct-2024 18:04 23 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-armv7l.tar.gz">node-v18.15.0-linux-armv7l.tar.gz</a> 30-Oct-2024 18:04 41 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-armv7l.tar.xz">node-v18.15.0-linux-armv7l.tar.xz</a> 30-Oct-2024 18:04 21 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-ppc64le.tar.gz">node-v18.15.0-linux-ppc64le.tar.gz</a> 30-Oct-2024 18:04 46 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-ppc64le.tar.xz">node-v18.15.0-linux-ppc64le.tar.xz</a> 30-Oct-2024 18:04 24 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-s390x.tar.gz">node-v18.15.0-linux-s390x.tar.gz</a> 30-Oct-2024 18:04 44 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-s390x.tar.xz">node-v18.15.0-linux-s390x.tar.xz</a> 30-Oct-2024 18:04 22 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-x64.tar.gz">node-v18.15.0-linux-x64.tar.gz</a> 30-Oct-2024 18:04 44 MB
<a href="/dist/v18.15.0/node-v18.15.0-linux-x64.tar.xz">node-v18.15.0-linux-x64.tar.xz</a> 30-Oct-2024 18:04 24 MB
<a href="/dist/v18.15.0/node-v18.15.0-win-x64.7z">node-v18.15.0-win-x64.7z</a> 30-Oct-2024 18:04 18 MB
<a href="/dist/v18.15.0/node-v18.15.0-win-x64.zip">node-v18.15.0-win-x64.zip</a> 30-Oct-2024 18:04 29 MB
<a href="/dist/v18.15.0/node-v18.15.0-win-x86.7z">node-v18.15.0-win-x86.7z</a> 30-Oct-2024 18:04 17 MB
<a href="/dist/v18.15.0/node-v18.15.0-win-x86.zip">node-v18.15.0-win-x86.zip</a> 30-Oct-2024 18:04 27 MB
<a href="/dist/v18.15.0/node-v18.15.0-x64.msi">node-v18.15.0-x64.msi</a> 30-Oct-2024 18:04 32 MB
<a href="/dist/v18.15.0/node-v18.15.0-x86.msi">node-v18.15.0-x86.msi</a> 30-Oct-2024 18:04 30 MB
<a href="/dist/v18.15.0/node-v18.15.0.pkg">node-v18.15.0.pkg</a> 30-Oct-2024 18:04 70 MB
<a href="/dist/v18.15.0/node-v18.15.0.tar.gz">node-v18.15.0.tar.gz</a> 30-Oct-2024 18:04 85 MB
<a href="/dist/v18.15.0/node-v18.15.0.tar.xz">node-v18.15.0.tar.xz</a> 30-Oct-2024 18:04 40 MB
</pre><hr /></body>
</html>
Loading