Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ASCII character support #45736

Merged
merged 1 commit into from
Feb 18, 2023
Merged

Conversation

mertcanaltin
Copy link
Member

@mertcanaltin mertcanaltin commented Dec 4, 2022

Fixes #45706
Fixes #46508

  • Is the character's ASCII code greater than 127? (In this case, the function returns true)
  • Is the character a number? (In this case, the function returns false)
  • Is the character a "#" symbol? (In this case, the function returns false)
  • If no conditions are met, the function returns true.

@nodejs-github-bot nodejs-github-bot added dont-land-on-v14.x needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem. labels Dec 4, 2022
@mertcanaltin
Copy link
Member Author

mertcanaltin commented Dec 4, 2022

test

 const  isLiteralSymbol = (char) => {
    const code = char.charCodeAt(0);

    if (code > 127) {
      return true;
    }

    if (char >= '0' && char <= 9) {
      return false;
    }

    if (code === 35) {
      return false;
    }

    return true;
  }

output

isLiteralSymbol('أهلا')
true

lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
test/parallel/test-runner-tap-lexer.js Outdated Show resolved Hide resolved
@anonrig anonrig added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Dec 4, 2022
@mertcanaltin
Copy link
Member Author

@anonrig I applied the changes, thank you very much for the review 🚀

test/parallel/test-runner-tap-lexer.js Outdated Show resolved Hide resolved
lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
@anonrig anonrig requested a review from MoLow December 4, 2022 23:03
@anonrig
Copy link
Member

anonrig commented Dec 4, 2022

@manekinekko Can you review this?

Copy link
Member

@MoLow MoLow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change LGTM, but this seems to break quite a few tests

lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
assert.strictEqual(isLiteralSymbol('#'), false);
assert.strictEqual(isLiteralSymbol('أ'), true);
assert.strictEqual(isLiteralSymbol('ت'), true);
assert.strictEqual(isLiteralSymbol('ث'), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you:

  1. group these into 2 groups: literals (true) and non-literals (false)?
  2. add more samples of non-Latin characters? you can cherry-pick from this list here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manekinekko thank you so much i will fly here 🚀

Copy link
Member Author

@mertcanaltin mertcanaltin Dec 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manekinekko would that be healthy

{
  const literals = ['A', 'a', '-', '+', 'أ', 'ت', 'ث', '讲', '演', '講'];
  const nonLiterals = ['0', '#', '\\', '+', '-'];
  
  literals.forEach((literal) => {
  assert.strictEqual(isLiteralSymbol(literal), true);
  });
  
  nonLiterals.forEach((nonLiteral) => {
  assert.strictEqual(isLiteralSymbol(nonLiteral), false);
  });
}

@mertcanaltin mertcanaltin requested review from manekinekko, MoLow and anonrig and removed request for manekinekko, MoLow and anonrig December 5, 2022 14:59
@mertcanaltin
Copy link
Member Author

I'm so sorry I triggered you all to review, sorry for the extra notifications :/

lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
lib/internal/test_runner/tap_lexer.js Outdated Show resolved Hide resolved
test/parallel/test-runner-tap-lexer.js Outdated Show resolved Hide resolved
@mertcanaltin
Copy link
Member Author

@manekinekko I sent the edits thank you very much

Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really accept zero width characters as acceptable input? I would skip all of them. We also already have a function to check for these:

const isZeroWidthCodePoint = (code) => {
return code <= 0x1F || // C0 control codes
(code >= 0x7F && code <= 0x9F) || // C1 control codes
(code >= 0x300 && code <= 0x36F) || // Combining Diacritical Marks
(code >= 0x200B && code <= 0x200F) || // Modifying Invisible Characters
// Combining Diacritical Marks for Symbols
(code >= 0x20D0 && code <= 0x20FF) ||
(code >= 0xFE00 && code <= 0xFE0F) || // Variation Selectors
(code >= 0xFE20 && code <= 0xFE2F) || // Combining Half Marks
(code >= 0xE0100 && code <= 0xE01EF); // Variation Selectors
};

@mertcanaltin
Copy link
Member Author

mertcanaltin commented Dec 7, 2022

I didn't accept zero-width characters @BridgeAR i will update the code like this

  if (typeof char !== 'string' || util.inspect.isZeroWidthCodePoint(char)) {
      return false;
    }

can I do it like this?

@mertcanaltin mertcanaltin reopened this Dec 7, 2022
@cjihrig
Copy link
Contributor

cjihrig commented Dec 12, 2022

@mertcanaltin I think that would be ok.

@MoLow
Copy link
Member

MoLow commented Feb 8, 2023

CC @nodejs/test_runner @manekinekko additional reviews will be appreciated

@MoLow
Copy link
Member

MoLow commented Feb 8, 2023

@anonrig can you please dismiss your review/approve?

@MoLow MoLow removed the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Feb 8, 2023
@MoLow MoLow added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. and removed author ready PRs that have at least one approval, no pending requests for changes, and a CI started. labels Feb 9, 2023
@MoLow MoLow added the request-ci Add this label to start a Jenkins CI on a PR. label Feb 18, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Feb 18, 2023
@nodejs-github-bot
Copy link
Collaborator

@MoLow MoLow added the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 18, 2023
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 18, 2023
@nodejs-github-bot nodejs-github-bot merged commit 3c6547f into nodejs:main Feb 18, 2023
@nodejs-github-bot
Copy link
Collaborator

Landed in 3c6547f

@mertcanaltin mertcanaltin changed the title [WIP] Non-ASCII character support Non-ASCII character support Feb 18, 2023
@mertcanaltin
Copy link
Member Author

🎉

MoLow pushed a commit to MoLow/node that referenced this pull request Feb 18, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MylesBorins pushed a commit that referenced this pull request Feb 18, 2023
PR-URL: #45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
@MylesBorins MylesBorins mentioned this pull request Feb 19, 2023
MylesBorins pushed a commit that referenced this pull request Feb 20, 2023
PR-URL: #45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
@MoLow MoLow added the backport-open-v18.x Indicate that the PR has an open backport. label Feb 25, 2023
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
juanarbol pushed a commit that referenced this pull request Mar 3, 2023
PR-URL: #45736
Backport-PR-URL: #46839
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
@juanarbol juanarbol mentioned this pull request Mar 3, 2023
juanarbol pushed a commit that referenced this pull request Mar 5, 2023
PR-URL: #45736
Backport-PR-URL: #46839
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-open-v18.x Indicate that the PR has an open backport. needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test runner regression in 19.2.0 Tap parser fails when test name includes non ASCII characters
9 participants