Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Rf_charIsASCII for IS_ASCII instead of testing LEVELS on R >= 4.5 #6422

Merged
merged 10 commits into from
Mar 5, 2025

Conversation

aitap
Copy link
Contributor

@aitap aitap commented Aug 28, 2024

This part is split from #6420 (but includes #6420; can rebase if needed) because the R version containing Rf_isCharASCII is not currently released. The overall issue is #6180. Not yet sure how to change NEWS.md in case we're expecting data.table releases in the meantime.

aitap added 4 commits August 28, 2024 16:36
Since data.table now depends on R >= 3.3, the backports are no longer
needed. Moreover, MAYBE_SHARED is currently a function, while
MAYBE_REFERENCED expands to !NO_REFERENCES (which is a function).

In debugging output, show MAYBE_REFERENCED (NAMED > 0) instead of NAMED.
getCharCE appeared in R-2.7, making it possible to check for strings
_marked_ as UTF-8 or Latin-1.

There is no marking as ASCII, so fixing IS_ASCII will have to wait for R
>= 4.5.
There's no explicit encoding code for ASCII, so use charIsASCII()
("eapi", expected to appear in R-4.5.0).
Copy link

github-actions bot commented Aug 28, 2024

Comparison Plot

Generated via commit 090dc37

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 4 minutes and 44 seconds
Installing different package versions 8 minutes and 23 seconds
Running and plotting the test cases 2 minutes and 24 seconds

@MichaelChirico
Copy link
Member

Not yet sure how to change NEWS.md in case we're expecting data.table releases in the meantime.

Don't mind that, we maintainers will take care of it :)

@MichaelChirico MichaelChirico changed the base branch from master to nonapi_b_gone August 28, 2024 16:39
@MichaelChirico
Copy link
Member

set #6420 as the target of this PR to make the chain clearer

Base automatically changed from nonapi_b_gone to master August 28, 2024 16:42
Copy link
Member

@HughParsonage HughParsonage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it an issue that the pre 4.5.0 version returns 1 or 0 whereas the new version returns an Rboolean?

@aitap
Copy link
Contributor Author

aitap commented Aug 29, 2024 via email

@MichaelChirico MichaelChirico marked this pull request as draft September 12, 2024 03:55
@MichaelChirico MichaelChirico changed the title Use Rf_isCharASCII for IS_ASCII instead of testing LEVELS on R >= 4.5 Use Rf_charIsASCII for IS_ASCII instead of testing LEVELS on R >= 4.5 Dec 9, 2024
@MichaelChirico MichaelChirico marked this pull request as ready for review March 4, 2025 17:11
@@ -42,7 +42,11 @@
/* we mean the encoding bits, not CE_NATIVE in a UTF-8 locale */
#define IS_UTF8(x) (getCharCE(x) == CE_UTF8)
#define IS_LATIN(x) (getCharCE(x) == CE_LATIN1)
#define IS_ASCII(x) (LEVELS(x) & 64) // API expected in R >= 4.5
#if R_VERSION < R_Version(4, 5, 0) || R_SVN_REVISION < 86789
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping this here even though there's another (4,5,0) check above at L21

  • The R_SVN_REVISIONs are different
  • The IS_UTF8() and IS_LATIN() macros are also right here

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to go now, too, right? Please merge if you agree. Thanks!

Copy link

codecov bot commented Mar 4, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.59%. Comparing base (93a5305) to head (090dc37).
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6422   +/-   ##
=======================================
  Coverage   98.59%   98.59%           
=======================================
  Files          79       79           
  Lines       14661    14661           
=======================================
  Hits        14455    14455           
  Misses        206      206           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aitap aitap merged commit 6a8634e into master Mar 5, 2025
9 of 11 checks passed
@aitap aitap deleted the R_4_5_isCharASCII branch March 5, 2025 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants