Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(encoding): improve base32 encode/decode performance #6479

Merged
merged 3 commits into from
Mar 21, 2025

Conversation

BlackAsLight
Copy link
Contributor

This pull request brings the performance added in #6471 to the stable base32 functions, but in a non-breaking manner. The only difference a user would find is a difference in error message if providing an invalid encoded string.

@BlackAsLight BlackAsLight requested a review from kt3k as a code owner March 13, 2025 06:47
Copy link

codecov bot commented Mar 13, 2025

Codecov Report

Attention: Patch coverage is 92.30769% with 2 lines in your changes missing coverage. Please review.

Project coverage is 95.29%. Comparing base (5fa127e) to head (9bbfb81).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
encoding/base32.ts 92.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6479      +/-   ##
==========================================
- Coverage   95.32%   95.29%   -0.03%     
==========================================
  Files         577      576       -1     
  Lines       43417    43256     -161     
  Branches     6485     6467      -18     
==========================================
- Hits        41386    41222     -164     
- Misses       1992     1995       +3     
  Partials       39       39              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@@ -37,9 +37,9 @@ Deno.test({
name: "decodeBase32() throws on bad length",
fn() {
assertThrows(
() => decodeBase32("OOOO=="),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of this input seems changed by this PR. Is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The way the new code detects errors is different where the old input would be valid. The old input had the wrong length based off the encoding + padding but the new code doesn't care about the padding as long as it isn't too much padding. Four encoded characters is valid here while three is invalid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new code doesn't care about the padding as long as it isn't too much padding.

I'm not sure if we should do that change. I checked the standard libraries of some other languages, but they seem checking padding length. Probably we should keep that check?

$ cat enc.py 
import base64
base64.b32decode("OOOO==")
$ python3 enc.py
Traceback (most recent call last):
  File "/Users/kt3k/denoland/std/enc.py", line 2, in <module>
    base64.b32decode("OOOO==")
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/base64.py", line 254, in b32decode
    return _b32decode(_b32alphabet, s, casefold, map01)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/base64.py", line 210, in _b32decode
    raise binascii.Error('Incorrect padding')
binascii.Error: Incorrect padding
$ cat main.go 
package main

import (
  "encoding/base32"
  "fmt"
)

func main() {
  enc := base32.NewEncoding("ABCDEFGHIJKLMNOPQRSTUVWXYZ234567")
  res, e := enc.DecodeString("OOOO==")
  if e != nil {
    fmt.Printf("%s", e)
  } else {
    fmt.Printf("%s", res)
  }
}
$ go run main.go 
illegal base32 data at input byte 6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was more written this way for convince. The code for it is here: https://github.com/denoland/std/blob/main/encoding/_common32.ts#L107-L119 where it really is just seeing how much padding it needs to slice off before doing the actual decoding.

I can understand wanting to be in align with other std's but I also don't see a benefit in being strict here with the padding when decoding when you know what the missing bytes are.

The change to base64 operates in the same manner here. Where base64url may allow omitting padding, but base64 requires it, the decoder doesn't care if padding is short

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this kind of change needs more broader consensus from the community and core team, and I'm not in favor of landing this as part of perf change. Can you add back the simple check of length like the below and create an issue for discussing that?:

  if (len % 8 > 0) {
    throw new Error(
      `Cannot decode base32 string as the length must be a multiple of 8: received length ${len}`,
    );
  }

The change to base64 operates in the same manner here. Where base64url may allow omitting padding, but base64 requires it, the decoder doesn't care if padding is short

Base64 decoder seems ignoring missing padding before the recent rewrite (#6461). That might be a good reason to remove padding length check from decodeBase32. Anyway I think we need to discuss this topic independently somewhere else.

Copy link
Member

@kt3k kt3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work!

@kt3k kt3k merged commit 922c2f4 into denoland:main Mar 21, 2025
18 checks passed
@BlackAsLight BlackAsLight deleted the encoding_stable_base32 branch March 21, 2025 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants