perf(encoding): improve base32 encode/decode performance #6479

BlackAsLight · 2025-03-13T06:47:35Z

This pull request brings the performance added in #6471 to the stable base32 functions, but in a non-breaking manner. The only difference a user would find is a difference in error message if providing an invalid encoded string.

codecov · 2025-03-13T06:50:47Z

Codecov Report

Attention: Patch coverage is 92.30769% with 2 lines in your changes missing coverage. Please review.

Project coverage is 95.29%. Comparing base (5fa127e) to head (9bbfb81).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
encoding/base32.ts	92.30%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6479      +/-   ##
==========================================
- Coverage   95.32%   95.29%   -0.03%     
==========================================
  Files         577      576       -1     
  Lines       43417    43256     -161     
  Branches     6485     6467      -18     
==========================================
- Hits        41386    41222     -164     
- Misses       1992     1995       +3     
  Partials       39       39

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

kt3k · 2025-03-17T05:24:32Z

encoding/base32_test.ts

@@ -37,9 +37,9 @@ Deno.test({
  name: "decodeBase32() throws on bad length",
  fn() {
    assertThrows(
-      () => decodeBase32("OOOO=="),


The result of this input seems changed by this PR. Is that intentional?

Yes. The way the new code detects errors is different where the old input would be valid. The old input had the wrong length based off the encoding + padding but the new code doesn't care about the padding as long as it isn't too much padding. Four encoded characters is valid here while three is invalid

the new code doesn't care about the padding as long as it isn't too much padding.

I'm not sure if we should do that change. I checked the standard libraries of some other languages, but they seem checking padding length. Probably we should keep that check?

$ cat enc.py import base64 base64.b32decode("OOOO==") $ python3 enc.py Traceback (most recent call last): File "/Users/kt3k/denoland/std/enc.py", line 2, in <module> base64.b32decode("OOOO==") File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/base64.py", line 254, in b32decode return _b32decode(_b32alphabet, s, casefold, map01) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/base64.py", line 210, in _b32decode raise binascii.Error('Incorrect padding') binascii.Error: Incorrect padding

$ cat main.go package main import ( "encoding/base32" "fmt" ) func main() { enc := base32.NewEncoding("ABCDEFGHIJKLMNOPQRSTUVWXYZ234567") res, e := enc.DecodeString("OOOO==") if e != nil { fmt.Printf("%s", e) } else { fmt.Printf("%s", res) } } $ go run main.go illegal base32 data at input byte 6

It was more written this way for convince. The code for it is here: https://github.com/denoland/std/blob/main/encoding/_common32.ts#L107-L119 where it really is just seeing how much padding it needs to slice off before doing the actual decoding.

I can understand wanting to be in align with other std's but I also don't see a benefit in being strict here with the padding when decoding when you know what the missing bytes are.

The change to base64 operates in the same manner here. Where base64url may allow omitting padding, but base64 requires it, the decoder doesn't care if padding is short

I think this kind of change needs more broader consensus from the community and core team, and I'm not in favor of landing this as part of perf change. Can you add back the simple check of length like the below and create an issue for discussing that?:

if (len % 8 > 0) { throw new Error( `Cannot decode base32 string as the length must be a multiple of 8: received length ${len}`, ); }

The change to base64 operates in the same manner here. Where base64url may allow omitting padding, but base64 requires it, the decoder doesn't care if padding is short

Base64 decoder seems ignoring missing padding before the recent rewrite (#6461). That might be a good reason to remove padding length check from decodeBase32. Anyway I think we need to discuss this topic independently somewhere else.

kt3k

LGTM. Nice work!

perf(encoding): brought to stable base32

c38164f

BlackAsLight requested a review from kt3k as a code owner March 13, 2025 06:47

github-actions bot added the encoding label Mar 13, 2025

kt3k reviewed Mar 17, 2025

View reviewed changes

add(encoding): length check to decodeBase32

5959f8a

BlackAsLight mentioned this pull request Mar 20, 2025

Should decodeBase32 be strict on requiring padding? #6498

Open

Merge branch 'main' into encoding_stable_base32

9bbfb81

kt3k approved these changes Mar 21, 2025

View reviewed changes

kt3k merged commit 922c2f4 into denoland:main Mar 21, 2025
18 checks passed

BlackAsLight deleted the encoding_stable_base32 branch March 21, 2025 06:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(encoding): improve base32 encode/decode performance #6479

perf(encoding): improve base32 encode/decode performance #6479

BlackAsLight commented Mar 13, 2025

codecov bot commented Mar 13, 2025 •

edited

Loading

kt3k Mar 17, 2025

BlackAsLight Mar 17, 2025

kt3k Mar 17, 2025

BlackAsLight Mar 17, 2025

kt3k Mar 17, 2025

kt3k left a comment

perf(encoding): improve base32 encode/decode performance #6479

perf(encoding): improve base32 encode/decode performance #6479

Conversation

BlackAsLight commented Mar 13, 2025

codecov bot commented Mar 13, 2025 • edited Loading

Codecov Report

kt3k Mar 17, 2025

Choose a reason for hiding this comment

BlackAsLight Mar 17, 2025

Choose a reason for hiding this comment

kt3k Mar 17, 2025

Choose a reason for hiding this comment

BlackAsLight Mar 17, 2025

Choose a reason for hiding this comment

kt3k Mar 17, 2025

Choose a reason for hiding this comment

kt3k left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 13, 2025 •

edited

Loading