Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON API: Fix for non-UTF-8 data #13225

Merged
merged 1 commit into from
Aug 14, 2019

Conversation

mdawaffe
Copy link
Member

json_encode() can only encode UTF-8 strings. Anything else will cause json_encode() to return false.

Core addresses this with wp_json_encode(), which converts all data to UTF-8 before calling json_encode().

WordPress.com addresses this with similar recursive pre-processing. (WordPress.com's solution here is different than Core's because it predates Core's use of json_encode() for anything interesting.)

This could have been fixed a long time ago :) I suspect it hasn't come up often because much of the data we return in JSON API requests comes from cached data, which is cleaned up by WordPress.com's pre-processing. At any rate, it needs fixing.

Changes proposed in this Pull Request:

Use wp_json_encode() instead of json_encode() so that oddly broken data doesn't completely break JSON API requests.

Is this a new feature or does it add/remove features to an existing part of Jetpack?

No. Bug fix.

Testing instructions:

  1. On a test site, wipe your database and reinstall with defined( 'DB_CHARSET', 'latin1' ). For those using the docker container:
    1. yarn docker:uninstall
    2. Set DB_CHARSET to 'latin1' in wp-config.php.
    3. yarn docker:install
  2. yarn docker:wp db cli
  3. UPDATE wp_posts SET post_content = UNHEX( '48656C6C6FF7' ) WHERE ID = 1; The last byte (F7) is a byte that cannot exist in UTF-8 encoded data.
  4. Connect Jetpack
  5. curl --silent --dump-header /dev/stderr YOUR_SITE?rest_route=/wp/v2/posts/1
  6. See that .content.rendered contains Hello?. The last byte of the post's contents has been converted to an ASCII ?. This is the expected output: json_encode() cannot encoded non-UTF-8 strings.
  7. curl --silent --dump-header /dev/stderr 'https://public-api.wordpress.com/rest/v1.1/sites/YOUR_SITE/posts/1

Prior to this patch, see an error:

{"error":"jetpack_response_error","message":"A malformed response was received from the Jetpack site."}

After this patch, see the correct response: a JSON blob with .content containing Hello?.

Note that you will probably want to switch back to define( 'DB_CHARSET', 'utf8' ) and uninstall/reinstall again after testing.

Proposed changelog entry for your changes:

  • Fix API responses which contain malformed (non-UTF-8) data.

`json_encode()` can only encode UTF-8 strings. Anything else will cause
`json_encode()` to return `false`.

Core addresses this with `wp_json_encode()`, which converts all data to
UTF-8 before calling `json_encode()`.

WordPress.com addresses this with similar recursive pre-processing.

Let's use `wp_json_encode()` instead of `json_encode()` so that oddly
broken data doesn't completely break JSON API requests.
@mdawaffe mdawaffe added [Type] Bug When a feature is broken and / or not performing as intended [Feature] WPCOM API [Status] Needs Review To request a review from fellow Jetpack developers. Label will be renamed soon. labels Aug 14, 2019
@mdawaffe mdawaffe added this to the 7.7 milestone Aug 14, 2019
@mdawaffe mdawaffe requested a review from a team August 14, 2019 03:03
@mdawaffe mdawaffe self-assigned this Aug 14, 2019
@matticbot
Copy link
Contributor

Caution: This PR has changes that must be merged to WordPress.com
Hello mdawaffe! These changes need to be synced to WordPress.com - If you 're an a11n, please commandeer, review, and approve D31505-code before merging this PR. Thank you!

@jetpackbot
Copy link
Collaborator

Thank you for the great PR description!

When this PR is ready for review, please apply the [Status] Needs Review label. If you are an a11n, please have someone from your team review the code if possible. The Jetpack team will also review this PR and merge it to be included in the next Jetpack release.

Scheduled Jetpack release: September 3, 2019.
Scheduled code freeze: August 27, 2019

Generated by 🚫 dangerJS against c21cc75

@mdawaffe mdawaffe merged commit 4a15164 into master Aug 14, 2019
@mdawaffe mdawaffe deleted the fix/json-api-responses-for-non-utf8-data branch August 14, 2019 23:14
@matticbot matticbot added [Status] Needs Changelog and removed [Status] Needs Review To request a review from fellow Jetpack developers. Label will be renamed soon. labels Aug 14, 2019
jeherve added a commit that referenced this pull request Aug 23, 2019
jeherve added a commit that referenced this pull request Aug 27, 2019
* 7.7 changelog: initial set of changes

* Changelog: add #13154

* Changelog: add #13134

* Changelog: add #12699 and many others

* Changelog: add #13127

* Changelog: add #13167

* Changelog: add #13225

* Changelog: add #13179

* Changelog: add #13173

* Changelog: add #13232

* Changelog: add #13137

* Changelog: add #13172

* Changelog: add #13182

* Changelog: add #13200

* Changelog: add #13244

* Changelog: add #13267

* Changelog: add #13204

* changelog: add #13205

* Changelog: add #13183

* Changelog: add #13278

* Changelog: add #13162

* Changelog: add #13268

* Changelog: add #13286

* Changelog: add #13273

* Changelog: add #12474

* Changelog: add #13085

* Changelog: add #13266

* Changelog: add #13306

* Changelog: add #13311

* Changelog: add #13302

* Changelog: add #13196

* Changelog: add #12733

* Changelog: add #13261

* Changelog: add #13322

* Changelog: add #13333

* Changelog: add #13335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] WPCOM API Touches WP.com Files [Type] Bug When a feature is broken and / or not performing as intended
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants