-
Notifications
You must be signed in to change notification settings - Fork 809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON API: Fix for non-UTF-8 data #13225
Conversation
`json_encode()` can only encode UTF-8 strings. Anything else will cause `json_encode()` to return `false`. Core addresses this with `wp_json_encode()`, which converts all data to UTF-8 before calling `json_encode()`. WordPress.com addresses this with similar recursive pre-processing. Let's use `wp_json_encode()` instead of `json_encode()` so that oddly broken data doesn't completely break JSON API requests.
Caution: This PR has changes that must be merged to WordPress.com |
Thank you for the great PR description! When this PR is ready for review, please apply the Scheduled Jetpack release: September 3, 2019. |
* 7.7 changelog: initial set of changes * Changelog: add #13154 * Changelog: add #13134 * Changelog: add #12699 and many others * Changelog: add #13127 * Changelog: add #13167 * Changelog: add #13225 * Changelog: add #13179 * Changelog: add #13173 * Changelog: add #13232 * Changelog: add #13137 * Changelog: add #13172 * Changelog: add #13182 * Changelog: add #13200 * Changelog: add #13244 * Changelog: add #13267 * Changelog: add #13204 * changelog: add #13205 * Changelog: add #13183 * Changelog: add #13278 * Changelog: add #13162 * Changelog: add #13268 * Changelog: add #13286 * Changelog: add #13273 * Changelog: add #12474 * Changelog: add #13085 * Changelog: add #13266 * Changelog: add #13306 * Changelog: add #13311 * Changelog: add #13302 * Changelog: add #13196 * Changelog: add #12733 * Changelog: add #13261 * Changelog: add #13322 * Changelog: add #13333 * Changelog: add #13335
json_encode()
can only encode UTF-8 strings. Anything else will causejson_encode()
to returnfalse
.Core addresses this with
wp_json_encode()
, which converts all data to UTF-8 before callingjson_encode()
.WordPress.com addresses this with similar recursive pre-processing. (WordPress.com's solution here is different than Core's because it predates Core's use of
json_encode()
for anything interesting.)This could have been fixed a long time ago :) I suspect it hasn't come up often because much of the data we return in JSON API requests comes from cached data, which is cleaned up by WordPress.com's pre-processing. At any rate, it needs fixing.
Changes proposed in this Pull Request:
Use
wp_json_encode()
instead ofjson_encode()
so that oddly broken data doesn't completely break JSON API requests.Is this a new feature or does it add/remove features to an existing part of Jetpack?
No. Bug fix.
Testing instructions:
defined( 'DB_CHARSET', 'latin1' )
. For those using the docker container:yarn docker:uninstall
DB_CHARSET
to'latin1'
in wp-config.php.yarn docker:install
yarn docker:wp db cli
UPDATE wp_posts SET post_content = UNHEX( '48656C6C6FF7' ) WHERE ID = 1;
The last byte (F7
) is a byte that cannot exist in UTF-8 encoded data.curl --silent --dump-header /dev/stderr YOUR_SITE?rest_route=/wp/v2/posts/1
.content.rendered
containsHello?
. The last byte of the post's contents has been converted to an ASCII?
. This is the expected output:json_encode()
cannot encoded non-UTF-8 strings.curl --silent --dump-header /dev/stderr 'https://public-api.wordpress.com/rest/v1.1/sites/YOUR_SITE/posts/1
Prior to this patch, see an error:
After this patch, see the correct response: a JSON blob with
.content
containingHello?
.Note that you will probably want to switch back to
define( 'DB_CHARSET', 'utf8' )
and uninstall/reinstall again after testing.Proposed changelog entry for your changes: