-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Research: Impact of additional WebP images on upload #289
Comments
I think it should be an option to enable WebP in Settings/Media, with a warning to staff about the total images will be increase by x% is great and accessibility-in-mind. |
Glad to see this research project! Regarding the question "What's the average size of a media library from a WordPress site?" -- Averages can hide important details, so I think it's important to dig into this question deeper. Since sites with larger libraries will be the ones impacted most, looking at "averages" doesn't go far enough to help us understand what the impacts will be on any "above average" site. Once we understand what the "average" looks like, we need to know how many "above average" sites are there, how large their libraries are, and how much they will be impacted by ~doubling the number of images on the server. Said another way, what does the distribution curve actually look like? |
Semantics but “Impact of additional WebP images on upload” is different than “Impact of additional WebP images on existing uploads”. The CPU impact, and resulting delay processing, due to generating alternative image format (WebP, avif) thumbnails does matter it’s just a bit of a different issue. Years ago I had a client who posted breaking news (think buzzfeed) that included very high resolution images (yes, think paparazzi images although had another that posted sneaker/shoe news with the same problem) and the 1 minute or more delay processing those images on upload would cause them both great frustration. It would often cause them missing out on being “first” to publish and hence first to ping google news for indexing. So that processing time delaying getting to the publish button does matter and should be considered, However, it seemed to me the bigger concern in last weeks slack chat was about the file system impact of retroactively generating alternative format thumbnails for all existing media library images. That impact is less about CPU/time and all about file system size (GBs, file/inode counts). Just want to be crystal clear that there are two very different impacts here, one “on upload” and the other “on thumbnail regeneration”. |
@jb510 CPU impact and delay in processing is surely a point of discussion but there will be lots of server configurations to test with and that is impossible to take into consideration. We account on the more load on server by having more retries to complete the upload. It is more of an hosting and server configuration discussion rather than just the impact on additional WebP image uploads on storage space. We are gathering data on how it will impact users in terms of hosting space and limits and costs about it. |
That's not impossible. It's a very simple measurement. Does processing WebP on upload increase wait time on upload 10%? 50%? 1000%? It had an easily measurable affect and should be calculated and considered. I'm not talking about the time spend uploading the original, that's bandwidth dependent. I'm just talked about the processing time to generate JPEG image sub-sizes vs JPEG and WebP sub-sizes. There is a strong argument none of that should be happening in a blocking way anyway, it ought to be moved to on defend/just in time generation. But this effect should be measured and considered given how it currently works in relation to user experience of uploading files to a post.
File system impact is important to us understand as well, but again, more so in the context of regenerating thumbs on existing large libraries. The incremental extra space taken by each upload over time is easily manageable. Even if each upload take double the space, the impact is predictable over time. The failure mode is what happiness to an existing library that is 90GB and WP goes to regenerate all those thumbnails at once doubling the size of the entire media library. Both very different, but very important impacts to measure. |
I do agree with @jb510 This is a highly important measurement which impacts all the sites. I did some tests on a fresh local environment with the Twenty Twenty Two theme and noticed that there was an increase in the overall time taken to upload and process the images. Did same tests on similar setup on InstaWP environment and the results were almost same. Without the WebP images enabled, for a sample JPEG image, the total time taken for the AJAX call was around 3.1 seconds and the same translated to an average of 7 seconds when the WebP images were enabled. Similar for another example JPEG image, the time was 4.2 seconds on average when WebP images were disabled, and it translated to 8.1 seconds when WebP images were enabled. SS attached. As we can see here. there is an overall more than 80% increment in total amount of time taken where it includes the image upload time too. If we dig more, calculate the upload time and remove it from this scenario, this is going to be an increment of more than 100% of processing time. This scenario was a test case with minimal setup and had minimum number of registered image sizes and the difference is quite significant. In a real work scenario where there are many more image sizes involved, this is going to be a significant time interval just of generation of WebP images alone, which will affect the editorial flow and need to be taken into consideration and accounted for accordingly. The better way moving forward would be to offload this process into some sort of cron job so that the editorial flow is not impacted. We might need to investigate a bit more on this direction. |
@jb510 interesting point! this issue is more related to the fact that even though image processing is happening the "the background" when you upload an image in the editor or media library, you can't really use the image until sub-size generation completes. In fact, if you drag an image into Gutenberg today and publish the post before processing completes, the resulting image is broken (see WordPress/gutenberg#39223). :( Overall this isn't great, and might be something we can improve as part of the work in #24 with background image processing. For now this ticket is focused on the storage requirements of the additional images. The tradeoff for the slower uploads is faster load times, a reasonable tradeoff for most users. For your paparazzi photographer, resizing the image locally to the desired size and uploading will probably get the image published faster in either case. |
Just a minor point of clarification about this situation. The inability to "use" the image in the editor is less of an issue here then the "wait time" where the author is effectively blocked from continuing with what they were doing (writing) while they stare at a progress bar for an increasingly long time. As others have pointed out that UX is awful in it's current state and would get 2x worse generating WebPs . As for missing sub-sizes, Gutenberg blocks ought to be better written to handle use cases where sub-sizes are permanently or temporarily missing. That doesn't seem a directly relevant issue here though. Bringing it back to the file system issues, I feel the "best" fix for BOTH these issues would be generating those sub-sizes on demand in the background if and when they are first requested. Rather than pre-rendering every possible size to disk just in case at the time of upload. I second best fix would be "shiny uploads" where it's completely non-blocking to the user. |
Re:
Also relevant is considering what the impacts might be on the plans themselves. Even hosts that offer "unlimited" storage have to contend with the reality of hardware, and are no doubt deciding which plans get to be "unlimited" based on certain assumptions about how WordPress media library works today, and how big a site is likely to get. A 70% increase in that size is likely to change the calculation and may move the needle, either by making unlimited plans more expensive, or by moving "unlimited" further up the pricing tier. |
@tiffanybridge good point - this is an important question to answer in the research here, how much will hosts actually be impacted by the increased storage requirements? I created a survey and shared in the hosting community channel in part to try to answer this question. |
Enabling WebP images on upload would make the most sense on sites that have lower image storage usage in the uploads folder. If a site has 100GB+ images in the uploads directly with high media library counts then WebP image conversion on upload should not be enabled by default and is opt-in. |
I'm doing maintenance for client sites and have some of them hosted on my own server. I'm seeing a couple of issues here, if webp creation/usage would be enabled by default for existing images.
So I would need to find a way to disable automatic creation of webp files if this increase of disk usage and/or inodes will not be solved. Of course I understand that smaller sites have a great impact on performance and even sustainability, but for clients it can/will have a financial impact, and for hosts it has a negative sustainability impact as they need more disk storage or even servers, using more power etc. Although the overall balance may be in favor of enabling webp, the negative impact should not be ignored. So it should definitely need to be an opt-in solution for existing sites and new sites should be able to opt out until the total internet is ready for WebP and we can remove JPG. |
Research complete; closing this issue. |
This issue is for research and analysis related to the concern about the new Enabling WebP by default feature creating too many files. There is a lot of concern about the doubling of the number of image files resulting in increased hosting costs, running out of disk space (or “inodes”), or failed backup.
We need to conduct research to get data on this topic, to answer the following questions:
Once we have data on the above points, we can review and discuss further how the additional storage needed affects end users.
NOTE: This issue is for tracking research. To share more general feedback on the WebP by default proposal, please comment on our follow-up post here.
The text was updated successfully, but these errors were encountered: