Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache temp files created from base64 data #3197

Merged
merged 23 commits into from
Feb 15, 2023
Merged

Cache temp files created from base64 data #3197

merged 23 commits into from
Feb 15, 2023

Conversation

abidlabs
Copy link
Member

@abidlabs abidlabs commented Feb 15, 2023

This PR slightly changes the way that we generate temp files from base64 input data by giving them a deterministic path based on the contents of data. In other words, if a user submits the same input twice, it will be produce the same temp file. If the temp file already exists, then the conversion from base64 to binary (which is very time consuming) can be skipped and the file can be returned directly.

Fixes: #3189

Probably the easiest way to test this would be to run a demo like this:

import gradio as gr

gr.Interface(lambda x:x, gr.Video(source="webcam"), gr.Video()).launch()
  • Record a video that is at least a few seconds long
  • Click the submit button (note the inference time)
  • Click the submit button again with the same video (note the inference time, should be much faster on this branch)

@gradio-pr-bot
Copy link
Collaborator

All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-3197-all-demos

@abidlabs abidlabs marked this pull request as ready for review February 15, 2023 04:36
@abidlabs abidlabs changed the title Cache temp files created from base64 Cache temp files created from base64 data Feb 15, 2023
Copy link
Collaborator

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @abidlabs !

CHANGELOG.md Outdated Show resolved Hide resolved
gradio/components.py Show resolved Hide resolved
test/test_processing_utils.py Show resolved Hide resolved
@aliabid94
Copy link
Collaborator

aliabid94 commented Feb 15, 2023

As I understand it, on the frontend side, we still have to upload any file. The only step we save is on the backend, where we don't have to convert the base64 stream back to a binary file.

While this PR does improve things, a better approach would be to do caching on the frontend side. Because right now, let's say you have a live interface with a Video and Slider input. If you attach a video, as you move the slider, the video gets sent every single time you move the slider. If the frontend can keep track of previously uploaded files and the file names of those in the backend (which should be very feasible with PR #3191), then we only even send the file to the backend if its a new file, which saves a lot more processing and network time.

I think the implementation above would remove the need for this PR.

@abidlabs
Copy link
Member Author

Agreed @aliabid94 that would be the optimal solution. However, which components are we planning on doing this large file upload caching with? I thought we were only applying it to the File and UploadButton components -- is it reasonable to apply the same approach to all of the other components affected by this PR: Video, Audio, Model3D?

@abidlabs
Copy link
Member Author

Synced with @aliabid94 and we'll merge this in while @aliabid94 works on a proper fix via the upload logic in #3191

@abidlabs
Copy link
Member Author

Thanks for the review and feedback guys!

@abidlabs abidlabs merged commit 752ec0e into gradio-app:main Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache Data on the frontend side
4 participants