Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use concrete Future types in public API #2970

Closed
wants to merge 1 commit into from

Conversation

daxpedda
Copy link
Contributor

Checklist

  • Run cargo clippy.
  • Run RUSTFLAGS=--cfg=web_sys_unstable_apis cargo clippy --target wasm32-unknown-unknown if applicable.
  • Add change to CHANGELOG.md. See simple instructions inside file.

Connections
None.

Description
I am trying to implement a render state that can be polled to finish request_adapter and request_device instead of using await. This has come up because of rust-windowing/winit#2051, now we have to call request_adapter and request_device inside the event loop, which isn't allowed to block.

Testing
Actually quiet well tested by the existing infrastructure.

@daxpedda daxpedda force-pushed the concrete-future-type branch from 52f67ac to 0da63f0 Compare August 17, 2022 08:55
@jimblandy
Copy link
Member

I don't think I understand this change. What problem is solved by replacing the impl Future return type with the concrete types?

@daxpedda
Copy link
Contributor Author

daxpedda commented Aug 20, 2022

Currently, one can't save the returned Future and poll it later or wrap it into something without allocation, e.g. using Box<impl Future>.

This is a very common pattern in Rust async code, either things are implemented with -> impl Future or a handwritten Future is written as a more optimized version to avoid having to treat it as a trait object, which usually ends up in allocation if you want to store it somewhere.

@jimblandy
Copy link
Member

Currently, one can't save the returned Future and poll it later or wrap it into something without allocation, e.g. using Box.

Okay, I see. Can you say more about winit's requirements, such that something like this isn't adequate?

let adapter = pollster::block_on(instance.new(backends));
let device = pollster::block_on(adapter.request_device(desc, None));

@daxpedda
Copy link
Contributor Author

Since rust-windowing/winit#1922, we have found out that we should not create a window until Event::NewEvents(StartCause::Init). This affects wgpu initialization because it requires a window. So initialization of wgpu has to happen inside the loop.

let adapter = pollster::block_on(instance.new(backends));
let device = pollster::block_on(adapter.request_device(desc, None));

This was perfectly adequate before, because it was outside the loop. Now that we have to do this inside the loop, we can't just block anymore. For obvious reasons, you can't just block inside the event loop, in this context this was already noticed here rust-windowing/winit#1992 (comment).

There are alternative solutions, like doing the initialization in a thread, on the web you have to do it via a spawned Future. This adds a bit of complexity but might also be easier for some users.

@jimblandy
Copy link
Member

In every situation where you're using winit, those futures will always be immediately available, and the call to block_on will never actually block. These future types in the direct backend are Ready:

    type RequestAdapterFuture = Ready<Option<Self::AdapterId>>;
    #[allow(clippy::type_complexity)]
    type RequestDeviceFuture =
        Ready<Result<(Self::DeviceId, Self::QueueId), crate::RequestDeviceError>>;

(There's actually no reason for these methods to be async on native in the first place.)

@daxpedda
Copy link
Contributor Author

In every situation where you're using winit, those futures will always be immediately available, and the call to block_on will never actually block.

I'm using winit for wasm32-unknown-unknown on the browser as well.

These future types in the direct backend are Ready:

    type RequestAdapterFuture = Ready<Option<Self::AdapterId>>;
    #[allow(clippy::type_complexity)]
    type RequestDeviceFuture =
        Ready<Result<(Self::DeviceId, Self::QueueId), crate::RequestDeviceError>>;

(There's actually no reason for these methods to be async on native in the first place.)

For native that would work, it just doesn't work for web because of the closure, that would need to be boxed, because it takes a moved context. That's why the custom Futures where necessary.

@jimblandy
Copy link
Member

Okay, I see now: winit::event_loop::EventLoop::run claims to take over the current thread, but on the web it actually throws a JS exception, so that control can return to the content event loop. The closure passed to run gets called from JavaScript event handlers, promise handler functions, and so on. I didn't expect that.

So the purpose of this change is to avoid calling Box::pin?

@daxpedda
Copy link
Contributor Author

So the purpose of this change is to avoid calling Box::pin?

It's to avoid Box, you might need Box::pin down the line, never tried that far.

@jimblandy
Copy link
Member

My judgment is that it isn't worth the complexity of this change to avoid boxing a value, usually once in the execution of the program. wgpu and Naga are going to allocate all over the place, and we should accept an increase in lines of code when they make an appreciable impact. But this change doesn't meet that threshold.

If another maintainer has a different opinion, feel free to re-open this.

@jimblandy jimblandy closed this Aug 22, 2022
@daxpedda
Copy link
Contributor Author

wgpu and Naga are going to allocate all over the place, and we should accept an increase in lines of code when they make an appreciable impact. But this change doesn't meet that threshold.

I have some ideas on how to make this PR better, if the problem is added complexity, I can reduce the amount of code and keep the logic local to where it was. I will update this PR so you can take a look and see if that changes your opinion.

@jimblandy
Copy link
Member

I would take a change to have request_adapter and request_device return a Box<dyn>.

@daxpedda
Copy link
Contributor Author

I believe that's actually even worse, that would force allocation even in the case where none was needed.

As far as I understand, the problem is that it just makes the code worse with no appropriate gain. So my suggestion would have been to figure out a way to not make the code worse and not introduce additional complexity, would that be acceptable?

@jimblandy
Copy link
Member

I believe that's actually even worse, that would force allocation even in the case where none was needed.

We're talking about two boxes, over (typically) the course of an entire execution. It's just not worth your time.

@daxpedda
Copy link
Contributor Author

daxpedda commented Aug 22, 2022

I'm a bit confused, boxing them outright does literally nothing useful, at least nothing I can come up with. If a user needs a concrete type, they can easily just wrap it in a Box themselves.

It's just not worth your time.

Clearly it's worth the time to me, otherwise I wouldn't have made this PR. I completely agree though, that saving two allocations at this point is completely irrelevant and doesn't improve performance in any measurable way.

But I think I got the message, this "improvement" isn't worth the added complexity, no matter how much I would
reduce the added complexity. You know where to find me if you change your mind.

@jimblandy
Copy link
Member

I'm sorry - it's not my place to say what's worth someone's time. You're correct that I don't think this change is a net win; I'd need to see more impact. Improvements in time, memory, and ergonomics are all on the table.

@daxpedda
Copy link
Contributor Author

daxpedda commented Aug 25, 2022

I think we agree that there are no improvements on time or memory, personally I made this change more for user-experience than anything else.

Personally when I write or review code, I look out for unnecessary allocations or reference counting. Every time I see a Box or Rc I ask myself if it's really necessary there. So when I encounter some that seem to have been avoidable upstream, I make a PR. Most of the time they do affect performance, not this time though. But it does reduce my mental load.

There is also the case for increased complexity for the user by making it more difficult to name the type when they need to. This has also been discussed in the official API guidelines here: rust-lang/api-guidelines#60.


I think the crux of the problem is comparing the pro vs contra here. The pro is very very small, if present at all. The contra, added code complexity, seems to be big enough for you, to overshadow the pro here. I did try to reduce the code complexity a bit, but found out that the difference was negligible.

I guess rust-lang/rust#63063 would basically reduce the added code complexity to zero.


I want to point out that I do appreciate the skepticism and scrutiny here. I don't want the wgpu project to add unnecessary code complexity, but to keep the projects code quality high and maintainers focused on actual problems.
So thank you for taking the time to discuss this and not be afraid to say no when necessary 👍.


Another idea that I just had was adding a native only extension to allow for sync alternatives to these two functions. Would that be in scope?

Pros

  • No need for users to pull in another dependency like pollster (it has a very small impact though).
  • Clearer API, because on native these APIs aren't actually Futures as they resolve immediately and don't have any actual "waiting time" involved.

Cons
Again, increased code complexity. I believe though this can be done fairly noninvasive.

@jimblandy
Copy link
Member

jimblandy commented Aug 25, 2022

That's actually where I'm leaning as well - simply have request_adapter_blocking and request_adapter_from_event_loop, where the former returns the adapter synchronously, and the latter would take a closure. The former would not be available in web builds, and the latter would not be available in native builds. (Yes, it seems like you could - but that introduces other problems that I don't want to get into in this comment.)

The catch is that some of the other wgpu maintainers are not on board with this proposal. They want wgpu to have an API that permits one to run exactly the same code on both native and the web. This is a perfectly reasonable goal, and I share it—except that it isn't actually possible. Even if the wgpu API itself is uniform, you end up having have platform-conditional code around it. Trying to get rid of a bump in the carpet by stepping on it just moves the bump somewhere else. The catch is - embarrassing confession - I have never actually written wgpu code to run on the web. So I'm concerned that there's some aspect of the situation I'm missing, that they're seeing more clearly.

So basically:

  1. I agree with this idea. I think it's easy to explain, understand, implement, and use - more than the current API.

  2. To get people on board, it needs to be presented with a clear and fair-minded (steelmanned) exploration of the alternatives.

If someone were to call this bikeshedding (an unwarranted focus on a minor issue, simply because it's something everyone can understand and feels qualified to have an opinion about), I might feel like that criticism hits its mark. In my defense, I'd say that this API is literally the first thing that anyone trying wgpu needs to call, and I suspect it leaves an impression.

@Diggsey
Copy link
Contributor

Diggsey commented Aug 25, 2022

I think at least one of the methods should be supported on both. That way if you need portability you can use the async function, and if you don't you can use the blocking function, but you never need to use both. Given that the async version already works on both, I don't see why it would be an issue to keep that on both.

@daxpedda
Copy link
Contributor Author

I believe I'm missing a bit of context to fully understand every point you are making @jimblandy.

That's actually where I'm leaning as well - simply have request_adapter_blocking and request_adapter_from_event_loop, where the former returns the adapter synchronously, and the latter would take a closure.

I'm not sure how you imagine request_adapter_from_event_loop to work exactly, but I feel this would be incredibly invasive and unnecessary. I'm not sure exactly what the user benefit here is. wgpu is not an app framework, it has to be paired with a lot of other things to actually make it work, winit for example, but it seems to me that this wouldn't support this kind of use-case anymore and is trying to let wgpu do things that it isn't really made for.

Maybe I also completely misunderstood what this API is supposed to do.

The catch is that some of the other wgpu maintainers are not on board with this proposal. They want wgpu to have an API that permits one to run exactly the same code on both native and the web.

I agree with the other wgpu maintainers here. Having the exact same API for all platforms including web is one of the best parts of the Rust ecosystem experience I had so far.

Even if the wgpu API itself is uniform, you end up having have platform-conditional code around it.

From personal experience I know that this isn't always necessary. In fact most of the programs I've written with wgpu contain very little to none platform-conditional code. For example when you use winit with wgpu, you only have to append the canvas to the document, but that's it. This is very non-invasive because it's just something added for a specific platform and doesn't change anything about the rest of the flow.

The catch is - embarrassing confession - I have never actually written wgpu code to run on the web. So I'm concerned that there's some aspect of the situation I'm missing, that they're seeing more clearly.

You should try it, with the addition of the WebGL backend it has become quiet straightforward today!


I hope I didn't misunderstand anything and explained my point of view understandably!

On how to move forward, I was really thinking of what @Diggsey was proposing, just adding a native only method, leaving the async version as it is, because it's the one that is cross-platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants