Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write timestamp lands on wrong side of barriers #2521

Open
fintelia opened this issue Mar 3, 2022 · 3 comments
Open

Write timestamp lands on wrong side of barriers #2521

fintelia opened this issue Mar 3, 2022 · 3 comments
Labels
area: performance How fast things go type: enhancement New feature or request

Comments

@fintelia
Copy link
Contributor

fintelia commented Mar 3, 2022

Description
I've been experimenting with wgpu's write_timestamp functionality to measure the execution time of compute shaders. Unfortunately, the lack of control of when they happen in relation to pipeline barriers seems to be preventing them from being of much use.

Repro steps
Execute two compute dispatches, and measure how long the second one takes:

cpass.dispatch(...);
cpass.write_timestamp(...);
cpass.dispatch(...);
cpass.write_timestamp(...);

Expected vs observed behavior
I would expect that the difference between the timestamps would reflect the execution time of the second dispatch. Instead, I observed that the measurement would sometimes be an order of magnitude larger, presumably including much of the time executing the first dispatch.

I assume this is related to vkCmdWriteTimestamp happening too early. Partial trace:

47 vkCmdBindPipeline(pipeline.generate.displacements)
48 vkCmdBindDescriptorSets(0, { bindgroup.generate.displacements })
49 vkCmdResetQueryPool(Query Pool 250)
50 vkCmdWriteTimestamp(Query Pool 250, 6)
51 vkCmdPipelineBarrier({ buffer.nodes })
52 vkCmdPipelineBarrier({ texture.tiles.displacements })
53 vkCmdDispatch(9, 9, 1)
54 vkCmdResetQueryPool(Query Pool 250)
55 vkCmdWriteTimestamp(Query Pool 250, 7)

Platform
I'm using Linux with an AMD GPU + mesa drivers, though I imagine this applies more broadly.

@cwfitzgerald
Copy link
Member

Yeah so this is an inherent problem with the current api -- because the barriers happen within the dispatch, if the barrier exists, it will always be included in the timestamp. The timestamp api is actually under a lot of debate within webgpu, and I don't know what the current status of it is. No matter what the upstream api ends up being, we should have an api that allows you to get the information you need to get this information.

@Wumpf
Copy link
Member

Wumpf commented Nov 18, 2023

There's now timestamps on the borders of a Pass which are a bit more well defined. Also, timestamp taking within passes is a wgpu extension, not a feature of WebGPU itself.

I'm have no idea what we can do about this problem in general though: It would imply to schedule barriers earlier than we would without the write_timestamp call which then completely changes the perf characteristics of the thing that we want to measure 🤔

I'm leaning "Won't fix". The timestamp isn't lying per se: It measures how much time was spent kicking off the dispatch, but the cost of finishing it falls on kicking of the next dispatch because that takes as long as it takes for all input resources to become available, no matter what these are and why they're not available. Warrants a lengthy comment though on write_timestamp ofc!

@cwfitzgerald thoughts?

@teoxoy
Copy link
Member

teoxoy commented Nov 18, 2023

Related:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: performance How fast things go type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants