Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frame Pacing Issues on Vulkan GTX 1650 #1882

Closed
D0liphin opened this issue Aug 30, 2021 · 7 comments
Closed

Frame Pacing Issues on Vulkan GTX 1650 #1882

D0liphin opened this issue Aug 30, 2021 · 7 comments
Labels
area: ecosystem Help the connected projects grow and prosper help required We need community help to make this happen.

Comments

@D0liphin
Copy link

D0liphin commented Aug 30, 2021

The time between renders seems to be irregular, but averages fine.

#![feature(label_break_value)]
use winit::{
    dpi::{PhysicalSize, Size},
    event::*,
    event_loop::{ControlFlow, EventLoop},
    window::{Window, WindowAttributes, WindowBuilder},
};

fn render(surface: &wgpu::Surface, device: &wgpu::Device, queue: &wgpu::Queue) {}

fn main() {
    let event_loop = EventLoop::new();
    let window = WindowBuilder::new().build(&event_loop).unwrap();

    let instance = wgpu::Instance::new(wgpu::Backends::PRIMARY);
    let surface = unsafe { instance.create_surface(&window) };
    let request_adapter_options = wgpu::RequestAdapterOptions {
        compatible_surface: Some(&surface),
        ..Default::default()
    };
    let adapter =
        futures::executor::block_on(instance.request_adapter(&request_adapter_options))
        .unwrap();
    println!("{:#?}", adapter.get_info());
    let (device, queue) = futures::executor::block_on(
        adapter.request_device(&wgpu::DeviceDescriptor::default(), None),
    )
    .unwrap();

    let surface_configuration = wgpu::SurfaceConfiguration {
        usage: wgpu::TextureUsages::RENDER_ATTACHMENT,
        format: surface.get_preferred_format(&adapter).unwrap(),
        width: window.inner_size().width,
        height: window.inner_size().height,
        present_mode: wgpu::PresentMode::Fifo,
    };
    surface.configure(&device, &surface_configuration);

    device.create_render_pipeline(&wgpu::RenderPipelineDescriptor {
        label: None,
        layout: None,
        vertex: wgpu::VertexState {
            module: &device.create_shader_module(&wgpu::ShaderModuleDescriptor {
                label: None,
                source: wgpu::ShaderSource::Wgsl(
                    "[[stage(vertex)]] fn main() {}".to_owned().into(),
                ),
            }),
            entry_point: "main",
            buffers: &[],
        },
        fragment: None,
        primitive: wgpu::PrimitiveState::default(),
        depth_stencil: None,
        multisample: wgpu::MultisampleState::default(),
    });

    event_loop.run(move |event, _, _| match event {
        Event::RedrawRequested(_) => {
            use std::{io::Write, time::Instant};
            let earlier = Instant::now();
            'render: {
                let surface_texture = match surface.get_current_frame() {
                    Ok(frame) => frame.output,
                    Err(e) => break 'render (),
                };
                let mut command_encoder =
                    device.create_command_encoder(&wgpu::CommandEncoderDescriptor::default());

                {
                    let surface_texture_view = surface_texture
                        .texture
                        .create_view(&wgpu::TextureViewDescriptor::default());
                    let _ = command_encoder.begin_render_pass(&wgpu::RenderPassDescriptor {
                        label: None,
                        color_attachments: &[wgpu::RenderPassColorAttachment {
                            view: &surface_texture_view,
                            resolve_target: None,
                            ops: wgpu::Operations {
                                load: wgpu::LoadOp::Clear(wgpu::Color::WHITE),
                                store: true,
                            },
                        }],
                        depth_stencil_attachment: None,
                    });
                }

                queue.submit(std::iter::once(command_encoder.finish()));
            }
            print!(
                "{:?} ",
                (Instant::now().duration_since(earlier).as_nanos() as f64 / 10_000.0).round()
                    / 100.0
            );
            std::io::stdout().flush().unwrap_or(());
        }
        Event::MainEventsCleared => window.request_redraw(),
        _ => {}
    })
}

Running this produces:

AdapterInfo {
    name: "GeForce GTX 1650",
    vendor: 4318,
    device: 8584,
    device_type: DiscreteGpu,
    backend: Vulkan,
}
34.39 0.23 0.1 48.5 0.2 0.25 32.89 16.79 0.09 33.19 17.14 16.11 0.05 33.3 16.57 17.04 0.17 32.76 16.67 16.66 0.04 33.27 16.62 16.66 0.05 33.27 16.63 16.66 0.05 50.18 0.39 0.06 32.67 16.6 16.67 0.04 33.26 16.64 16.64 0.06 33.26 16.64 16.66 0.04 33.32 33.24 0.15 0.03 33.21 16.64 16.69 0.04 33.21 16.65 16.72 0.05 33.2 16.76 16.59 0.04 49.86 0.19 0.04 33.2 16.66 16.65 0.03 33.27 16.67 16.64 0.04 33.29 16.66 16.63 0.04 33.6 16.52 16.41 0.05 33.31 16.64 16.6 0.05 33.29 16.66 16.66 0.05 33.23 16.65 16.64 0.05 33.32 16.6 16.65 0.05 33.33 16.66 16.62 0.05 33.29 16.62 16.66 0.05 33.26 16.33 17.07 0.18 32.99 16.67 16.63 0.03 33.3 16.64 16.72 0.03 33.3 16.62 16.7 0.06 33.87 32.93 0.18 0.04 33.51 16.49 16.73 0.08 33.11 16.58 16.6 0.05 33.69 16.84 0.06 0.04 49.66 0.15 0.03 33.22 16.42 16.58 0.05 33.1 16.64 16.64 0.05 33.32 16.04 16.75 0.05 50.2 0.16 0.04 32.88 16.8 16.6 0.06 32.75 16.78 16.74 0.04 33.47 0.06 0.03 33.0 17.52 15.72 0.05 33.64 16.09 17.2 0.06 33.13 16.17 16.55 0.04 33.54 16.65 16.48 0.05 49.79 0.15 0.03 32.8 16.39 16.75 0.05 33.25 16.66 16.6 0.05 33.31 16.6 16.73 0.03 49.9 16.58 0.05 0.08 33.23 16.7 16.63 0.05 33.28 16.68 16.65 0.03 33.36 0.2 32.76 33.32 0.18 0.04 32.74 16.68 16.76 0.13 33.16 16.6 16.82 0.12 32.98 17.16 0.06 0.03 33.48 16.46 16.53 0.07 32.82 0.05

avg = 16.34ms (~61fps)
There is no concrete pattern. Though /((1102)*(1103|2)002)+/ matches most of it (when mapped with something like |t| (t / 16.67).round()).

I thought I'd run it on my laptop too, but I don't get the same issue...

AdapterInfo {
    name: "NVIDIA GeForce MX350",
    vendor: 4318,
    device: 7316,
    device_type: DiscreteGpu,
    backend: Vulkan,
}
0.91 0.19 0.07 0.21 0.33 0.07 0.22 0.25 0.05 0.62 0.05 0.25 0.95 0.08 0.32 0.08 0.29 0.04 0.26 0.08 0.26 0.11 0.94 0.11 0.57 0.11 1.11 0.14 0.28 0.08 0.26 0.08 0.26 0.09 0.26 0.08 0.27 0.09 0.3 0.07 0.64 0.46 0.06 1.32 0.07 0.38 0.05 0.27 0.08 0.26 0.08 0.26 0.09 0.25 0.09 0.28 0.09 0.31 0.07 0.26 0.1 0.29 0.11 0.27 0.53 0.06 0.24 1.37 0.08 0.37 0.07 0.29 0.06 0.27 0.08 0.26 0.08 0.27 0.08 0.94 0.13 0.71 0.08 0.83 0.04 0.36 0.07 1.66 0.1 0.28 0.08 0.26 0.12 0.23 0.07 0.29 0.07 0.26 0.08 0.26 0.08 0.28 0.08 0.27 0.08 0.26 0.08 0.26 0.08 0.56 0.06 0.25 1.49 0.08 0.38 0.07 0.25 0.09 0.24 0.1 0.24 0.08 0.26 0.08 0.28 0.08 0.25 0.09 0.24 0.09 0.27 0.07 0.27 0.48 0.05 0.25 0.1 0.96 0.08 0.93 0.13 0.46 0.1 0.61 0.09 1.0 0.07 0.29 0.06 0.27 0.08 0.28 0.07 0.27 0.08 0.58 0.06 0.19 0.24 0.08 1.54 0.08 0.35 0.09 0.29 0.07 0.26 0.07 0.27 0.07 0.26 0.08 0.26 0.08 0.27 0.07 0.26 0.08 0.28 0.08 0.57 0.05 0.22 0.91 0.08 0.26 0.08 0.25 0.08 0.25 0.09 0.26 0.1 0.28 0.07 0.26 0.08 0.93 0.19 0.42 0.11 0.4 0.12 1.0 0.04 0.3 0.05 1.63 0.07 0.26 0.08 0.27 0.08 0.25 0.09 0.28 0.08 0.26 0.09 0.25 0.09 0.25 0.1 0.25 0.09 0.25 0.09 0.6 0.06 0.2 1.56 0.04 0.37 0.07 0.29 0.07 0.28 0.11 0.23 0.08 0.27 0.09 0.26 

switching to PowerPreference::HighPerformance or switching to PresentMode::Mailbox makes no obvious difference.

@kvark kvark added area: ecosystem Help the connected projects grow and prosper help required We need community help to make this happen. labels Aug 31, 2021
@l0calh05t
Copy link

Isn't this pretty much the submit time ≠ present time issue, which can be especially bad with triple buffering? KhronosGroup/Vulkan-Docs#1364 aims to solve this. Assuming you are running v-synced, a solution that works right now may be to moving average the ∂ts to suppress the jitter (although it may create some irregular timing if a frame actually takes long)

@cwfitzgerald
Copy link
Member

There's something funkier than this going on I think. Something is causing the CPU to hang in various places longer than it should.

I have attached a tracy trace of my program running on nvidia. Normally get_next_texture (i.e. acquire) will block until a new swapchain frame is ready (in this case, about 16ms [this is not my refresh rate, my gpu just happens to run at ~60fps for this scene]). Every so often, present will take about double the amount of time (~38ms). After a the long block, acquire will be instant for the next frame.

image

On AMD, this manifests the other way around. It manifests as an acquire being nearly instant, then the next one being taking double time (53ms -> 100ms).

image

I've been trying to figure out what on earth is going on, but no luck so far.

@inoutch
Copy link

inoutch commented Nov 9, 2021

Is this relevant to this issue?
If so, it seems that it will be solved by updating winit.

@cwfitzgerald
Copy link
Member

These examples all use ControlFlow poll. I am currently basically not seeing the issue when I double submit, but I think it still pops up occasionally.

@CatCode79
Copy link
Contributor

CatCode79 commented Nov 10, 2021

I tested the porting of the code on wgpu 0.11: no possibility to reproduce it, but I have rather large values ​​in the output, is this normal?

#Output:

AdapterInfo {
    name: "AMD Radeon R9 200 Series",
    vendor: 4098,
    device: 26520,
    device_type: DiscreteGpu,
    backend: Vulkan,
}

5.67 2.76 0.55 1010.81 1003.52 1005.96 1008.7 1008.18 1010.38 1011.15 999.11 1009.51 1008.01 1014.51 1011.29 1006.05 1013.02 1010.83 1011.01 1003.47 1011.26 1005.36 1011.37 1008.29 1007.46 1008.52 1007.04 1006.01 1005.31 1010.76 1014.48 1008.78 1008.42 1010.12 1011.21 1002.89 1003.39 1008.51 1008.34 1009.32 1010.12 1006.83 1009.24 1006.68 1013.72 1004.81 1009.32 1011.09 1012.06 1014.0 1007.54 1010.01 1010.32 1014.02 1014.1 1007.9 1008.12 1002.3 1005.79 1006.74 1005.97 1000.18 1000.55 1013.15 1009.0 1007.39 1001.02 1015.1 1007.28 1009.34 1011.68 1002.05 1009.2 1012.28 1009.16 1010.9 1009.89 1012.29 1011.87 1008.08 1013.45

You can found my test here.

@CatCode79
Copy link
Contributor

Forgive me, in porting the test from version 10 to version 11 I forgot to add the present () call after the commit *headpalm*, now the values ​​are stable between 14 and 16 milliseconds (and no longer a second or more).

Even testing on version 10 I have the same stable behavior on my machine.

@cwfitzgerald
Copy link
Member

See #2869 for what's needed to solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ecosystem Help the connected projects grow and prosper help required We need community help to make this happen.
Projects
None yet
Development

No branches or pull requests

6 participants