Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomic layout updates #2072

Merged
merged 39 commits into from
Jun 30, 2018
Merged

Atomic layout updates #2072

merged 39 commits into from
Jun 30, 2018

Conversation

RyanDwyer
Copy link
Member

@RyanDwyer RyanDwyer commented May 30, 2018

This is a very early concept/WIP of atomic layout updates. It's so early that my code won't even compile. You should consider it a proposal with some example code attached. I'm submitting it here so everyone can see the direction I'm going in and give feedback as the feature is developed.

The new data structures I'm proposing are:

  • struct sway_transaction
  • struct sway_transaction_instruction
  • struct sway_container_state

The basic idea here is that when we want to arrange layout, we create a struct sway_transaction. We add instructions to the transaction - one instruction per container involved in the arrangement. Then we commit the transaction which sends all the configure events. Then when all the configures are done, or a time has passed, we apply the transaction which updates the tree accordingly.

Each instruction has a pointer to a container, a pending state which it'll apply to the container, and the configure serial.

I was originally going to have a current and pending state in each container, but realised that would be tedious to update the codebase to change container->x to container->current.x, so at the moment there's just pending and the current stuff uses the normal properties.

When arranging layout, it builds the new pending state from the container's existing pending state. This is so a second transaction running at the same time won't revert what the first transaction is trying to change. This means if no transactions are in progress, every container's pending state must be an accurate reflection of the current state.

As for how windows are arranged by commands and so on, if you only want to do one arrange, you'll be able to call arrange_and_commit(container). This creates a new transaction and commits it for you. If you need to do two arranges in the one transaction, such as moving a view from one visible workspace to another visible workspace, you can take the more manual approach:

struct sway_transaction *transaction = transaction_create();
arrange_windows(old_workspace, transaction);
arrange_windows(new_workspace, transaction);
transaction_commit(transaction);

You may notice this means there's only two entry points for the arrange functions, rather than one for each container type as we had before.

Fixes #1650

Fixes #220

@RyanDwyer
Copy link
Member Author

RyanDwyer commented Jun 3, 2018

I've updated the PR so it's now a basic working implementation. Atomic updates are implemented for the resize command only. Almost everything else is broken because everything else writes to the current properties/state, which get overwritten with the pending state when the transaction is applied.

The PR is ready to stash the last good surface texture away, but we need to have a good discussion on whether that's actually what we want to do, because apparently reading textures back from the GPU is expensive. So in lieu of a solution, if the transaction stalls for a view then you'll see the other surfaces resize before their borders and everything else does.

Ideas mentioned in chat for preserving textures are:

  • Copy the texture as originally planned (bad due to being expensive), or
  • Have two wlr_textures for each surface, and swap them when we want to preserve one, or
  • Expose the pixel data to sway so we can make a new texture from it, or
  • Don't damage/rerender the area being updated until everything is ready (doesn't work if there's a layer above or below that needs to be updated)

Anyway, when you resize a split, here's what's happening:

  • The resize command calls container_recursive_resize, which now sets the new dimensions in the container's pending state rather than directly to the container's width and height properties.
  • The resize command calls arrange_children_of. This function will be removed in favour of arrange_and_commit, but to make rebasing easier it exists as a wrapper around arrange_and_commit.
  • arrange_and_commit does the following:
    • Creates a new transaction
    • Arranges the children as it did previously, but with the following differences:
      • Everything is saved into the pending properties rather than the swayc's main properties.
      • view_autoconfigure no longer calls view_configure and instead sets the view_{x,y,width,height} in the pending state.
    • Adds each processed child to the transaction.
    • Adds a single damage box to the transaction - the same box as the topmost parent's geometry (note the topmost parent does not move or change size, so only one damage box is needed and not a before and after).
    • Commits the transaction with transaction_commit.
  • transaction_commit iterates through the instructions/containers/configures in the transaction.
    • For views, view_configure is called and the serial is stored. The view's current surface texture is copied into view->saved_texture.
    • For C_CONTAINERs and anything else, the instruction is marked as ready instantly.
  • transaction_commit prepares a timer (currently set at 200ms). If any views have not acknowledged the configure by then, the transaction will be applied anyway.
  • When rendering, the main surface of each view will use the view->saved_texture if it exists.
  • As each view commits, it calls transaction_notify_view_ready with the view and the serial from that commit.
  • transaction_notify_view_ready marks off the view in the transaction and applies the transaction if all are ready.
  • transaction_apply copies the pending state into the main swayc properties and removes the view->saved_texture.
  • transaction_apply applies the damages that is stored in the transaction.
  • The transaction is then destroyed.

A good test is to change the TIMEOUT_MS from 200 to 2000 (2 seconds). Then download emersion's hello-wayland program (https://github.com/emersion/hello-wayland) and rewrite the configure handler so it ignores subsequent configures (ie. making it a misbehaving client):

static void xdg_surface_handle_configure(void *data,
        struct xdg_surface *xdg_surface, uint32_t serial) {
    static bool done_first = false;
    if (!done_first) {
        xdg_surface_ack_configure(xdg_surface, serial);
        done_first = true;
    }
}

Then open an xdg_shell terminal, use that terminal to launch hello-wayland, then use the resize command. You'll see it stall for 2 seconds before applying. You'll also see the surface resize before the borders do, due to the lack of texture preservation as mentioned in the second paragraph.

Moving forward from here:

  • We'll need to change the entire codebase to write into the pending state instead of the current properties. A good way to do this would be to remove the current properties and have a swayc->current state. That way we know we'll have updated all references because it won't compile otherwise.
  • We need to handle tree modification operations, such as creating containers, reparenting containers and deleting containers. To do this I think we'll need to store new children lists in the pending state, and that makes for ugly code like container->pending.parent->pending.children. I don't know of a better way to do this.
  • We'll need to handle cases where transactions queue up, like when you repeatedly resize a split.

@emersion
Copy link
Member

emersion commented Jun 3, 2018

Copy the texture as originally planned

This seems pretty hard, there's no direct way to do it. We could:

  • Download the texture data and re-upload it. Very expensive. I'd advise against it.
  • Render the texture to a pbuffer/FBO and use glCopyTexImage2D. A little bit better, but still expensive.

Have two wlr_textures for each surface, and swap them when we want to preserve one

I'd rather "steal" the texture from the wlr_surface and force the wlr_surface to re-create a new texture next time the client commits. I don't think this strategy works for DMA-BUFs, as the client can update its texture without making the compositor import it again?

Expose the pixel data to sway so we can make a new texture from it

The pixel data is in the GPU, so it's tricky to just read it.

@ddevault
Copy link
Contributor

ddevault commented Jun 5, 2018

Read through your details about how this works, seems good to me.

Can we worry about "what to do with old textures" later?

@RyanDwyer
Copy link
Member Author

RyanDwyer commented Jun 6, 2018

I'm making good progress on this.

I decided to make it so the main swayc/view properties are the pending state rather than the current state. This is because so much of the code works with these properties, and they need to work with the pending state. The only exception is rendering code which works with the current state. So the good news is everyone can continue to use container->x and so on everywhere except in the rendering code.

BTW, I consider the main/pending properties to be the "live" state, and the current state to be what the renderer displays which might be a few frames behind. So IPC events are sent when the live/pending state is changed.

I discovered we called the arrange_foo functions in both high level functions and low level functions. I wouldn't be surprised if moving a view could trigger several arranges of the same containers. I've moved these out to be high level only. So arrange functions are typically called from command handlers or from a view's map/unmap handler.

I've made floating, fullscreen, and floating + fullscreen all work with transactions.

What's left now is:

  • Handling tree operations.
  • Saving the texture.
  • Making the title and marks textures atomic.

@emersion
Copy link
Member

emersion commented Jun 7, 2018

Texture discussion: swaywm/wlroots#1046

RyanDwyer added 6 commits June 9, 2018 10:08
* The arrange_foo functions are now replaced with arrange_and_commit, or
with manually created transactions and arrange_windows x2.
* The arrange functions are now only called from the highest level
functions rather than from both high level and low level functions.
* Due to the previous point, view_set_fullscreen_raw and
view_set_fullscreen are both merged into one function again.
* Floating and fullscreen are now working with transactions.
* Also fix parts of the rendering where it was rendering the pending
state instead of current.
@RyanDwyer
Copy link
Member Author

Update on this: It now preserves the buffer/texture using the wlr_buffer refcounting. I am now waiting on a way to access the buffer after the view has unmapped (see swaywm/wlroots#1065).

@emersion
Copy link
Member

I am now waiting on a way to access the buffer after the view has unmapped (see swaywm/wlroots#1065).

Note that this should already work with xdg-shell views.

This implements atomic layout updates for when views map, reparent or
unmap.
@RyanDwyer
Copy link
Member Author

RyanDwyer commented Jun 23, 2018

This is now ready for testing. I'm going to start using this as my daily driver.

My latest commit implements atomic layouts for containers that are mapping, reparenting or unmapping.

How it works is we still have the regular tree, but the properties you're currently familiar with may not reflect what is actually rendered. Each container has a current state, and that state contains the properties that are rendered as well as another children list. When rendering, the con->current.children list is used rather than con->children. In most cases we can continue working with the tree as normal, and if a transaction is applied to a section of the tree then it'll eventually get updated in the rendered/current state.

Another key point is that destroying containers is now split into two steps: destroying and freeing. Destroying removes it from the tree (but keeps it in the current.children list of its old parent). When the container is no longer involved in a transaction (ie. removed from on-screen) then it'll be freed via container_free.

For views that are unmapping and destroying, a similar system is used. An unmapped view will still contain a reference to its swayc until it's no longer in a transaction. For this reason we can't use view->swayc to determine if a view is mapped, so a better check is view->surface. view->surface is set to NULL immediately at the end of each implementation's unmap handler.

The destroy callback in the view implementations has been renamed to free, to keep the terms consistent across the project.

To test what happens during a transaction, you can set TRANSACTION_DEBUG to true at the top of transaction.c. This makes it always wait for the timeout to expire before applying the new layout, even if the views are ready. You'll also want to bump up the TIMEOUT_MS to something like 1000 (1 second). Doing both of these allows you to observe what's rendered while a transaction is in progress, and test what happens if you interact with things while transactions are in progress. For example, launch a view then launch another one before the first one has appeared on screen.

If you use TRANSACTION_DEBUG and open several views, you will notice that no view is shown as focused while the transaction is happening. This is because the focus works with the pending tree, and your focus is on the view that hasn't appeared on screen yet. I think this is desirable behaviour, because it allows you to hammer out commands (eg. splitting and opening) without having to wait, and the rendered state will catch up.

Possible issues or other points of discussion:

  • When a transaction is happening, I'm still sending frame done to everything despite rendering a saved buffer. I'm not sure what the consequences of this are. I haven't seen any adverse effects and mpv seems to be fine with it.
  • container_at uses the pending state of the tree, not what's shown on screen, but with a timeout of 200ms I don't think we need to worry about this.
  • In theory, transactions that complete out of order may mess up the rendered state. I'm not sure how likely this is in practice though as there's usually only one transaction running at a time. So my strategy here is to wait and see if it's a problem, and if so I'll look at a solution. If it does happen, it'd be recoverable by doing another arrange operation (eg. switch workspace and back). Forcing transactions to complete in order is now implemented.

@RyanDwyer RyanDwyer changed the title WIP: Atomic layout updates Atomic layout updates Jun 23, 2018
@RyanDwyer RyanDwyer requested review from ddevault and emersion June 23, 2018 06:30
@ghost
Copy link

ghost commented Jun 23, 2018

Works well, haven't found faults yet. One thing is that when toggling fullscreen any clients using the layer shell don't get damaged until they flip, but I'm guessing this would be fixed by proper layer shell client management (which would fix other bugs too probably, such as the damage from layer shell clients not affecting fullscreen clients).

@ghost
Copy link

ghost commented Jun 23, 2018

Actually I can crash it easily. Create 2 windows, make either of them stacking, create a few tabs in the stacking container and then close all of them one by one. Closing the last one will crash the compositor.

There was no `current` child because the container was destroyed. This
makes it fall back to looking in the parent's current children list.
@ghost
Copy link

ghost commented Jun 23, 2018

Running mpv on one workspace and quickly opening and closing another mpv instance on the same workspace crashes the compositor. Works with any video and works even if mpv hasn't had a chance to open a window yet.

@ddevault
Copy link
Contributor

This works pretty well on a cursory test, but it occurs to me that we're not going to be able to stress test this without something like interactive resize support.

@ddevault
Copy link
Contributor

ddevault commented Jun 23, 2018

https://sr.ht/p99v.mkv

Edit: actually it's not much better on master: https://sr.ht/Cr1W.mkv

struct sway_container *con = instruction->container;
if (con->type == C_VIEW && !con->destroying &&
(con->current.view_width != instruction->state.view_width ||
con->current.view_height != instruction->state.view_height)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parens aren't necessary, || has a higher precedence than &&

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing these parens also makes the code unreadable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol I'm dumb

@ddevault
Copy link
Contributor

Codewise this looks pretty good. I'm a little concerned about thrashing the heap all the time but I'm not sure what alternative to offer. Need to test it more but I'm gonna wait until the current round of bugs is fixed.

wl_list_remove(&xwayland_view->commit.link);
view_unmap(&xwayland_view->view);
view->surface = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be done in view_unmap?

/**
* Implement the actual destroy logic, without reaping.
*/
struct sway_container *container_destroy_noreaping(struct sway_container *con) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be static

@@ -976,3 +986,12 @@ bool container_is_floating(struct sway_container *container) {
}
return container->parent == workspace->sway_workspace->floating;
}

struct wlr_box *container_get_box(struct sway_container *container) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function allocates a box, and callers don't free it.

Copy link
Member

@emersion emersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, I like it.

A few comments below. Will post more comments tomorrow.

Instead, damage each container when applying the transaction.
Rather than allocate a structure and expect callers to free it, take a
pointer to an existing struct as an argument.

This function is no longer called anywhere though.
To do this properly, the transaction queue will only be processed if it
can be completely processed.
common/list.c Outdated
@@ -62,6 +62,13 @@ void list_cat(list_t *list, list_t *source) {
}
}

void list_empty(list_t *list) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be much less subtle and just set list->length to 0, list_add does reallocs but list_del never bothers with reducing capacity either so meh.

wlr_texture_destroy(cont->title_focused);
wlr_texture_destroy(cont->title_focused_inactive);
wlr_texture_destroy(cont->title_unfocused);
wlr_texture_destroy(cont->title_urgent);

for (int i = 0; i < server.destroying_containers->length; ++i) {

This comment was marked as resolved.

@emersion
Copy link
Member

Do we even need to store damage? Can we just damage all containers in the transaction when applied?

for (int i = 0; i < view->swayc->instructions->length; ++i) {
struct sway_transaction_instruction *instruction =
view->swayc->instructions->items[i];
if (!instruction->ready && instruction->state.view_width == width &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a client receives a configure with size WxH, and commits a W'xH' surface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it wouldn't match any instruction in the list, the transaction would time out and then the transaction would apply. I don't think there's any other option. Keep in mind this function is only used by xwayland views.

common/list.c Outdated
list->capacity = 10;
list->length = 0;
free(list->items);
list->items = malloc(sizeof(void*) * list->capacity);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: missing space between void and *

@emersion
Copy link
Member

emersion commented Jun 27, 2018

Do we even need to store damage? Can we just damage all containers in the transaction when applied?

My bad. I failed to see that you've ACKed this one.

struct wlr_box box = {
.x = view->swayc->current.view_x - output->swayc->current.swayc_x,
.y = view->swayc->current.view_y - output->swayc->current.swayc_y,
.width = view->swayc->current.view_width,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the view width/height will render incorrectly if the surface size is different. This happens with unresizable windows for instance. We should crop surfaces instead.

@emersion
Copy link
Member

Thinking again about subsurfaces: this is tricky because right now subsurfaces will be hidden while applying a transaction. Subsurfaces are part of the view content (ie. most of the view can be subsurfaces). You can try weston-subsurfaces for instance.

We were arranging a parent which may have been deleted by the reaper,
which meant the `current` children list of the surviving parent had a
dangling pointer.

Instead, we now reap the workspace.
@emersion
Copy link
Member

After discussion on IRC, @RyanDwyer convinced me the destroy/free wording is a good idea.

11:29 <ryan-au> Let's talk about terms like destroy and free
11:29 <emersion> ryan-au: also wanted to say it's very cool how atomic layout simplifies damage tracking
11:30 <emersion> ryan-au: https://github.com/swaywm/wlroots/blob/master/CONTRIBUTING.md#constructiondestruction-functions
11:30 <ryan-au> Yeah, I think we need to extend the naming convention
11:30 <ryan-au> I like the destroy term, and used it for transactions and even in some non-sway projects of mine
11:31 <ryan-au> But the problem is destroying is now a two step process due to transactions
11:31 <emersion> just find another word
11:32 <ryan-au> The only other word that comes to mind at the moment is remove
11:33 <emersion> finalize?
11:33 <emersion> prepare_destroy?
11:33 <ryan-au> Finalise can be used in the context of setting up
11:33 <emersion> true
11:34 <ryan-au> ok, so are you happen if I rename container_destroy to container_prepare_destroy, and container_free to container_destroy?
11:34 <ryan-au> *happy
11:34 <emersion> yeah, works for me
11:34 <ryan-au> It means any time you want to destroy a container you have to call container_prepare_destroy instead of container_destroy
11:35 <emersion> hmm
11:35 <ryan-au> Also container_workspace_destroy would become container_workspace_prepare_destroy
11:36 <emersion> i don't like it, but i don't like the current name either
11:36 <emersion> neither*
11:39 <ryan-au> If you consider the *_free function to be a component of destroying, then using the destroy+free terms makes sense
11:40 <ryan-au> Any time you call *_destroy, there's no backing out and it'll eventually get freed
11:40 <emersion> you have a point
11:41 <emersion> okay, i agree with you

A flash of background was happening for two reasons:

1) We were using the xsurface's dimensions to check if the surface is
ready, but these are pending dimensions.
2) In my particular setup, the default geometry of the xsurface does not
intersect any output, which prevented it from receiving a frame done
event. This made the transaction time out and the client would only
redraw once it's been rendered.
@emersion
Copy link
Member

Follow-up issues:

  • Handle subsurfaces
  • Views that aren't visible shouldn't block the transaction

@ddevault ddevault merged commit d8c61c9 into swaywm:master Jun 30, 2018
@ddevault
Copy link
Contributor

Thanks! \o/

@RyanDwyer RyanDwyer deleted the atomic branch July 19, 2018 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants