Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Initial x11rb-async implementation #790

Merged
merged 10 commits into from
Feb 6, 2023
Merged

Conversation

notgull
Copy link
Collaborator

@notgull notgull commented Jan 14, 2023

I don't remember exactly where, but I remember @psychon saying that they were thinking of creating an x11rb-async crate that provides an asynchronous API over x11rb. This PR creates an x11rb-async crate that aims to do just that.

Motivation: I saw this code and it made me angry because it could be an asynchronous task instead of taking up an entire thread.

Things this PR has so far:

  • RequestConnection and Connection traits that mirror their equivalents in x11rb.
  • A BlockingConnection connection that is implemented by running function calls on a separate thread.
  • About half of a RustConnection that wraps around x11rb::RustConnection but registers the Stream into an async reactor. Generally trying to reduce the amount of duplicated code.
  • Cookie, VoidCookie et al

Things I need to do before I mark this as ready:

  • Finish the RustConnection. This may require more finesse, including exposing greater fidelity in the x11rb implementation.
  • Are we find with how we do async traits? Right now this is basically equivalent to the output of async-trait, but on stable Rust we can use GATs and on nightly Rust we have access to real-life async traits (although the latter wouldn't let us use dyn).

Things to do in the future:

  • Add a wrapper around XCBConnection.

Closes #318

@codecov
Copy link

codecov bot commented Jan 14, 2023

Codecov Report

Base: 13.69% // Head: 12.46% // Decreases project coverage by -1.24% ⚠️

Coverage data is based on head (f8bf805) compared to base (dacfba5).
Patch coverage: 1.66% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #790      +/-   ##
==========================================
- Coverage   13.69%   12.46%   -1.24%     
==========================================
  Files         142      180      +38     
  Lines      120124   133177   +13053     
==========================================
+ Hits        16454    16602     +148     
- Misses     103670   116575   +12905     
Flag Coverage Δ
tests 12.46% <1.66%> (-1.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
x11rb-async/examples/xclock_utc.rs 0.00% <0.00%> (ø)
x11rb-async/src/blocking.rs 0.00% <0.00%> (ø)
x11rb-async/src/connection.rs 0.00% <0.00%> (ø)
x11rb-async/src/cookie.rs 0.00% <0.00%> (ø)
x11rb-async/src/protocol/bigreq.rs 0.00% <0.00%> (ø)
x11rb-async/src/protocol/composite.rs 0.00% <0.00%> (ø)
x11rb-async/src/protocol/damage.rs 0.00% <0.00%> (ø)
x11rb-async/src/protocol/dbe.rs 0.00% <0.00%> (ø)
x11rb-async/src/protocol/dpms.rs 0.00% <0.00%> (ø)
x11rb-async/src/protocol/dri2.rs 0.00% <0.00%> (ø)
... and 38 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@psychon
Copy link
Owner

psychon commented Jan 15, 2023

I don't remember exactly where, but I remember @psychon saying that they were thinking of creating an x11rb-async crate that provides an asynchronous API over x11rb.

Looks like whatever I did back then is still in some branches: https://github.com/psychon/x11rb/tree/async-experiments https://github.com/psychon/x11rb/tree/async-experiments2

I don't really remember details, but I do remember that I did not use the code generator, but instead had request sending implemented based on the traits that are now in x11rb-protocol (VoidRequest etc). Oh and that I tried really hard for this code to work "everywhere" (tokio, async-std, ...) so ended up implementing my own AsyncMutex. That might not have been the best idea...

Copy link
Owner

@psychon psychon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a look and ended up being confused about the locking strategy. Sorry, I don't have time to look at this more carefully right now.

}

fn discard_reply(&self, sequence: SequenceNumber, kind: RequestKind, mode: DiscardMode) {
// Doesn't block.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well.... depends on your definition of "block". This does not perform I/O, but it does lock a mutex (which is supposed not to be held for long). I guess this is fine, right?

(Same comment for other "Doesn't block"-comments)

Edit: Actually, this is generic code. I could implement the trait with something that just sleeps for 5 minutes. But I guess it's not worth caring about such malicious implementations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's not like anything unsound can happen if this blocks. It's just that, unless there's a buggy/malicious implementation on our hands (which we shouldn't assume), it shouldn't block for a protracted period of time.

Also, it's fine to lock std::sync::Mutexes in async code as long as you don't hold their guards across await points, see this point in the tokio documentation.

x11rb-async/src/blocking.rs Show resolved Hide resolved
let sequence = cookie.sequence_number();
mem::forget(cookie);
sequence
};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read the new cookies.rs yet, but x11rb's Cookie has an into_sequence_number() for the above sequence.

Edit: ....that is only pub(crate), but could still be good enough for this use here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we might want to expose that method as pub.

}

/// Eat the cookie and return the connection.
fn consume(self) -> (&'conn C, SequenceNumber) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, if you make this pub(crate), you could use this for my into_sequence_number() comment from elsewhere. Dunno if that makes sense.


/// Multiple IPs.
Multiple(std::vec::IntoIter<IpAddr>),
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This smells like smallvec and also a bit like premature optimisation. Why not always use a Vec? That one memory allocation for the Single case will be practically unmeasurable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, although I'd rather use tinyvec if we go in this direction

/// Connect to the X11 server.
pub async fn connect(dpy_name: Option<&str>) -> Result<(Self, usize), ConnectError> {
let (stream, screen) = nb_connect::connect(dpy_name).await?;
Ok((Self::connect_to_stream(stream, screen).await?, screen))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No .Xauthority "fun"? I guess that deserves at least a TODO comment somewhere.

(Yay, I guess that means the Xauth code needs to be made public somewhere... do we need another crate besides x11rb-protocol for such shared code?

{
Box::pin(async move {
// Gain exclusive access to the write end of the stream.
let _guard = self.write_lock.lock().await;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this "two mutexes"-approach really work? 🤔

From libxcb's source code, I learnt that there was "once upon a time" a deadlock between libX11 and the X11 server: The X11 server stops reading from a client when it has stuff to sent, so the client must always read and write at the same time. Only writing without reading could produce a deadlock.

With your "two mutex" approach... well, there could be only a thread/task trying to write a request, so this task also has to read a request. I guess this means it also has to lock the read mutex? But there could be another task in wait_for_reply() that already does the waiting, so in this task waiting for the mutex would produce unwanted blocking.

Oh and it could happen that the write thread is blocked for a while and in the mean time the wait_for_reply() finishes and so the reader that was present at the beginning of send_request goes away and the write thread has to also "start" reading.

For my async experiments I solved this by spawning an extra task that does all the reading and is always "blocked" in read(). That way, nothing else has to worry about reading since it is always already in progress. Of course, this makes the synchronisation between things more complicated and also "proper shutdown" becomes more complicated....

) -> Fut<'_, Option<Self::Buf>, ConnectionError> {
Box::pin(async move {
// Gain exclusive access to the read end of the stream.
let _guard = self.read_lock.lock().await;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the Error was already received from the server (so sits in some queue somewhere), but another task sits in wait_for_event() and thus this mutex is locked? That would produce unwanted blocking.

let _guard = self.write_lock.lock().await;

self.do_io(PollMode::ReadAndWritable, |conn| {
conn.inner.extension_information(name)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, now I feel like I misunderstood the "two mutexes" approach. Why doesn't this need the read lock?

I thought that this had something to do with https://docs.rs/async-io/latest/async_io/struct.Async.html#concurrent-io, but now... I am no longer sure and I am more confused.

// Try to gain access to the read end of the stream.
let _guard = match self.read_lock.try_lock() {
Some(guard) => guard,
None => return Ok(None),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this incorrectly return None when an event sits in a queue / was already received, but the mutex is locked because something waits for the reply to a request?

@psychon
Copy link
Owner

psychon commented Jan 15, 2023

Oh and: How does x11rb-async relate to breadx? My impression was that x11rb-async is not necessary since breadx does the same thing already?

@notgull
Copy link
Collaborator Author

notgull commented Jan 15, 2023

Thanks for the review!

Oh and that I tried really hard for this code to work "everywhere" (tokio, async-std, ...) so ended up implementing my own AsyncMutex. That might not have been the best idea...

As someone who tried the "runtime independent" idea, you don't want to do the runtime independent idea. It usually ends up in sprawling code that is extraordinarily difficult to maintain (see this future, which appears in my reoccuring nightmares). async-io runs just fine under tokio, and async-lock is runtime-independent for all intents and purposes.

Regarding the locking strategy: my initial goal was to write this crate as a wrapper around x11rb that used I/O error propagation and sync wrappers around async code to be able to reuse the existing RustConnection code. This is the strategy used by async-rustls, among others. The locks were my way of trying to prevent deadlocks from happening due to reading/writing at the same time. However, there are some subtleties with RustConnection that I don't think I can work around with this strategy, so I need to rethink my approach.

@psychon
Copy link
Owner

psychon commented Jan 15, 2023

However, there are some subtleties with RustConnection that I don't think I can work around with this strategy, so I need to rethink my approach.

Since you have much more async-expertise than I do:

How about providing a future to the user that must be spawned to the executor and that does all the reading and pushes X11 packets (and possibly FDs) into an AsyncQueue? That way, all other code does not have to care too much about the reading part and (most importantly) the interaction with writing goes away.

Downside is that reading is still complicated. Since I know the sync world of things: I would use mutexes and condition variables to make this work correctly. Dunno if there is a better async alternative. In one of my async experiments, Cookies contained the read end of a oneshot channel and the write end was in the InnerConnection. That way, the reading task can cause a wakeup in the cookie if needed. But this is still complicated logic...

The sync version of the above would be "start a thread that does all the reading".

@notgull
Copy link
Collaborator Author

notgull commented Jan 16, 2023

Having a "driving" future is probably the best way of doing this, this is what postgres and hyper do as well.

Looking into this, it may be impossible to implement as a wrapper around x11rb. There would need to be some kind of "allocate a new sequence number and request but don't actually send it" function.

@yshui
Copy link

yshui commented Jan 17, 2023

Thank you so much for this! ❤️

Would this close #318 ?

@notgull notgull marked this pull request as ready for review January 17, 2023 19:33
@notgull
Copy link
Collaborator Author

notgull commented Jan 17, 2023

I think this is ready now. I've added an xclock example that mirrors the one in the main package, but uses an async runtime rather than poll(). All the TODOs should be filled out now.

Also: We may want to add 1.64 for IntoFuture in #538, since Cookie should probably implement it for replies.

Would this close #318 ?

Yes, I think it would. This would provide an async way to use x11rb.

Oh and: How does x11rb-async relate to breadx? My impression was that x11rb-async is not necessary since breadx does the same thing already?

I think having a way to use x11rb's API in an async area would be useful in the case that I linked above. x11rb is used already in these cases, so having an alternative async API would make it easier to replace threads with tasks.

@psychon
Copy link
Owner

psychon commented Jan 17, 2023

Also: We may want to add 1.64 for IntoFuture in #538, since Cookie should probably implement it for replies.

Couldn't Cookie simply implement Future? Although I guess that would require manually implementing Future, which is not the nicest thing to code... 🤔

@notgull
Copy link
Collaborator Author

notgull commented Jan 17, 2023

Couldn't Cookie simply implement Future? Although I guess that would require manually implementing Future, which is not the nicest thing to code... thinking

This would require including a slot for the wait_for_reply future in Cookie, which wouldn't be the most ergonomic thing in the world.

@notgull notgull requested a review from psychon January 24, 2023 17:18
@psychon
Copy link
Owner

psychon commented Jan 31, 2023

Sorry for being so slow. I don't have a lot of time currently. And I still do not. 🙈

I looked at everything except generator/ and x11rb-async and I am confused by the changes in x11rb/. Adding spaces at the end of the line in Cargo.toml seems like an error and the changes in x11rb/src seem weird. Somehow I remembered that the "integrate with x11rb"-solution did not work out. What is the state of things here? Is x11rb-async meant to be used as a separate connection or is it somehow doing async I/O on an already existing x11rb connection or...?

Also: How does this handle futures being dropped before they are polled to completion. Specifically sending large requests "only half" is a great way to cause problems IIRC.

@notgull
Copy link
Collaborator Author

notgull commented Jan 31, 2023

Adding spaces at the end of the line in Cargo.toml seems like an error

My bad, I'll fix that.

Somehow I remembered that the "integrate with x11rb"-solution did not work out. What is the state of things here? Is x11rb-async meant to be used as a separate connection or is it somehow doing async I/O on an already existing x11rb connection or...?

I wanted to build directly on top of the x11rb::RustConnection to deduplicate code between x11rb and x11rb-async. It uses some tricks to make it so it can do async I/O on top of the x11rb base. I needed to expose some extra functionality for the x11rb::RustConnection to make this work. If you don't want this extra API, I can remove it and try to build x11rb-async without it.

Also: How does this handle futures being dropped before they are polled to completion. Specifically sending large requests "only half" is a great way to cause problems IIRC.

There is a writing boolean in the structure that it set to true when we are writing something and sets it back to false when it's done. It a future is dropped mid-write, it stays set to true. If a new writing operation happens and it sees that writing is set to true, it panics before starting.

Copy link
Owner

@psychon psychon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I ran out of time again. I looked through everything except rust_connection. Here's a proposal for how to continue: Split this PR up!

The first version only uses the blocking pool (and IIRC I only had one comment that affects this code).

The async RustConnection is then later added in a second PR (and in that PR I will again take forever since I am trying to get rid of the x11rb-async -> x11rb dependency; having this dependency feels like a failure of x11rb-protocol / some stuff is still missing)

What do you think? Does that seem feasible? Sorry for the necessary history-rewriting. I promise that I can react to such a first PR more quickly since I basically already approved it.

x11rb-async/src/blocking.rs Outdated Show resolved Hide resolved
x11rb-async/src/rust_connection/max_request_bytes.rs Outdated Show resolved Hide resolved
x11rb-async/src/rust_connection/nb_connect.rs Outdated Show resolved Hide resolved
/// Check if this request caused an X11 error.
pub async fn check(self) -> Result<(), ReplyError> {
let (conn, seq) = self.consume();
match conn.check_for_raw_error(seq).await? {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Async code makes my head spin...

This one can "leak" if the future for check() is not polled to completion and the error was not yet received. This cookie is already consumed, so its Drop impl won't run. When the error is later received, it will sit indefinitely in the "receive queue".

However, I guess one can just ignore this. It's quite unlikely that anyone ever hits this and even if it happens, nothing really bad happens. Just 32 bytes are kept around until the connection is dropped.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the consume() to after the await, which should ensure the drop happens.

@notgull
Copy link
Collaborator Author

notgull commented Feb 6, 2023

Sorry, I ran out of time again. I looked through everything except rust_connection. Here's a proposal for how to continue: Split this PR up!

That's probably for the best; I'll add the equivalent to RustConnection (and maybe XCBConnection) as future PRs.

and in that PR I will again take forever since I am trying to get rid of the x11rb-async -> x11rb dependency; having this dependency feels like a failure of x11rb-protocol / some stuff is still missing

In the current state this probably won't happen; BlockingConnection uses x11rb::Connection to drive its functionality. However I can rewrite RustConnection to not rely on x11rb.

@mergify mergify bot merged commit 15153e5 into master Feb 6, 2023
@mergify mergify bot deleted the notgull/x11rb-async branch February 6, 2023 16:18
@psychon
Copy link
Owner

psychon commented Mar 11, 2023

I'll add the equivalent to RustConnection (and maybe XCBConnection) as future PRs.

What's the state of this? Would you mind if I gave this a try myself?

@notgull
Copy link
Collaborator Author

notgull commented Mar 11, 2023

I'll add the equivalent to RustConnection (and maybe XCBConnection) as future PRs.

What's the state of this? Would you mind if I gave this a try myself?

I got distracted. I can give it a go this weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Async/await pattern
3 participants