Deadlock detection and resolution? #26

alvestrand · 2018-01-08T19:56:18Z

If two processes execute

await navigator.locks.acquire('A', async lock => {
await navigator.locks.acquire('B')
})

and

await navigator.locks.aquire('B', async lock => {
await navigator.locks.acquire('A')
})

standard processing theory says that they will sooner or later end up in a sad state (deadlock).

can we detect this?
can we resolve this?

On the detection side, it worries me slightly that the lock snapshot state doesn't identify holders in any way; there's no way to differentiate between "X holds A and Y holds B" and "X holds A and B, Y waits for A and B".

inexorabletash · 2018-01-08T20:13:12Z

The naive answer is of course: that's a bug in the web app, which should be coded to avoid deadlocks, e.g. by always requesting multiple locks in a strict order. (See #20). But yeah, what further assistance can we provide...

As noted, even a web app that is defending against its own bugs can't tell by state inspection what is holding the locks/making the requests; logging the query() snapshot to the server may not provide enough to debug. Ideas that come to mind: (1) allowing apps to specify an arbitrary identifier with each request which is reflected in the snapshot, (2) including a the user-agent assigned id in the snapshot. I don't think we expose the latter outside of SWs at the moment, so it wouldn't correlate with much, although it would at least show that A and B are held by/requested by different contexts, thus clearly showing the deadlock.

alvestrand · 2018-01-08T20:33:57Z

I think it's inevitable that locks will be used inside libraries.

A side effect of "a lock is a name" is that the system has no idea what locks can exist - so some library can always use an internal lock that would fit in lexical order in a surprising way - so there's no way for the browser to predict what locks are going to exist, which defeats most schemes for avoiding deadlock.

What I'd like to see is:
a) a part of the spec that talks about the danger of deadlocks, and that acquisition in some hierarchical order and not holding them longer than required are both Good Practices - won't prevent them, but will at least warn the users that they should google the word "deadlock" when their app stops suddenly and mysteriously :-)
b) a part of the spec that allows diagnostics. Including a context ID in the lock listing (I'd go for an internally generated one - don't burden the simple use cases with managing an ID space, and don't let the user shoot themselves in the foot by having two contexts with the same name) seems like a good start.

inexorabletash · 2018-01-08T20:49:10Z

re: spec/deadlocks - absolutely, everything you listed is awesome. I'm signing you up to review it when written. :)

re: diagnostics - In theory, including the context's environment id would be straightforward (one more thing plumbed through from request). @domenic - is this a reasonable use of that data?

(In Blink's implementation, it looks like we currently mint the client ID as a UUID in the "service worker host" i.e. a SW-specific bit of code associated with each client (a.k.a. agent), rather than actually having something that multiple services could easily share. I.e. some refactoring would be needed, but that's not a spec problem.)

domenic · 2018-01-08T20:51:54Z

I don't know much about environment ID; I think it was added for service workers. @jungkees or maybe @annevk would be able to help understand that better. Although I guess you figured most of it out by digging through the Blink code :)

For #26

inexorabletash · 2018-01-08T23:38:56Z

Took a stab at adding clientId to the snapshot in the proto-spec.

I have an implementation/tests in blink (up for review shortly) that adds the property, although it mints the ids on its own (like serviceworker).

inexorabletash · 2018-01-11T23:25:29Z

Closing this for now - I added a section to the explainer (README) about deadlocks; everything in the explainer should end up in the eventual spec so that'll be the seed.

Thanks @alvestrand !

The proposed API includes a query() method which returns a snapshot of the state of the lock manager for an origin - which locks are held, and which locks are requested. This can be used by developers to debug their web applications, either locally or in the field. This CL introduces a clientId field giving a unique identifier for each execution context, giving more context about the system. For example, this can be used to determine that a deadlock has been introduced. NOTE: The identifier should be the "environment's id" in spec language, and thus the same clientId exposed in Service Workers for client enumeration and fetch events, but currently these id namespaces are not unified. Proposal: https://github.com/inexorabletash/web-locks/ Discussion: w3c/web-locks#26 Bug: 161072 Change-Id: I56df39acd7ca12d6c0062c3efcd21d97c5992770 Reviewed-on: https://chromium-review.googlesource.com/855576 Reviewed-by: Daniel Cheng <dcheng@chromium.org> Reviewed-by: Victor Costan <pwnall@chromium.org> Commit-Queue: Joshua Bell <jsbell@chromium.org> Cr-Commit-Position: refs/heads/master@{#528797}

inexorabletash added question spec labels Jan 8, 2018

inexorabletash added a commit that referenced this issue Jan 8, 2018

Add clientId property to snapshot state

602fda4

For #26

inexorabletash closed this as completed Jan 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlock detection and resolution? #26

Deadlock detection and resolution? #26

alvestrand commented Jan 8, 2018

inexorabletash commented Jan 8, 2018

alvestrand commented Jan 8, 2018

inexorabletash commented Jan 8, 2018

domenic commented Jan 8, 2018

inexorabletash commented Jan 8, 2018

inexorabletash commented Jan 11, 2018

Deadlock detection and resolution? #26

Deadlock detection and resolution? #26

Comments

alvestrand commented Jan 8, 2018

inexorabletash commented Jan 8, 2018

alvestrand commented Jan 8, 2018

inexorabletash commented Jan 8, 2018

domenic commented Jan 8, 2018

inexorabletash commented Jan 8, 2018

inexorabletash commented Jan 11, 2018