Add thread locking to TEDAPI #148

Nexarian · 2025-02-23T23:56:55Z

Summary

Using a dictionary with an api lock is not thread safe. Given that this is frequently called from the proxy, a more thread safe option is appropriate, especially because I am in the process of modifying the proxy to call multiple gateways.
A large amount of the "changes" are simply indenting existing lines for the "with" statement that the api lock uses.
Update some typing
Switch to enumerate instead of manually incrementing indicies.

Testing

Using my Raspberry PI. This has been working well for about a week.

jasonacox · 2025-02-24T06:32:44Z

Thanks @Nexarian - I'll start testing. Out of curiosity, did you see any thread locking issues as a result of the current design? It would be interesting to correlate this to issues people see (if they do).

Nexarian · 2025-02-24T15:50:16Z

The origin of this fix was the race condition issues we were seeing with the animation. That is now very clearly in the realm of something to do with JavaScript contention and browsers, but as part of debugging it I tried this.

Dictionaries are not thread-safe, and any thread safety you see is incidental from implementation and Python version, so this is a correct fix, though not the only fix. You could argue that exponential backoff and jitter isn't necessary here, and my use of a contextmanager that forces a with-indent might be over-engineered. I'll let you all think about it.

But as best I can tell, this works with no ill side effects and makes the code cleaner. The magic of the with block makes it so the lock is always released and we don't have to manually think about it anymore.

Nexarian · 2025-02-24T18:19:19Z

I just realized this has a problem: I said dictionaries aren't thread safe and you shouldn't use them for that and I am still doing that here. defaultdict doesn't mean it'll be thread-safe. I think we need another way to store the locks... I'll think about it.

mccahan · 2025-02-24T20:15:39Z

The origin of this fix was the race condition issues we were seeing with the animation. That is now very clearly in the realm of something to do with JavaScript contention and browsers, but as part of debugging it I tried this.

This is relevant to my interests. I was having difficulty with the animation JS absolutely hammering the endpoint because it runs the HTTP request every X seconds regardless of whether it has one running at the time or not, but with the minified code it was tedious to edit.

I've started rebuilding my own version of it without the dependency on the unminified existing code here: https://github.com/mccahan/power-flow . Using TanStack Query to make sure we're only running one HTTP request to pypowerwall at a time and being a bit more tolerant with some error situations.

Nexarian · 2025-02-24T21:23:03Z

@mccahan 😮 That's awesome! Did you run the code through some sort of de-minifier or are you rebuilding from scratch?

mccahan · 2025-02-24T21:56:05Z

Started from scratch, it seems like a whole technician setup app and everything so it didn't seem worth sorting through.

I'm torn on which way to go with it API-wise, with the issues I've had with timeouts and stalls I have a local copy of pypowerwall modified to add a websocket endpoint so it can broadcast whenever it gets new load/battery numbers to try and get around some things. This PR might change that plan a bit, though.

Nexarian · 2025-02-24T22:01:40Z

@mccahan Feel free to test it and let me know! I still need to fix the dictionary thing, but overall it should be an improvement in locking capability.

jasonacox · 2025-02-25T03:56:57Z

I just realized this has a problem: I said dictionaries aren't thread safe and you shouldn't use them for that and I am still doing that here. defaultdict doesn't mean it'll be thread-safe. I think we need another way to store the locks... I'll think about it.

Ok, no rush. Let me know when it is ready to test. 🙏

Nexarian · 2025-02-27T04:30:21Z

@jasonacox Should be ready now. There's perhaps more cleanup/hygiene to do here, but the functional aspect should be fine.

I was able to add a "thread safe" dictionary with minimal additional refactoring.

- Using a dictionary with an api lock is not thread safe. Given that this is frequently called from the proxy, a more thread safe option is appropriate, especially because I am in the process of modifying the proxy to call multiple gateways. - Use a dictionary (ApiLockPool) that IS thread safe. - Update some typing - Switch to enumerate instead of manually incrementing indicies.

mccahan · 2025-02-27T06:54:56Z

@Nexarian I haven't tested the version from 2 hours ago, but the previous version I was still having an issue with request spamming. I didn't do a lot of investigation, though I do wonder if having separate locks for every kind of request is an issue.

This is a work-in-progress, but I do have part of my suggestion on the matter here: https://github.com/jasonacox/pypowerwall/compare/main...mccahan:pypowerwall:feature/queue-tedapi-requests?expand=1

Uses a queue and a worker thread to make sure we're running one query at a time
Deduplicates a lot of the lock/cache code out to that thread (should be thread safe because the cache belongs to that thread)
Since it's a queue, if you get multiple requests in the queue waiting on the same request it'll serve them all the same thing when the first resolves (within lifetimes)
Should make it a bit easier to handle multiple Powerwalls if that's your goal, since you could have a separate queue thread per PW
Refactoring a lot of the protobuf stuff into separate "Message" classes to clean up some mess in that massive file

Nexarian · 2025-02-27T16:52:50Z

@mccahan The scope of this PR is very narrow: The way API locks were implemented was not correct or thread safe. That's all I intend to fix here. I had, in passing, mentioned that I found this as a result of investigating the behavior of the Tesla animation, but this wasn't directly intended to address anything there.

I don't fully understand what issues with request spamming means, and this PR wouldn't be intended to fix that anyway. I recommend starting a separate issue on the exact symptoms you're seeing.

I'm also not entirely convinced that making sure pypowerwall/proxies query the inverter serially is strictly necessary. There are provisions in the code to already handle backoff if we're getting timeouts from the gateways, and without more concrete data about how they scale, I believe this to be premature optimization.

Rather than store the lock in a dictionary, which at best uses the Global Interpreter Lock to get a value from a dictionary (and dictionary reads are not guaranteed thread safe), store the lock as an attribute on the function, which should never have access issues. It also removes a "global" parameter from the object.

Nexarian · 2025-03-05T05:44:27Z

@jasonacox I updated the way the threading.Lock is stored. I think this is better... It uses a decorator on the functions that need api locking. I originally wanted to do an array, but this is more "automatic." After a lot of thought, dictionaries aren't great for storing these, they consume the Global Interpreter Lock for reads and are not guaranteed thread safe anyway.

In this way, the lock is defined once, has thread-safe access, is defined in a generic way, and is relatively easy to access.

Nexarian force-pushed the add_api_lock_to_tedapi branch 2 times, most recently from 24dd7f5 to f925a6c Compare February 24, 2025 00:50

Nexarian force-pushed the add_api_lock_to_tedapi branch from f43a5d7 to c1f639a Compare February 27, 2025 04:34

Nexarian added 2 commits March 5, 2025 00:23

Update so the api_lock is only defined once

899ef74

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add thread locking to TEDAPI #148

Add thread locking to TEDAPI #148

Nexarian commented Feb 23, 2025 •

edited

Loading

jasonacox commented Feb 24, 2025

Nexarian commented Feb 24, 2025

Nexarian commented Feb 24, 2025

mccahan commented Feb 24, 2025

Nexarian commented Feb 24, 2025

mccahan commented Feb 24, 2025

Nexarian commented Feb 24, 2025

jasonacox commented Feb 25, 2025

Nexarian commented Feb 27, 2025

mccahan commented Feb 27, 2025 •

edited

Loading

Nexarian commented Feb 27, 2025

Nexarian commented Mar 5, 2025

Add thread locking to TEDAPI #148

Are you sure you want to change the base?

Add thread locking to TEDAPI #148

Conversation

Nexarian commented Feb 23, 2025 • edited Loading

Summary

Testing

jasonacox commented Feb 24, 2025

Nexarian commented Feb 24, 2025

Nexarian commented Feb 24, 2025

mccahan commented Feb 24, 2025

Nexarian commented Feb 24, 2025

mccahan commented Feb 24, 2025

Nexarian commented Feb 24, 2025

jasonacox commented Feb 25, 2025

Nexarian commented Feb 27, 2025

mccahan commented Feb 27, 2025 • edited Loading

Nexarian commented Feb 27, 2025

Nexarian commented Mar 5, 2025

Nexarian commented Feb 23, 2025 •

edited

Loading

mccahan commented Feb 27, 2025 •

edited

Loading