-
Notifications
You must be signed in to change notification settings - Fork 302
fleetctl: check systemd active state via client API #1667
fleetctl: check systemd active state via client API #1667
Conversation
839d97f
to
0cb4369
Compare
3b60e93
to
875d938
Compare
I don't think it's ever really sane to fall back to D-Bus, is it? |
@jonboulle Good question. |
bdc94e8
to
fa5aa3a
Compare
@dongsupark could you link to an example of test failures in that case? |
@jonboulle |
ffecd1e
to
ee5585d
Compare
Finally I fixed a bug, and force-pushed. It turns out, checking states via client API has been already working correctly, and "fleetctl list-units" has actually shown correct output. The only broken thing was the way of using loop around Please have a look. Thanks. |
4712b14
to
88a0a30
Compare
@@ -1067,78 +1072,87 @@ func assertUnitState(name string, js job.JobState, out io.Writer) (ret bool) { | |||
// tryWaitForSystemdActiveState tries to wait for systemd units to reach an | |||
// active state, making use of the systemd's DBUS interface. First it takes one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring needs updating now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonboulle Done. Thanks for spotting it.
88a0a30
to
6a64a4e
Compare
|
||
// getSingleUnitState returns a single uState of type suState, which consists | ||
// of necessary systemd states, only for a given unit name. | ||
func getSingleUnitState(unitName string) (suState, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned about the efficiency of this - UnitStates()
is a relatively expensive call, and we're only concerned about a single item. It's actually a little worse because we're potentially doing this in multiple goroutines simultaneously, per waitForUnitStates
.
Two suggestions:
- introduce a
UnitState.Get
call (similar to how, for units themselves, we have bothUnit.Get
andUnit.List
)
or, - use a single
UnitState.List
call and share the results across all goroutines (in waitForUnitStates)
thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonboulle
First of all, getSingleUnitState()
is being called only by waitForSystemdActiveState(), not by waitForUnitStates(). So it's not true that getSingleUnitState()
is called from multiple goroutines, because waitForSystemdActiveState()
simply iterates over units in a sequential way.
On the other hand, your concern is still valid, because it's basically not efficient to call cAPI.UnitStates()
for each unit. I think I'm going to choose the 2nd option, which is calling only a single UnitState.List
to share its results for all units. Let me see.
Thanks for pointing it out! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongsupark ah! good point, sorry about that. But now you've pointed that out - wouldn't it make more sense to unify the behaviour of the two so that they're both parallelised? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonboulle
I pushed a new commit to call cAPI.UnitStates()
only once. Please have a look.
I actually tried to also introduce cAPI.UnitState()
, to allow this code to fetch a single unit status. However, that's definitely a lot of work all over the place in fleet. (schema, etcdRegistry, rpcRegistry, etc) So I'll do that in the future, creating another PR.
As for using goroutines also for waitForSystemdActiveState(), I'll try to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonboulle Force-pushed the commit, after changing the logic from sequential iteration to goroutines.
d7da0fc
to
d919ee8
Compare
d919ee8
to
32a88e2
Compare
func checkSystemdActiveState(apiStates []*schema.UnitState, name string, maxAttempts int, wg *sync.WaitGroup, errchan chan error) { | ||
defer wg.Done() | ||
|
||
if maxAttempts < 1 { | ||
for { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not particularly fond of the duplication of these loop bodies - can we instead do something more clever with the loop conditional to collapse them into a single for
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonboulle Good point. I just split out the common part into assertFetchSystemdActiveState()
, and let checkSystemdActiveState()
go into a single loop. Such duplication was just a result of copy-and-paste.
Looks pretty good overall |
32a88e2
to
7ce7460
Compare
To be as idiomatic as possible, do so: * make tryWaitForSystemdActiveState() return error instead of int * pass getBlockAttempts(cCmd) to tryWaitForSystemdActiveState() to handle such cases of --no-block or --block-attempts. * make assertSystemdActiveState(), waitForSystemdActiveState() return just error, not with 'found', as it's unnecessary. * remove unnecessary checking against conn == nil. * compare strings with '!=' instead of strings.Compare() != 0
Now that tryWaitForSystemdActiveState() returns error instead of int, tryWaitForUnitStates() should also return error instead of int, to be as idiomatic as possible.
It's not appropriate to check systemd active states with DBUS connection, as that will return correct states only on the same machine. So we should introduce a check using cAPI.UnitStates(), to check states on other machines.
7ce7460
to
413f6e3
Compare
@jonboulle Now I suppose everything was addressed. I'll merge it soon. |
413f6e3
to
988fd84
Compare
As it's basically not efficient to call cAPI.UnitStates() for each unit, let's call it only once outside goroutines for processing units, i.e. calling in waitForSystemdActiveState(), not in getSingleUnitState(). To do that, we need to share the result apiStates across all units. Also make use of goroutines in waitForSystemdActiveState(), instead of sequential iteration over each unit. Note that we still need to keep calling UnitStates() when assertSystemdActiveState() failed, because in the next attempt the old apiState is not necessarily going to be valid. Ideally in that case, we should be able to fetch the state only for a single unit. However, we cannot do that for now, because cAPI.UnitState() is not available. In the future we would need to implement cAPI.UnitState() and all dependendent parts all over the tree in fleet, e.g. schema, etcdRegistry, rpcRegistry, etc. to replace UnitStates() in this place with the new method UnitState(). In practice, calling UnitStates() here is not as badly inefficient as it looks, because it will be anyway rarely called only when the assertion failed.
It's not appropriate to check systemd active states with DBUS connection, as that will return correct states only on the same machine. So we should also introduce a check using
cAPI.UnitStates()
, to check states on other machines.