Reasons for asynchronous Pillar refresh and behavior of modules/saltutil.sync_all #67312
Replies: 8 comments
-
looks like you are indeed correct. the
Looks like this is currently possible for I do want to help you dive into your actual issue: "For some reason we observe from time to time that the pillar or grain data is not accurate and as expected. " Can you elaborate on this a bit more? Can you include a use case or logs when this occurs? |
Beta Was this translation helpful? Give feedback.
-
Hi, thanks very much for the details! I am still working to isolate the issue further. Imho it would indeed help to have the refresh_pillar synchronously available. It's a bit challenging due to the amount of available saltenv, high amount of calls and number of minions we have here to find the relevant logs. One piece I still don't get is e.g. which component is finally really processing/rendering the pillar. Is it the master and the minion (and both ideally compute the (eventually) same data)? I am asking this to get a better catch on what to look for. We are only at the beginning of our analysis at the moment. Basically the issue is that from time to time we see that minions either lack pillar information. At the same time we also see that the minion synchronizes environments it should not know about at all. That may be a different issue or not. Most of the time everything works as expected, sometimes not. I know I am very unspecific at the moment, sorry for that. |
Beta Was this translation helpful? Give feedback.
-
I believe we found the causes for our issues. It is my understanding now that the salt-master only compiles the pillar using grain and configuration information of the minion. Basically our issue was a combination of pillar merge strategies not fitting our use case. Still, I think it would be helpful to have the refresh_pillar in some way synchronous. It would already help if a event would be fired on the master bus so a runner could catch it. This event could then be used by a runner (using The event would need to be fired either in What would be the chances a PR in that direction would be approved? |
Beta Was this translation helpful? Give feedback.
-
without diving into the code I believe that would be the area. ping @saltstack/team-core might be able to show you exactly where. I will go ahead and approve this as a feature request :) |
Beta Was this translation helpful? Give feedback.
-
you do not need the runner. What you should do is exactly what the refresh_modules does when it asyncronous is False. https://github.com/saltstack/salt/blob/2018.3/salt/modules/saltutil.py#L979
You need to make sure to listen to the minion event bus, then you fire an event to refresh pillar, exactly like it does now, and then you listen for the event for when the pillars are completed refresh. Make sure to initialize the event listener first, because if it returns to quickly you might miss the event if you try to initialize the listener after the refresh event has been fired. |
Beta Was this translation helpful? Give feedback.
-
You will also need to add fireing the event, probably from the _gather_pillar() function in salt.minion, the same way that the minion_mods loader fires an event when notify is set to True. https://github.com/saltstack/salt/blob/2018.3/salt/loader.py#L263 |
Beta Was this translation helpful? Give feedback.
-
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue. |
Beta Was this translation helpful? Give feedback.
-
Thank you for updating this issue. It is no longer marked as stale. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am currently attempting to understand issues I see in regards to not-up-to date pillar in combination with the use of saltenv/pillarenv. Currently I am not in the position to file a valid defect due to lack of understanding of how and why the code behaves (by purpose).
Our setup is that we have all runners, states, modules, pillar and grain data in a git repository and highly rely on the saltenv/pillarenv feature to segregate different versions of the code base. As we highly rely on the saltenv feature (salt master configuration
pillarenv_from_saltenv: True
) we perform in front of any highstate a call saltutil.sync_all(refresh=True, saltenv=some_value) to ensure the highstate is using the expected custom modules, grains, pillar etc, available in the mentioned saltenv.For some reason we observe from time to time that the pillar or grain data is not accurate and as expected. When reviewing the code of the saltutil.sync_all method I got some questions. The method includes this code
This means the
refresh_modules
call happens asynchronous and the method immediately returns before starting therefresh_pillar
. Is this good? Wouldn't it make sense to be sure the modules and grains are refreshed before refreshing the pillar?I am not sure what the methods are exactly doing but I can imagine a situation in which Jinja code is rendering pillar data using custom module code or grains. If this module code is not confirmed to be current, pillar data might be not accurate.
Is the refresh_modules() call really needed here? Isn't
refresh_modules
a part ofrefresh_pillar
already?My other question is: Why is there no way to refresh pillar in a synchronous way? Currently we workaround this by adding a time out of some seconds but consider this only a hack which - even worse - still does not reliably work for some reasons we do not yet understand. This might be also caused by Jinja magic performing delaying calls but we are not yet sure about it.
Any input which may increase the understanding is highly appreciated.
The currently used Salt version is 2017.7.5
Beta Was this translation helpful? Give feedback.
All reactions