Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Salt fails to render pillar but master logs are OK #59339

Closed
BrianSidebotham opened this issue Jan 21, 2021 · 4 comments · Fixed by #61189
Closed

[BUG] Salt fails to render pillar but master logs are OK #59339

BrianSidebotham opened this issue Jan 21, 2021 · 4 comments · Fixed by #61189
Assignees
Labels
Bug broken, incorrect, or confusing behavior severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around Vault
Milestone

Comments

@BrianSidebotham
Copy link
Contributor

Description
When using state.sls to apply state to a minion from the master the response says:

Pillar failed to render with the following messages:

Rendering SLS 'roles.atlassian-postgresdb' failed.

Please see master log for details.

However, the pillar data is correctly rendered on the master (as proved in the master logs), so the minion appears to be incorrectly saying that the pillar data has failed to render.

The logs from the master and minion are below. There are not really any clues until we start looking at the code of salt.state which I haven't got round to doing yet.

Master logs (in debug)

2021-01-21 16:20:55,037 [salt.template    :31  ][PROFILE ][14235] Time (in seconds) to render '/srv/salt/pillar/roles/atlassian-postgresdb.sls' using 'jinja' renderer: 6.497445106506348
2021-01-21 16:20:55,038 [salt.template    :122 ][DEBUG   ][14235] Rendered data from file: /srv/salt/pillar/roles/atlassian-postgresdb.sls:

SOME VALID YAML RENDERING (CHECKED WITH YAMLLINT)

2021-01-21 16:20:55,046 [salt.loaded.int.render.yaml:85  ][DEBUG   ][14235] Results of YAML rendering: 
OrderedDict(...)
2021-01-21 16:20:55,048 [salt.template    :31  ][PROFILE ][14235] Time (in seconds) to render '/srv/salt/pillar/roles/atlassian-postgresdb.sls' using 'yaml' renderer: 0.008826017379760742
2021-01-21 16:20:55,053 [salt.utils.event :752 ][DEBUG   ][14235] Sending event: tag = minion/refresh/server1.ex.ample.com; data = {'Minion data cache refresh': 'server1.ex.ample.com', '_stamp': '2021-01-21T16:20:55.053677'}
2021-01-21 16:20:55,057 [salt.crypt       :215 ][DEBUG   ][14235] salt.crypt.get_rsa_pub_key: Loading public key
2021-01-21 16:20:55,313 [salt.utils.job   :88  ][INFO    ][14231] Got return from server1.ex.ample.com for job 20210121162047724717
2021-01-21 16:20:55,314 [salt.utils.event :752 ][DEBUG   ][14231] Sending event: tag = salt/job/20210121162047724717/ret/server1.ex.ample.com; data = {'cmd': '_return', 'id': 'server1.ex.ample.com', 'success': False, 'return': ['Pillar failed to render with the following messages:', "Rendering SLS 'roles.atlassian-postgresdb' failed. Please see master log for details."], 'retcode': 5, 'jid': '20210121162047724717', 'fun': 'state.sls', 'fun_args': ['postgres'], 'out': 'highstate', '_stamp': '2021-01-21T16:20:55.314247'}
2021-01-21 16:20:55,317 [salt.client      :1164][DEBUG   ][19826] jid 20210121162047724717 return from server1.ex.ample.com
2021-01-21 16:20:55,317 [salt.client      :1605][DEBUG   ][19826] return event: {'server1.ex.ample.com': {'ret': ['Pillar failed to render with the following messages:', "Rendering SLS 'roles.atlassian-postgresdb' failed. Please see master log for details."], 'out': 'highstate', 'retcode': 5, 'jid': '20210121162047724717'}}

Minion logs (in debug)

2021-01-21 16:20:55,088 [salt.state       :770 ][DEBUG   ][4852] Finished gathering pillar data for state run
2021-01-21 16:20:55,089 [salt.state       :1144][INFO    ][4852] Loading fresh modules for state activity
2021-01-21 16:20:55,296 [salt.utils.lazy  :102 ][DEBUG   ][4852] LazyLoaded jinja.render
2021-01-21 16:20:55,297 [salt.utils.lazy  :102 ][DEBUG   ][4852] LazyLoaded yaml.render
2021-01-21 16:20:55,302 [salt.minion      :896 ][DEBUG   ][4852] Minion return retry timer set to 10 seconds (randomized)
2021-01-21 16:20:55,303 [salt.minion      :2150][INFO    ][4852] Returning information for job: 20210121162047724717
2021-01-21 16:20:55,304 [salt.transport.zeromq:165 ][DEBUG   ][4852] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'server1.ex.ample.com', 'tcp://10.3.250.22:4506', 'aes')
2021-01-21 16:20:55,304 [salt.crypt       :495 ][DEBUG   ][4852] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'server1.ex.ample.com', 'tcp://10.3.250.22:4506')
2021-01-21 16:20:55,304 [salt.transport.zeromq:264 ][DEBUG   ][4852] Connecting the Minion to the Master URI (for the return server): tcp://10.3.250.22:4506
2021-01-21 16:20:55,305 [salt.transport.zeromq:1302][DEBUG   ][4852] Trying to connect to: tcp://10.3.250.22:4506
2021-01-21 16:20:55,325 [salt.transport.zeromq:291 ][DEBUG   ][4852] Closing AsyncZeroMQReqChannel instance
2021-01-21 16:20:55,326 [salt.minion      :2017][DEBUG   ][4852] minion return: {'success': False, 'return': ['Pillar failed to render with the following messages:', "Rendering SLS 'roles.atlassian-postgresdb' failed. Please see master log for details."], 'retcode': 5, 'jid': '20210121162047724717', 'fun': 'state.sls', 'fun_args': ['postgres']}

Setup
If the rendered pillar data is used instead of the jinja sls file that produces the yaml, the minion doesn't have a problem. The pillar is only reported as failing to render when there is a lot of hashicorp vault activity in the rendering of the yaml.

Steps to Reproduce the behavior
This happens on a server with a lot of pillar and state data defined. Running state.sls on the sls file results in this odd behaviour.

Expected behavior
A failure message in either the minion or master log that points to the reason for failure.

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)

Minion report:

(1:44)# salt-minion --versions-report
Salt Version:
          Salt: 3002.2
 
Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: Not Installed
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.11.1
       libgit2: Not Installed
      M2Crypto: 0.35.2
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: Not Installed
        pygit2: Not Installed
        Python: 3.6.8 (default, Nov 16 2020, 16:55:22)
  python-gnupg: Not Installed
        PyYAML: 3.13
         PyZMQ: 17.0.0
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.1.4
 
System Versions:
          dist: centos 7 Core
        locale: UTF-8
       machine: x86_64
       release: 3.10.0-1160.11.1.el7.x86_64
        system: Linux
       version: CentOS Linux 7 Core

Master report:

(1:129)# salt --versions-report
Salt Version:
          Salt: 3002.2
 
Dependency Versions:
          cffi: Not Installed
      cherrypy: unknown
      dateutil: Not Installed
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.11.1
       libgit2: Not Installed
      M2Crypto: 0.35.2
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: Not Installed
        pygit2: Not Installed
        Python: 3.6.8 (default, Nov 16 2020, 16:55:22)
  python-gnupg: Not Installed
        PyYAML: 3.13
         PyZMQ: 17.0.0
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.1.4
 
System Versions:
          dist: centos 7 Core
        locale: UTF-8
       machine: x86_64
       release: 3.10.0-1160.11.1.el7.x86_64
        system: Linux
       version: CentOS Linux 7 Core
@BrianSidebotham BrianSidebotham added the Bug broken, incorrect, or confusing behavior label Jan 21, 2021
@garethgreenaway garethgreenaway added Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged and removed needs-triage labels Jan 25, 2021
@garethgreenaway garethgreenaway added this to the Blocked milestone Jan 25, 2021
@garethgreenaway
Copy link
Contributor

@BrianSidebotham Thanks for the report. To clarify, are you storing pillar data in Vault? Or using it for something else? Thanks.

@BrianSidebotham
Copy link
Contributor Author

BrianSidebotham commented Jan 26, 2021

@garethgreenaway Secrets are stored in vault and we have salt['vault.read_secret']() calls in the pillar in order to render those secrets. We could not use sdb:// for vault because of a bug in 3001 with vault's sdb driver, although that may well be fixed now in 3002.

Sorry I cannot provide much more in the way of an encompassing example which re-produces the error.

@sagetherage sagetherage added needs-triage and removed Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged labels May 4, 2021
@sagetherage sagetherage removed this from the Blocked milestone May 4, 2021
@garethgreenaway
Copy link
Contributor

@BrianSidebotham Apologies for the delay on this one. I did notice some issues with Salt and later versions of Vault, what version were you using when running into issues?

@BrianSidebotham
Copy link
Contributor Author

@garethgreenaway Vault was at v1.6.0 👍

@sagetherage sagetherage added this to the Approved milestone May 11, 2021
@sagetherage sagetherage added severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around Vault labels Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around Vault
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants