Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provisioning Beaker machines w/count > 1 causes XML RPC errors #1693

Closed
Dannyb48 opened this issue Mar 16, 2020 · 2 comments · Fixed by #1694
Closed

Provisioning Beaker machines w/count > 1 causes XML RPC errors #1693

Dannyb48 opened this issue Mar 16, 2020 · 2 comments · Fixed by #1694
Assignees
Milestone

Comments

@Dannyb48
Copy link
Contributor

Describe the bug
On March 11th carbon was provisoining beaker systems with a count of 2 and everything was working creating and destroying fine. From that point on provisioning were failing on destroy. The error from Beaker is the following.

xmlrpclib.Fault: <Fault 1: "<class 'sqlalchemy.exc.OperationalError'>:(OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') 'INSERT INTO job_activity (id, job_id) VALUES (%s, %s)' (92678998L, 4138086L)

But the issue seems to be cosmetic in the sense that when I lookup the beaker job id the job was indeed cancelled. So it seems like some type of race condition.

I was able to reproduce this outside of Carbon with the following PinFile and a count of greater than 1. If I use the same PinFile and specify just a count of 1 there are no issues on:

---
beaker-test:
  topology:
    resource_groups:
    - resource_definitions:
      - job_group: ci-ops-central
        recipesets:
        - arch: x86_64
          count: 2
          distro: RHEL-7.5
          hostrequires:
          - op: '='
            tag: pool
            value: ci-ops-central-qe
          - op: '>'
            tag: memory
            value: 15000
          - op: <
            tag: memory
            value: 400000
          - op: '>'
            tag: cpu_count
            value: 2
          - op: <
            tag: cpu_count
            value: 13
          name: carbon-beaker-node
          ssh_key_file:
          - demo.pub
          variant: Server
        role: bkr_server
        ssh_keys_path: /home/dbaez/projects/carbon-py3/carbon_include_scenario_example/keys
        whiteboard: Danny_Test
      resource_group_name: carbon
      resource_group_type: beaker
    topology_name: carbon

STACKTRACE

Traceback (most recent call last):
  File "/home/dbaez/.ansible/tmp/ansible-tmp-1584371936.9-84960481172744/AnsiballZ_bkr_server.py", line 102, in <module>
    _ansiballz_main()
  File "/home/dbaez/.ansible/tmp/ansible-tmp-1584371936.9-84960481172744/AnsiballZ_bkr_server.py", line 94, in _ansiballz_main
    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
  File "/home/dbaez/.ansible/tmp/ansible-tmp-1584371936.9-84960481172744/AnsiballZ_bkr_server.py", line 40, in invoke_module
    runpy.run_module(mod_name='ansible.modules.bkr_server', init_globals=None, run_name='__main__', alter_sys=False)
  File "/usr/lib64/python2.7/runpy.py", line 192, in run_module
    fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/tmp/ansible_bkr_server_payload_rzY4Xe/ansible_bkr_server_payload.zip/ansible/modules/bkr_server.py", line 333, in <module>
  File "/tmp/ansible_bkr_server_payload_rzY4Xe/ansible_bkr_server_payload.zip/ansible/modules/bkr_server.py", line 329, in main
  File "/tmp/ansible_bkr_server_payload_rzY4Xe/ansible_bkr_server_payload.zip/ansible/modules/bkr_server.py", line 268, in cancel_jobs
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1243, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1602, in __request
    verbose=self.__verbose
  File "/home/dbaez/.virtualenvs/linchpin/lib/python2.7/site-packages/bkr/common/xmlrpc2.py", line 478, in request
    result = transport_class.request(self, *args, **kwargs)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1283, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/home/dbaez/.virtualenvs/linchpin/lib/python2.7/site-packages/bkr/common/xmlrpc2.py", line 386, in _single_request
    return self.parse_response(response)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1493, in parse_response
    return u.close()
  File "/usr/lib64/python2.7/xmlrpclib.py", line 800, in close
    raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault 1: "<class 'sqlalchemy.exc.OperationalError'>:(OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') 'INSERT INTO job_activity (id, job_id) VALUES (%s, %s)' (92678998L, 4138086L)">

To Reproduce
Steps to reproduce the behavior:

  1. Create a PinFile to provision a Beaker resource using a count > 1
  2. Run linchpin -vvvv up
  3. Run linchpin -vvvv destroy
  4. See error
@Dannyb48 Dannyb48 self-assigned this Mar 16, 2020
@samvarankashyap
Copy link
Contributor

@Dannyb48
I am assuming you are using a develop branch of linchpin.
Were you able to reproduce the bug in previous versions too? like 1.9.2 ?
I think the error is due to some unknown changes from the beaker server.
Further, I see the above is running python2.7, it would be great if it runs on python3.x

Dannyb48 added a commit to Dannyb48/linchpin that referenced this issue Mar 16, 2020
This should fix CentOS-PaaS-SIG#1693 where two resources belong to the same
job are filtered to avoid xmlrpc condition
Dannyb48 added a commit to Dannyb48/linchpin that referenced this issue Mar 16, 2020
This should fix CentOS-PaaS-SIG#1693 where two resources belong to the same
job are filtered to avoid xmlrpc condition
@Dannyb48
Copy link
Contributor Author

@samvarankashyap

I agree I think this is an unknown change on Beaker server side. Yes, I am using latest develop branch.

Good catch, it looks like it's not reproduceable on python 3. Although, I have identified and tested a fix for py2. I just got done testing it with python3 and it worked fine. The fix will allow whatever changes happened on the Beaker server side to be backwards compatible for python 2 as well.

I just submitted it. I'll let you guys review to see if you want to merge it in.

Dannyb48 added a commit to Dannyb48/linchpin that referenced this issue Mar 17, 2020
This should fix CentOS-PaaS-SIG#1693 where two resources belong to the same
job are filtered to avoid xmlrpc condition
@samvarankashyap samvarankashyap added this to the v2.0.0 milestone Mar 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants