Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crowdstrike makes unable to sync project from self hosted gitlab via ssh clone #11518

Closed
3 tasks done
craph opened this issue Jan 11, 2022 · 10 comments
Closed
3 tasks done

Comments

@craph
Copy link
Contributor

craph commented Jan 11, 2022

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that AWX is open source software provided for free and that I am not entitled to status updates or other assurances.

Summary

After installing AWX with AWX operator and importing old database from awx 17.x, I'm unable to sync my projects from a self hosted Gitlab server via SSH clone.

I did the migration from local docker installation 17.x to the Kubernetes with AWX in version 19.2.0

AWX version

19.2.0

Installation method

kubernetes

Modifications

no

Ansible version

No response

Operating system

centos 7

Web browser

Chrome

Steps to reproduce

Create a new project with a source control url like this : ssh://git@your_git_server:your_custom_port/some_group/my_project.git and add source control credential. In my case it's a user with a private ssh key.
The private key doesn't have a passphrase.
Then save it and try to sync.

Expected results

The sync should work.

Actual results

The sync is still running indefinitly ... it's an infinite loop
In AWX 17.x the same project sync correctly

Where can I see the log of the sync process to investigate more ?

Additional information

the only logs I can find is in the awx-tasks container in the job_lifecycle.log

{"type": "projectupdate", "task_id": 2791, "state": "running_playbook", "template_name": "my-project", "guid": "1e632d939b7f4b649d51641d18d66264", "time": "2022-01-11T07:31:28.279033+00:00"}

How I can see the output of the job that sync my project to identify the issue ?

I don't have specify any default Execution Environment. Should I define one ?
image

I have crowdstrike version 6.32.12905 installed on the servers where is my k8s cluster.

@craph
Copy link
Contributor Author

craph commented Jan 11, 2022

Is it possible related with this issue ?

@craph
Copy link
Contributor Author

craph commented Jan 11, 2022

I did some other tests and if I try with url in https and login/pwd to clone my repository from my self hosted GitLab server I have the following error :

TASK [update project using git] ************************************************
fatal: [localhost]: FAILED! => {"changed": false, "cmd": "/usr/bin/git ls-remote 'https://$encrypted$:$encrypted$@my_gitlab_server/my-group/my-project.git' -h refs/heads/HEAD", "msg": "fatal: unable to access 'https://my_gitlab_server/my-group/my-project.git/': SSL certificate problem: unable to get local issuer certificate", "rc": 128, "stderr": "fatal: unable to access 'https://my_gitlab_server/my-group/my-project.git/': SSL certificate problem: unable to get local issuer certificate\n", "stderr_lines": ["fatal: unable to access 'https://my_gitlab_server/my-group/my-project.git/': SSL certificate problem: unable to get local issuer certificate"], "stdout": "", "stdout_lines": []}

why is it not possible to have a project that use ssh credential like it was in AWX 17.x ? @shanemcd @AlanCoding @ryanpetrello

When I use ssh credential and url like this : ssh://git@your_git_server:your_custom_port/some_group/my_project.git the output of the job is empty and the job is still running indefinitly

@craph
Copy link
Contributor Author

craph commented Jan 11, 2022

On my k8s worker nodes, if I stop the crowdstrike service I am able to clone and sync my project with ssh credential. After re-enabling crowdstrike the issue appear again...

But on another server where I have AWX 17.0.1 installed on local Docker with crowdstrike installed on the docker host, all the projects with ssh credential can be sync correctly without any issue.

what are the changes between AWX 17.0.1 and 19.x about the project updates (playbook processing ?)

@lyutian
Copy link

lyutian commented Jan 12, 2022

I met the project sync issue with self host git server also.
What different from yours, my git source is a simply git server setup as git offical webpage, and I only try the SSH way.

My awx version:
19.5.0
My source url in project configure:
git@x.x.x.x:/git-server/my_project.git
The failed job details shows:

{
  "cmd": "/usr/bin/git ls-remote '' -h refs/heads/master",
  "rc": 128,
  "stdout": "",
  "stderr": "Warning: Permanently added 'x.x.x.x' (ECDSA) to the list of known hosts.\r\ngit@x.x.x.x: Permission denied (publickey,password).\r\nfatal: Could not read from remote repository.\n\nPlease make sure you have the correct access rights\nand the repository exists.\n",
  "msg": "Warning:********@x.x.x.x: Permission denied (publickey,password).\r\nfatal: Could not read from remote repository.\n\nPlease make sure you have the correct access rights\nand the repository exists.",
  "invocation": {}
...

The "cmd": "/usr/bin/git ls-remote '' -h refs/heads/master" part seems has issue, which loses the source url.
But I can't find a way to get verbose job output for more information.

@lyutian
Copy link

lyutian commented Jan 12, 2022

Seems 18.0.0 and later, changed the ansible running environment from virtual env(/var/lib/awx/venv/) to container environment which is the new feature 'execution environment' provided.

This change provide a better way to run playbook indeed. But seems it also makes the troubleshooting more difficult.

When the awx used virtual env before, I can troubleshoot by login to awx container (docker exec tools_awx_1 bash), then enter venv and try git clone git@x.x.x.x/git-server/my_project.git to find the problem directly. But now, I could even not be to find where the docker execution environment is...

Where the 'execution environment' hid? 😭

@craph
Copy link
Contributor Author

craph commented Jan 12, 2022

@lyutian thank you for your reply.
How did you installed the AWX 19.5.0 ? is it a local Docker install or on K8S ?

In my case I don't have any output when I run a project sync on AWX 19.2.0 on my K8S cluster. The issue seems to be crowdstrike because when I disable it it works. BUT on another server with AWX 17.0.1 installed on Docker and crowstrike installed too I don't have the issue.

So that's why I'm asking what are the changes from AWX 17.0.1 and AWX 19.x ?

Moreover, in your case, the error message seems related to credentials. Are you using the correct credentials ?

@craph craph changed the title Unable to sync project from self hosted gitlab via ssh clone Crowdstrike makes unable to sync project from self hosted gitlab via ssh clone Jan 12, 2022
@craph
Copy link
Contributor Author

craph commented Jan 12, 2022

In the K8S cluster, how can I debug what is done when I try to sync a project ? Is it spawning some new pods or something else ? Where can I find some logs to investigate ?

@craph
Copy link
Contributor Author

craph commented Jan 12, 2022

After more investigation, in the container awx-ee when I ran a project sync I can see in process the ssh-add process stuck like this :

runner     693     1  0 09:52 ?        00:00:00 ssh-add /tmp/pdd_wrapper_2806_l1dsfrw9/awx_2806_epu_1i_e/artifacts/2806/ssh_key_data
runner     694     0  0 09:53 pts/1    00:00:00 /bin/sh -c TERM=xterm-256color; export TERM; [ -x /bin/bash ] && ([ -x /usr/bin/script ] && /usr/bin/script -q -c "/bin/bash" /dev/null || exec /bin/bash) || exec /bin/sh
runner     701   694  0 09:53 pts/1    00:00:00 /bin/sh -c TERM=xterm-256color; export TERM; [ -x /bin/bash ] && ([ -x /usr/bin/script ] && /usr/bin/script -q -c "/bin/bash" /dev/null || exec /bin/bash) || exec /bin/sh
runner     702   701  0 09:53 pts/1    00:00:00 /usr/bin/script -q -c /bin/bash /dev/null
runner     704   702  0 09:53 pts/3    00:00:00 /bin/bash
runner     720     1  0 09:54 ?        00:00:00 ssh-add /tmp/pdd_wrapper_2807_nekpd5hi/awx_2807_xng791no/artifacts/2807/ssh_key_data

If I try to insert date "for a test" in the pipe file like this :

date > /tmp/pdd_wrapper_2807_nekpd5hi/awx_2807_xng791no/artifacts/2807/ssh_key_data

The project sync failed (because it wasn't the correct key indeed) but the job is not "block" anymore.

After that if I retry the same thing with the correct private key, the job succeed.

echo "my-private-key" > /tmp/pdd_wrapper_2809_yd16dsgu/awx_2809_l15igf5h/artifacts/2809/ssh_key_dat

info :
the file is a pipe :

bash-4.4$ ls -l /tmp/pdd_wrapper_2807_nekpd5hi/awx_2807_xng791no/artifacts/2807/ssh_key_data
prw------- 1 runner root 0 Jan 12 09:54 /tmp/pdd_wrapper_2807_nekpd5hi/awx_2807_xng791no/artifacts/2807/ssh_key_data

@auroraqin
Copy link

After more investigation, in the container awx-ee when I ran a project sync I can see in process the ssh-add process stuck like this :

runner     693     1  0 09:52 ?        00:00:00 ssh-add /tmp/pdd_wrapper_2806_l1dsfrw9/awx_2806_epu_1i_e/artifacts/2806/ssh_key_data
runner     694     0  0 09:53 pts/1    00:00:00 /bin/sh -c TERM=xterm-256color; export TERM; [ -x /bin/bash ] && ([ -x /usr/bin/script ] && /usr/bin/script -q -c "/bin/bash" /dev/null || exec /bin/bash) || exec /bin/sh
runner     701   694  0 09:53 pts/1    00:00:00 /bin/sh -c TERM=xterm-256color; export TERM; [ -x /bin/bash ] && ([ -x /usr/bin/script ] && /usr/bin/script -q -c "/bin/bash" /dev/null || exec /bin/bash) || exec /bin/sh
runner     702   701  0 09:53 pts/1    00:00:00 /usr/bin/script -q -c /bin/bash /dev/null
runner     704   702  0 09:53 pts/3    00:00:00 /bin/bash
runner     720     1  0 09:54 ?        00:00:00 ssh-add /tmp/pdd_wrapper_2807_nekpd5hi/awx_2807_xng791no/artifacts/2807/ssh_key_data

If I try to insert date "for a test" in the pipe file like this :

date > /tmp/pdd_wrapper_2807_nekpd5hi/awx_2807_xng791no/artifacts/2807/ssh_key_data

The project sync failed (because it wasn't the correct key indeed) but the job is not "block" anymore.

After that if I retry the same thing with the correct private key, the job succeed.

echo "my-private-key" > /tmp/pdd_wrapper_2809_yd16dsgu/awx_2809_l15igf5h/artifacts/2809/ssh_key_dat

info : the file is a pipe :

bash-4.4$ ls -l /tmp/pdd_wrapper_2807_nekpd5hi/awx_2807_xng791no/artifacts/2807/ssh_key_data
prw------- 1 runner root 0 Jan 12 09:54 /tmp/pdd_wrapper_2807_nekpd5hi/awx_2807_xng791no/artifacts/2807/ssh_key_data

Hi @craph, I met the same issue on my awx 19.5.1, may I know how your run the project sync in the container directly?

@craph
Copy link
Contributor Author

craph commented Oct 18, 2022

After upgrading Crowdstrike version the problem is solved

@craph craph closed this as completed Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants