Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hail 118 #299

Merged
merged 96 commits into from
Jun 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
0bd6a49
[infra] create subnetworks manually (#13016)
danking May 11, 2023
2da08af
[batch] Allow private job network traffic to internal IPs (#13036)
daniel-goldstein May 11, 2023
3b04c08
[qob] maybe retry GSFS reads (#13013)
danking May 11, 2023
554b6b3
[infra][gke] disable gce metadata enable gke metadata (#13017)
danking May 11, 2023
4a9d538
[query] xfail should always be strict (#12961)
danking May 11, 2023
d42d89e
[benchmark] Make benchmarks run in Google Artifact Repository (#13039)
ehigham May 11, 2023
e553ebc
[services] reliably retry all requests (#13029)
danking May 12, 2023
2fa23a8
[query] Better message for LocalLDPrune only taking diploid calls (#1…
chrisvittal May 12, 2023
c9de811
[ruff] maybe eliminate PLW2901 (#12990)
danking May 12, 2023
7d537ac
[batch] no wget (#13033)
danking May 12, 2023
8dfc21f
[qob] only use highmem for one test (#13040)
danking May 12, 2023
0c9ac88
[infra] fix workload_identity_config (#13041)
danking May 12, 2023
ff658b0
[qob] actually do a naive_coalsece (#13042)
danking May 12, 2023
8362afa
[query] minor performance improvement (#13044)
danking May 12, 2023
ae7cdee
[ci] Create a test MySQL server in test and dev namespaces (#13030)
daniel-goldstein May 12, 2023
94f410e
[batch] Give cloud-specific API implementations more control over con…
daniel-goldstein May 12, 2023
559469b
Bump types-chardet from 5.0.4.5 to 5.0.4.6 in /hail/python/dev (#13034)
dependabot[bot] May 12, 2023
b1e509f
[docs] fix minor doc bug (#13019)
danking May 12, 2023
859f0cc
[query] eliminate irrelevant META-INF.services (#13037)
danking May 13, 2023
3171e25
Correct the 0.2.116 release date [very minor] (#13047)
jmarshall May 15, 2023
277ee08
[hail-ubuntu] use focal-20221019 (#12378)
danking May 15, 2023
c6bae18
[query] Split up matrix randomness test (#13048)
daniel-goldstein May 15, 2023
0aaf389
[hailctl] Fix validation for remote_tmpdir to allow azure https schem…
daniel-goldstein May 15, 2023
532e688
[query] Expand ReadValue beyond hail EType deserialization (#12948)
chrisvittal May 16, 2023
a2b070f
Azure-redeploy-fixes-upstream (#13058)
violetbrina May 16, 2023
004eab2
[hailctl] fix init_notebook.py requirements to be compatible with Hai…
danking May 16, 2023
211bff4
[batch] Populate aggregated billing project users table (#12817)
jigold May 16, 2023
6ad5f0e
[batch] Fix dedup billing project user resources audit (#13069)
jigold May 16, 2023
3a55e70
[build.yaml] Remove failing dedup_bp_user_resources migration (#13070)
jigold May 17, 2023
e4eae2c
[batch] Use same size VMs in dev and PR as we use in prod (#12974)
daniel-goldstein May 17, 2023
1883a71
[vds] Fix bug in vds.filter_intervals with reference block max len (#…
tpoterba May 17, 2023
4a449ed
[batch] Dont use the authorization header for worker tokens to the dr…
daniel-goldstein May 17, 2023
d7bbe32
[nd] docs should use \ge not \gte (#13078)
danking May 18, 2023
ad59047
[query] warn about force=True in hl.import_vcf (#13079)
danking May 18, 2023
a2caf57
[batch] Dont schedule on instances that dont match the current instan…
daniel-goldstein May 18, 2023
0f62ef6
[transient-errors] will it ever end? (#13043)
danking May 18, 2023
df352de
[qob] immediateFlush=true for QoB logger (#13067)
danking May 18, 2023
cf04eca
[batch] Fix Python to add correct deduped resource id (#13082)
jigold May 19, 2023
646fc76
[batch] Exponential backoff of resource usage monitoring (#13088)
jigold May 19, 2023
d8e61bc
[docker] Add rsync to hail-ubuntu image (#13087)
jigold May 20, 2023
c626b49
[qob] remove resource leak in readNoCompression (#13065)
danking May 20, 2023
31acf40
[release] 0.2.117 (#13085)
danking May 22, 2023
9627a0b
[qob] novel transient error (#13075)
danking May 22, 2023
2a7ef11
[test-dataproc] add subnet (#13090)
danking May 23, 2023
8e96d8e
[query] suggest --only-binary=:all: if pip fails to build from source…
danking May 23, 2023
e90cf2b
[qob] retry stream is already closed (#13104)
danking May 23, 2023
dc384d0
[query/ggplot] Updates API reference to include faceting (#12569)
iris-garden May 24, 2023
e365a2f
[infra] apply a PDB to kube-dns (#13105)
danking May 24, 2023
3755ede
[batch] Add resource usage plots for JVM containers (#13098)
jigold May 24, 2023
8294fd4
[aiotools] Properly close azure storage clients (#13101)
jigold May 24, 2023
b0c8dea
[batch] Fix attempt resources after update trigger to use deduped res…
jigold May 25, 2023
7f088a0
[hailctl] Remove check_for_update since pip search no longer works (#…
daniel-goldstein May 25, 2023
3d7c110
[query] permit any compatible patch version (#13111)
danking May 25, 2023
a79e34d
Bump tornado from 6.3.1 to 6.3.2 in /hail/python/dev (#13120)
dependabot[bot] May 26, 2023
2b85ba5
[deploy] assert sufficient space is available at PyPI in deploy (#13118)
danking May 26, 2023
2203c6d
[auth] Add hail_identity to REST get users (#12889)
illusional May 30, 2023
06a4198
[vds] Compute max reference block length in the VDS combiner (#13081)
tpoterba May 30, 2023
8387521
[build.yaml] retry copy_images.sh (#13130)
danking May 31, 2023
67cebf5
More parallelism timeouts strict local backend heap size (#13128)
danking Jun 1, 2023
c4cb2bb
[query] Lower MatrixMultiWrite (#12892)
tpoterba Jun 2, 2023
3edb8a7
[qob] Add retry to one partition fast path (#13126)
chrisvittal Jun 3, 2023
61d7e73
[ci] Only alert on failed deploys once (#13137)
jigold Jun 5, 2023
216fcbd
[compiler] mark ApplySeeded as lowerable (#12621)
patrick-schultz Jun 5, 2023
8a075e9
[query] a few more timeout relaxations or test splits (#13145)
danking Jun 7, 2023
e1c8b44
[query] split tests, faster test dataset, longer timeout for big test…
danking Jun 7, 2023
386f151
[batch] Remove unnecessary get_compute_client method on CloudWorkerAP…
daniel-goldstein Jun 8, 2023
106ede4
CHANGELOG Allow subnet to be passed through to gcloud in hailctl (#13…
tlangs Jun 8, 2023
07ed0b1
[query] spectra_and_moments_1 also takes a long time in spark?? (#13153)
danking Jun 8, 2023
1a26393
[batch] set min pool size to 1 (#13150)
danking Jun 8, 2023
d647b6a
Feature/sas token final (#13140)
gregsmi Jun 8, 2023
bf814b6
[query/ggplot] Shows legend group for single-value columns (#13113)
iris-garden Jun 9, 2023
271a72a
[hailctl] Fix wrong temp-bucket setting in hailctl dataproc start (#1…
daniel-goldstein Jun 9, 2023
22a1984
[ci][batch] mitigate some instability (#13155)
danking Jun 9, 2023
df00ebc
[qob] tightly scope retries in ServiceBackend (#13152)
danking Jun 9, 2023
60bf091
[auth] mirror the batch settings for non-deploy (#13164)
danking Jun 9, 2023
39dc3c0
[qob] more debugging information on UnexpectedEOFError in Azure (#13160)
danking Jun 9, 2023
d24f578
[infra] Add resource group to azure terraform state remote config (#1…
daniel-goldstein Jun 9, 2023
c90b4ad
[query] fix requester pays in all Query backends (#13089)
danking Jun 9, 2023
e6a6dc8
[query] better error message in union_cols (#13144)
danking Jun 9, 2023
ab1de78
added import_csv (#12047)
Aleisha02 Jun 9, 2023
0f495fd
[fs] rmtree empty dir, rmtree empty subdir, rmtree link, raise conten…
danking Jun 10, 2023
39bde98
[ci][gear][auth][query][hailtop] upgrade requests and cryptography (#…
danking Jun 10, 2023
80334af
[ci] Automatically recreate expired root cert in dev and test namespa…
daniel-goldstein Jun 10, 2023
9059870
[terraform] remote terraform state (#12991)
danking Jun 10, 2023
87b8d0a
[release] updates changelog for 0.2.118 (#13127)
iris-garden Jun 10, 2023
86a01dd
[docs] Expose hailtop.fs and hailtop.batch in query docs (#13114)
jigold Jun 10, 2023
e8b4543
[query] Zstdandard (Zstd) compression (#12981)
chrisvittal Jun 10, 2023
e638043
[qob] noisier ServiceBackend startup (#13157)
danking Jun 10, 2023
8495463
[ci] Grant create temporary tables privilege to test dbs (#13165)
jigold Jun 10, 2023
305b355
Revert "[query] Zstdandard (Zstd) compression" (#13172)
daniel-goldstein Jun 12, 2023
750f336
[query] skip comments when parsing spark-defaults.conf (#13171)
danking Jun 12, 2023
62f606c
[query] Improve _same to make exactly one pass over the data (#13151)
danking Jun 13, 2023
c5093fc
Merge commit '2a7ef11' into hail-117-surgery-ii
illusional Jun 13, 2023
3e1a740
[deploy] Add WHEEL to environment for assert_pypi_has_room.py (#13176)
daniel-goldstein Jun 13, 2023
a4ca239
[query] split yet another test (#13169)
danking Jun 13, 2023
71d6bd4
Merge commit 'a4ca239' into hail-118
illusional Jun 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,19 @@ check-pip-requirements:
ci \
memory

.PHONY: check-linux-pip-requirements
check-linux-pip-requirements:
./check_linux_pip_requirements.sh \
hail/python/hailtop \
hail/python \
hail/python/dev \
gear \
web_common \
auth \
batch \
ci \
memory

.PHONY: install-dev-requirements
install-dev-requirements:
python3 -m pip install \
Expand Down Expand Up @@ -146,6 +159,13 @@ base-image: hail-ubuntu-image docker/Dockerfile.base
./docker-build.sh . docker/Dockerfile.base.out $(BASE_IMAGE)
echo $(BASE_IMAGE) > $@

hail-run-image: base-image hail/Dockerfile.hail-run hail/python/pinned-requirements.txt hail/python/dev/pinned-requirements.txt docker/core-site.xml
$(eval BASE_IMAGE := $(DOCKER_PREFIX)/hail-run:$(TOKEN))
$(MAKE) -C hail wheel
python3 ci/jinja2_render.py '{"base_image":{"image":"'$$(cat base-image)'"}}' hail/Dockerfile.hail-run hail/Dockerfile.hail-run.out
./docker-build.sh . hail/Dockerfile.hail-run.out $(BASE_IMAGE)
echo $(BASE_IMAGE) > $@

private-repo-hailgenetics-hail-image: hail-ubuntu-image docker/hailgenetics/hail/Dockerfile $(shell git ls-files hail/src/main hail/python)
$(eval PRIVATE_REPO_HAILGENETICS_HAIL_IMAGE := $(DOCKER_PREFIX)/hailgenetics/hail:$(TOKEN))
$(MAKE) -C hail wheel
Expand Down
12 changes: 6 additions & 6 deletions auth/auth/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -513,12 +513,12 @@ async def post_create_user(request, userdata): # pylint: disable=unused-argumen
@auth.rest_authenticated_developers_only
async def rest_get_users(request, userdata): # pylint: disable=unused-argument
db: Database = request.app['db']
users = await db.select_and_fetchall(
'''
SELECT id, username, login_id, state, is_developer, is_service_account FROM users;
_query = '''
SELECT id, username, login_id, state, is_developer, is_service_account, hail_identity
FROM users;
'''
)
return json_response([user async for user in users])
users = [x async for x in db.select_and_fetchall(_query)]
return json_response(users)


@routes.get('/api/v1alpha/users/{user}')
Expand All @@ -529,7 +529,7 @@ async def rest_get_user(request, userdata): # pylint: disable=unused-argument

user = await db.select_and_fetchone(
'''
SELECT id, username, login_id, state, is_developer, is_service_account FROM users
SELECT id, username, login_id, state, is_developer, is_service_account, hail_identity FROM users
WHERE username = %s;
''',
(username,),
Expand Down
17 changes: 3 additions & 14 deletions auth/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -126,11 +126,7 @@ spec:
selector:
matchLabels:
app: auth
{% if deploy %}
replicas: 3
{% else %}
replicas: 1
{% endif %}
replicas: 5
template:
metadata:
labels:
Expand Down Expand Up @@ -253,6 +249,7 @@ spec:
secret:
optional: false
secretName: ssl-config-auth
{% if deploy %}
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
Expand All @@ -263,13 +260,8 @@ spec:
apiVersion: apps/v1
kind: Deployment
name: auth
{% if deploy %}
minReplicas: 3
maxReplicas: 10
{% else %}
minReplicas: 1
maxReplicas: 3
{% endif %}
metrics:
- type: Resource
resource:
Expand All @@ -281,14 +273,11 @@ kind: PodDisruptionBudget
metadata:
name: auth
spec:
{% if deploy %}
minAvailable: 2
{% else %}
minAvailable: 0
{% endif %}
selector:
matchLabels:
app: auth
{% endif %}
---
apiVersion: v1
kind: Service
Expand Down
19 changes: 12 additions & 7 deletions auth/pinned-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@
#
# pip-compile --output-file=hail/auth/pinned-requirements.txt hail/auth/requirements.txt
#
cachetools==5.3.0
cachetools==5.3.1
# via
# -c hail/auth/../gear/pinned-requirements.txt
# -c hail/auth/../hail/python/pinned-requirements.txt
# google-auth
certifi==2022.12.7
certifi==2023.5.7
# via
# -c hail/auth/../gear/pinned-requirements.txt
# -c hail/auth/../hail/python/dev/pinned-requirements.txt
Expand All @@ -22,7 +22,7 @@ charset-normalizer==3.1.0
# -c hail/auth/../hail/python/pinned-requirements.txt
# -c hail/auth/../web_common/pinned-requirements.txt
# requests
google-auth==2.17.3
google-auth==2.19.1
# via
# -c hail/auth/../gear/pinned-requirements.txt
# -c hail/auth/../hail/python/pinned-requirements.txt
Expand All @@ -37,7 +37,9 @@ idna==3.4
# -c hail/auth/../web_common/pinned-requirements.txt
# requests
oauthlib==3.2.2
# via requests-oauthlib
# via
# -c hail/auth/../hail/python/pinned-requirements.txt
# requests-oauthlib
pyasn1==0.5.0
# via
# -c hail/auth/../gear/pinned-requirements.txt
Expand All @@ -49,14 +51,16 @@ pyasn1-modules==0.3.0
# -c hail/auth/../gear/pinned-requirements.txt
# -c hail/auth/../hail/python/pinned-requirements.txt
# google-auth
requests==2.28.2
requests==2.31.0
# via
# -c hail/auth/../gear/pinned-requirements.txt
# -c hail/auth/../hail/python/dev/pinned-requirements.txt
# -c hail/auth/../hail/python/pinned-requirements.txt
# requests-oauthlib
requests-oauthlib==1.3.1
# via google-auth-oauthlib
# via
# -c hail/auth/../hail/python/pinned-requirements.txt
# google-auth-oauthlib
rsa==4.9
# via
# -c hail/auth/../gear/pinned-requirements.txt
Expand All @@ -68,9 +72,10 @@ six==1.16.0
# -c hail/auth/../hail/python/dev/pinned-requirements.txt
# -c hail/auth/../hail/python/pinned-requirements.txt
# google-auth
urllib3==1.26.15
urllib3==1.26.16
# via
# -c hail/auth/../gear/pinned-requirements.txt
# -c hail/auth/../hail/python/dev/pinned-requirements.txt
# -c hail/auth/../hail/python/pinned-requirements.txt
# google-auth
# requests
11 changes: 5 additions & 6 deletions batch/Dockerfile.worker
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,8 @@ RUN hail-apt-get-install \
iptables \
openjdk-8-jre-headless \
liblapack3 \
libyajl-dev \
wget \
xfsprogs
xfsprogs \
libyajl-dev

RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1

Expand All @@ -22,8 +21,8 @@ RUN echo "APT::Acquire::Retries \"5\";" > /etc/apt/apt.conf.d/80-retries && \

{% elif global.cloud == "azure" %}
# https://github.com/Azure/azure-storage-fuse/issues/603
RUN hail-apt-get-install ca-certificates pkg-config libfuse-dev cmake libcurl4-gnutls-dev libgnutls28-dev uuid-dev libgcrypt20-dev wget && \
wget https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb && \
RUN hail-apt-get-install ca-certificates pkg-config libfuse-dev cmake libcurl4-gnutls-dev libgnutls28-dev uuid-dev libgcrypt20-dev && \
curl -LO https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb && \
dpkg -i packages-microsoft-prod.deb && \
apt-get update && \
hail-apt-get-install blobfuse
Expand All @@ -47,7 +46,7 @@ ENV PYSPARK_PYTHON python3

COPY docker/core-site.xml ${SPARK_HOME}/conf/core-site.xml

RUN wget https://github.com/jvm-profiling-tools/async-profiler/releases/download/v2.9/async-profiler-2.9-linux-x64.tar.gz -qO- | tar -zxvf -
RUN curl -L https://github.com/jvm-profiling-tools/async-profiler/releases/download/v2.9/async-profiler-2.9-linux-x64.tar.gz | tar -zxvf -

# Build crun in separate build step
FROM base AS crun_builder
Expand Down
49 changes: 26 additions & 23 deletions batch/batch/cloud/azure/worker/worker_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,11 @@

import aiohttp

from gear.cloud_config import get_azure_config
from hailtop import httpx
from hailtop.aiocloud import aioazure
from hailtop.utils import check_exec_output, request_retry_transient_errors, time_msecs
from hailtop.utils import check_exec_output, retry_transient_errors, time_msecs

from ....worker.worker_api import CloudWorkerAPI
from ....worker.worker_api import CloudWorkerAPI, ContainerRegistryCredentials
from ..instance_config import AzureSlimInstanceConfig
from .credentials import AzureUserCredentials
from .disk import AzureDisk
Expand Down Expand Up @@ -40,20 +39,24 @@ def create_disk(self, instance_name: str, disk_name: str, size_in_gb: int, mount
def get_cloud_async_fs(self) -> aioazure.AzureAsyncFS:
return aioazure.AzureAsyncFS(credentials=self.azure_credentials)

def get_compute_client(self) -> aioazure.AzureComputeClient:
azure_config = get_azure_config()
return aioazure.AzureComputeClient(azure_config.subscription_id, azure_config.resource_group)

def user_credentials(self, credentials: Dict[str, str]) -> AzureUserCredentials:
return AzureUserCredentials(credentials)

async def worker_access_token(self, session: httpx.ClientSession) -> Dict[str, str]:
async def worker_container_registry_credentials(self, session: httpx.ClientSession) -> ContainerRegistryCredentials:
# https://docs.microsoft.com/en-us/azure/container-registry/container-registry-authentication?tabs=azure-cli#az-acr-login-with---expose-token
return {
'username': '00000000-0000-0000-0000-000000000000',
'password': await self.acr_refresh_token.token(session),
}

async def user_container_registry_credentials(
self, user_credentials: AzureUserCredentials
) -> ContainerRegistryCredentials:
return {
'username': user_credentials.username,
'password': user_credentials.password,
}

def instance_config_from_config_dict(self, config_dict: Dict[str, str]) -> AzureSlimInstanceConfig:
return AzureSlimInstanceConfig.from_dict(config_dict)

Expand Down Expand Up @@ -112,6 +115,9 @@ async def unmount_cloudfuse(self, mount_base_path_data: str):
os.remove(self._blobfuse_credential_files[mount_base_path_data])
del self._blobfuse_credential_files[mount_base_path_data]

async def close(self):
pass

def __str__(self):
return f'subscription_id={self.subscription_id} resource_group={self.resource_group}'

Expand All @@ -137,18 +143,16 @@ class AadAccessToken(LazyShortLivedToken):
async def _fetch(self, session: httpx.ClientSession) -> Tuple[str, int]:
# https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-to-use-vm-token#get-a-token-using-http
params = {'api-version': '2018-02-01', 'resource': 'https://management.azure.com/'}
async with await request_retry_transient_errors(
session,
'GET',
resp_json = await retry_transient_errors(
session.get_read_json,
'http://169.254.169.254/metadata/identity/oauth2/token',
headers={'Metadata': 'true'},
params=params,
timeout=aiohttp.ClientTimeout(total=60), # type: ignore
) as resp:
resp_json = await resp.json()
access_token: str = resp_json['access_token']
expiration_time_ms = int(resp_json['expires_on']) * 1000
return access_token, expiration_time_ms
)
access_token: str = resp_json['access_token']
expiration_time_ms = int(resp_json['expires_on']) * 1000
return access_token, expiration_time_ms


class AcrRefreshToken(LazyShortLivedToken):
Expand All @@ -164,14 +168,13 @@ async def _fetch(self, session: httpx.ClientSession) -> Tuple[str, int]:
'service': self.acr_url,
'access_token': await self.aad_access_token.token(session),
}
async with await request_retry_transient_errors(
session,
'POST',
resp_json = await retry_transient_errors(
session.post_read_json,
f'https://{self.acr_url}/oauth2/exchange',
headers={'Content-Type': 'application/x-www-form-urlencoded'},
data=data,
timeout=aiohttp.ClientTimeout(total=60), # type: ignore
) as resp:
refresh_token: str = (await resp.json())['refresh_token']
expiration_time_ms = time_msecs() + 60 * 60 * 1000 # token expires in 3 hours so we refresh after 1 hour
return refresh_token, expiration_time_ms
)
refresh_token: str = resp_json['refresh_token']
expiration_time_ms = time_msecs() + 60 * 60 * 1000 # token expires in 3 hours so we refresh after 1 hour
return refresh_token, expiration_time_ms
6 changes: 5 additions & 1 deletion batch/batch/cloud/gcp/driver/create_instance.py
Original file line number Diff line number Diff line change
Expand Up @@ -268,10 +268,14 @@ def scheduling() -> dict:
iptables --append FORWARD --destination $INTERNAL_GATEWAY_IP --jump ACCEPT
# And this worker
iptables --append FORWARD --destination $IP_ADDRESS --jump ACCEPT
# Forbid outgoing requests to cluster-internal IP addresses
# Allow traffic going to the internet
INTERNET_INTERFACE=$(ip link list | grep ens | awk -F": " '{{ print $2 }}')
iptables --append FORWARD --out-interface $INTERNET_INTERFACE ! --destination 10.128.0.0/16 --jump ACCEPT

# [private]
# Allow all traffic from the private job network
iptables --append FORWARD --source 172.20.0.0/16 --jump ACCEPT

{make_global_config_str}

# retry once
Expand Down
8 changes: 0 additions & 8 deletions batch/batch/cloud/gcp/worker/credentials.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,6 @@ def __init__(self, data: Dict[str, str]):
def cloud_env_name(self) -> str:
return 'GOOGLE_APPLICATION_CREDENTIALS'

@property
def username(self):
return '_json_key'

@property
def password(self) -> str:
return self._key

@property
def mount_path(self):
return '/gsa-key/key.json'
Expand Down
28 changes: 16 additions & 12 deletions batch/batch/cloud/gcp/worker/worker_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@

from hailtop import httpx
from hailtop.aiocloud import aiogoogle
from hailtop.utils import check_exec_output, request_retry_transient_errors
from hailtop.utils import check_exec_output, retry_transient_errors

from ....worker.worker_api import CloudWorkerAPI
from ....worker.worker_api import CloudWorkerAPI, ContainerRegistryCredentials
from ..instance_config import GCPSlimInstanceConfig
from .credentials import GCPUserCredentials
from .disk import GCPDisk
Expand Down Expand Up @@ -46,22 +46,23 @@ def create_disk(self, instance_name: str, disk_name: str, size_in_gb: int, mount
def get_cloud_async_fs(self) -> aiogoogle.GoogleStorageAsyncFS:
return aiogoogle.GoogleStorageAsyncFS(session=self._google_session)

def get_compute_client(self) -> aiogoogle.GoogleComputeClient:
return self._compute_client

def user_credentials(self, credentials: Dict[str, str]) -> GCPUserCredentials:
return GCPUserCredentials(credentials)

async def worker_access_token(self, session: httpx.ClientSession) -> Dict[str, str]:
async with await request_retry_transient_errors(
session,
'POST',
async def worker_container_registry_credentials(self, session: httpx.ClientSession) -> ContainerRegistryCredentials:
token_dict = await retry_transient_errors(
session.post_read_json,
'http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token',
headers={'Metadata-Flavor': 'Google'},
timeout=aiohttp.ClientTimeout(total=60), # type: ignore
) as resp:
access_token = (await resp.json())['access_token']
return {'username': 'oauth2accesstoken', 'password': access_token}
)
access_token = token_dict['access_token']
return {'username': 'oauth2accesstoken', 'password': access_token}

async def user_container_registry_credentials(
self, user_credentials: GCPUserCredentials
) -> ContainerRegistryCredentials:
return {'username': '_json_key', 'password': user_credentials.key}

def instance_config_from_config_dict(self, config_dict: Dict[str, str]) -> GCPSlimInstanceConfig:
return GCPSlimInstanceConfig.from_dict(config_dict)
Expand Down Expand Up @@ -118,5 +119,8 @@ async def unmount_cloudfuse(self, mount_base_path_data: str):
os.remove(self._gcsfuse_credential_files[mount_base_path_data])
del self._gcsfuse_credential_files[mount_base_path_data]

async def close(self):
await self._compute_client.close()

def __str__(self):
return f'project={self.project} zone={self.zone}'
Loading