Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge upstream changes #98

Merged
merged 49 commits into from
Mar 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
556fedf
[batch] Worker cleanup (#10155)
jigold Mar 9, 2021
7a45835
[query] Add `source_file_field` to `import_table` (#10164)
tpoterba Mar 9, 2021
31459ac
[ci] add authorize sha and action items table to user page (#10142)
daniel-goldstein Mar 9, 2021
f2b980f
[ci] add CI dropdown with link to user page (#10163)
daniel-goldstein Mar 9, 2021
e395f6d
[batch] add more logs and do not wait for asyncgens (#10136)
danking Mar 9, 2021
db45839
[query-service] maybe fix event loop not initialized (#10153)
danking Mar 9, 2021
7fa37a6
[prometheus] add prometheus to track SLIs (#10165)
daniel-goldstein Mar 9, 2021
831abfb
[query] apply nest-asyncio as early as possible (#10158)
danking Mar 9, 2021
09dd086
[grafana] set pod fsGroup to grafana user (#10162)
daniel-goldstein Mar 9, 2021
18042cc
fix linting errors (#10171)
daniel-goldstein Mar 10, 2021
c456b7e
[query] Remove verbose print (#10167)
tpoterba Mar 10, 2021
bcb0f3b
[ci] update assignees and reviewers on PR github update (#10168)
daniel-goldstein Mar 10, 2021
dd29d2b
[query-service] fix receive logic (#10159)
danking Mar 10, 2021
251d681
CHANGELOG: Fixed incorrect error message when incorrect type specifie…
johnc1231 Mar 10, 2021
434ba4a
[linting] add curlylint check for any service that renders jinja2 (#1…
daniel-goldstein Mar 11, 2021
723233f
[website] fix website (#10173)
danking Mar 11, 2021
d4689bf
[ci] change mention for deploy failure (#10178)
daniel-goldstein Mar 11, 2021
f5e497a
[gateway] move ukbb routing into gateway (#10179)
daniel-goldstein Mar 11, 2021
fee24a8
[query] Fix filter intervals (keep=False) memory leak (#10182)
tpoterba Mar 11, 2021
e47e71b
[query-service] remove service backend tests (#10180)
danking Mar 11, 2021
42847e4
[website] pass response body as kwarg (#10176)
daniel-goldstein Mar 11, 2021
1ef7018
Release 0.2.64 (#10183)
johnc1231 Mar 11, 2021
72dc5ee
[nginx] ensure nginx configs dont overwrite each other in build.yaml …
daniel-goldstein Mar 11, 2021
2d8ba29
[query-service] teach query service to read MTs and Ts created by Spa…
danking Mar 11, 2021
fbf6233
[website] dont jinja render any of the batch docs (#10190)
daniel-goldstein Mar 15, 2021
9de3415
[googlestoragefs] ignore the directory check entirely (#10185)
danking Mar 15, 2021
d463980
[ci] fix focus on slash and search job page for PRs (#10194)
daniel-goldstein Mar 15, 2021
9df985a
[query] Improve file compatibility error (#10191)
tpoterba Mar 16, 2021
1443fbc
Call init_service from init based on HAIL_QUERY_BACKEND value. (#10189)
lgruen Mar 16, 2021
321047b
[query] NDArray Sum (#10187)
johnc1231 Mar 16, 2021
79cc76b
[website] fix resource path for non-html files in the docs (#10196)
daniel-goldstein Mar 16, 2021
383069d
[query] Remove tcode from primitive orderings (#10193)
tpoterba Mar 16, 2021
0fcc1e9
[query] BlockMatrix map (#10195)
johnc1231 Mar 17, 2021
5d89225
[query] Remove all uses of .tcode[Boolean] (#10198)
chrisvittal Mar 17, 2021
022d02f
[ci] make test hello speak https (#10192)
daniel-goldstein Mar 17, 2021
466f3c3
[query] blanczos_pca dont do extra loading work (#10201)
johnc1231 Mar 18, 2021
8e1a9f1
Add query graceful shutdown for rolling updates (#10106)
illusional Mar 18, 2021
9df5c2e
[auth] add more options for obtaining session id for dev credentials …
daniel-goldstein Mar 19, 2021
a5304d8
[query] Default to Spark 3 (#10054)
johnc1231 Mar 19, 2021
f475cb6
[batch] Add more info to UI pages (#10070)
jigold Mar 22, 2021
0521e09
Bump jinja2 from 2.10.1 to 2.11.3 in /docker (#10209)
dependabot[bot] Mar 22, 2021
c76da66
[docker][hail] update to latest pytest (#10177)
danking Mar 22, 2021
a88fcd5
[gateway] Cut out router and router-resolver from gateway internal ro…
daniel-goldstein Mar 22, 2021
1a4aebc
[datasets] add pan-ukb datasets (#10186)
pwc2 Mar 22, 2021
c14bce4
[query] Add json warn context to `parse_json` (#10160)
tpoterba Mar 23, 2021
357844b
[query] fix tmp_dir default in init(), which doesn't work for the ser…
lgruen Mar 23, 2021
8aee9a5
[gitignore]ignore website and doc files (#10214)
CDiaz96 Mar 23, 2021
d457460
Merge remote-tracking branch 'upstream/main' into upstream
lgruen Mar 23, 2021
3ad57ea
Remove duplicate on_shutdown in query service
illusional Mar 23, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,8 @@ GTAGS
*.dylib
*/hail.jar
infra/.terraform.lock.hcl
hail/python/hail/docs/experimental/hail.experimental.DB.rst
hail/python/hailtop/batch/docs/api/
web_common/web_common/static/css/
website/docs.tar.gz
website/website/static/css/
47 changes: 29 additions & 18 deletions auth/auth/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
import google.auth.transport.requests
import google.oauth2.id_token
import google_auth_oauthlib.flow
from hailtop.auth import async_get_userinfo
from hailtop.config import get_deploy_config
from hailtop.tls import internal_server_ssl_context
from hailtop.hail_logging import AccessLogger
Expand Down Expand Up @@ -526,18 +525,7 @@ async def rest_logout(request, userdata):
return web.Response(status=200)


@routes.get('/api/v1alpha/userinfo')
async def userinfo(request):
if 'Authorization' not in request.headers:
log.info('Authorization not in request.headers')
raise web.HTTPUnauthorized()

auth_header = request.headers['Authorization']
session_id = maybe_parse_bearer_header(auth_header)
if not session_id:
log.info('Bearer not in Authorization header')
raise web.HTTPUnauthorized()

async def get_userinfo(request, session_id):
# b64 encoding of 32-byte session ID is 44 bytes
if len(session_id) != 44:
log.info('Session id != 44 bytes')
Expand All @@ -554,18 +542,41 @@ async def userinfo(request):
if len(users) != 1:
log.info(f'Unknown session id: {session_id}')
raise web.HTTPUnauthorized()
user = users[0]
return users[0]


@routes.get('/api/v1alpha/userinfo')
async def userinfo(request):
if 'Authorization' not in request.headers:
log.info('Authorization not in request.headers')
raise web.HTTPUnauthorized()

auth_header = request.headers['Authorization']
session_id = maybe_parse_bearer_header(auth_header)
if not session_id:
log.info('Bearer not in Authorization header')
raise web.HTTPUnauthorized()

return web.json_response(await get_userinfo(request, session_id))


return web.json_response(user)
async def get_session_id(request):
if 'X-Hail-Internal-Authorization' in request.headers:
return maybe_parse_bearer_header(request.headers['X-Hail-Internal-Authorization'])

if 'Authorization' in request.headers:
return maybe_parse_bearer_header(request.headers['Authorization'])

session = await aiohttp_session.get_session(request)
return session.get('session_id')


@routes.get('/api/v1alpha/verify_dev_credentials')
async def verify_dev_credentials(request):
session = await aiohttp_session.get_session(request)
session_id = session.get('session_id')
session_id = await get_session_id(request)
if not session_id:
raise web.HTTPUnauthorized()
userdata = await async_get_userinfo(session_id=session_id)
userdata = await get_userinfo(request, session_id)
is_developer = userdata is not None and userdata['is_developer'] == 1
if not is_developer:
raise web.HTTPUnauthorized()
Expand Down
2 changes: 1 addition & 1 deletion batch/Dockerfile.worker
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ RUN hail-apt-get-install \
COPY docker/hail-ubuntu/pip.conf /root/.config/pip/pip.conf
COPY docker/hail-ubuntu/hail-pip-install /bin/hail-pip-install
COPY docker/requirements.txt .
RUN hail-pip-install -r requirements.txt pyspark==2.4.0
RUN hail-pip-install -r requirements.txt pyspark==3.1.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Juicy, I like this!


ENV SPARK_HOME /usr/local/lib/python3.7/site-packages/pyspark
ENV PATH "$PATH:$SPARK_HOME/sbin:$SPARK_HOME/bin"
Expand Down
3 changes: 3 additions & 0 deletions batch/batch/batch.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ def _time_msecs_str(t):

d = {
'id': record['id'],
'user': record['user'],
'billing_project': record['billing_project'],
'token': record['token'],
'state': state,
Expand Down Expand Up @@ -85,6 +86,8 @@ def job_record_to_dict(record, name):
'batch_id': record['batch_id'],
'job_id': record['job_id'],
'name': name,
'user': record['user'],
'billing_project': record['billing_project'],
'state': record['state'],
'exit_code': exit_code,
'duration': duration
Expand Down
27 changes: 16 additions & 11 deletions batch/batch/front_end/front_end.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,8 @@ async def _query_batch_jobs(request, batch_id):
where_args.extend(args)

sql = f'''
SELECT jobs.*, batches.format_version, job_attributes.value AS name, SUM(`usage` * rate) AS cost
SELECT jobs.*, batches.user, batches.billing_project, batches.format_version,
job_attributes.value AS name, SUM(`usage` * rate) AS cost
FROM jobs
INNER JOIN batches ON jobs.batch_id = batches.id
LEFT JOIN job_attributes
Expand Down Expand Up @@ -1150,7 +1151,7 @@ async def _get_job(app, batch_id, job_id):
db: Database = app['db']

record = await db.select_and_fetchone('''
SELECT jobs.*, ip_address, format_version, SUM(`usage` * rate) AS cost
SELECT jobs.*, user, billing_project, ip_address, format_version, SUM(`usage` * rate) AS cost
FROM jobs
INNER JOIN batches
ON jobs.batch_id = batches.id
Expand Down Expand Up @@ -1252,28 +1253,31 @@ async def ui_get_job(request, userdata, batch_id):
app = request.app
job_id = int(request.match_info['job_id'])

job_status, attempts, job_log = await asyncio.gather(_get_job(app, batch_id, job_id),
_get_attempts(app, batch_id, job_id),
_get_job_log(app, batch_id, job_id))
job, attempts, job_log = await asyncio.gather(_get_job(app, batch_id, job_id),
_get_attempts(app, batch_id, job_id),
_get_job_log(app, batch_id, job_id))

job_status_status = job_status['status']
job['duration'] = humanize_timedelta_msecs(job['duration'])
job['cost'] = cost_str(job['cost'])

job_status = job['status']
container_status_spec = dictfix.NoneOr({
'name': str,
'timing': {'pulling': dictfix.NoneOr({'duration': dictfix.NoneOr(Number)}),
'running': dictfix.NoneOr({'duration': dictfix.NoneOr(Number)})},
'container_status': {'out_of_memory': False},
'state': str})
job_status_status_spec = {
job_status_spec = {
'container_statuses': {'input': container_status_spec,
'main': container_status_spec,
'output': container_status_spec}}
job_status_status = dictfix.dictfix(job_status_status, job_status_status_spec)
container_statuses = job_status_status['container_statuses']
job_status = dictfix.dictfix(job_status, job_status_spec)
container_statuses = job_status['container_statuses']
step_statuses = [container_statuses['input'],
container_statuses['main'],
container_statuses['output']]

job_specification = job_status['spec']
job_specification = job['spec']
if 'process' in job_specification:
process_specification = job_specification['process']
process_type = process_specification['type']
Expand All @@ -1289,11 +1293,12 @@ async def ui_get_job(request, userdata, batch_id):
page_context = {
'batch_id': batch_id,
'job_id': job_id,
'job': job,
'job_log': job_log,
'attempts': attempts,
'step_statuses': step_statuses,
'job_specification': job_specification,
'job_status_str': json.dumps(job_status, indent=2)
'job_status_str': json.dumps(job, indent=2)
}
return await render_template('batch', request, userdata, 'job.html', page_context)

Expand Down
23 changes: 22 additions & 1 deletion batch/batch/front_end/templates/batch.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,28 @@
<script src="{{ base_path }}/common_static/focus_on_keyup.js"></script>
{% endblock %}
{% block content %}

<h1>Batch {{ batch['id'] }}</h1>

<h2>Properties</h2>
<ul>
<li>User: {{ batch['user'] }}</li>
<li>Billing Project: {{ batch['billing_project'] }}</li>
<li>Time Created: {% if 'time_created' in batch and batch['time_created'] is not none %}{{ batch['time_created'] }}{% endif %}</li>
<li>Time Closed: {% if 'time_closed' in batch and batch['time_closed'] is not none %}{{ batch['time_closed'] }}{% endif %}</li>
<li>Time Completed: {% if 'time_completed' in batch and batch['time_completed'] is not none %}{{ batch['time_completed'] }}{% endif %}</li>
<li>Total Jobs: {{ batch['n_jobs'] }}</li>
<ul>
<li>Pending Jobs: {{ batch['n_jobs'] - batch['n_completed'] }}</li>
<li>Succeeded Jobs: {{ batch['n_succeeded'] }}</li>
<li>Failed Jobs: {{ batch['n_failed'] }}</li>
<li>Cancelled Jobs: {{ batch['n_cancelled'] }}</li>
</ul>
<li>Duration: {% if 'duration' in batch and batch['duration'] is not none %}{{ batch['duration'] }}{% endif %}</li>
<li>Cost: {% if 'cost' in batch and batch['cost'] is not none %}{{ batch['cost'] }}{% endif %}</li>
</ul>

<h2>Attributes</h2>
{% if 'attributes' in batch %}
{% for name, value in batch['attributes'].items() %}
<p>{{ name }}: {{ value }}</p>
Expand Down Expand Up @@ -64,7 +85,7 @@ <h2>Jobs</h2>
<tbody>
{% for job in batch['jobs'] %}
<tr>
<td class="numeric-cell">
<td class="numeric-cell" onClick="document.location.href='{{ base_path }}/batches/{{ job['batch_id'] }}/jobs/{{ job['job_id'] }}';">
<a href="{{ base_path }}/batches/{{ job['batch_id'] }}/jobs/{{ job['job_id'] }}">{{ job['job_id'] }}</a>
</td>
<td>
Expand Down
8 changes: 7 additions & 1 deletion batch/batch/front_end/templates/batches.html
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ <h1>Batches</h1>
<thead>
<tr>
<th>ID</th>
<th>User</th>
<th>Billing Project</th>
<th>Name</th>
<th>Submitted</th>
<th>Completed</th>
Expand All @@ -68,7 +70,11 @@ <h1>Batches</h1>
<tbody>
{% for batch in batches %}
<tr>
<td class="numeric-cell"><a href="{{ base_path }}/batches/{{ batch['id'] }}">{{ batch['id'] }}</a></td>
<td class="numeric-cell" onClick="document.location.href='{{ base_path }}/batches/{{ batch['id'] }}';">
<a href="{{ base_path }}/batches/{{ batch['id'] }}">{{ batch['id'] }}</a>
</td>
<td>{{ batch['user'] }}</td>
<td>{{ batch['billing_project'] }}</td>
<td>
{% if 'attributes' in batch and 'name' in batch['attributes'] and batch['attributes']['name'] is not none %}
{{ batch['attributes']['name'] }}
Expand Down
12 changes: 12 additions & 0 deletions batch/batch/front_end/templates/job.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,18 @@
{% block content %}
<h1>Batch {{ batch_id }} Job {{ job_id }}</h1>

<h2>Properties</h2>
<ul>
<li><a href="{{ base_path }}/batches/{{ batch_id }}">Batch ID: {{ batch_id }}</a></li>
<li>Job ID: {{ job_id }}</li>
<li>User: {{ job['user'] }} </li>
<li>Billing Project: {{ job['billing_project'] }}</li>
<li>State: {{ job['state'] }}</li>
<li>Exit Code: {% if 'exit_code' in job and job['exit_code'] is not none %}{{ job['exit_code'] }}{% endif %}</li>
<li>Duration: {% if 'duration' in job and job['duration'] is not none %}{{ job['duration'] }}{% endif %}</li>
<li>Cost: {% if 'cost' in job and job['cost'] is not none %}{{ job['cost'] }}{% endif %}</li>
</ul>

<h2>Attempts</h2>
{% if attempts %}
<table class="data-table">
Expand Down
1 change: 1 addition & 0 deletions batch/test/test_dag.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ def test():
callback_body.pop('duration')
assert (callback_body == {
'id': b.id,
'user': 'test',
'billing_project': 'test',
'token': token,
'state': 'success',
Expand Down
3 changes: 0 additions & 3 deletions benchmark-service/Dockerfile.test
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
FROM {{ service_base_image.image }}

COPY benchmark-service/test/ /test/
RUN python3 -m pip install --no-cache-dir \
pytest-instafail==0.4.1 \
pytest-asyncio==0.10.0
3 changes: 1 addition & 2 deletions benchmark-service/test/test_update_commits.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,13 @@
from hailtop.httpx import client_session
import hailtop.utils as utils

pytestmark = pytest.mark.asyncio

logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__)

sha = 'd626f793ad700c45a878d192652a0378818bbd8b'


@pytest.mark.asyncio
async def test_update_commits():
deploy_config = get_deploy_config()
headers = service_auth_headers(deploy_config, 'benchmark')
Expand Down
24 changes: 3 additions & 21 deletions build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -684,26 +684,6 @@ steps:
to: /cluster-tests.tar.gz
dependsOn:
- hail_build_image
- kind: runImage
name: build_hail_spark3
image:
valueFrom: hail_build_image.image
resources:
memory: "7.5G"
cpu: "4"
script: |
set -ex
cd /
rm -rf repo
mkdir repo
cd repo
{{ code.checkout_script }}
cd hail
time retry ./gradlew --version
export SPARK_VERSION="3.0.1" SCALA_VERSION="2.12.12"
time retry make jars python-version-info wheel
dependsOn:
- hail_build_image
- kind: buildImage
name: batch_worker_image
dockerFile: batch/Dockerfile.worker
Expand Down Expand Up @@ -2830,6 +2810,8 @@ steps:
mkdir -p ./ci/test ./hail/python
cp /repo/hail/ci/test/resources/build.yaml ./
cp -R /repo/hail/ci/test/resources ./ci/test/
cp /repo/hail/tls/Dockerfile ./ci/test/resources/Dockerfile.certs
cp /repo/hail/tls/create_certs.py ./ci/test/resources/
cp /repo/hail/pylintrc ./
cp /repo/hail/setup.cfg ./
cp -R /repo/hail/docker ./
Expand Down Expand Up @@ -3289,7 +3271,7 @@ steps:
script: |
set -ex
gcloud auth activate-service-account --key-file=/secrets/ci-deploy-0-1--hail-is-hail.json
SPARK_VERSION=2.4.5
SPARK_VERSION=3.1.1
BRANCH=0.2
SHA="{{ code.sha }}"
GS_JAR=gs://hail-common/builds/${BRANCH}/jars/hail-${BRANCH}-${SHA}-Spark-${SPARK_VERSION}.jar
Expand Down
1 change: 0 additions & 1 deletion ci/Dockerfile.test
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,3 @@ COPY hail/python/setup-hailtop.py /hailtop/setup.py
COPY hail/python/hailtop /hailtop/hailtop/
RUN hail-pip-install /hailtop && rm -rf /hailtop
COPY ci/test/ /test/
RUN hail-pip-install pytest-instafail==0.4.1 pytest-asyncio==0.10.0
30 changes: 30 additions & 0 deletions ci/test/resources/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,36 @@ steps:
publishAs: service-base
dependsOn:
- base_image
- kind: buildImage
name: create_certs_image
dockerFile: ci/test/resources/Dockerfile.certs
contextPath: ci/test/resources
publishAs: test_hello_create_certs_image
dependsOn:
- service_base_image
- kind: runImage
name: create_certs
image:
valueFrom: create_certs_image.image
script: |
set -ex
python3 create_certs.py \
{{ default_ns.name }} \
config.yaml \
/ssl-config-hail-root/hail-root-key.pem \
/ssl-config-hail-root/hail-root-cert.pem
serviceAccount:
name: admin
namespace:
valueFrom: default_ns.name
secrets:
- name: ssl-config-hail-root
namespace:
valueFrom: default_ns.name
mountPath: /ssl-config-hail-root
dependsOn:
- default_ns
- create_certs_image
- kind: buildImage
name: hello_image
dockerFile: ci/test/resources/Dockerfile
Expand Down
4 changes: 4 additions & 0 deletions ci/test/resources/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
principals:
- name: hello
domain: hello
kind: json
Loading