forked from hail-is/hail
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upstream 0.2.132 release #344
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Incorporates grafana which we did not have at the time of previous revision and removes the nginx config that is no longer used. We now use Envoy as our load balancer. I'll make a separate dev doc explaining the gateways.
A long-standing fixme in the LocalBackend was to not rely on HadoopFS, which we use with the SparkBackend for compatibility with dataproc and hdfs urls. By default, the HadoopFS doesn't understand gs urls. Users need to install the gcs-hadoop-connector (preinstalled in dataproc) to communicate with google cloud storage. Spark handles supplying credentials to the connector. Issue hail-is#13904 is caused by failing to properly supply the gcs-hadoop-connector with credentials in the LocalBackend. In the absence of config, the connector hangs while trying to fetch a token form a non-existant metadata server. The LocalBackend was designed to be a testing ground for lowered and compiled code that would eventually be run on batch, where we use the RouterFS. I propose a pragmatic fix for hail-is#13904 that ditches the HadoopFS for all but local filesystem access in the LocalBackend instead of identifying and fixing the root cause. In doing so, I made a couple of changes to how the RouterFS is configured: In the absence of the `HAIL_CLOUD` environment variable, RouterFS can handle gs and az urls iff credentials are not supplied. If the user supplies creditials, we use `HAIL_CLOUD` to decide which cloud to route to. fixes hail-is#13904
…ail-is#14451) Before we can simplify the binding structure, we need to stop duplicating it all over the place. This PR rewrites `FreeVariables` so that it no longer needs special logic for particular nodes, hard coding binding structure (redundantly). To do this, it takes advantage of the new `Bindings`, which operates on a `GenericBindingEnv` interface. It adds a new implementation of this interface specifically for computing free variables, then simply does a generic traversal of the IR using this custom binging environment. While I find the new implementation far simpler and more obviously correct than the old, I do expect it to further simplify once I'm able to start modifying the core binding structure.
…ail-is#14461) We can always revert this if there are issues, but the polling CI does to refresh its understanding of PRs and batches is pretty light, and I think every minute is still quite reasonable.
…ail-is#14458) I missed that hail-is#14443 only got one of several `LOCK IN SHARE MODE` lines in the billing projects query. This gets the rest in the two `query_billing_projects` functions that do readonly selects
Fixes hail-is#13971 CHANGELOG: Hail now supports and primarily tests against Dataproc 2.2.5, Spark 3.5.0, and Java 11. We strongly recommend updating to Spark 3.5.0 and Java 11. You should also update your GCS connector *after installing Hail*: `curl https://broad.io/install-gcs-connector | python3`. Do not try to update before installing Hail 0.2.131. https://cloud.google.com/dataproc/docs/concepts/versioning/dataproc-release-2.2 --------- Co-authored-by: Edmund Higham <edhigham@gmail.com>
…s#14277) Later versions of IDEA can't run tests with our current testng and scalatest versions. Update to latest and fix fallout.
… of 5" (hail-is#14476) Reverts hail-is#14461. We're hitting [github rate limits](https://console.cloud.google.com/logs/query;query=resource.type%3D%22k8s_container%22%0Aresource.labels.namespace_name%3D%22default%22%0Aresource.labels.container_name%3D%22ci%22%0A--%20severity%3DERROR%20OR%20WARNING;pinnedLogId=2024-04-18T15:42:12.920462831Z%2Fvdlhscspn377olu0;cursorTimestamp=2024-04-18T15:42:14.785330871Z;duration=P1D?project=hail-vdc) which is preventing actions like dev deploys. The limit is apparently [5,000 requests/hour](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#primary-rate-limit-for-authenticated-users). This feels excessive to me, and I feel like we most be using the API poorly, but I want to just revert this before investigating further so CI doesn't get stuck.
Fixes: hail-is#14462 Restores ability to reuse spark contexts acrosss different hail sessions, eg using multiple databricks notebook sessions.
In QoB, the Query Driver and all Query Worker jobs reside in the same batch. Currently, the Query Driver submits worker jobs and then waits for them to finish before collecting their results from GCS. The way it waits is by polling the status of the batch and waiting for the number of completed jobs to reach `n_total_jobs - 1` (to account for itself). This is both awkward and prevents a couple potentially valuable pieces of functionality: you cannot run multiple concurrent Query Drivers, and you cannot take advantage of Batch's `cancel_after_n_failures` functionality because you do not want to cancel the Query Driver itself. The introduction of job groups addresses both these problems, as the Query Driver can create a job group for the stage of worker jobs and then await its completion. This PR makes a job group for the query driver and then the query driver creates a nested job group per stage of workers that it creates. Utilizing `cancel_after_n_failures` to fail a stage early is future work.
Use generic environment propagation in PruneDeadFields as much as possible. This allowed the deletion of many cases in `rebuild`.
For Hail Batch on Terra Azure, the production artifact is Helm chart containing the necessary kubernetes resources to run a Hail Batch deployment in a Terra k8s cluster. This deployment contains slightly modified containers of the batch front-end, batch driver and a mysql database. This chart is currently built manually using the targets in `batch/terra-chart/Makefile`. As this process is not currently automatically tested, it's very prone to bit rot. This PR is an amalgamation of fixes that I needed to make to get `main` to build in the current Terra. A non-exhaustive list of the changes are: - After changing from gradle to mill, the some Dockerfiles and make targets needed to change to account for the new location of the JAR. - I removed some redundancy in invocations of `docker build` by relying on the generic targets that we now have in the top level Makefile. - Terra changed how they handle identity management for the kubernetes deployment, from `aadpodidentity` to `workloadIdentity`. This changes the chart to work with their new inputs they provide. Ultimately, terra should have a CI system that we can push charts to and receive feedback on whether it passed our test suite in their test environment.
`curl` is already installed in `hail_ubuntu_image`, on which `curl_image` is based.
`orjson` 3.9.15 fixed the rare segfault that we saw in `3.9.11`. Besides just updating to latest patch and minor versions: - Removed a redundant requirement of `orjson` in `gear/requirements.txt` -- it inherits `orjson` from hailtop - Bokeh `3.4` made a breaking change w.r.t. the `circle` method on figures. I have restricted the bounds for `bokeh` to avoid this breaking change but will follow up with a PR that changes our usage of bokeh to follow the deprecation/upgrade advice and undo the bounds restriction
setup.py complains awfully when you invoke it directly. Apparently this sort of thing is unacceptable and the recommended workflow for building wheels is to use build. See the link below for more information. https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html
[uv](https://github.com/astral-sh/uv) is a new package resolver by the same folks who make `ruff`. It's boasted for being really fast, which honestly it is, but mostly it's appealing to me because they support generating lockfiles for alternative platforms and python versions than the system you run it on, which allows us to delete all this dockerizing `pip-compile` in order to generate lockfiles for linux. It's a really green project, so I'm open to pushback on incorporating it, but it seemed like a worthwhile simplification. I also quite like that it allows for additional strategies in generating lockfiles. By default, it behaves as would be expected, where it locks packages to the highest version within the acceptable bounds. But you can also configure it to generate the *lowest* acceptable pins, so we could actually verify whether the lower bounds that we have in our requirements files are actually acceptable or not.
…ex` (hail-is#14497) hail-is#14085 fixed a change in semantics for `parallelizeAndComputeWithIndex` didn't fail early anymore. It also noted that the `SparkTaskContext.partitionId` could be wrong when used with call caching, but left the fix to another change. This change takes inspiration from `SparkContext.runJob` in its `partitions` optional argument. By supplying this, we can limit the number of partitions that we compute (useful for call caching) without doing dodgy things that breaks the `partitionId`. This change also reduces code duplication wile preserving semantics - `parallelizeAndComputeWithIndex` still fails early for the local and spark backends.
This function is an unused relic of the first version of our billing system, that just computed the cost of a slot on a Batch VM per mcpu milliseconds based on some hard-coded prices. We long ago migrated to a system where we track GCP SKUs and price changes through their API and properly itemize usage by SKU in the database.
This is a long overdue description of how our K8s ingress / config is set up. I've tried to keep it as concise as possible, so ask for elaboration if there are unexplained gaps.
As a step towards coralling the specification of the binding structure of the IR into one place, this rewrites `Bindings` to use only a single method of the `GenericBindingEnv` interface, `newBlock`, which therefore captures all possibilities of how a node can modify its parent's environment in a child. Later work refactors this to return an object encoding this modification, instead of returning a modified environment, which allows the caller complete flexibility in how to maintain an environment appropriately for their use case. This PR leaves in the old `Bindings` implementation, with an assertion that they agree. The PR stacked above this, hail-is#14495, deletes the old implementation. This way CI asserts that this refactoring hasn't changed any behavior.
Follows up on hail-is#14475 by deleting the old implementation of `Bindings` in favor of the new one, which has now been checked by CI to not change the behavior.
Refactors `Bindings` to return an object encoding the change to the environment (any new bindings, whether the agg/scan env is promoted, etc). This allows the deletion of `SegregatedBindingEnv`. Follow up work will use this to replace the other specializations of `GenericBindingEnv`, and to greatly simplify compiler passes, such as `NormalizeNames` and `PruneDeadFields`, which currently need to redundantly encode the binding structure of every node.
The memory that machines use is calculated based off of the number of cores the machine has, with a fix ratio set of cores to Gb of RAM. The code assumed a ratio of 3.75 cores: 1 Gb but this assumption does not hold outside of the n1 machine family. This PR changes the functions that calculate memory from number of cores to account for this by passing the machine_type as the main parameter rather than the worker type. Then, logic is added to use the appropriate ratio of 4:1 for the g2 family of machines. --------- Co-authored-by: Daniel Goldstein <danielgold95@gmail.com> Co-authored-by: Sophie Parsa <parsa@wm9c6-e4e.broadinstitute.org>
…ts (hail-is#14508) These docstrings weren't updated since the GCP multi-regional egress changes. It is no longer free to egress from/to multi-regional buckets, so the best thing to do would be to put your storage and compute into the same region.
…is#14520) We have the type, we should use it, saves a round trip especially for QoB.
At time of writing dependabot doesn't have a great way to bulk update dependencies across unrelated lockfiles in the repo, which often require manual intervention because we assert that we're always using the same package version everywhere. It's also just a lot of noise and hogs CI time. We've decided to move to a periodic bulk-update process + using repo Security alerts.
This PR extends a previous PR that added g2-standard-4 machines into batch by adding the full machine family. The code adds a new Accelerator Resource that takes the number of gpus. --------- Co-authored-by: Sophie Parsa <parsa@wm9c6-e4e.broadinstitute.org>
…ail-is#14560) CHANGELOG: The gvcf import stage of the VDS combiner now preserves the GT of reference blocks. Some datasets have haploid calls on sex chromosomes, and the fact that the reference was haploid should be preserved.
…s#14571) Fixes: hail-is#14559 `hl.nd.array`s constructed from stream pipelines can cause out of memory exceptions owing to a limitation in the python CSE algorithm that does not eliminate partially redundant expressions in if-expressions. Explicitly `let`-binding the input collection prevents it from being evaluated twice: once for the flattened data stream and once for the original shape.
Hail's benchmarks were kind of their own thing and a little neglected. This change moves the benchmarks into the `hail/python` folder and updates them to use pytest with a custom plugin/set of pytest hooks. Now, benchmarks can be run from the command line like any pytest. This change removes the `benchmark-hail` (or `hailbench`) utility. Benchmarks are marked by `pytest.mark.benchmark` (via the `@benchmark` decorator). By convention, benchmarks are python tests whose names are prefixed by `benchmark_` and are located in files with the same prefix. Nothing enforces this, however, so you could name your benchmarks `test_*` and put them in files named `test_*.py`. Benchmarks may import and use any test code or utilities defined in `test/`. The results of each benchmark are outputted as json lines (`.jsonl`) to the file specified by the `--output` pytest arg or stdout. The folder structure should be familiar, resembling our `test/` directory. I believe this is flexible enough to add `hailtop` benchmarks should we so wish: ``` pytest.ini - hoisted from `test/` to include benchmark marks benchmark/ - conftest.py for custom pytest command line args - hail/ - confest.py for custom plugin that runs hail benchmarks - benchmark_*.py hail query benchmark code - tools/ - shared utilites, including the `@benchmark` ``` Supporting pytest fixtures required writing a custom plugin to run benchmarks, as using off-the-shelf solutions like `pytest-benchmark` would forbid method level fixtures like `tmp_path` etc. The plugin is designed to run "macro-benchmarks" (ie long-running tests) and fully supports pytest parameterisation. For each benchmark, the plugin initialises hail and then repeats (for a number of iterations defined by the pytest mark) acquiring fixtures, timing invocation and tearing-down fixtures, finally stopping hail. It is therefore unsuitable for microbenchmarks, for which we currenly have none in python. If we add them we'd need to tweak this so support them. Perhaps an inner loop or something. The process of submitting benchmarks to batch is greatly simplified as the old `Makefile` infrastructure for building wheels and docker images etc has been replaced with the script `benchmark_in_batch.py`. Benchmark images are now based off the `hail-dev` image built in CI (or via the `hail-dev-image` make target). Furthermore, you can control the number of "replicate" jobs created for each benchmark at the benchmark level using the `@benchmark(batch_jobs=N)` decotator. Limitations/shortcomings: - Output is currently jsonl only. Some more human friendly output might be nice on a per iteration basis. - Old `benchmark-hail` utilities are broken. I'll restore these in subsequent changes. --------- Co-authored-by: Daniel Goldstein <danielgold95@gmail.com> Co-authored-by: Christopher Vittal <chris@vittal.dev>
Fixes paths missed by hail-is#14565
This revamps the `batch.front_end` web pages to be a bit less cluttered and more usable. I brought in tailwindcss instead of using our existing sass. I mainly find it easier to work with but am fine porting it to sass if anyone feels strongly against changing css tools. In order to avoid overhauling all our HTML in one PR, I added a flag to the jinja context `use_tailwind` that is currently only on for the `batch` service. `layout.html` then uses this flag to determine which CSS and header it should use. If this gets to a state we're happy to merge, it should be a lot less work to bring over auth and CI and delete that flag.
…4583) `make_variants_matrix_table` and `make_variant_stream`, shared a function for processing entries. As did `make_reference_matrix_table` and `make_reference_stream`. This deduplicates that functionality by lifting out the internal functions to top level definitions.
…file uploads (hail-is#14576) We use `nest_asyncio` to allow synchronous blocking on a coroutine execution inside an async context. For example, if a user is using the synchronous interface of hail inside a jupyter cell (which runs an event loop). `nest_asyncio` achieves this by patching the `asyncio` event loop to make it reentrant, and with that there are footguns. This allows us to do things like create 100 tasks that all concurrently invoke `run_until_complete`, each of which will add stack frames to the event loop stack that can pile up and trigger a cryptic `RecursionError`. But internally we never *need* to make concurrent calls to `run_until_complete`, and more broadly we should never have `async / sync / async` inside hail code. This change exposes an asynchronous `validate_file` so that asynchronous methods in `hailtop` can use it directly instead of inserting a synchronous layer (`async_to_blocking`) between them.
These queries are responsible for getting the paginated list of jobs in the batch UI. I believe this missing order by went unnoticed because the queries were naturally getting returned in order based on the jobs primary key, but we've recently seen a couple page loads of jumbled or missing jobs. Seems like something around the query stats changed and the database adjusted its plan slightly. Regardless, we want this order by explicit.
Specify which benchmark to run by nodeid instead of by expression --------- Co-authored-by: Christopher Vittal <chris@vittal.dev>
Adds tests for sparse block matrices in python, which are currently failing in backends with bm lowering.
The original UI redesign was intended to make it easier to navigate and focus on important information. I received very positive feedback on some aspects like this, for example having the different job steps in separate tabs. There was also feedback though that there was a lot of dead space. This is an attempt to find a good middle ground where the UI is not overwhelmingly cluttered but is more compact and makes better use of screen real estate
We seem to be running into spark/yarn scheduling limitations causing stages to often fail with a large number of partitions. Here, we implement a very simple chunking strategy to run spark jobs with a limited number of partitions at a time. The maximum parallelism is controlled by a new `spark_max_stage_parallelism` feature flag, which defaults to MAXINT until we can figure out a good default. Also, this change corrects a small error in logic for partition indices for call caching. The `resultHandler` argument of [`runJob`] is called with the job's partition index, not the index of the partition within the RDD. So we need to index into the `partitions` sequence when populating the results buffer. CHANGELOG: Add 'spark_max_stage_parallelism' flag to allow users to run pipelines with a large number of partitions in chunks. By default, hail still attempts to run all partitions in a stage at once. [`runJob`]: https://spark.apache.org/docs/latest/api/scala/org/apache/spark/SparkContext.html#runJob[T,U](rdd:org.apache.spark.rdd.RDD[T],func:(org.apache.spark.TaskContext,Iterator[T])=%3EU,partitions:Seq[Int],resultHandler:(Int,U)=%3EUnit)(implicitevidence$11:scala.reflect.ClassTag[U]):Unit
…hail-is#14589) `'/api/v1alpha/batches/create-fast'` is a Batch API endpoint that allows you to create a batch, submit jobs, and start them running all in one HTTP request. It makes a few transactions within the endpoint and must be idempotent. One of the failure modes it must watch for is when a Batch Update to add some new jobs and/or job groups to a batch has already successfully been committed, but the request is retried for X reason. In that scenario, we want to just return a 200 to the client. We do this already, but there was an oversight when creating job groups where those branches did not include the full response payload, which results in intermittent errors on the client when it can't retrieve `resp['last_job_group_id']`. This fixes that so we always return the same JSON.
…il-is#14579) I believe hail-is#14547 introduced a bug that broke IR function deserialization in QoB by changing `value_parameter_names` from `Array[String]` to `Array[Name]`. This fixes the issue by pushing the introduction of the `Name` wrapper object to after deserialization. Another option is to incorporate the `{ str: String }` structure of `Name` into the python -> scala payload, but I'm not sure I see a use case for that and we can always do that later (there is no issue of backwards compatibility with this communication between python and scala). My main concern here is that clearly this isn't tested. I'd appreciate guidance on the current advised practice for testing this behavior.
These make targets have been more trouble than they are worth. Since both the requirements files and the pinned requirements files that they generate are checked into the repo, updates to these files through PRs don't play well with how make tracks timestamps, since my local git may update them in different orders. In general, I've never had a need to only update a single requirements file and since these are technically dependencies of say, the Batch image, building the batch image can incidentally trigger a re-generation of these files when I never actually changed the dependencies. This PR removes those targets and does a normal refresh of the requirements, as well as updates the documentation on how we handle pip dependencies.
Recent upstream changes include a rewrite of several of the web page templates. Hence this merge drops the functionality locally added by PRs #270, #272, and #311 to make the command <PRE> resizeable and to add job state quick links. We may re-add these improvements later by reimplementing them in the new template code.
illusional
approved these changes
Jul 10, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Recent upstream changes include a rewrite of several of the web page templates. Hence this merge drops the functionality locally added by PRs #270, #272, and #311 to make the command
<PRE>
resizeable and to add job state quick links. We may re-add these improvements later by reimplementing them in the new template code.