forked from hail-is/hail
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upstream 2023 03 01 1 #271
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* [query] IBD implemented in terms of block matrices * cleanup * fix * get debugging info * more debugging * fix temp file location * address comments * minor fix * fix? * checkpoints
…is#12678) * [query][scala-fs] more debugging information when memory fails * wording * log off the fast path
* [batch] Add environment variable for batch id in worker * address comments?
* [batch] Profile more services not just batch-driver * fixes * only add the googlecloudprofiler logger filter if invoked * fix HAIL_SHOULD_PROFILE * sort * add cloud env variable to auth * fix
`gsutil` is one of the most user hostile tools I have ever used. Here are some examples of why. I think what I now have committed is the only way to achieve the behavior we want without assuming anything about which objects are present at the target. ``` $ gsutil -m cp -r baz gs://danking/baz/ $ gsutil ls gs://danking/baz/ gs://danking/baz/baz/ ``` ``` $ gsutil rm -rf gs://danking/baz $ gsutil -m cp -r baz/ gs://danking/baz/ $ gsutil ls gs://danking/baz/ gs://danking/baz/baz/ ``` ``` $ gsutil rm -rf gs://danking/baz $ gsutil -m cp -r baz gs://danking/baz $ gsutil ls gs://danking/baz/ gs://danking/baz/1 gs://danking/baz/2 ``` ``` $ gsutil rm -rf gs://danking/baz $ touch foo $ gsutil cp foo gs://danking/baz/foo $ gsutil -m cp -r baz gs://danking/baz $ gsutil ls gs://danking/baz gs://danking/baz/foo gs://danking/baz/baz/ ``` ``` $ gsutil rm -rf gs://danking/baz $ gsutil cp foo gs://danking/baz/foo $ gsutil -m cp -r baz/\* gs://danking/baz/ $ gsutil ls gs://danking/baz gs://danking/baz/1 gs://danking/baz/2 gs://danking/baz/foo ```
…12688) * [query] improve error message when rng_nonce is unparseable * Update ExecuteContext.scala
Efficiency can wait.
hail-is#12597) * [compiler] Refactor compiled functions to take a HailTaskContext instead of partitionIndex Partition index is now unnecessary due to the completion of the randomness redesign. HailTaskContext will be used in a subsequent PR to add task-level cleanup to permit aggressive caching in generated code. * fixes * oops writeIRs * fix combine * fix other issues in combine * fix * fix combop nonsense * bleh * fix * fix * bump
* [batch] Mitigate too many resources with same prices * add equality chcks * fix
…hail-is#12600) * [query] Make caching in PartitionNativeIntervalReader more aggressive Add finalizers to HailTaskContext to clean up open indices. * remove log * comment
…ail-is#12667) * [query] Add RouterFS to scala to permit local file system IO in QoB * erasure * scala is stupid * override open/create cached * fix cast
* [compiler] Iterative `DistinctlyKeyed` Analysis Use iterative tree traversals to prevent exceeding stack size for large IRs. * traverse all ir nodes
…il-is#12295) * [lowering] Rewrite maximal independent set to be its own IR And use this new IR to enable lowered execution of maximal_independent_set * fix * updates * Update Copy.scala * Update __init__.py * Requireness.scala rule * fix bindings * fix * remove bad error * fix requireness * custom element binding for tiebreaker arguments * Add Array[Long] to hail value conversion * Make maximalIndependentSet return an IndexedSeq Functions.unwrapReturn expects scala IndexedSeq not java Array * fix GraphSuite * Add (bad) unwrap rule for arrays of structs * xfails * better uwrapReturn for arrays of structs * fix bad merge * correct Requiredness rules * ChildEnvWithoutBindings rule --------- Co-authored-by: Dan King <daniel.zidan.king@gmail.com>
…#12700) Bumps [com.github.samtools:htsjdk](https://github.com/samtools/htsjdk) from 3.0.2 to 3.0.4. - [Release notes](https://github.com/samtools/htsjdk/releases) - [Commits](samtools/htsjdk@3.0.2...3.0.4) --- updated-dependencies: - dependency-name: com.github.samtools:htsjdk dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* [query] add `hl.utils.genomic_range_table` CHANGELOG: In Query on Batch, `hl.balding_nichols_model` is slightly faster. Also added `hl.utils.genomic_range_table` to quickly create a table keyed by locus. It has grated on me for a while that `hl.balding_nichols_models` requires a whole pass to verify it is sorted even though it is plainly so. This change introduces the necessary infrastructure to convince Hail of that fact. * pylint * missing import * fix tests * add to __init__.py * better genomic range test tables
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.2.2 to 2.2.3. - [Release notes](https://github.com/pallets/werkzeug/releases) - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst) - [Commits](pallets/werkzeug@2.2.2...2.2.3) --- updated-dependencies: - dependency-name: werkzeug dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [de.undercouch.download](https://github.com/michel-kraemer/gradle-download-task) from 5.3.0 to 5.3.1. - [Release notes](https://github.com/michel-kraemer/gradle-download-task/releases) - [Commits](michel-kraemer/gradle-download-task@5.3.0...5.3.1) --- updated-dependencies: - dependency-name: de.undercouch.download dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* [query] Upgrade spark to 3.3.0 and dataproc to 2.1 * lint * lint * update zstd-jni * trust pyspark on patch versions * create specific exceptions to appease pylint
* [query] lower logistic SKAT CHANGELOG: `hl.skat(..., logistic=True)` now supported in the Batch backend. * minor doc fix * remove wrong sentence from docs * covariates are required now * support max_iter = 0 in logreg_fit * fix syntax * wip docs * docs iwp * final take on the math * docs fixes
…l-is#12709) * [batch] Mitigate test failures by extending batch client timeout * fix
Not all IOExceptions have a cause.
* [batch] Support job logs that are not UTF-8 compatible * redundant * add test for non-utf-8 log * fix * fix test * linting * address comments * lint
* [query] improve the memory client Currently, the memory client buffers the entire output in memory which is likely to cause OOMs. For reasons that are not entirely clear to me, sometimes these OOMs get muffled by our system and instead lead to non-termination. I vaguely remember this happening before with `using`. I suspect there is something somewhat subtle wrong with that method, but I am not certain. Anyway, there are four big changes here: 1. Do not buffer the entire request body in memory when writing to memory. 2. Because of (1) we have to pull retry behavior all the way up to the top-level where we know how to recreate the body. 3. Because of (2) it is easier to provide a `write(url)(writerFunction)` style API, which I do here. 4. Again, because of (2), and because I want to preserve the file-object-like interface, I added a somewhat funky anonymous class which uses a second thread to facilitate the movement of data written into the OutputStream returned by `create` into the OutputStream of the HTTP connection. Point (4) probably bears more explanation. The root issue is the bad Apache HTTP Client interface. Instead of `request` returning an OutputStream, it takes an "entity". An entity knows how to write itself into the OutputStream of an HTTP request. This works fine if the "writer" code is pased as a function (as in my new `write` method), but that does not work if the control flow looks like: f = create(...) f.write(...) f.close() We avoid this limited API by initiating the request in a second thread which will eventually block waiting to receive data from a PipedInputStream. That PipedInputStream produces the data written to a PipedOutputStream. The `create` call returns a positioned OutputStream which just writes data into the PipedOutputStream and handles cleaning up the thread when it is closed. In a multi-core system, network requests should proceed in parallel to the client code. In a single-core system, the written data will buffer until `close` is called which will definitely yield control to the other thread. * restore the retry * fix * fix * fix oerride
Flags now use the same user configuration machinery we use for Batch and QoB. I am not certain this is the right choice. Feedback very welcome. The configuration_of function lets us uniformly treat any configuration by checking, in order: explicit argument, envvar, config file, or a fallback. I added a bit of code to allow us to support the envvars which do not conform to the new envvar scheme. I also removed a few flags that are no longer used. I kind of think these flags should actually be under a new section like "query_compiler" or something. @tpoterba, thoughts?
* [hailtop] Keep strong references to tasks on the event loop * isort * linting
…fo (hail-is#12713) * [batch] Refactor resource billing checks with additional debugging info * delint
* [tests] Use token in remote_tmpdir path * fix missing curly brace
…and not with a default feature flag (hail-is#12719) * [qob] Don't use local sort to shuffle... * fix python * use empty dict when flags is None --------- Co-authored-by: Daniel Goldstein <danielgold95@gmail.com>
* [batch] Allow python jobs to have file resources too * add test
* [qob] Enable liftovers for Query-on-Batch * reorganize python side of things * linting * dedup some stuff on the local and spark backends * dedup local spark and py4j backend functionality * linting * address some comments * cache orderings * simplify * fix
* [batch] Make driver parameters configurable in the UI * delint * address comments * address comment
This the main merge about the memory service that's debatable, ```scala val (open, write) = if (n <= 50) { (fs.openCachedNoCompression _, fs.writeCached _) } else { ((x: String) => fs.openNoCompression(x), fs.writePDOS _) } ``` Merged to: ```scala val (open, write) = ((x: String) => fs.openNoCompression(x), fs.writePDOS _) ``` Conflicts: batch/batch/driver/instance_collection/pool.py batch/batch/driver/main.py batch/batch/driver/templates/pool.html batch/batch/inst_coll_config.py ci/ci/ci.py hail/src/main/scala/is/hail/backend/service/ServiceBackend.scala hail/src/main/scala/is/hail/backend/service/Worker.scala
That merge looks right if we still want to disable the use of the For now, I think we should just go ahead with the version we had so far. |
lgruen
approved these changes
Mar 1, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Based on https://hail.zulipchat.com/#narrow/stream/300487-Hail-Batch-Dev/topic/resource.20table.20query.20woes/near/338629272, merge to 9e0081c and future commits (none of which are available now) need to be merged separately.
@lgruen, this the main merge about the memory service that we could potentially reverse the hack, but I don't know much about this, I just merged to the most relevant branch:
Out version was merged to: