Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding google-nucleus to the REQUIRED_PACKAGES list #416

Merged
merged 2 commits into from
Dec 7, 2018

Conversation

samanvp
Copy link
Member

@samanvp samanvp commented Nov 22, 2018

In PR #346 we added a temporary solution to install Nucleus using its
wheel file. Nucleus now has pip install so we can remove that hack.

Also when running integration tests I was prompted to replace gcloud container builds submit with gcloud builds submit. My first attempt to run the integration tests after that change failed due to build timeout, so I added the --timeout "30" flag.

@coveralls
Copy link

coveralls commented Nov 22, 2018

Pull Request Test Coverage Report for Build 1468

  • 1 of 1 (100.0%) changed or added relevant line in 1 file are covered.
  • 4 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+1.4%) to 89.013%

Files with Coverage Reduction New Missed Lines %
gcp_variant_transforms/beam_io/vcfio_test.py 2 98.69%
gcp_variant_transforms/beam_io/vcf_parser.py 2 85.95%
Totals Coverage Status
Change from base Build 1467: 1.4%
Covered Lines: 6546
Relevant Lines: 7354

💛 - Coveralls

Copy link
Contributor

@arostamianfar arostamianfar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks, Saman! Finally we don't need that hack anymore :)

setup.py Outdated
@@ -28,6 +26,7 @@
'google-api-python-client>=1.6',
'intervaltree>=2.1.0,<2.2.0',
'pyvcf<0.7.0',
'google-nucleus>=0.2.0',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, let's actually pin this to 0.2.0 as they may release a new version, which may break our unit tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

vcf_line = ('20 1234 rs123;rs2 C A,T 50 '
# rs123;rs2 -> rs123 it seems nuclues does not parse IDs correctly.
# quality=50 -> 50.0 nucleus converts quality values to float.
# TODO(samanvp): convert all quality values to float.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allieychen FYI
I actually noticed that we have the same problem for the BQ -> VCF pipeline. The VCF quality values are typically integers, but the spec has float. So, while technically correct, our output VCF files have #.0 for the quality field.
I think a better solution is to add some logic to our parser to output these as ints if the value is actually an integer (an easy way to do this is: return quality if int(quality) != quality else int(quality) (with some special casing when quality is None). Let's do this in another PR though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thanks!

@@ -207,8 +207,9 @@ if [[ -z "${skip_build}" ]]; then
# TODO(bashir2): This will pick and include all directories in the image,
# including local build and library dirs that do not need to be included.
# Update this to include only the required files/directories.
gcloud container builds submit --config "${build_file}" \
gcloud builds submit --config "${build_file}" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify: is this the new way of using cloud build?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when I run the deploy_and_run_tests.sh I got the following error message:

ERROR: (gcloud.container.builds.submit) This command has been replaced with `gcloud builds`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for fixing this! Could you please update the comment in cloudbuild.yaml as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

deploy_and_run_tests.sh Show resolved Hide resolved
Copy link
Member Author

@samanvp samanvp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration tests are failing without a meaningful error message:

  File "/tmp/tmp.XPnhTCIT3g/local/lib/python2.7/site-packages/gcp_variant_transforms/testing/integration/run_tests_common.py", line 134, in _handle_failure
    response['error']['message']))
gcp_variant_transforms.testing.integration.run_tests_common.TestCaseFailure: Test case platinum-na12877-hg38-10k-lines (Operation projects/gcp-variant-transforms-test/operations/14235103094015335786) failed: Execution failed: action 1: unexpected exit status 1 was not ignored

It seems this is independent of the platinum-na12877-hg38-10k-lines test case, first another test was failing when I removed its yaml file this test failed with a similar error message.

I am still investigating...

@@ -207,8 +207,9 @@ if [[ -z "${skip_build}" ]]; then
# TODO(bashir2): This will pick and include all directories in the image,
# including local build and library dirs that do not need to be included.
# Update this to include only the required files/directories.
gcloud container builds submit --config "${build_file}" \
gcloud builds submit --config "${build_file}" \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when I run the deploy_and_run_tests.sh I got the following error message:

ERROR: (gcloud.container.builds.submit) This command has been replaced with `gcloud builds`.

deploy_and_run_tests.sh Show resolved Hide resolved
setup.py Outdated
@@ -28,6 +26,7 @@
'google-api-python-client>=1.6',
'intervaltree>=2.1.0,<2.2.0',
'pyvcf<0.7.0',
'google-nucleus>=0.2.0',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

@arostamianfar arostamianfar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just one nit about the version...
RE integration test failure: it looks like a temporary failure and may go away if you retry. Also, to debug, you need to run gcloud alpha genomics operations describe projects/gcp-variant-transforms-test/operations/14235103094015335786 and look at the logs.

@@ -207,8 +207,9 @@ if [[ -z "${skip_build}" ]]; then
# TODO(bashir2): This will pick and include all directories in the image,
# including local build and library dirs that do not need to be included.
# Update this to include only the required files/directories.
gcloud container builds submit --config "${build_file}" \
gcloud builds submit --config "${build_file}" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for fixing this! Could you please update the comment in cloudbuild.yaml as well?

setup.py Outdated
@@ -28,6 +26,7 @@
'google-api-python-client>=1.6',
'intervaltree>=2.1.0,<2.2.0',
'pyvcf<0.7.0',
'google-nucleus>=0.2.0,<0.2.1',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does google-nuclues==0.2.0 not work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually does!
Since there was no other instance of == I assumed it does not work.

Copy link
Member Author

@samanvp samanvp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested by Thomas, adding 'protobuf>=3.6.1' to the list of our requirements solved the problem of failing integration tests. I am not sure whether it's better to pin to version 3.6.1 or we should rely on the latest version?

Please take another look before I submit this PR.

@@ -207,8 +207,9 @@ if [[ -z "${skip_build}" ]]; then
# TODO(bashir2): This will pick and include all directories in the image,
# including local build and library dirs that do not need to be included.
# Update this to include only the required files/directories.
gcloud container builds submit --config "${build_file}" \
gcloud builds submit --config "${build_file}" \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

setup.py Outdated
@@ -28,6 +26,7 @@
'google-api-python-client>=1.6',
'intervaltree>=2.1.0,<2.2.0',
'pyvcf<0.7.0',
'google-nucleus>=0.2.0,<0.2.1',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually does!
Since there was no other instance of == I assumed it does not work.

In PR googlegenomics#346 we added a temporary solution to install Nucleus using its
wheel file. Nucleus now has pip install so we can remove that hack.
Copy link
Contributor

@allieychen allieychen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Saman, LGTM!

@samanvp samanvp merged commit 60f36ef into googlegenomics:master Dec 7, 2018
@samanvp samanvp deleted the nucleus branch September 13, 2019 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants