Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the second VcfParser class: NucleusParser #346

Merged
merged 9 commits into from
Sep 18, 2018

Conversation

samanvp
Copy link
Member

@samanvp samanvp commented Aug 24, 2018

After defining the VCFParser interface and PyVcfParser in a previous PR, here we add
NucleusParser. Following points must be highlighted:

  • This new parser is hidden behind a flag (use_nucleus=False).
  • Since we are not still able to create a Wheel file that have the
    from_string() method, we use a hack to make Nucleus work: write the whole
    file to a local temp file and send it as input to parser.
  • Currently we don't have any unit tests or integration test. We
    manually cheked the sanity of Nucleus output by comparing its output
    table with PyVcf output. This must be done in the next step.

Copy link
Contributor

@arostamianfar arostamianfar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Saman. I added some comments. Please also add unit tests just for the proto->class conversion for now. We can add more once the from_string method is available.

import vcf

from apache_beam.coders import coders
from apache_beam.io.filesystems import FileSystems
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please only import the filesystems module (this is required by the styleguide [1], but not followed by Beam, which is why there is sometimes a discrepancy in our code).
[1] https://github.com/google/styleguide/blob/gh-pages/pyguide.md#22-imports

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

break
return temp_file_name

def _store_to_temp_local_file(self, header_lines):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this for our final design that we will have from_string(), I though I will just leave it here until we drop the current hack and will have our final implementation.

I will remove it now and add later ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have the from_string() I added this back.

# This member will be properly initiated in _init_with_header().
self._vcf_reader = None
# These members will be properly initiated in _extract_header_fields().
self._is_info_repeated = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a dict of info and format that point to VcfInfo and VcfFormatInfo instead; that is more future-proof (e.g. if we want to also check the description or type for another purpose, we could just reuse those dicts later on without introducing more specialized dicts). You can keep the is_repeated arg in _convert_list_value as it's a nice abstraction, but you can make a helper function that given a number string would return true/false if the field is repeated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I also added two other methods so finding is_repeated would be more clear.


def _init_with_header(self, header_lines):
# This header line is needed by Nucleus.
header_lines = ['##fileformat=VCFv4.2'] + header_lines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please only add this if the first like is not ##fileformat.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!


def _init_with_header(self, header_lines):
# This header line is needed by Nucleus.
header_lines = ['##fileformat=VCFv4.2'] + header_lines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this a constant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the question is, how about the const string ##fileformat which we use in the if condition? It seems a bit too redundant if we define that as another constant?

Let me know what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of this suggestion: define FILE_FORMAT_HEADER_TEMPLATE='##fileformat=VCF{VERSION}'.
You can then do startswith(FILE_FORMAT_HEADER_TEMPLATE.format(VERSION='') ...if it doesn't match, you can add FILE_FORMAT_HEADER_TEMPLATE.format('v4.0'). I think it's ok to hard-code the 'v.4.0' for simplicity.

self._file_name, data_line, str(e))
return MalformedVcfRecord(self._file_name, data_line, str(e))

def _convert_to_variant(self, record):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/record/variant_proto (we used 'record' before b/c that was the PyVCF naming).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

return MalformedVcfRecord(self._file_name, data_line, str(e))

def _convert_to_variant(self, record):
# type: (nucleus_proto.Variant) -> Variant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type looks wrong as there is no "nucleus_proto" module?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my 'proper' implementation (using the from_string() method), we need:

from nucleus.protos import variants_pb2 as nucleus_proto

to initialize the Nucleus object (their signature has changed from the current version).
Here we don't need to import that module.

info=self._get_variant_info(record),
calls=self._get_variant_calls(record))

def _get_variant_info(self, record):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/record/variant_proto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

if not sample.genotype:
call.genotype.append(MISSING_GENOTYPE_VALUE)
else:
for v in sample.genotype:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just do call.genotype = sample.genotype (or if it's really not a list, just cast to a list like list(sample.genotype))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to convert it to list otherwise it fails.

@@ -186,7 +186,8 @@ def __init__(self,
compression_type=CompressionTypes.AUTO, # type: str
buffer_size=DEFAULT_VCF_READ_BUFFER_SIZE, # type: int
validate=True, # type: bool
allow_malformed_records=False # type: bool
allow_malformed_records=False, # type: bool
use_nucleus=False # type: bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's more future proof to make an enum for this like VcfParserType and use that here (in case we decide to add a third parser). There is even a 'simpler' way of just using the class string and let python actually construct it (so you dont even need an if statement) but that may be less readable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need a more complicated approach given that we are pretty sure there is not going to be another parser anytime soon?

If you think still we better use the enum let me know, I will do it in the next round of comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should. It's very reasonable to have other types of parsers in the future (e.g. we're thinking about a bgzip parser; which may end up being a layer on top of Nuclues). In general, it's a good software engineering practice to not use bools for things that can possibly have more than two values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Member Author

@samanvp samanvp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except for a strange pylint message (which I am looking into it) this PR is ready for review.

Also I have started working on unit tests in a separate PR, if you prefer I can add them to this one.

Thanks.

import vcf

from apache_beam.coders import coders
from apache_beam.io.filesystems import FileSystems
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

self._temp_local_file = self._write_to_local_file(file_name)

def _write_to_local_file(self, remote_file_name):
(temp_file, temp_file_name) = tempfile.mkstemp(text=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

break
return temp_file_name

def _store_to_temp_local_file(self, header_lines):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this for our final design that we will have from_string(), I though I will just leave it here until we drop the current hack and will have our final implementation.

I will remove it now and add later ...


def _init_with_header(self, header_lines):
# This header line is needed by Nucleus.
header_lines = ['##fileformat=VCFv4.2'] + header_lines
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!


def _init_with_header(self, header_lines):
# This header line is needed by Nucleus.
header_lines = ['##fileformat=VCFv4.2'] + header_lines
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the question is, how about the const string ##fileformat which we use in the if condition? It seems a bit too redundant if we define that as another constant?

Let me know what do you think?

return None
if len(output_list) == 1:
return output_list[0]
raise ValueError('a not repeated field has more than 1 value')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, regardless of what the Nucleus does, we need to do the right thing and fail (or at least report this problem) instead of just dropping unexpected values.
If you think failing is not a good idea, we could change this to a warning. Let me know what you think.

# This member will be properly initiated in _init_with_header().
self._vcf_reader = None
# These members will be properly initiated in _extract_header_fields().
self._is_info_repeated = {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I also added two other methods so finding is_repeated would be more clear.

break
return temp_file_name

def _store_to_temp_local_file(self, header_lines):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have the from_string() I added this back.

alternate_bases=variant_proto.alternate_bases,
names=map(str, variant_proto.names),
alternate_bases=list(variant_proto.alternate_bases),
names=variant_proto.names[0].split(';') if variant_proto.names else [],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should report this as a bug to Nucleus team.

if phaseset_from_format or call_proto.is_phased:
call.phaseset = (phaseset_from_format if phaseset_from_format
else DEFAULT_PHASESET_VALUE)
if call_proto.is_phased:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified this if statement, I think this is more straightforward. Let me know what you think.

Copy link
Contributor

@arostamianfar arostamianfar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Saman. This looks much better! Please also submit unit tests together with this PR as we want to avoid having untested code in our repo at any point; may also be easier to catch/fix strange errors.
P.S. let's sync offline about how to do setup.py properly so that the unit tests actually pass.

import os
import tempfile

from nucleus.io.python import vcf_reader as nucleus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/nucleus/nucleus_vcf_reader so that it doesn't conflict with import nucleus itself (in case someone wants to do that in the future).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

import tempfile

from nucleus.io.python import vcf_reader as nucleus
from nucleus.protos import variants_pb2 as nucleus_proto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use variants_pb2 here. The one above is fine as you like to distinguish between pyvcf and nucleus parsers, but you should leave this one as is (e.g. it's easier for searching).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -41,7 +46,7 @@
DEFAULT_PHASESET_VALUE = '*' # Default phaseset value if call is phased, but
# no 'PS' is present.
MISSING_GENOTYPE_VALUE = -1 # Genotype to use when '.' is used in GT field.

FIRST_HEADER_LINE = '##fileformat=VCFv4.2\n'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: do all header_lines passed via _init_with_header include a \n?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, originally I did not have \n and I noticed header was missing AC info field. When I further looked at the problem I found out the top two lines are concatenated as one line (and Nucleus actually did not show any warning or error message).

@@ -41,7 +46,7 @@
DEFAULT_PHASESET_VALUE = '*' # Default phaseset value if call is phased, but
# no 'PS' is present.
MISSING_GENOTYPE_VALUE = -1 # Genotype to use when '.' is used in GT field.

FIRST_HEADER_LINE = '##fileformat=VCFv4.2\n'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use 4.0 as the 'least common denominator'. Not sure if it matters for the parser though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

return MalformedVcfRecord(self._file_name, data_line, str(e))

def _convert_to_variant(self, variant_proto):
# type: (nucleus.protos.variants_pb2.Variant) -> Variant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just variants_pb2.Variant is fine here :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

# This is a temporary solution until from_string will be fixed.
record = next(self._vcf_reader)
return self._convert_to_variant(record)
except (LookupError, ValueError) as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't see LookupError in their code. Looks like they only throw ValueError in their from_string signature.

if record.reference_bases != MISSING_FIELD_VALUE
else None),
alternate_bases=(
map(str, record.alternate_bases) if record.alternate_bases else []),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange. We should investigate this further later. OK for this PR.

alternate_bases=(
map(str, record.alternate_bases) if record.alternate_bases else []),
names=map(str, record.names) if record.names else [],
quality=record.quality,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right; it seems like they use -1 instead of null for quality field. We should eventually revisit this and special case on that. OK to leave as TODO for this PR.

return None
if len(output_list) == 1:
return output_list[0]
raise ValueError('a not repeated field has more than 1 value')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only throw an error if allow_malformed_records is true. But let's first see what Nucleus does. OK to just add a TODO for this PR though.


def _get_variant_calls(self, record):
calls = []
for sample in record.calls:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this context, it's better to use consistent names within each class rather than across classes. There are some terminologies that are PyVCF specific that we shouldn't leak into Nucleus ;) (e.g. when/if we remove PyVCF).

@samanvp samanvp force-pushed the nucleus_parser branch 2 times, most recently from 5282426 to b845c24 Compare September 5, 2018 20:51
Copy link
Member Author

@samanvp samanvp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added all modifications that you asked for. I also duplicated two of the unit tests so we have a minimum level of tests for Nucleus. Please let me know if you think this is enough.

import os
import tempfile

from nucleus.io.python import vcf_reader as nucleus
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

import tempfile

from nucleus.io.python import vcf_reader as nucleus
from nucleus.protos import variants_pb2 as nucleus_proto
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -41,7 +46,7 @@
DEFAULT_PHASESET_VALUE = '*' # Default phaseset value if call is phased, but
# no 'PS' is present.
MISSING_GENOTYPE_VALUE = -1 # Genotype to use when '.' is used in GT field.

FIRST_HEADER_LINE = '##fileformat=VCFv4.2\n'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, originally I did not have \n and I noticed header was missing AC info field. When I further looked at the problem I found out the top two lines are concatenated as one line (and Nucleus actually did not show any warning or error message).

@@ -41,7 +46,7 @@
DEFAULT_PHASESET_VALUE = '*' # Default phaseset value if call is phased, but
# no 'PS' is present.
MISSING_GENOTYPE_VALUE = -1 # Genotype to use when '.' is used in GT field.

FIRST_HEADER_LINE = '##fileformat=VCFv4.2\n'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

return MalformedVcfRecord(self._file_name, data_line, str(e))

def _convert_to_variant(self, variant_proto):
# type: (nucleus.protos.variants_pb2.Variant) -> Variant
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

else:
call.genotype = list(call_proto.genotype)

if call_proto.is_phased:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your reasoning for changing it?

Nucleus comment on is_phased field implies that the value of PS field is only relevant when is_phased == True.
I changed this to the old logic.

@@ -186,7 +186,8 @@ def __init__(self,
compression_type=CompressionTypes.AUTO, # type: str
buffer_size=DEFAULT_VCF_READ_BUFFER_SIZE, # type: int
validate=True, # type: bool
allow_malformed_records=False # type: bool
allow_malformed_records=False, # type: bool
use_nucleus=False # type: bool
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

# This is a temporary solution until from_string will be fixed.
record = next(self._vcf_reader)
return self._convert_to_variant(record)
except (LookupError, ValueError) as e:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, last time I quickly checked this test file and I saw they are checking both of those exception. Now that I checked the test_vcf_from_string_raises_on_bad_input I saw they only check the ValueError for from_string() method.

@samanvp samanvp force-pushed the nucleus_parser branch 3 times, most recently from df52894 to d8d17f3 Compare September 10, 2018 18:42
@coveralls
Copy link

coveralls commented Sep 10, 2018

Pull Request Test Coverage Report for Build 1264

  • 90 of 221 (40.72%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-1.7%) to 88.102%

Changes Missing Coverage Covered Lines Changed/Added Lines %
gcp_variant_transforms/beam_io/vcfio.py 9 12 75.0%
gcp_variant_transforms/beam_io/vcf_parser_test.py 33 36 91.67%
gcp_variant_transforms/beam_io/vcfio_test.py 28 59 47.46%
gcp_variant_transforms/beam_io/vcf_parser.py 20 114 17.54%
Totals Coverage Status
Change from base Build 1247: -1.7%
Covered Lines: 5887
Relevant Lines: 6682

💛 - Coveralls

Copy link
Contributor

@arostamianfar arostamianfar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Saman! Looks great! Just two nits (one is optional).

try:
variant_proto = self._vcf_reader.from_string(data_line)
return self._convert_to_variant(variant_proto)
except (ValueError) as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please remove the surrounding brackets around ValueError

self._representative_header_lines,
buffer_size=self._buffer_size,
skip_header_lines=0)
if self._vcf_parser_type == VcfParserType.PYVCF:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since the signature of the classes is exactly the same, you can do:

vcf_parser_class = None
if self._vcf_parser_type == VcfParserType.PYVCF:
  vcf_parser_class = vcf_parser.PyVcfParser
elif self._vcf_parser_type == VcfParserType.NUCLEUS:
  vcf_parser_class = vcf_parser.NucleusParser
else:
  raise Error
record_iterator = vcf_parser_class(
    file_name,
    range_tracker,
    ....)

This has the advantage of not duplicating the code for the arguments, but we may need to change it back if at any point the signature of the subclasses is not the same. So, up to you if you want to change it :)

Copy link
Contributor

@arostamianfar arostamianfar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please double check that integration tests pass prior to submitting.

as the second class that implements the VcfParser interface.

Currently we are trying to solve a few issues related to the nucleus Wheel file.
After defining the VCFParser interface and PyVcfParser in the PR we add
NucleusParser. Following points must be considered:
 * This new parser is hidden behind a flag (use_nucleus=False).
 * Since we are not still able to create a Wheel file that have the
 from_string() method, we use a hack to make Nucleus work: write the whole
 file to a local temp file and send it as input to parser.
 * Currently we don't have any unit tests or integration test. We
 manually cheked the sanity of Nucleus output by comparing its output
 table with PyVcf output. This must be done in the next step.
After we managed to build a Wheel file based on the latest code base.
We also make a few minor changes to make NucleusParser work with Dataflow.
and also move unit tests of Variant and VariantCall classes to vcf_parser_test.py
In order to use NucleusParser with a DataflowRuner the following flag is
still needed:
--extra_package /dirctory/containing/Nucleus-0.1.0-py2-none-any.whl
@samanvp samanvp merged commit befbca5 into googlegenomics:master Sep 18, 2018
@samanvp samanvp deleted the nucleus_parser branch September 18, 2018 17:45
@samanvp samanvp restored the nucleus_parser branch September 19, 2018 18:22
samanvp added a commit to samanvp/gcp-variant-transforms that referenced this pull request Nov 22, 2018
In PR googlegenomics#346 we added a temporary solution to install Nucleus using its
wheel file. Now that Nucleus has pip install we can remove that hack.
samanvp added a commit to samanvp/gcp-variant-transforms that referenced this pull request Nov 22, 2018
In PR googlegenomics#346 we added a temporary solution to install Nucleus using its
wheel file. Now that Nucleus has pip install we can remove that hack.
samanvp added a commit to samanvp/gcp-variant-transforms that referenced this pull request Nov 22, 2018
In PR googlegenomics#346 we added a temporary solution to install Nucleus using its
wheel file. Nucleus now has pip install so we can remove that hack.
samanvp added a commit to samanvp/gcp-variant-transforms that referenced this pull request Nov 23, 2018
In PR googlegenomics#346 we added a temporary solution to install Nucleus using its
wheel file. Nucleus now has pip install so we can remove that hack.
samanvp added a commit to samanvp/gcp-variant-transforms that referenced this pull request Dec 4, 2018
In PR googlegenomics#346 we added a temporary solution to install Nucleus using its
wheel file. Nucleus now has pip install so we can remove that hack.
samanvp added a commit that referenced this pull request Dec 7, 2018
* Adding google-nucleus to the REQUIRED_PACKAGES list

In PR #346 we added a temporary solution to install Nucleus using its
wheel file. Nucleus now has pip install so we can remove that hack.

* Review comments
@samanvp samanvp deleted the nucleus_parser branch September 13, 2019 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants