Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make VariantContextWriterBuilder warn when indexing-on-the-fly is enabled for stream output #1328

Merged
merged 5 commits into from
Jun 17, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,7 @@

import htsjdk.samtools.Defaults;
import htsjdk.samtools.SAMSequenceDictionary;
import htsjdk.samtools.util.BlockCompressedOutputStream;
import htsjdk.samtools.util.IOUtil;
import htsjdk.samtools.util.Md5CalculatingOutputStream;
import htsjdk.samtools.util.RuntimeIOException;
import htsjdk.samtools.util.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's better to avoid wildcard imports as it makes it harder to see what the code is doing when you're not using an IDE. You can change the IDE settings so it doesn't create wildcard imports.

import htsjdk.tribble.index.IndexCreator;
import htsjdk.tribble.index.tabix.TabixFormat;
import htsjdk.tribble.index.tabix.TabixIndexCreator;
Expand Down Expand Up @@ -107,6 +104,7 @@ public class VariantContextWriterBuilder {
public static final EnumSet<Options> DEFAULT_OPTIONS = EnumSet.of(Options.INDEX_ON_THE_FLY);
public static final EnumSet<Options> NO_OPTIONS = EnumSet.noneOf(Options.class);
private static final OpenOption[] EMPTY_OPEN_OPTION_ARRAY = new OpenOption[0];
private final static Log log = Log.getInstance(VariantContextWriter.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorrect modifier order

Suggested change
private final static Log log = Log.getInstance(VariantContextWriter.class);
private static final Log log = Log.getInstance(VariantContextWriter.class);


public enum OutputType {
UNSPECIFIED,
Expand Down Expand Up @@ -188,12 +186,10 @@ public VariantContextWriterBuilder setOutputFile(final String outFile) {
/**
* Set the output file type for the next <code>VariantContextWriter</code> created by this builder.
*
* @param outType the type of file the <code>VariantContextWriter</code> will write to
* @param outType the type of file or stream the <code>VariantContextWriter</code> will write to
* @return this <code>VariantContextWriterBuilder</code>
*/
public VariantContextWriterBuilder setOutputFileType(final OutputType outType) {
if (!FILE_TYPES.contains(outType))
throw new IllegalArgumentException("Must choose a file type, not other output types.");

if (this.outPath == null || this.outStream != null)
throw new IllegalArgumentException("Cannot set a file type if the output is not to a file.");
Expand Down Expand Up @@ -476,14 +472,18 @@ else if (STREAM_TYPES.contains(this.outType))
writer = createBCFWriter(outPath, outStreamFromFile);
break;
case VCF_STREAM:
if (options.contains(Options.INDEX_ON_THE_FLY))
throw new IllegalArgumentException("VCF index creation not supported for stream output.");
if (options.contains(Options.INDEX_ON_THE_FLY)) {
pshapiro4broad marked this conversation as resolved.
Show resolved Hide resolved
log.warn("VCF index creation not supported for stream output, index will not be created");
options.remove(Options.INDEX_ON_THE_FLY);
}

writer = createVCFWriter(null, outStreamFromFile);
break;
case BCF_STREAM:
if (options.contains(Options.INDEX_ON_THE_FLY))
throw new IllegalArgumentException("BCF index creation not supported for stream output.");
if (options.contains(Options.INDEX_ON_THE_FLY)) {
log.warn("BCF index creation not supported for stream output, index will not be created");
options.remove(Options.INDEX_ON_THE_FLY);
}

writer = createBCFWriter(null, outStream);
break;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,25 @@ public void testIndexingOnTheFly() {
}
}

@Test
public void testIndexingOnTheFlyForPathStream() throws IOException {
final VariantContextWriterBuilder builder = new VariantContextWriterBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent should be 4 spaces, not 2. I realize the rest of the file is using 2, but IMO we should move to a common convention for all files, starting with new/updated code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reindent the file pretty easily if you would like? It seems orthogonal to this branch however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just fix the lines that you're adding/modifying.

.setReferenceDictionary(dictionary)
.setOption(Options.INDEX_ON_THE_FLY);

try (FileSystem fs = Jimfs.newFileSystem("test", Configuration.unix())) {
final Path vcfPath = IOUtil.getPath("/dev/null");
final Path vcfIdxPath = Tribble.indexPath(vcfPath);
// We explicitly setOutputFileType here to mimic gatk behavior where it pretends files that have no extension
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete sentence, or extra that?

try (final VariantContextWriter writer = builder.setOutputPath(vcfPath).setOutputFileType(OutputType.VCF_STREAM).build()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this also work if you do not set the output type to a stream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it won't. In that case it will end up in the same place where we are as of right now, where an output stream location will end up "UNSPECIFIED" and throw an exception unless the user explicitly defines what type of output it is.

//deliberately empty
}

Assert.assertFalse(Files.exists(vcfIdxPath),
String.format("VCF index should not have been created %s / %s", vcfPath, vcfIdxPath));
}
}

@Test
public void testIndexingOnTheFlyForPath() throws IOException {
final VariantContextWriterBuilder builder = new VariantContextWriterBuilder()
Expand Down