-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing the 'index 0' bug in features2adam #308
Fixing the 'index 0' bug in features2adam #308
Conversation
This fixes Issue #307 |
Hmmm, no dice on the tests...? |
Weird, I ... don't understand what's happening. Running it again locally. It was all working fine 10 minutes ago :-( |
Gremlins. |
@fnothaft can you do me a favor when you get a chance? Grab the branch and try building it yourself? It's still working for me, I don't quite understand what the problem is :-/ |
@tdanford sure! Will do now... |
@tdanford ran OK for me as well... Jenkins, retest this please. |
Jenkins, test this please. |
I'm at a loss. |
And this is why we can't have nice things. |
Jenkins, retest this please. |
The test failure seems to be mis-sorted values. Not sure why the server is reading them in a different order, but we should probably not use the Parquet printer for this, and instead sort them for stability of the test. |
@carlyeks @fnothaft What release do we want this PR to go into, 0.12.1 or 0.13? If 0.12.1, can one of you help me understand why this latest rewrite of the Features2ADAMSuite is still failing? I'm using the ParquetLister class, but I'm getting a Thrift error (!) when I try to read the Parquet file header. |
This PR will fix Issue #322 as well. |
Doing a Features2ADAMSuite:
2014-07-23 08:28:25 WARN Utils:70 - Your hostname, wm1a5-fef resolves to a loopback address: 127.0.0.1; using 192.168.1.112 instead (on interface en0)
2014-07-23 08:28:25 WARN Utils:70 - Set SPARK_LOCAL_IP if you need to bind to another address
2014-07-23 08:28:27.168 java[73274:1703] Unable to load realm info from SCDynamicStore
2014-07-23 08:28:27 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
- can convert a simple BED file *** FAILED ***
java.io.IOException: can not read class parquet.format.PageHeader: null
at parquet.format.Util.read(Util.java:50)
at parquet.format.Util.readPageHeader(Util.java:26)
at org.bdgenomics.adam.rdd.ParquetColumnChunk.readPageHeader(ParquetCommon.scala:190)
at org.bdgenomics.adam.rdd.ParquetColumnChunk.readAllPages(ParquetCommon.scala:246)
at org.bdgenomics.adam.parquet_reimpl.ParquetPartition$$anonfun$2.apply(ParquetPartition.scala:111)
at org.bdgenomics.adam.parquet_reimpl.ParquetPartition$$anonfun$2.apply(ParquetPartition.scala:111)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
...
Cause: parquet.org.apache.thrift.transport.TTransportException:
at parquet.org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at parquet.org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at parquet.org.apache.thrift.protocol.TCompactProtocol.readBinary(TCompactProtocol.java:641)
at parquet.org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:102)
at parquet.org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:112)
at parquet.org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:138)
at parquet.org.apache.thrift.protocol.TProtocolUtil.skip(TProtocolUtil.java:60)
at parquet.format.PageHeader.read(PageHeader.java:851)
at parquet.format.Util.read(Util.java:47)
at parquet.format.Util.readPageHeader(Util.java:26)
... |
For some reason, we'd neglected to include an ADAMFeatureField class in the adam.projections package! This commit fixes that.
@fnothaft Carl and I talked, and while neither of us can get this test (Features2ADAMSuite) to work locally or on Jenkins, it's clear that the bug in Features2ADAM itself has been fixed by these commits. I've marked the test as "ignore" for now, and I've opened a ticket (Issue #323) to fix it. As soon as Jenkins successfully builds this, I think it's ready to be reviewed-and-merged. |
@tdanford that sounds good here, I will wait for Jenkins to build it and willl then review. |
Looks like this is still failing... |
Jenkins isn't building the right revisions (again). The general build 118 shows the right revision, but the test matrix isn't building that one... |
Jenkins, retest this please. |
Features2ADAM fails because it doesn't include an 'index' field in its second command-line argument, and therefore Args4j can't parse the Features2ADAMArgs object off a command line. This commit adds the missing 'index' argument to the Features2ADAMArgs class. There is a test for this additional feature -- however, that test is marked as IGNORE because we can't get it to work either locally or on Jenkins. This should be addressed, and I'll open up a ticket for it. In the meantime, however, we've verified that manually that this patch works as advertised. Also modified PrintADAM to use the scala.Console.out rather than System.out. This seems to allow us to (in tests) redirect the output stream using the Console.withOut command.
Formatting issue -- just pushed up an update. |
Jenkins, retest this please. |
Fixing the 'index 0' bug in features2adam
Fixes the bug where the error message "Argument with index 0 is used more than once" appears and prevents us from running features2adam from the command line.
Also, added a test for Features2ADAM, and updated PrintADAM (in support of the test) to use scala.Console.out rather than System.out, which appears to play better with the scala.Console.withOut method used in tests.