Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine-readable output of tests #1284

Closed
wants to merge 2 commits into from
Closed

Machine-readable output of tests #1284

wants to merge 2 commits into from

Conversation

hauleth
Copy link

@hauleth hauleth commented Sep 17, 2015


### Structure

Output is stream of lines containing JSON objects separated by new line. Each
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to agree on something like http://dataprotocols.org/ndjson/ or http://jsonlines.org/ for that or is that up to TAP-J to define?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could also make usage of \1E ASCII code which refers to "Record Separator". This is still under consideration, but 1 record per line is simple enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd take '\n' or '\r\n', it's a pretty common format in the NoSQL-world (e.g. Elasticsearch bulk uses it) and JSON can be written newline-free very well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is inconsistent between platform as Windows still uses \r\n as a line separator and I believe that println! macro respect that. Although this needs reconsideration as I chosen new line as it was simple enough to implement in first draft of specs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That inconsistency doesn't matter in that case as '{"test": "foo"}\r' is the same JSON value as '{"test": "foo"}'. Even if your IO isn't prepared for that, this is not an issue.

@killercup
Copy link
Member

I really like this proposal. Thanks for writing this!

@alexcrichton alexcrichton added the T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. label Sep 17, 2015
@alexcrichton
Copy link
Member

cc @rust-lang/tools, definitely our territory!

Thanks for the RFC @hauleth! Some thoughts of mine:

  • Perhaps there could be a record for when a test starts, as well as when it finishes? For interactive displays it could perhaps be nice to see what tests are in progress.
  • We probably want to start off pretty conservative and not have many fields required by default unless necessary. For example could we drop build and rustc from the suite record?
  • For a test record, the duration field could be dropped if we had start/stop records.
  • Could this spec how the libtest library will be changed? E.g. it's good that this is preserving backwards compatibility, but we'll probably want a new flag to test binaries which generates this form of output.
  • Does this output also affect rustdoc --test? I think it probably should and we'd just want to forward flags to the "test binary" basically.

I personally always find it a good exercise to implement features like this ahead of time to get a good feeling about what's needed to implement the current functionality we have today. Along those lines it'd certainly help weed out what's needed for maintaining the same output in terms of benchmarks and tests, but it may also be a pretty significant chunk of work!

@hauleth
Copy link
Author

hauleth commented Sep 17, 2015

About first point. I assumed that suite object will be the beginning of tests so additional object would be unneeded.

About second. It is completely possible to do so. I've just added that to be sure that we are running valid binary.

About third. Of course it could be solution, but I don't see reason behind providing timestamp of beginning and ending of test, duration will do as well (maybe it's only me, but I doesn't care when my test started, I want to know how long it has been running).

About last two it would be recommended to do so. It could simplify calls.

About implementing that: I had that in mind, but currently I hadn't enough time to dig in rustc to find where and what so I what I loved in Ruby and power it up to work nice with Rust and provide functionality I need in my work. So it is very opinionated.

@jgraham
Copy link

jgraham commented Sep 17, 2015

For comparison we designed a similar format for logging browser test results at Mozilla. You probably don't need everything in that format, and may want some different things, but it's another point in the possible design space to consider. One of the driving considerations there was when you have tests from a third party source (e.g. because you are implementing some specification) there can be tests that you expect to fail but which you don't wish to edit. If that isn't a case you care about here you are likely to make differnt design choices.

@emberian
Copy link
Member

In general I'm a big fan of anything TAP based, and this JSON encoding removes most of problems with TAP. Human-readable output should definitely be the default, though. Perhaps an environment variable such as RUST_MACHINE_OUTPUT or a universally understood command-line to enable machine output.

These tools are primarily for people, not machines. Changing Cargo to consume the machine-readable format is not an issue.

Looking forward to seeing something like this develop!

@diwic
Copy link

diwic commented Sep 18, 2015

cargo test should continue to have human readable format, and machine output could be available with cargo test --json or so.

@erickt
Copy link

erickt commented Sep 19, 2015

This is a great start! I think it could also be useful to capture the rustc command line arguments so that we can observe optimization levels, enabled feature flags, and etc.

For reference, a bunch of other test libraries produce the xUnit XML File Format.

@tomjakubowski
Copy link
Contributor

In general I'm a big fan of anything TAP based, and this JSON encoding removes most of problems with TAP. Human-readable output should definitely be the default, though. Perhaps an environment variable such as RUST_MACHINE_OUTPUT or a universally understood command-line to enable machine output.

I don't see why it should be an environment variable; environment variables are much harder to reason about than command line flags (it's a bit like dynamic scope vs. lexical scope), and it's not like we need to adjust some behavior deep within a stack of executing programs.

A flag on the compiled test runner (--format=tap, bikeshedding welcome) combined with a flag on cargo test itself seems like the right fit to me.

@alexcrichton
Copy link
Member

@hauleth

About first point. I assumed that suite object will be the beginning of tests so additional object would be unneeded.

Ah yeah I meant in addition to the suite object there'd also be an object for "this test has started to run". That gives consumers a notion of what tests are currently running, e.g. those you've seen start records for and haven't seen end records for. Additionally if a consumer of the output wants to provide a progress bar this would be useful information perhaps.

About third. Of course it could be solution, but I don't see reason behind providing timestamp of beginning and ending of test, duration will do as well (maybe it's only me, but I doesn't care when my test started, I want to know how long it has been running).

Ah just in the sense that we don't have a lot of timestamp support in the standard library just yet, so it'd be difficult to implement this in-tree (e.g. whereas it'd be pretty easy to do it out-of-tree), so for ease of implementation we may want to avoid this for now.

About implementing that: I had that in mind, but currently I hadn't enough time to dig in rustc to find where and what so I what I loved in Ruby and power it up to work nice with Rust and provide functionality I need in my work. So it is very opinionated.

Ah yeah no worries! You're not on the hook to implement this or anything like that, just some musings from me!

@withoutboats withoutboats mentioned this pull request Sep 26, 2015
@yoshuawuyts
Copy link
Member

What are the limitations on TAP that warrant the invention of another format? To my understanding this format would be TAP-J based, meaning it's something new.

Though being slightly inconvenient to parse, I think that using TAP yields more benefits in terms of tooling, adoption, familiarity and interoperability than creating something new.

It's quite annoying if every language comes with their own custom way of formatting test output. E.g. I don't think Rust should fall into the same trap Golang fell into with their test output:

$ go test -v
=== RUN TestPrintSomething
Say hi
--- PASS: TestPrintSomething (0.00 seconds)
    v_test.go:10: Say bye
PASS
ok      so/v    0.002s

@hauleth
Copy link
Author

hauleth commented Sep 29, 2015

@yoshuawuyts the only problem with TAP is that this is primitive format, that doesn't provide way to message a lot of things (like type of test, performance, etc.) in uniform way. RusTAP is created to fit into Rust testing framework as it has some quirks that original TAP-J doesn't cover (i.e. benches).

@IanConnolly
Copy link

I'd be happy to help with moving this along, as I've been wanting this myself recently.

@brson
Copy link
Contributor

brson commented Nov 4, 2015

My quick comments:

  • Never heard of TAP-J. Need to consider it.
  • Doesn't consider cargo integration
    • people do testing through cargo
    • cargo has other types of tests and I want to be to analyze their results
  • Printing to stdout has problems
    • cargo also prints to stdout
    • test cases can print to stdout when the test runner is not capturing

@hauleth
Copy link
Author

hauleth commented Jan 8, 2016

Closing as I want to rewrite it for TAP13 protocol (wider usage, and already existing tools).

@hauleth hauleth closed this Jan 8, 2016
@tj
Copy link

tj commented Jan 9, 2016

IMO the problem with Go's is that you can't replace the output generation. I know the Go team has an issue open for considering JSON output, but to me It would be so much nicer if you could just:

import (
  _ "my/fancy/test/output" // register a replacement hook
)

Then you don't have to fight about JSON, TAP, etc, just use whatever you like. Maybe Rust could go that route instead of a base format?

@andrew-d
Copy link

andrew-d commented Jan 9, 2016

👍 for that - being able to write a "test output plugin" (or whatever it ends up being called) seems like the best way to handle this.

@jtepe
Copy link

jtepe commented Jan 9, 2016

Then there still has to be a sensible default or fallback for those that do not provide such a plugin.

@jtepe
Copy link

jtepe commented Jan 9, 2016

However, I am also in favor of this plugin approach.

@hauleth
Copy link
Author

hauleth commented Jan 9, 2016

I am in the progress of rewriting libtest to allow writing reporters in sensible way. About TAP - TAP version 13 allow YAML test description after each test which seems IMHO good approach. As YAML is superset of JSON it solves both problems.

Łukasz Jan Niemier

Dnia 9 sty 2016 o godz. 09:55 Jonas Tepe notifications@github.com napisał(a):

However, I am also in favor of this plugin approach.


Reply to this email directly or view it on GitHub.

@kamalmarhubi
Copy link
Contributor

@hauleth

Closing as I want to rewrite it for TAP13 protocol (wider usage, and already existing tools).

Do you need any help with this? I'd very much like testing to get better in Rust!

@milgner
Copy link

milgner commented Oct 9, 2016

Is this still being worked on? I think Rust would greatly benefit from this as it makes integration of tests into CI systems much more elegant. TAP 13 seems like a good format, too, even if it doesn't explicitly contain a differentiation between regular tests and benchmarks. But I guess a # benchmark directive could be added to address this?

@dnsco
Copy link

dnsco commented Sep 3, 2017

Can you reopen this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-dev-tools Relevant to the development tools team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.