Add support for external frontends, take 2 #1600

EliRibble · 2019-07-12T04:05:55Z

This is based on #1210 by collincross.

There are many requests for customizations to Ninja's build output cataloged in #746. This patch set attempts to address the underlying issue by spawning a separate frontend process and feeding serialized build status updates to it. The frontend can be extended without complicating Ninja's core logic, or can be completely swapped out for a custom frontend like implemented in #1020. The output can also be serialized to disk for later parsing, which will work well for buildbots.

The interface between Ninja and the frontend is documented in FRONTEND.md in 7th patch in the series, "Add frontend option for build status output"

This is an attempt to get external frontends landed in a way that is in harmony with ninja's policy of being conceptually simple. This means that the ninja binary will not depend on any implementation of protobufs. This PR will adds a new binary that has the ability to take a '--frontend' argument that passes protobuf messages and allows arbitrary logic for showing progress and messages to the user.

Add a tiny protobuf runtime in proto.cc and proto.h, and a python script that takes a binary proto descriptor (produced by protoc --descriptor_set_out) and produces a .pb.h file that implements a subset of the protobuf C++ Message interface. Original code from colincross.

EliRibble · 2019-07-12T04:13:35Z

This PR is going to need some more work before it has fully achieved its purpose. From the original PR, #1210, there was the following post from nico:

I thought a bit about this. As said on the mail thread, I'm not a fan of putting protobuf in ninja itself (even if it's a reimplementation). At the same time, I agree that lots of people want to tweak ninja's output in many ways, and this is a good approach to attack that problem.

What do you think about adding a second binary, ninja_protoout or ninja_buildcore or what have you, somewhere below contrib/, make that link to the ninja lib, put the proto code there, and make the ninja lib have an interface for selecting the output behavior (and core ninja would provide a class that works like the current output and the new binary would instead use the proto behavior)? That way, ninja itself stays the way it is, and people who want very custom output can use the new binary for that?

I have not yet done that work. Before I do I'd like to actually understand the requirements. This is what I believe needs to be done:

The ninja binary cannot have compiled into it any implementation of protobuf communication. For this PR, that means proto.h|proto.c.
The current code in this PR creates a StatusSerializer or a StatusPrinter in real_main based on the presence of a --frontend parameter. Instead we'll need do work to break up real_main into code that can be shared and create two different source files, one that will be used by ninja and will always provide a StatusPrinter and another that will be used by ninja_buildcore and can provide StatusPrinter or StatusSerializer.
There may be packaging work to add ninja_buildcore to releases.

Open questions:

Do I need to create a contrib directory in the root of the repository and put all protobuf-related source into it? If not, what does "somewhere below contrib/" mean?

Is there a preference in sharing the logic in ninja.cc between keeping most of the shared code within ninja.cc or moving shared code to a new source file?

Make BuildStatus an abstract interface, and move the current implementation to StatusPrinter, to make way for a serialized Status output. Original code by colincross.

Edges are nominally ordered by order in the build manifest, but in fact are ordered by memory address. In most cases the memory address will be monontonically increasing. Since serialized build output will need unique IDs, add a monotonically increasing ID to edges, and use that for sorting instead of memory address. Original code by colincross.

Send all output after manifest parsing is finished to the Status interface, so that when status frontends are added they can handle build messages. Original code from collincross.

The times that end up in the build log currently originate in the status printer, and are propagated back out to the Builder. Move the edge times into the Builder instead, and move the overall start time into NinjaMain so it doesn't get reset during manifest rebuilds. Original source from colincross.

Store the number of running edges instead of trying to compute it from the started and finshed edge counts, which may be different for starting and ending status messages. Allows removing the status parameter to PrintStatus and the EdgeStatus enum. Original code by colincross.

Add a --frontend option that takes a command that will handle printing build status. The command will be executed with the read of a pipe on fd 3, and each build event will cause a serialized message to be sent on the pipe. The frontend/ directory includes a description of the interface between Ninja and a frontend, and an implementation of the interface in python that mimics the native interface in ninja, which can be run with: ninja --frontend=frontend/native.py Original code by colincross.

jhasse · 2019-07-17T12:52:55Z

Thanks! I've also done a rebase of this some some time ago: https://github.com/jhasse/ninja/tree/serialize Never managed to create a PR from it though. What else did you change beside rebasing?

From the original PR, #1210, there was the following post from nico:

In the post you've quoted nico is asking questions. I don't think that the final word about this has been spoking, so I wouldn't start on any work yet.

EliRibble · 2019-07-22T21:49:43Z

I haven't changed anything yet aside from rebasing. That's because I agree, the final word has not been spoken on how this should be done.

jhasse · 2019-08-04T08:51:49Z

src/status.h

+  virtual void BuildLoadDyndeps() = 0;
+  virtual void BuildStarted() = 0;
+  virtual void BuildFinished() = 0;
+


No need for protected, deleting Status* without a virtual dtor will result in a warning.

but even compilers that do may not be able to issue a diagnostic for this in every situation due to AST transformations losing the context of the delete operation.

Can you give an example?

some times,

How do you know that?

See GCC's or Clang's compiler manual.

I never said that I'm "against" it, just that there is no need.

src/status.h

+  virtual void BuildStarted();
+  virtual void BuildFinished();
+
+  virtual ~StatusPrinter() { }


src/status.h

+
+#include <map>
+#include <string>
+using namespace std;


jhasse · 2019-08-04T08:52:39Z

src/ninja.cc

@@ -1299,6 +1298,8 @@ NORETURN void real_main(int argc, char** argv) {
    exit((ninja.*options.tool->func)(&options, argc, argv));
  }

+  Status* status = new StatusPrinter(config);


Better to avoid auto_ptr. I would keep the raw pointer for now.

jhasse · 2019-08-04T08:53:41Z

src/status.cc

@@ -250,3 +251,24 @@ void StatusPrinter::PrintStatus(Edge* edge, EdgeStatus status) {
                 force_full_command ? LinePrinter::FULL : LinePrinter::ELIDE);
 }

+void StatusPrinter::Warning(const char* msg, ...) {


#ifdef #else with two code paths isn't a good idea, because only one of them will be tested by most people.

???
I did:

because only one of them will be tested by most people.

jhasse · 2019-08-04T09:16:11Z

Let's not discuss Protobuf vs JSON here please.

The library approach is something to consider, requires more work though (might be worth it).

jhasse · 2019-08-04T09:47:03Z

If the user wanted to have the output formatted differently, then the JSON can be piped to another program, in the conventional way program output is handled on posix like systems.

That change doesn't require JSON, it could also work with protobuf.

The reason for using fd 3 is that jobs in the console pool can then still directly use fds 0, 1 and 2.

jhasse · 2019-08-04T10:10:48Z

ninja --serialized-output | protoc --decode_raw | your_plain_text_tool

And now please stop with the opinion pieces about protobuf. I think everyone is aware of the advantages and disadvantages, if not see https://www.google.com/search?q=protobuf+vs+json

While I don't know the syntax for it, i'm fairly sure it's possible to pipe-operator alternative file descriptors, isn't it?

As far as I know it isn't possible.

Sarcasm · 2019-08-04T15:55:42Z

I'm really sorry to talk about this too, but as a user I would find JSON more accessible.
For example, if I want to make a Ninja UI inside Emacs, using JSON I would just have to write the UI code.
Using protobuf would just be painful, bigger barrier to entry for what reasons?

I don't think googling "json vs protobuf" is of any help, every tools has pros and cons.
ninja is quite simple to use, its lack of external dependencies is quite convenient for building and distribution.
Adding protobuf for json reporting while a "json line" protocol could do the same job without the protobuf dependency seems weird to me.

I feel like it would be better to explain what are the protobuf benefits over json line for ninja users.
IMHO, you could put "convenience" on the side of JSON.

jhasse · 2019-08-07T17:01:06Z

@jonesmz

Sorry, I definitely should have phrased that differently in a not-so-dismissive way.

The reason why I asked to stop the anti-protobuf arguments was NOT because I don't value them. It's because I already know them. The same goes with the other points you've mentioned. I'm well aware and I never said that I don't agree with you - this PR won't be merged any time soon, so no need to panic or object.

@Sarcasm

I feel like it would be better to explain what are the protobuf benefits over json line for ninja users.

I think the general pro-protobuf arguments apply.

colincross · 2019-08-08T17:29:03Z

I'm the original author of these changes from #1210. Not sure why this needed to be split into a new pull request, and I'd prefer that you leave me as the author in Git when making minor changes to my patches.

As for JSON vs. protobuf, I would much rather have used JSON, but JSON is actually terrible at passing arbitrary data. The data must be valid UTF8, which precludes some arbitrary binary sequences that might be found in the output of rules unless you base64 encode everything, which is both inefficient and loses the human-readability advantages of JSON. I originally wrote this with UBJSON instead of protobuf, but that loses most of the advantages of JSON without providing the advantages of protobuf (library support in multiple languages, easy cross-version compatibility). I ended up having to write rules on how to rev the schema without breaking compatibility with existing frontend, and there were no good tools to enforce the schema. So I went with protobuf instead.

I'm happy to continue working on this PR if there is a path forward, but it is merged as-is to the fork of ninja used in the Android build system and has been working great for us there. It has enabled us to store multiple output streams with different levels of logging, collect the error messages into a separate file, and produce an accurate build trace that can be loaded in chrome://tracing. Recently we have turned on local build output using a status table instead of a single status line that shows the description of the 8 longest currently running edges. There are more features planned including streamed build progress from our CI and colorized logs output from CI.

Sarcasm · 2019-08-08T21:53:33Z

JSON is actually terrible at passing arbitrary data. The data must be valid UTF8, which precludes some arbitrary binary sequences that might be found in the output of rules unless you base64 encode everything, which is both inefficient and loses the human-readability advantages of JSON.

I agree it's technically possible to have arbitrary byte sequences, that contains binary data, and ninja should support it, but I don't think everything needs to be base64 encoded.

Looking at the src/frontend.proto, I see the following string fields (correct me if I'm wrong):

EdgeStarted#inputs
EdgeStarted#outputs
EdgeStarted#desc
EdgeStarted#command
EdgeFinished#output
Message#message

I'm not sure all theses can contain binary data, but if ninja had to support if for all these fields, it should be possible to just add something like <field>_format with a value of base64 or something like that, e.g.:

{
  "type": "EdgeStarted",
  "inputs": ["Zm9vLGNwcA=="],
  "inputs_type": "base64",
  "outputs": ["foo.o"],     # outputs_format can be skipped if it's utf-8
}

JSON has way more support than protobuf and is supported by default by a lots of programming languages or environments (Python standard library, Emacs Lisp, frameworks like Qt, ...).
While ninja can content itself to write a light version of the protobuf serialization and add the new features it needs bit by bit, I expect the protobuf consumers will often try to support all of protobuf unless the ninja subset is stable and publicly documented.

The C++ Module folks have a similar issue to deal with non-utf8 text and it looks like JSON is not a showstopper for that http://www.open-std.org/pipermail/tooling/2019-March/000553.html, I expect most real-life files will be human readable, so it seems a good compromise.

I'm not sure I want to write this, but really my Emacs integration example is interesting.
With a protobuf API, I would have to build either:

a protobuf deserializer in Emacs Lisp (because no one exists), I suspect it would take more code and time than to make the ninja UI itself
or depend on an external tool that to do the conversion to an intermediate format

While with JSON, there is built-in support, with efficient built-in support.
If it's just a simple integration just for my comfort, I can even skip the handling of non-utf-8 data.
I could easily test it with fake data that I write by hand.

Another thing, regarding the frontend that is invoked by ninja.
I also expected more an API like @jonesmz feel like it would be easier to have an API like this:

ninja --serialized-output

With output going to stdout or stderr.
And if user want confidence that the data is not mangled with other outputs, make it configurable, e.g.:

ninja --serialized-output --serialized-output-fd=3

That way, it seems easier to integrate in a tool like Emacs, because if I wanted to integrate the ninja build inside my Emacs session, with the current design, it looks like I would to write a frontend script, that forward the data to Emacs, so I would need 3 processes, while Emacs could maybe just invoke ninja with the rights params and consume the additional reporting stream.

Sorry to bring Emacs up, but I feel like editor integration is a reasonable example of a custom frontend, and right now the API does not seem easy to support that.

The other idea of a frontend I have in mind, is to make a python frontend that would make a D-Bus progress notification, so I can start the build in a workspace, go to another workspace, e.g. in my browser, and still track compilation progress.

EliRibble · 2019-08-09T06:44:05Z

@colincross

I'm the original author of these changes from #1210. Not sure why this needed to be split into a new pull request, and I'd prefer that you leave me as the author in Git when making minor changes to my patches.

My apologies, I didn't realize that was something that could be done in Github. I wanted to try to move your ideas forward without requiring you to do additional work. I tried to point out as much as I could that the work is almost entirely yours. If there's something you'd like me to do to make this right please let me know.

EliRibble · 2019-08-09T06:57:50Z

As far as the discussion here goes. I tend to agree with the idea that making ninja a library rather than making it start up a separate process to act as its frontend makes sense. I'll going to start looking in to what it would take to do that. I also think this PR could be modified to remove the (controversial) protobuf work to get the benefit of some of the changes while the discussion continues. I'll try to make that happen.

jhasse · 2019-08-09T08:28:50Z

My apologies, I didn't realize that was something that could be done in Github.

Not really GitHub, but normally Git will leave the original author of the commit when you rebase. It will look like this in GitHub: https://github.com/jhasse/ninja/commits/serialize "colincross authored and jhasse committed"

EliRibble · 2019-08-09T17:44:23Z

Created subset PR: #1631

XVilka · 2020-07-28T10:05:13Z

There are now some conflicts that should be resolved. Please rebase.

kaspar030 · 2020-12-18T13:47:31Z

Another thing, regarding the frontend that is invoked by ninja.
I also expected more an API like @jonesmz feel like it would be easier to have an API like this:
ninja --serialized-output
[...]

That way, it seems easier to integrate in a tool like Emacs, because if I wanted to integrate the ninja build inside my Emacs session, with the current design, it looks like I would to write a frontend script, that forward the data to Emacs, so I would need 3 processes, while Emacs could maybe just invoke ninja with the rights params and consume the additional reporting stream.

Just as a datapoint, I agree with this.
I'm currently writing a frontend that "transparently" calls out to ninja. Currently I cannot now what ninja did. if it exited without error, I can just assume that everything has been built, and if it fails, ... I guess the need for the functionality in general is clear. :)

So with the solution this PR proposes, my frontend would call ninja with another tool (which is a "frontend" in ninja's terminology) as parameter which ninja executes, which would somehow communicate back to my frontend. That sounds clumsy, error prone and not trivial to get right in a portable way.

Just having ninja's output in a structured way that's easy to parse would be much easier to integrate.

I don't care if it is json or protobufs.

amilcarlucas · 2022-03-15T18:48:59Z

ping

Eli Ribble added 3 commits July 11, 2019 21:18

Turn BuildStatus into an interface

83fc85f

Make BuildStatus an abstract interface, and move the current implementation to StatusPrinter, to make way for a serialized Status output. Original code by colincross.

Put builder output through status interface

05fb273

Send all output after manifest parsing is finished to the Status interface, so that when status frontends are added they can handle build messages. Original code from collincross.

EliRibble force-pushed the frontend branch 2 times, most recently from ce100e5 to f19f3f6 Compare July 12, 2019 04:22

Eli Ribble added 2 commits July 11, 2019 22:55

EliRibble force-pushed the frontend branch 2 times, most recently from 60023a3 to 22e94eb Compare July 12, 2019 05:59

EliRibble force-pushed the frontend branch from 22e94eb to 8f02cd9 Compare July 12, 2019 06:07

This was referenced Aug 2, 2019

Extend format status output - Allow to change also order of description in NINJA_STATUS #1045

Open

Elision with escape seq #1494

Closed

Remove 'using namespace std' from header files, properly namespace all std symbols #1614

Merged

This comment was marked as abuse.

Sign in to view

jhasse mentioned this pull request Aug 4, 2019

Proposed/still running #629

Closed

jonesmz reviewed Aug 4, 2019

View reviewed changes

This comment was marked as abuse.

Sign in to view

EliRibble mentioned this pull request Aug 9, 2019

Status interface #1631

Closed

jhasse mentioned this pull request Mar 23, 2020

Consider adding a second, "heavy" binary #1499

Open

mathstuf mentioned this pull request Mar 23, 2020

CreateProcess error message should be improved #1380

Closed

EliRibble mentioned this pull request Apr 17, 2020

Shipping a static splinter library and binary frontend splinter-build/splinter#34

Open

rotu mentioned this pull request Sep 1, 2020

Colcon does not show failed build output if using Ninja generator colcon/colcon-cmake#67

Closed

jhasse added this to the 2.0.0 milestone Apr 14, 2024

jhasse added the frontend label Apr 14, 2024

Add support for external frontends, take 2 #1600

Are you sure you want to change the base?

Add support for external frontends, take 2 #1600

Conversation

EliRibble commented Jul 12, 2019

EliRibble commented Jul 12, 2019

jhasse commented Jul 17, 2019

EliRibble commented Jul 22, 2019

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

Choose a reason for hiding this comment

This comment was marked as abuse.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as abuse.

Choose a reason for hiding this comment

This comment was marked as abuse.

Choose a reason for hiding this comment

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

Choose a reason for hiding this comment

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

Choose a reason for hiding this comment

This comment was marked as abuse.

This comment was marked as abuse.

Choose a reason for hiding this comment

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

jhasse commented Aug 4, 2019

This comment was marked as abuse.

jhasse commented Aug 4, 2019

This comment was marked as abuse.

jhasse commented Aug 4, 2019

Sarcasm commented Aug 4, 2019

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

jhasse commented Aug 7, 2019

This comment was marked as abuse.

colincross commented Aug 8, 2019

This comment was marked as abuse.

Sarcasm commented Aug 8, 2019

EliRibble commented Aug 9, 2019

EliRibble commented Aug 9, 2019

jhasse commented Aug 9, 2019

EliRibble commented Aug 9, 2019

XVilka commented Jul 28, 2020

kaspar030 commented Dec 18, 2020

amilcarlucas commented Mar 15, 2022