[WIP] Implement Cobertura output option from #9275 #9295

patliu85 · 2020-03-13T15:58:46Z

Problem

Implement Cobertura output option from #9275

Solution

Add new option (--scoverage-report-output-as-cobertura) for triggering CoberturaXmlWriter(...) method call in scoverage-report-generator_2.12-0.0.3.jar to generate Cobertura output

Result

Generate Cobertura output (.pants.d/test/junit/_runs/*/*/all/scoverage/reports/xml/cobertura.xml)
when --scoverage-enable-scoverage --scoverage-report-output-as-cobertura options are provided in pants command.

…tsbuild#9291) Unions are supposed to be an `is-a` relationship. Previously, to add a new field to a pre-existing target, you would do `UnionRule(PythonLibrary, TypeChecked)`. This is not an `is-a` relationship, though; `TypeChecked(Field)` is not a `PythonLibrary(Target)`. Instead, we add a sentinel class automatically to every `Target` called `PluginField`. So, to add a new field, plugin authors simply register `UnionRule(PythonLibrary.PluginField, TypeChecked)`.

### Problem Laziness can be great, particularly for better performance. We need laziness for certain `Fields` like `Sources` that are very expensive to hydrate and often not even needed. But, laziness has its downsides (see Haskell). Laziness generally makes code more difficult to work with and reason with. For example, by having `PrimitiveField` do its hydration and validation lazily, every `PrimitiveField` must now import and use the decorater `@memoized_property`. Further, it becomes harder for us to refer to a `Field`'s owning `Address` when we encounter validation errors; if lazy, we must then store `Address` on the `Field` instance because `Field`s have no way of referring back to their owning `Target` (this is by design). Finally, lazy validation of `PrimitiveField`s means that it is much harder for users to detect when they have invalid BUILD files. ### Solution Change `PrimitiveField` to eagerly validate and hydrate in the constructor. If the hydration is particularly expensive, then the user should use `AsyncField` to get the engine's caching + the laziness of `AsyncField`. #### Rejected alternative: `LazyField` `PrimitiveField` would have two subclasses: `EagerField` and `LazyField`. We would still have `AsyncField`. `LazyField` would be for when you need light-weight validation that you want to do lazily. This was rejected as a premature abstraction. For example, the performance cost of validating that every `timeout` field is `> 0` is likely very negligible. Perhaps, we find that it is indeed costly so we add `LazyField` down the road, but for now, it's premature.

Upgrade rust to 1.42

Previously many serial awaits were being performed where there were no data dependencies dictating this be so.

…ion (pantsbuild#9300) ### Problem With the V2 Target API, by design, fields live independently of targets. This means that they cannot access any of the information from their parent target or be traced back to their original owner. For example, `my_tgt.Get(Compatibility)` will return `Compatibility` as a distinct type. This is an issue during validation of fields. When validating a field, we need to refer to the original owning target to be able to generate a helpful error message that says which target the error comes from. Beyond generating useful error messages, some `AsyncField`s need to use the `Address` to hydrate, such as `Sources`. ### Solution Change the constructor for `Field` to take `Address`. However, we don't actually always store an `Address` field on every `Field` instance. `PrimitiveField`s have no use for the `Address` after they are validated in the constructor. It would be a non-trivial cost in space to store an `Address` on every single field ever encountered. So, `PrimitiveField`s use the `Address` during the constructor, and then throw away the value after. In contrast to `PrimitiveField`s, `AsyncField`s are validated lazily, so they do need to preserve the `Address` as an attribute. ### Result Field validation can now have helpful error messages. We can also now use `Address` for `AsyncField`s, which allows us to implement `Sources`. ### Remaining followup In a followup, we will provide standard exception types to make it much easier for field/plugin authors to create high-quality error messages.

…API (pantsbuild#9301) This sets up the 3 core target types for Python that we are originally planning to support. We can now parse and validate all 3 of these targets. This adds several new common `PrimitiveField` superclasses like `StringField` to cut down on boilerplate. ### Remaining followup * Add support for hydrating the `Sources` field. * Add tests for the custom Python sources types when doing this. * Add support for consuming the `Dependencies` field. * It's unclear what this looks like yet. * Allow going from `HydratedStruct -> Target` so that these may actually be used.

Add a test for using the repo root as a source root.

…ntsbuild#9225) ### Problem V2 dependencies rule didn't support listing 3rdparty dependencies. Turns out the `--dependencies-type=3rdparty` flag was used by at least one pants consumer. ### Solution Add support for listing external dependencies. ### Result Running ``` ./pants dependencies2 --dependencies2-transitive --dependencies2-type=3rdparty src/python/pants/rules/core:: ``` now prints all external deps of the target. ``` ~/c/pants ❯❯❯ ./pants dependencies2 --dependencies2-transitive --dependencies2-type=3rdparty src/python/pants/rules/core:: Scrubbed PYTHONPATH=/Users/tansypants/code/pants/src/python: from the environment. Markdown==2.1.1 PyYAML==5.1.2 Pygments==2.3.1 ansicolors==1.1.8 cffi==1.13.2 contextlib2==0.5.5 dataclasses==0.6 docutils==0.14 fasteners==0.15.0 packaging==16.8 pathspec==0.5.9 pex==2.1.5 psutil==5.6.3 py_zipkin==0.18.4 pyopenssl==17.3.0 pystache==0.5.3 python-Levenshtein==0.12.0 pywatchman==1.4.1 requests[security]>=2.20.1 setproctitle==1.1.10 setuptools==44.0.0 toml==0.10.0 twitter.common.collections<0.4,>=0.3.11 twitter.common.confluence<0.4,>=0.3.11 twitter.common.dirutil<0.4,>=0.3.11 typing-extensions==3.7.4 wheel==0.33.6 www-authenticate==0.9.2 ``` This is currently only working for python 3rd party deps. As far as I can tell we don't have V2 jvm support, so that remains a TODO.

### Problem We were passing all sources from the entire dep closure to the pytest EPR. This included, for example, the requirements.txt file. So modifying requirements.txt, even by just adding a comment, would cause pytest to be re-run, even if the change had no effect on the actual set of requirements. ### Solution Only pass python sources and resources to pytest. Only pass python sources to the coverage config. ### Result Whitespace changes to requirements.txt no longer invalidate the pytest EPR.

### Problem The engine requires that every value is immutable and hashable. (Technically, it only enforces hashability, but mutable things are fundamentally not safe to hash). However, many of the `raw_value`s we stored on `Field` are not hashable, such as `List`. The idea of `raw_value` was to preserve exactly what the user typed in a `BUILD` file, but that's fundamentally not safe to do because `BUILD` files can use mutable data types. ### Solution For `PrimitiveField`, discard the `raw_value` after initialization. We were only keeping it for the sake of debugging. It would be neat to continue storing, but it's not safe to do so with the engine so the best option is to simply discard the value. This also improves the space usage of `PrimitiveField`. For `AsyncField`, due to lazy hydration, we must preserve the raw value. But, we need to use these fields with the engine, so we introduce the attribute `sanitized_raw_value` and the abstractmethod `sanitize_raw_value()`. This allows us to do things like convert lists into tuples. `sanitize_raw_values()` also means that we can now get some light-weight validation to be eager, which is likely a good thing as it allows us to catch invalid BUILD files eagerly. For example, we can eagerly fail if a user tries doing `sources=[0, 1, 2]`. ### Result `Target` and `Field` are now safe to use with the engine.* *`Field` authors can still mess things up. They must use immutable and hashable types for `PrimitiveField.value` and `AsyncField.sanitized_raw_value`. There is nothing enforcing a `Field` author to use a safe type like a tuple over a list, other than convention.** ** Maybe we could make a MyPy plugin to check that the return type of a function is immutable? This would need to work ergonomically, such as working with native tuples and not forcing us to subclass `Tuple`.

### Problem pantsbuild#9181 caused a significant performance regression for operations in larger repos. It was also noticeable in the pantsbuild/pants repo: ``` # for 5 runs of `./pants filter :: | wc -l`: # before pantsbuild#9181 0m33.455s # after pantsbuild#9181 1m23.784s ``` Profiling showed that all of the time increase was in the `__contains__` method, which in pantsbuild#9166 had gone from `O(1)` to `O(n)`. The benchmark done there to justify this likely used 1) too small a set, 2) with too simple an `__eq__` impl ... to identify the difference. ### Solution Switch the backing datastore to a python `dict`. Since pants requires python>=3.6, `dict` guarantees order. `*OrderedSet` `__eq__` and `__hash__` receive custom but straightforward implementations. Additionally, mixing-in of `Sequence[T]` and indexing/slicing methods were removed, since they cannot be implemented efficiently with `dict` as it stands, and were as-of-yet unused. ### Result Performance is restored to before the change; code is slightly simpler.

This support exists in V1 and is useful for immediate local development feedback.

…rts (pantsbuild#9290) ### Problem Currently `Revision.lenient("1") > Revision.lenient("1.0.1")` See also pantsbuild#9289 ### Solution Update `__lt__` to fix error. ### Result Fixes above as `Revision.lenient("1") < Revision.lenient("1.0.1")`

) This fleshes out how we use `AsyncField`s, which are much more complex than `PrimitiveField`s. ## Result ```python sources = await Get[SourcesResult](SourcesRequest, tgt.get(Sources).request) print(sources.snapshot.files) ``` This also works: ```python if tgt.has_fields([PythonLibrarySources]): sources1 = await Get[Sources](SourcesRequest, tgt.get(PythonLibrarySources).request) sources2 = await Get[Sources](SourcesRequest, tgt.get(Sources).request) assert sources1 == sources2 ``` `PythonSources` and its subclasses will validate that all resulting files end in `*.py` (new behavior). `PythonLibrarySources` and `PythonTestsSources` will use the previous default globs. `PythonBinarySources` will enforce that `sources` is 0 or 1 files (previous behavior). ## Solution ### Ensuring support for subclassed `AsyncField`s With the Target API, we allow new targets to subclass `Field`s for custom behavior. For example, `PythonLibrarySources` might use the default globs of `*.py` whereas `PythonTestSources` might use the default globs of `test_*.py`. To allow these custom subclasses of `Field`s, we added support in pantsbuild#9286 for substituting in the subclass with the original parent class. For example, `my_python_library.get(Sources) == my_python_library.get(PythonSources) == my_python_library.get(PythonLibrarySources)`. This works great with `PrimitiveField` but is tricky to implement with `AsyncField` due to the engine not supporting subclasses. Originally, I tried achieving this extensibility through a union, which would allow the engine to have multiple ways to get a common result type like `SourcesResult`. But, this created a problem that there became multiple paths in the rule graph to compute the same product, e.g. `Sources->SourcesResult`, `PythonSources->SourcesResult`, etc. Instead, each `AsyncField` should define a simple `Request` dataclass that simply wraps the underlying `AsyncField`. This allows us to have only one path from `SourcesRequest -> SourcesResult`, but still give custom behavior in the underlying `SourcesRequest`. Within the hydration rule, the rule will call standardized extension points provided by the underlying field. **This means that the onus is on the `AsyncField` author to expose certain entry points for customizing the field's behavior.** For example, `Sources` defines the entry points of `default_globs` and `validate_snapshot()`. `Dependencies` might provide entry points like `inject_dependencies()` and `validate_dependencies()` (not necessarily, only possibilities). While this approach has lots of boilerplate and less extensibility than `PrimitiveField`s, it solves the graph ambiguity and still allows for subclassing an `AsyncField`. ### Fixing `__eq__` for `Field`s The previous naive implementation resulted in `Field`s only comparing their classvar `alias`, rather than their actual underlying values. This meant that the engine would cache values when it should not have. This tweaks how we use dataclasses to ensure that the engine works correctly with `AsyncField`s.

hrfuller

Have you been able to publish a version fo scoverage to your local maven repo and use it to test against? There is a special invocation of pants to publish to your local maven repo with a named version.

hrfuller · 2020-03-16T21:56:43Z

src/python/pants/java/jar/jar_dependency.py

@@ -151,7 +153,8 @@ def __init__(
        base_path = base_path or "."
        if os.path.isabs(base_path):
            base_path = os.path.relpath(base_path, get_buildroot())
-        self.base_path = base_path
+        self.base_path = base_path,


I think that bare comma is making base_path a tuple. Is that intended?

hrfuller · 2020-03-16T21:58:09Z

tests/python/pants_test/backend/jvm/tasks/test_junit_run_integration.py

@@ -144,6 +144,11 @@ def test_junit_run_with_coverage_succeeds_cobertura_merged(self):
            args=["--batch-size=2"],
        )

+    def test_junit_run_with_coverage_succeeds_cobertura_output(self):


If you are testing a change in the output of the pants you should probably assert something about the output.

…antsbuild#9274) This patch changes the cache-based Distribution lookup to use the lowest-version matching cached distribution, rather than the first inserted matching cached distribution. It also adds via to Location to make debugging where a location came from easier. ### Problem Currently, when JVM distributions are located, and there are cached entries that match, the first matching cached entry from the ordering that `dict.values()` uses will be picked. This behavior may be not-deterministic because that order is python implementation dependent, and cache insertion order dependent. ### Solution Always sort the cache entries by version. ### Result The lowest matching cache entry will always be used first. Additionally, this patch adds a `via` property to locations because I found setting up test fixtures tricky when I didn't know what source the `Location` objects I was seeing were coming from.

We don't want to invalidate sources based on arbitrary changes to requirements.txt, but we do need to depend on it so we know to recompute requirements. Previously this was achieved by synthesizing a files() target and adding a dep on it, in the python_requirements macro. However when we gather sources to operate on we have to include all files(), meaning we were invalidating the sources even on a whitespace change to requirements.txt. This change adds a new, internal, target type specially for the synthetic requirements.txt change, so that we can maintain a dependency, but rules can still filter out the actual file. Context: pantsbuild#6405

…ld#9314) The main idiom we'll use for Target API code is: ```python relevant_tgts = [tgt for tgt in tgts if tgt.has_field(PythonSources)] # do something with these tgts ``` `has_field(PythonSources)` is simpler to write than `has_fields([PythonSources])`. Because we will use this so often, this is nice sugar to add. -- This also documents the 3 main methods exposed by `Target`.

### Problem Scoverage supports exporting cobertura formats which would allow us to merge with cobertura coverage for java targets. Pants doesn't expose this option at the moment. ### Solution Call `CoberturaXmlWriter` method to generate Cobertura output when `--scoverage-enable-scoverage --scoverage-report-output-as-cobertura` options are provided in `pants` command ### Result Generate Cobertura output (`.pants.d/test/junit/_runs/*/*/all/scoverage/reports/xml/cobertura.xml`)

…runner;0.0.8["ci skip"]

…til-zip;0.0.12["ci skip"]

…_javactool_for_testing;0.0.7["ci skip"]

…runner-withretry;0.0.19["ci skip"]

…age-report-generator_2.12;0.0.3["ci skip"]

…ootstrapper_2.12;0.0.14["ci skip"]

…;0.0.21["ci skip"]

…runner;1.0.29["ci skip"]

…ol;0.0.17["ci skip"]

…pendency-update-checker;0.0.5["ci skip"]

… into pliu/test_cobertura_output

patliu85 · 2020-03-17T22:36:09Z

Cancelling this PR in favor of #9325

Test cobertura output in pantsbuild#9275

b518658

patliu85 changed the title ~~[WIP] Test cobertura output in #9275~~ [WIP] Test cobertura output from #9275 Mar 13, 2020

Eric-Arellano and others added 15 commits March 13, 2020 12:21

Bump rust to 1.42 (pantsbuild#9297)

130842d

Upgrade rust to 1.42

Improve setup_pytest_for_target concurrency. (pantsbuild#9283)

44861b5

Previously many serial awaits were being performed where there were no data dependencies dictating this be so.

Move pants.source's tests into src/python. (pantsbuild#9298)

cef8154

Add a test for using the repo root as a source root.

Prep for 1.27.0.dev0 (pantsbuild#9304)

83651bb

Add support for pytest-cov console reports. (pantsbuild#9302)

cf15e11

This support exists in V1 and is useful for immediate local development feedback.

hrfuller reviewed Mar 16, 2020

View reviewed changes

baroquebobcat and others added 12 commits March 16, 2020 17:22

pants build committing publish data for push of org.pantsbuild#pants-…

b132925

…runner;0.0.8["ci skip"]

pants build committing publish data for push of org.pantsbuild#buck-u…

aeab72b

…til-zip;0.0.12["ci skip"]

pants build committing publish data for push of org.pantsbuild#custom…

c81458e

…_javactool_for_testing;0.0.7["ci skip"]

pants build committing publish data for push of org.pantsbuild#junit-…

8f00240

…runner-withretry;0.0.19["ci skip"]

pants build committing publish data for push of org.pantsbuild#scover…

db53cbd

…age-report-generator_2.12;0.0.3["ci skip"]

pants build committing publish data for push of org.pantsbuild#zinc-b…

760b01e

…ootstrapper_2.12;0.0.14["ci skip"]

pants build committing publish data for push of org.pantsbuild#args4j…

3d14f95

…;0.0.21["ci skip"]

pants build committing publish data for push of org.pantsbuild#junit-…

4242f9b

…runner;1.0.29["ci skip"]

stuhood and others added 5 commits March 17, 2020 11:30

pants build committing publish data for push of org.pantsbuild#jar-to…

df18b7b

…ol;0.0.17["ci skip"]

pants build committing publish data for push of org.pantsbuild#ivy-de…

68824fa

…pendency-update-checker;0.0.5["ci skip"]

Test cobertura output in pantsbuild#9275

21aff63

Implement Cobertura output option

9fc4ded

Merge branch 'pliu/test_cobertura_output' of github.com:twitter/pants…

ed4af3e

… into pliu/test_cobertura_output

patliu85 changed the title ~~[WIP] Test cobertura output from #9275~~ [WIP] Implement Cobertura output option from #9275 Mar 17, 2020

patliu85 closed this Mar 17, 2020

gshuflin mentioned this pull request Apr 17, 2020

Remove "Executing process" text from V2 UI #9568

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Implement Cobertura output option from #9275 #9295

[WIP] Implement Cobertura output option from #9275 #9295

patliu85 commented Mar 13, 2020 •

edited

Loading

hrfuller left a comment

hrfuller Mar 16, 2020

hrfuller Mar 16, 2020

patliu85 commented Mar 17, 2020

[WIP] Implement Cobertura output option from #9275 #9295

[WIP] Implement Cobertura output option from #9275 #9295

Conversation

patliu85 commented Mar 13, 2020 • edited Loading

Problem

Solution

Result

hrfuller left a comment

Choose a reason for hiding this comment

hrfuller Mar 16, 2020

Choose a reason for hiding this comment

hrfuller Mar 16, 2020

Choose a reason for hiding this comment

patliu85 commented Mar 17, 2020

patliu85 commented Mar 13, 2020 •

edited

Loading