-
-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Implement Cobertura output option from #9275 #9295
Conversation
…tsbuild#9291) Unions are supposed to be an `is-a` relationship. Previously, to add a new field to a pre-existing target, you would do `UnionRule(PythonLibrary, TypeChecked)`. This is not an `is-a` relationship, though; `TypeChecked(Field)` is not a `PythonLibrary(Target)`. Instead, we add a sentinel class automatically to every `Target` called `PluginField`. So, to add a new field, plugin authors simply register `UnionRule(PythonLibrary.PluginField, TypeChecked)`.
### Problem Laziness can be great, particularly for better performance. We need laziness for certain `Fields` like `Sources` that are very expensive to hydrate and often not even needed. But, laziness has its downsides (see Haskell). Laziness generally makes code more difficult to work with and reason with. For example, by having `PrimitiveField` do its hydration and validation lazily, every `PrimitiveField` must now import and use the decorater `@memoized_property`. Further, it becomes harder for us to refer to a `Field`'s owning `Address` when we encounter validation errors; if lazy, we must then store `Address` on the `Field` instance because `Field`s have no way of referring back to their owning `Target` (this is by design). Finally, lazy validation of `PrimitiveField`s means that it is much harder for users to detect when they have invalid BUILD files. ### Solution Change `PrimitiveField` to eagerly validate and hydrate in the constructor. If the hydration is particularly expensive, then the user should use `AsyncField` to get the engine's caching + the laziness of `AsyncField`. #### Rejected alternative: `LazyField` `PrimitiveField` would have two subclasses: `EagerField` and `LazyField`. We would still have `AsyncField`. `LazyField` would be for when you need light-weight validation that you want to do lazily. This was rejected as a premature abstraction. For example, the performance cost of validating that every `timeout` field is `> 0` is likely very negligible. Perhaps, we find that it is indeed costly so we add `LazyField` down the road, but for now, it's premature.
Upgrade rust to 1.42
Previously many serial awaits were being performed where there were no data dependencies dictating this be so.
…ion (pantsbuild#9300) ### Problem With the V2 Target API, by design, fields live independently of targets. This means that they cannot access any of the information from their parent target or be traced back to their original owner. For example, `my_tgt.Get(Compatibility)` will return `Compatibility` as a distinct type. This is an issue during validation of fields. When validating a field, we need to refer to the original owning target to be able to generate a helpful error message that says which target the error comes from. Beyond generating useful error messages, some `AsyncField`s need to use the `Address` to hydrate, such as `Sources`. ### Solution Change the constructor for `Field` to take `Address`. However, we don't actually always store an `Address` field on every `Field` instance. `PrimitiveField`s have no use for the `Address` after they are validated in the constructor. It would be a non-trivial cost in space to store an `Address` on every single field ever encountered. So, `PrimitiveField`s use the `Address` during the constructor, and then throw away the value after. In contrast to `PrimitiveField`s, `AsyncField`s are validated lazily, so they do need to preserve the `Address` as an attribute. ### Result Field validation can now have helpful error messages. We can also now use `Address` for `AsyncField`s, which allows us to implement `Sources`. ### Remaining followup In a followup, we will provide standard exception types to make it much easier for field/plugin authors to create high-quality error messages.
…API (pantsbuild#9301) This sets up the 3 core target types for Python that we are originally planning to support. We can now parse and validate all 3 of these targets. This adds several new common `PrimitiveField` superclasses like `StringField` to cut down on boilerplate. ### Remaining followup * Add support for hydrating the `Sources` field. * Add tests for the custom Python sources types when doing this. * Add support for consuming the `Dependencies` field. * It's unclear what this looks like yet. * Allow going from `HydratedStruct -> Target` so that these may actually be used.
Add a test for using the repo root as a source root.
…ntsbuild#9225) ### Problem V2 dependencies rule didn't support listing 3rdparty dependencies. Turns out the `--dependencies-type=3rdparty` flag was used by at least one pants consumer. ### Solution Add support for listing external dependencies. ### Result Running ``` ./pants dependencies2 --dependencies2-transitive --dependencies2-type=3rdparty src/python/pants/rules/core:: ``` now prints all external deps of the target. ``` ~/c/pants ❯❯❯ ./pants dependencies2 --dependencies2-transitive --dependencies2-type=3rdparty src/python/pants/rules/core:: Scrubbed PYTHONPATH=/Users/tansypants/code/pants/src/python: from the environment. Markdown==2.1.1 PyYAML==5.1.2 Pygments==2.3.1 ansicolors==1.1.8 cffi==1.13.2 contextlib2==0.5.5 dataclasses==0.6 docutils==0.14 fasteners==0.15.0 packaging==16.8 pathspec==0.5.9 pex==2.1.5 psutil==5.6.3 py_zipkin==0.18.4 pyopenssl==17.3.0 pystache==0.5.3 python-Levenshtein==0.12.0 pywatchman==1.4.1 requests[security]>=2.20.1 setproctitle==1.1.10 setuptools==44.0.0 toml==0.10.0 twitter.common.collections<0.4,>=0.3.11 twitter.common.confluence<0.4,>=0.3.11 twitter.common.dirutil<0.4,>=0.3.11 typing-extensions==3.7.4 wheel==0.33.6 www-authenticate==0.9.2 ``` This is currently only working for python 3rd party deps. As far as I can tell we don't have V2 jvm support, so that remains a TODO.
### Problem We were passing all sources from the entire dep closure to the pytest EPR. This included, for example, the requirements.txt file. So modifying requirements.txt, even by just adding a comment, would cause pytest to be re-run, even if the change had no effect on the actual set of requirements. ### Solution Only pass python sources and resources to pytest. Only pass python sources to the coverage config. ### Result Whitespace changes to requirements.txt no longer invalidate the pytest EPR.
### Problem The engine requires that every value is immutable and hashable. (Technically, it only enforces hashability, but mutable things are fundamentally not safe to hash). However, many of the `raw_value`s we stored on `Field` are not hashable, such as `List`. The idea of `raw_value` was to preserve exactly what the user typed in a `BUILD` file, but that's fundamentally not safe to do because `BUILD` files can use mutable data types. ### Solution For `PrimitiveField`, discard the `raw_value` after initialization. We were only keeping it for the sake of debugging. It would be neat to continue storing, but it's not safe to do so with the engine so the best option is to simply discard the value. This also improves the space usage of `PrimitiveField`. For `AsyncField`, due to lazy hydration, we must preserve the raw value. But, we need to use these fields with the engine, so we introduce the attribute `sanitized_raw_value` and the abstractmethod `sanitize_raw_value()`. This allows us to do things like convert lists into tuples. `sanitize_raw_values()` also means that we can now get some light-weight validation to be eager, which is likely a good thing as it allows us to catch invalid BUILD files eagerly. For example, we can eagerly fail if a user tries doing `sources=[0, 1, 2]`. ### Result `Target` and `Field` are now safe to use with the engine.* *`Field` authors can still mess things up. They must use immutable and hashable types for `PrimitiveField.value` and `AsyncField.sanitized_raw_value`. There is nothing enforcing a `Field` author to use a safe type like a tuple over a list, other than convention.** ** Maybe we could make a MyPy plugin to check that the return type of a function is immutable? This would need to work ergonomically, such as working with native tuples and not forcing us to subclass `Tuple`.
### Problem pantsbuild#9181 caused a significant performance regression for operations in larger repos. It was also noticeable in the pantsbuild/pants repo: ``` # for 5 runs of `./pants filter :: | wc -l`: # before pantsbuild#9181 0m33.455s # after pantsbuild#9181 1m23.784s ``` Profiling showed that all of the time increase was in the `__contains__` method, which in pantsbuild#9166 had gone from `O(1)` to `O(n)`. The benchmark done there to justify this likely used 1) too small a set, 2) with too simple an `__eq__` impl ... to identify the difference. ### Solution Switch the backing datastore to a python `dict`. Since pants requires python>=3.6, `dict` guarantees order. `*OrderedSet` `__eq__` and `__hash__` receive custom but straightforward implementations. Additionally, mixing-in of `Sequence[T]` and indexing/slicing methods were removed, since they cannot be implemented efficiently with `dict` as it stands, and were as-of-yet unused. ### Result Performance is restored to before the change; code is slightly simpler.
This support exists in V1 and is useful for immediate local development feedback.
…rts (pantsbuild#9290) ### Problem Currently `Revision.lenient("1") > Revision.lenient("1.0.1")` See also pantsbuild#9289 ### Solution Update `__lt__` to fix error. ### Result Fixes above as `Revision.lenient("1") < Revision.lenient("1.0.1")`
) This fleshes out how we use `AsyncField`s, which are much more complex than `PrimitiveField`s. ## Result ```python sources = await Get[SourcesResult](SourcesRequest, tgt.get(Sources).request) print(sources.snapshot.files) ``` This also works: ```python if tgt.has_fields([PythonLibrarySources]): sources1 = await Get[Sources](SourcesRequest, tgt.get(PythonLibrarySources).request) sources2 = await Get[Sources](SourcesRequest, tgt.get(Sources).request) assert sources1 == sources2 ``` `PythonSources` and its subclasses will validate that all resulting files end in `*.py` (new behavior). `PythonLibrarySources` and `PythonTestsSources` will use the previous default globs. `PythonBinarySources` will enforce that `sources` is 0 or 1 files (previous behavior). ## Solution ### Ensuring support for subclassed `AsyncField`s With the Target API, we allow new targets to subclass `Field`s for custom behavior. For example, `PythonLibrarySources` might use the default globs of `*.py` whereas `PythonTestSources` might use the default globs of `test_*.py`. To allow these custom subclasses of `Field`s, we added support in pantsbuild#9286 for substituting in the subclass with the original parent class. For example, `my_python_library.get(Sources) == my_python_library.get(PythonSources) == my_python_library.get(PythonLibrarySources)`. This works great with `PrimitiveField` but is tricky to implement with `AsyncField` due to the engine not supporting subclasses. Originally, I tried achieving this extensibility through a union, which would allow the engine to have multiple ways to get a common result type like `SourcesResult`. But, this created a problem that there became multiple paths in the rule graph to compute the same product, e.g. `Sources->SourcesResult`, `PythonSources->SourcesResult`, etc. Instead, each `AsyncField` should define a simple `Request` dataclass that simply wraps the underlying `AsyncField`. This allows us to have only one path from `SourcesRequest -> SourcesResult`, but still give custom behavior in the underlying `SourcesRequest`. Within the hydration rule, the rule will call standardized extension points provided by the underlying field. **This means that the onus is on the `AsyncField` author to expose certain entry points for customizing the field's behavior.** For example, `Sources` defines the entry points of `default_globs` and `validate_snapshot()`. `Dependencies` might provide entry points like `inject_dependencies()` and `validate_dependencies()` (not necessarily, only possibilities). While this approach has lots of boilerplate and less extensibility than `PrimitiveField`s, it solves the graph ambiguity and still allows for subclassing an `AsyncField`. ### Fixing `__eq__` for `Field`s The previous naive implementation resulted in `Field`s only comparing their classvar `alias`, rather than their actual underlying values. This meant that the engine would cache values when it should not have. This tweaks how we use dataclasses to ensure that the engine works correctly with `AsyncField`s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you been able to publish a version fo scoverage to your local maven repo and use it to test against? There is a special invocation of pants to publish to your local maven repo with a named version.
@@ -151,7 +153,8 @@ def __init__( | |||
base_path = base_path or "." | |||
if os.path.isabs(base_path): | |||
base_path = os.path.relpath(base_path, get_buildroot()) | |||
self.base_path = base_path | |||
self.base_path = base_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that bare comma is making base_path a tuple. Is that intended?
@@ -144,6 +144,11 @@ def test_junit_run_with_coverage_succeeds_cobertura_merged(self): | |||
args=["--batch-size=2"], | |||
) | |||
|
|||
def test_junit_run_with_coverage_succeeds_cobertura_output(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are testing a change in the output of the pants you should probably assert something about the output.
…antsbuild#9274) This patch changes the cache-based Distribution lookup to use the lowest-version matching cached distribution, rather than the first inserted matching cached distribution. It also adds via to Location to make debugging where a location came from easier. ### Problem Currently, when JVM distributions are located, and there are cached entries that match, the first matching cached entry from the ordering that `dict.values()` uses will be picked. This behavior may be not-deterministic because that order is python implementation dependent, and cache insertion order dependent. ### Solution Always sort the cache entries by version. ### Result The lowest matching cache entry will always be used first. Additionally, this patch adds a `via` property to locations because I found setting up test fixtures tricky when I didn't know what source the `Location` objects I was seeing were coming from.
We don't want to invalidate sources based on arbitrary changes to requirements.txt, but we do need to depend on it so we know to recompute requirements. Previously this was achieved by synthesizing a files() target and adding a dep on it, in the python_requirements macro. However when we gather sources to operate on we have to include all files(), meaning we were invalidating the sources even on a whitespace change to requirements.txt. This change adds a new, internal, target type specially for the synthetic requirements.txt change, so that we can maintain a dependency, but rules can still filter out the actual file. Context: pantsbuild#6405
…ld#9314) The main idiom we'll use for Target API code is: ```python relevant_tgts = [tgt for tgt in tgts if tgt.has_field(PythonSources)] # do something with these tgts ``` `has_field(PythonSources)` is simpler to write than `has_fields([PythonSources])`. Because we will use this so often, this is nice sugar to add. -- This also documents the 3 main methods exposed by `Target`.
### Problem Scoverage supports exporting cobertura formats which would allow us to merge with cobertura coverage for java targets. Pants doesn't expose this option at the moment. ### Solution Call `CoberturaXmlWriter` method to generate Cobertura output when `--scoverage-enable-scoverage --scoverage-report-output-as-cobertura` options are provided in `pants` command ### Result Generate Cobertura output (`.pants.d/test/junit/_runs/*/*/all/scoverage/reports/xml/cobertura.xml`)
…runner;0.0.8["ci skip"]
…til-zip;0.0.12["ci skip"]
…_javactool_for_testing;0.0.7["ci skip"]
…runner-withretry;0.0.19["ci skip"]
…age-report-generator_2.12;0.0.3["ci skip"]
…ootstrapper_2.12;0.0.14["ci skip"]
…;0.0.21["ci skip"]
…runner;1.0.29["ci skip"]
…ol;0.0.17["ci skip"]
…pendency-update-checker;0.0.5["ci skip"]
… into pliu/test_cobertura_output
Cancelling this PR in favor of #9325 |
Problem
Implement Cobertura output option from #9275
Solution
Add new option (
--scoverage-report-output-as-cobertura
) for triggeringCoberturaXmlWriter(...)
method call inscoverage-report-generator_2.12-0.0.3.jar
to generate Cobertura outputResult
Generate Cobertura output (
.pants.d/test/junit/_runs/*/*/all/scoverage/reports/xml/cobertura.xml
)when
--scoverage-enable-scoverage --scoverage-report-output-as-cobertura
options are provided in pants command.