Allow tags to be applied to targets from a single source #7302

codealchemy · 2019-03-02T00:19:05Z

Now that task targets can be filtered by tags (prior work from #6902), having a way of defining a single source for which tags to apply to which targets would be helpful to avoid needing to set the same tag across multiple BUILD files.

There are other use cases as well, such as grouping all targets with a particular suffix (or prefix) under a single tag (as noted for integration targets in pants being tagged with integration below), or others (fsqio, for instance, uses tags for rule enforcement on the build graph)

The text was updated successfully, but these errors were encountered:

codealchemy · 2019-03-02T00:19:26Z

Proposed Solution

Using #6902 (comment) as a starting point, a (global) subsystem would fit well for this purpose - taking a single option (the path to the source file) and providing a helper function for identifying which tags to apply to which targets. This would act as a supplement to any tags defined in the BUILD files themselves, and would not be coupled to tag filtering so that it could apply to a broader selection of use cases.

Plugging this into the LegacyPythonCallbacksParser would allow us to apply tags to targets during target initialization, though injecting an instance of a subsystem there has proven to be a challenge due to the problem of calling a global instance of a subsystem before the Scheduler has been created (i.e. it fails on an UninitializedSubsystemError), which led me to the following TODO in global options:

pants/src/python/pants/option/global_options.py

Lines 46 to 47 in dea6270

    
             TODO: These options should move to a Subsystem once we add support for "bootstrap" Subsystems (ie, 
        
             allowing Subsystems to be consumed before the Scheduler has been created).

Not sure if there's a related issue for that, or if that may be resolved through anything currently in flight.

Which leads me here - looking for feedback on the problem / approach defined ☝️- in the meantime, I'm looking into other ways of (lazily) initializing the subsystem. There may be alternative approaches that would make sense for this problem as well (from what I can tell, possibly exposing a global option for the source file / path that could be fed to a service object).

For reference, WIP branch for the general approach I've taken so far here

stuhood · 2019-03-02T00:28:27Z

Is it the case that these options need to be applied "across" scopes (ie, to multiple Tasks at once)? The example that you referenced here: #6902 (comment) ... looks like the configuration is still "per Task" or per scope.

If this is per scope, rather than using a global subsystem, it feels like you should be able to use the Subsystem that you added in #7275 to add regex support. That would invert the model from the linked comment... so the config (as it would be specified in pants.ini) would look more like:

[target-filter]
target_restrictions: {
    'blacklist': [<target regexes>], # Targets ignored by all tasks that support this subsystem
    ..,
  }

[target-filter.lint.scalastyle]
target_restrictions: {
    'blacklist': [<target regexes>], # Targets ignored by just the scalastyle task
    'whitelist': # etc
  }

benjyw · 2019-03-02T03:34:14Z

@stuhood I think this question is orthogonal to the specific use-case of #7275 .

That change's subsystem provides the answer to "which target tags should cause me to skip the target". Whereas this change provides a way of tagging targets in a centralized way, without having to laboriously edit every relevant BUILD file.

True, #7275 is the motivating use-case, but there could easily be others. For example, a single mapping that says "all targets whose name ends with _integration should have the tag integration" could replace all our tags = {'integration'} boilerplate.

stuhood · 2019-03-02T16:38:53Z

@benjyw : Ok, I think I see what you're getting at. Expanding the description with more of what we're trying to accomplish would be helpful then I think, because the goal was not clear to me. It sounds like it's approximately: "there is a global subsystem that can add tags to targets based on regexes (?)".

Describing one or two usecases for this would be good too, because the _integration usecase can be satisfied in multiple ways.

benjyw · 2019-03-02T18:56:40Z

True, specific use-cases can be satisfied in multiple ways. But I think it's better if we don't have to solve each case ad-hoc.

The general design here is: there can be K ways of adding tags to a target (e.g., inline, or in a central .json file, or ), and M uses of target tags (e.g., blacklisting unlintable targets, not running integration tests, or ).

Then we do K+M work instead of K*M work, yet get the added benefit of uniformity.

Kubernetes does something similar. You can apply arbitrary key-value labels to pods, and then select on those labels for various actions. https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

benjyw · 2019-03-02T18:57:47Z

@codealchemy can you edit your posts on this issue to clarify?

benjyw · 2019-03-02T19:03:10Z

But back to the technical question: We'd like to have a subsystem that provides a .json file mapping tags to target addresses (or target address regexes). So we're trying to figure out how to get that into LegacyPythonCallbacksParser's ctor, because that's created before we have full options.

I'm hoping that we can just create it lazily, since it's not needed in the bootstrap sequence itself.

stuhood · 2019-03-03T02:32:37Z

So, including these in the parser is one approach. I think that if you make lookup of the Subsystem instance lazy, that would work.

A few approaches

Make lookup of the global subsystem lazy (as you mentioned), and then inject at parse time.
Inject tags one level up in the @rules that hydrate "structs": https://github.com/pantsbuild/pants/blob/master/src/python/pants/engine/build_files.py#L138-L187 .. as I said, the naming is terrible. But: a struct "always" represents the un-instantiated placeholder for a target, and so you could include additional kwargs there.
Inject tags when we're actually instantiating v1 Target objects here: https://github.com/pantsbuild/pants/blob/master/src/python/pants/engine/legacy/graph.py#L147-L172

Of these options, the first and second ones are cleanest, and more forwards compatible (they work across both v1 and v2). The third option (injecting in BuildGraph) will only work for v1, because v2 does not create a BuildGraph.

Unfortunately, it looks like there are some minor blockers to taking the second approach (related to #6880: we don't actually inject the OptionsBootstrapper in the codepath that hydrates targets for v1). So I think either one or three would be paths forward for now (possibly with a TODO mentioning consuming the tags via optionable_rule).

codealchemy · 2019-03-04T23:55:00Z

@stuhood it looks like making the lookup lazy runs into the same issue in option 1 and 3 (seen running ./pants2 test tests/python/pants_test/build_graph:build_graph):

Ex. test error from option 1

self = <pants_test.build_graph.test_build_graph.BuildGraphTest testMethod=test_transitive_subgraph_of_addresses_bfs_predicate>

    def test_transitive_subgraph_of_addresses_bfs_predicate(self):
      root = self.inject_graph('a', {
        'a': ['b', 'c'],
        'b': ['d', 'e'],
>       'c': [], 'd': [], 'e': [],
      })

.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants_test/build_graph/test_build_graph.py:257:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants_test/build_graph/test_build_graph.py:44: in inject_graph
    self.build_graph.inject_address_closure(root_address)
.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/legacy/graph.py:206: in inject_address_closure
    self.inject_addresses_closure([address])
.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/legacy/graph.py:214: in inject_addresses_closure
    for _ in self._inject_specs(specs):
.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/legacy/graph.py:271: in _inject_specs
    subjects)
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py:35: in __exit__
    self.gen.throw(type, value, traceback)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pants.engine.legacy.graph.LegacyBuildGraph object at 0x1042aba90>

    @contextmanager
    def _resolve_context(self):
      try:
        yield
      except Exception as e:
        raise AddressLookupError(
>         'Build graph construction failed: {} {}'.format(type(e).__name__, str(e))
        )
E       AddressLookupError: Build graph construction failed: ExecutionError 1 Exception encountered:
E       Computing Select(Specs(dependencies<Exactly(tuple)>=(SingleAddress(directory=u'a', name=u'a'),), matcher<Exactly(SpecsMatcher)>=SpecsMatcher(tags<Exactly(tuple)>=(), exclude_patterns<Exactly(tuple)>=())), Exactly(TransitiveHydratedTargets))
E         Computing Task(transitive_hydrated_targets, Specs(dependencies<Exactly(tuple)>=(SingleAddress(directory=u'a', name=u'a'),), matcher<Exactly(SpecsMatcher)>=SpecsMatcher(tags<Exactly(tuple)>=(), exclude_patterns<Exactly(tuple)>=())), Exactly(TransitiveHydratedTargets), true)
E           Computing Task(addresses_from_address_families, Specs(dependencies<Exactly(tuple)>=(SingleAddress(directory=u'a', name=u'a'),), matcher<Exactly(SpecsMatcher)>=SpecsMatcher(tags<Exactly(tuple)>=(), exclude_patterns<Exactly(tuple)>=())), Exactly(BuildFileAddresses), true)
E             Computing Task(parse_address_family, Dir(path=a), Exactly(AddressFamily), true)
E               Throw(Failed to parse a/BUILD:
E       Subsystem "TargetTagDefinitions" not initialized for scope "target-tag-definitions". Is subsystem missing from subsystem_dependencies() in a task? )
E                 Traceback (most recent call last):
E                   File "/Users/schmitt/Workspace/pants/.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/native.py", line 422, in extern_generator_send
E                     res = c.from_value(func[0]).send(c.from_value(arg[0]))
E                   File "/Users/schmitt/Workspace/pants/.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/build_files.py", line 61, in parse_address_family
E                     address_mapper.parser))
E                   File "/Users/schmitt/Workspace/pants/.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/mapper.py", line 55, in parse
E                     raise MappingError('Failed to parse {}:\n{}'.format(filepath, e))
E                 MappingError: Failed to parse a/BUILD:
E                 Subsystem "TargetTagDefinitions" not initialized for scope "target-tag-definitions". Is subsystem missing from subsystem_dependencies() in a task?

.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/legacy/graph.py:238: AddressLookupError

Ex. test error from option 3

self = <pants_test.build_graph.test_build_graph.BuildGraphTest testMethod=test_transitive_subgraph_of_addresses_bfs_predicate>

    def test_transitive_subgraph_of_addresses_bfs_predicate(self):
      root = self.inject_graph('a', {
        'a': ['b', 'c'],
        'b': ['d', 'e'],
>       'c': [], 'd': [], 'e': [],
      })

.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants_test/build_graph/test_build_graph.py:257:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants_test/build_graph/test_build_graph.py:44: in inject_graph
    self.build_graph.inject_address_closure(root_address)
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:211: in inject_address_closure
    self.inject_addresses_closure([address])
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:219: in inject_addresses_closure
    for _ in self._inject_specs(specs):
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:278: in _inject_specs
    self._index(thts.closure)
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:100: in _index
    new_targets.append(self._index_target(target_adaptor))
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:133: in _index_target
    target = self._instantiate_target(target_adaptor)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pants.engine.legacy.graph.LegacyBuildGraph object at 0x112407b10>
target_adaptor = TargetAdaptor(address=a)

    def _instantiate_target(self, target_adaptor):
      """Given a TargetAdaptor struct previously parsed from a BUILD file, instantiate a Target.

        TODO: This assumes that the SymbolTable used for parsing matches the SymbolTable passed
        to this graph. Would be good to make that more explicit, but it might be better to nuke
        the Target subclassing pattern instead, and lean further into the "configuration composition"
        model explored in the `exp` package.
        """
      target_cls = self._target_types[target_adaptor.type_alias]
      try:
        # Pop dependencies, which were already consumed during construction.
        kwargs = target_adaptor.kwargs()
        kwargs.pop('dependencies')

        # Append any globally defined tags to the target
        kwargs.setdefault('tags', set())
        kwargs['tags'].update(self._target_tag_definitions().tags_for(target_adaptor.address.spec))

        # Instantiate.
        if issubclass(target_cls, AppBase):
          return self._instantiate_app(target_cls, kwargs)
        elif target_cls is RemoteSources:
          return self._instantiate_remote_sources(kwargs)
        return target_cls(build_graph=self, **kwargs)
      except TargetDefinitionException:
        raise
      except Exception as e:
        raise TargetDefinitionException(
            target_adaptor.address,
>           'Failed to instantiate Target with type {}: {}'.format(target_cls, e))
E       TargetDefinitionException: Invalid target BuildFileAddress(a/BUILD, a): Failed to instantiate Target with type <class 'pants.build_graph.target.Target'>: Subsystem "TargetTagDefinitions" not initialized for scope "target-tag-definitions". Is subsystem missing from subsystem_dependencies() in a task?

.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:177: TargetDefinitionException

For reference, same working branch (codealchemy@3f7c108 and codealchemy@495b2a5, respectively)

IIUC it looks like the build graph is parsed there before we have subsystems as well, is that right?

stuhood · 2019-03-05T00:23:33Z

@codealchemy : It occurs to me that there is an existing subsystem that you can either:

model after
just use!

pants/src/python/pants/build_graph/target.py

Lines 88 to 101 in 3a81e86

    
           class Arguments(Subsystem): 
        
             """Options relating to handling target arguments.""" 
        
             class UnknownArgumentError(TargetDefinitionException): 
        
               """An unknown keyword argument was supplied to Target.""" 
        
             options_scope = 'target-arguments' 
        
             @classmethod 
        
             def register_options(cls, register): 
        
               register('--ignored', advanced=True, type=dict, 
        
                        help='Map of target name to a list of keyword arguments that should be ignored if a ' 
        
                             'target receives them unexpectedly. Typically used to allow usage of arguments ' 
        
                             'in BUILD files that are not yet available in the current version of pants.')

stuhood · 2019-03-05T00:30:00Z

But actually, it's possible that that subsystem is only consumed within BuildGraph parsing... I think that to guarantee that a Subsystem is loaded, as early as possible, you'd need to include it here:

pants/src/python/pants/init/options_initializer.py

Lines 85 to 92 in 3a81e86

    
           # Gather the optionables that are not scoped to any other.  All known scopes are reachable 
        
           # via these optionables' known_scope_infos() methods. 
        
           top_level_optionables = ( 
        
             {GlobalOptionsRegistrar} | 
        
             GlobalSubsystems.get() | 
        
             build_configuration.optionables() | 
        
             set(Goal.get_optionables()) 
        
           )

benjyw · 2019-03-05T01:43:48Z

The new subsystem was added to GlobalSubsystems but still wasn't available in the parser. Are we sure Arguments actually works...? :)

Problem: It can be cumbersome to apply the same tags to many targets as it requires editing every associated BUILD file. Having a single source of tags would make it more straightforward to configure tags by which to filter targets (#6902), and can be applied to other use cases such as grouping targets with a particular suffix (ex. _integration in pants) under a single tag. Solution: Add a subsystem within Target responsible for fetching, parsing, and matching tags configured in a JSON file to targets. The subsystem matches against the target's address (though could be expanded to a regex to address additional use cases). Closes #7302

stuhood mentioned this issue Mar 2, 2019

Generic incremental strategy (white/blacklist) for fmt/lint #6902

Closed

benjyw assigned codealchemy Mar 2, 2019

codealchemy mentioned this issue Mar 5, 2019

Add subsystem for appending tags to targets from a single source #7315

Merged

stuhood mentioned this issue Mar 5, 2019

Support options consumption during legacy BuildGraph hydration #7316

Closed

benjyw closed this as completed in #7315 Mar 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow tags to be applied to targets from a single source #7302

Allow tags to be applied to targets from a single source #7302

codealchemy commented Mar 2, 2019 •

edited

Loading

codealchemy commented Mar 2, 2019 •

edited

Loading

stuhood commented Mar 2, 2019

benjyw commented Mar 2, 2019

stuhood commented Mar 2, 2019

benjyw commented Mar 2, 2019

benjyw commented Mar 2, 2019

benjyw commented Mar 2, 2019

stuhood commented Mar 3, 2019

codealchemy commented Mar 4, 2019

stuhood commented Mar 5, 2019

stuhood commented Mar 5, 2019

benjyw commented Mar 5, 2019

Allow tags to be applied to targets from a single source #7302

Allow tags to be applied to targets from a single source #7302

Comments

codealchemy commented Mar 2, 2019 • edited Loading

codealchemy commented Mar 2, 2019 • edited Loading

Proposed Solution

stuhood commented Mar 2, 2019

benjyw commented Mar 2, 2019

stuhood commented Mar 2, 2019

benjyw commented Mar 2, 2019

benjyw commented Mar 2, 2019

benjyw commented Mar 2, 2019

stuhood commented Mar 3, 2019

codealchemy commented Mar 4, 2019

stuhood commented Mar 5, 2019

stuhood commented Mar 5, 2019

benjyw commented Mar 5, 2019

codealchemy commented Mar 2, 2019 •

edited

Loading

codealchemy commented Mar 2, 2019 •

edited

Loading