Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand engine readme #5759

Merged
merged 8 commits into from
May 13, 2018
Merged

Conversation

cosmicexplorer
Copy link
Contributor

@cosmicexplorer cosmicexplorer commented Apr 27, 2018

Problem

It's not clear how to make rule definitions available to the scheduler, and there's no discussion of the type checking added to datatype() fields in #5723.

Solution

  • Add a section on datatype fields, typing, and how they interact with the engine.
  • Add a section on making rules available to the scheduler in and out of unit tests.
  • Add a section describing types of rules after discussion on this PR.

@cosmicexplorer
Copy link
Contributor Author

The bit about being able to expose rules from backends needs #5747 to be merged before it is true -- I'm working on passing CI there, not a problem, just making a note.

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README changes look great to me, thanks for putting them together!

I'm slightly skeptical of the TypeCheckError change... Can you explain the motivation?

@@ -47,8 +47,8 @@ def datatype(field_decls, superclass_name=None, **kwargs):

class DataType(namedtuple_cls):
@classmethod
def make_type_error(cls, msg):
return TypeCheckError(cls.__name__, msg)
def make_type_error(cls, msg, *args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a weird change. Why are we now accepting arguments that we'll ignore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that you can pass in e.g. another exception instance as the next positional argument, so that its stacktrace can be preserved. *args and **kwargs are eventually forwarded to the Exception class constructor, unless I'm mistaken, so I don't see how we're ignoring these? I may be missing something quite obvious.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah, sorry, I was totally misreading the TypeCheckError constructor!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of that might be because I don't know how normal humans format python code, part of that might be because this chaining of exceptions is currently a little ad-hoc/awkward. See the TODO in one of the other exception classes in that file -- there could be a method to make this pattern more concise.

@cosmicexplorer
Copy link
Contributor Author

This needs to be updated a bit after some recent changes to datatype(). I'll rework this and comment again when it is updated.

@cosmicexplorer cosmicexplorer changed the title Expand engine readme [WIP] Expand engine readme May 4, 2018
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


1. an `@rule`, which has a single product type and selects its inputs as described above.
2. a `SingletonRule`, which matches a product type with a value so the type can then be `Select`ed in an `@rule`.
3. a `RootRule`, which declares a type that can be used as a *subject*, which means it can be provided as an input to a `product_request()` or a `Get` statement.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the reason RootRule exists is only for the former case: Scheduler.product_request, et al. Those represent the root of a request. In the absence of RootRules, any subject type that was only involved in a request "at runtime" (ie via product_request), would show up as an an unused or impossible path in the rulegraph, because no @rule could provide that type. RootRule allows for plugging that hole in rulegraph verification by saying: "yea, no other @rule provides this type, but trust me: someone might request one at runtime".

Another potential name for RootRule might be "ParamRule", or something similar... because it's essentially saying that the type represents a sort of "public API entrypoint" via a product_request. I'm fairly confident that RootRule is a necessary API, but the name is not at all final... Suggestions welcome!


You do not need to use RootRule if any Get requests exist in the graph, because the Get requests are statically verified in the graph. We know before runtime that they might be requested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pilfered most of that comment into a few sentences in 5032fe9.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParamRule would be a great name, and thinking it over, the only problem is that RootRule is also a good name, for different reasons.

@@ -181,3 +225,108 @@ in the context of scala and mixed scala & java builds. Twitter spiked on a proj
a target-level scheduling system scoped to just the jvm compilation tasks. This bore fruit and
served as further impetus to get a "tuple-engine" designed and constructed to bring the benefits
seen in the jvm compilers to the wider pants world of tasks.

## Datatypes in Depth
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section... feels like too much. But maybe it's not obvious to me yet that people will need to be convinced to add types to things... in the presence of @rules? The idea behind @rules is really to force people to use types, whether or not we motivate them to in some other way.

Maybe if you trimmed it down a bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. Would it be better just removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that incorporating a small amount of "types are good, use them!" into the section that already describes datatype would be good? But probably not much more than a paragraph.

Sorry... it's great writing, but it doesn't feel like the right place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made that edit in af466ad.

The published version is currently broken because of a transitive dep bump.
@cosmicexplorer cosmicexplorer changed the title [WIP] Expand engine readme Expand engine readme May 12, 2018
@cosmicexplorer
Copy link
Contributor Author

Un-WIPing this after the most recent commits in response to PR comments.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

print(x[0]) # 'a string'

# datatype objects can be easily inspected:
print(x) # 'FormattedInt(content=a string)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a separate change, it looks like we should be using the repr of fields, rather than str. That would result in quotes around string fields in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the output of the __str__() method on datatypes (which also displays any type constraints on fields) -- the __repr__() method uses the repr of each field. Does that sound appropriate, or is there a further change to be made?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(if print() uses repr, then i may have taken a shortcut in producing that example code/output -- i really hope not because i hate seeing that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like that's exactly what happened. I can make a quick followup to fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #5818 for small followup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants