Warning: These are raw notes, and still need to be cleaned up. Read at your own peril!
Expression trees
Bing scenario, sending code from ad hoc queries would be too heavyweight through traditional compiling and sending of assemblies. Also it's not just execution but intense analysis, which feeds into where the query ends up running etc. Introspection.
Currently work around various limitations. Async calls may be part of the query, and there's a lot of code manipulation for that.
are other important scenarios: capturing code for auto-differentiation (ML) or for translation to GPU executable code.
Would allowing more nodes remove a useful restriction, that people depend on today? For instance, most new nodes we add may not be sensible in EF. We've always had this problem, and been able to write something that fails at runtime.
Partial solutions:
- provide alternative factories with a pluggable builder pattern, driven by the target type (just like we now do for task-like types)
- offer analyzers that are domain specific.
Provider model: Probably not that unrealistic to implement your own factory. Plugging it through the language probably requires some extra work, but it seems doable. This would allow static checking of shape, and would give analyzers something to latch off of for further restriction of nodes.
The expression lambda rewriter today is nicely isolated in the Roslyn codebase and only ~1100 lines.
There's a further step we could take, which is to freeze current expression trees in time. Then whoever wants to understand new language features needs to use a new expression type.
Expression trees have been significantly expanded since the language tied into them. There is an extension story, where a node can be reducible. It then offers to translate into other nodes. This helps older providers still work, as well as lets you not have to do the work to implement all the nodes, only the irreducible ones.
We don't want to overdo reduction. We probably wouldn't reduce async lambdas to trees representing the state machine. That would overly commit us to some very specific implementation choices, and probably also leave performance opportunities on the table.
It's a big question whether we want to commit to expression trees following the language in perpetuity. But we don't need to commit to that. As long as we feel an upgrade is useful.
This is not just about adding new nodes, but also expanding existing ones with new expressiveness. We'd have to be careful to keep generating old factory method calls for existing situations.
github.com/bartdesmet/ExpressionFutures/tree
Inherits and shadows System.Linq.Expressions.Expression
.
Supports dynamic, async. Needed some hacks because it is a separate library, if it goes into the BCL it would have access to the right things.
Supports null-conditional, discard.
Statements are generally supported up to C# 6.0.
Since these are a shadow library over System.Linq.Expressions, there's an updated Roslyn compiler that also understands those.
What we would like to do:
- Use factory methods as the well-defined interface between compiler and API
- Feel good about incrementally improving without having to do all of the language at once
- Add provider model/builder pattern for other factories
- Get to a prioritized list of specific language constructs to start supporting
- Evolve compiler and API together for specific constructs
- Checkin with existing providers such as EF to ensure compat story is good, and their scenarios are taken care of
For now, let's mull it over for a couple of months, pursue information, crisp up the plan. Only after a while will we be able to free up compiler resources.