-
-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support module.sc
files in subfolders
#3213
Conversation
What are the semantics when
|
@lefou I haven't thought about that yet. I'm not sure what the best thing to do is. I'm guessing (2) or (3), just because that's what other build tools seem to do |
We should remind our single-source-(file)-of-truth concept and make split-up projects explicit. Instead of looking up (guessing) sub-projects, we should explicitly list them in the parent project. I think we previously handled kind-of sub-projects with I would strongly recommend to not guess sub-project relations solely on the existence of files. Then we end up with a chaotic and non-reproducible setup (which you can experience with Gradle for example). |
With an explicit sub-project configuration, it would be possible to aggregate multiple stand-alone projects (e.g. by pointing to a location outside the current project or some git submodules). As long as sub-projects don't use any resources of the aggregating project, their build results (in So, simply aggregating multiple Mill projects should not require a rebuild of each single project. But it would make re-use of external project easy and lightweight. |
I think there are a few orthogonal things here Aggregating standalone projectsThis can be done with or without explicit references between subfolder The challenge with aggregation is the things that are "global". These include:
There's no general solution for combining these global configs in different standalone projects when aggregating, but maybe some compromise is sufficient to be useful. Bazel has similar issues, and draws an arbitrary line of what works in aggregated projects that works well enough in practice (e.g. These are actually similar to the problems involved with treating sub-directories as standalone projects when running Explicit References Between SubprojectsThis is something that different build tools handle differently:
So clearly all different approaches can work. I think my current inclination is to go the Bazel/SBT route rather than the Gradle/Maven route. For two reasons:
Existing support for foreign modulesExisting "foreign modules" work, but they cannot serve the purpose of these nested
In effect, the current way foreign modules and My goal for this effort is to solve both these issues:
I only started looking into this a few days ago, so it's still pretty rough. Hopefully it'll become more concrete as the implementation progresses |
@lihaoyi Thanks for you thorough explanations. I think a link from the sub-project to the root or parent project would be the best solution.
The only downside of the sub-project idea as a whole is the lookup which we need to do with every Mill invocation. Maybe, we can settle with a single enabler-option in the root module, so we only look for sub-projects if it is enabled. ( |
@lolgab I've fixed the resolution logic. Took a bit of surgery, but in the end managed to do it without too much invasive refactoring @lefou The semantics of I think it's probably not worth keeping binary compatibility for this PR. We are planning to break compatibility anyway at some point in the next few months to release 0.12.x, and this PR definitely needs to go into 0.12.x, so this PR can be the first of those breaking changes. We don't need to merge it yet, until we start having further compat-breaking PRs targeting 0.12.x (e.g. Scala 3.x), at which point we can cut a |
I wonder how likely is it, that this PR breaks compatibility? It now needs explicit enablement via a But being able to split up builds without considering any other breaking change is a huge benefit for some of the larger projects I witnessed. Maybe, we could release this useful feature in a The only breaking change I do recognize is the removed foreign module support. We could cut a new release in which we mark that feature as deprecated or encourage users to reach out to us, so we have some better feeling how many projects are affected. The safest approach would be to cut a binary compatible As a reminder. Changing Mill versions isn't a problem. Changing ABI is what is costly for the Mill ecosystem, as all plugins need to be prepared and re-released for it. |
It is technically possible to maintain compatibility, but I don't think it's worth it. It would involve duplicating large swathes of code: basically the entire resolution logic, the MillBootstrapRootModule codegen logic, and all associated tests (assuming we actually want to ensure both code paths work!). That is a ton of copy-paste that will really muck up the codebase. The thing about breaking bincompat with 0.12.x is that we're going to need to do it anyway: Scala 3 support will involve breaking bincompat, bumping uPickle to 4.x will involve breaking bincompat, as will many other changes in the 0.12.x discussion #3209. Given that we're going to need minimum 2 releases anyway (0.12.0-RC1 and 0.12.0 final), it costs us nothing to include this PR as part of 0.12.0-RC1, and saves us a whole lot of trouble trying to maintain compatibility in the codebase The options are basically:
Of these options:
This is probably the point where we should consider cutting a We don't need to do it immediately, but we should plan to do it in the near future (e.g. 31st July?), so we have some time to release 0.12.0-RC1 by Q4 and 0.12.0 final by EOY |
I'm hitting a bincompat issue here with But if I get rid of the alias, that breaks binary compatibility. Not sure whether there's some way around that. |
For now I just re-enabled |
You can't get rid of a |
If we consider running |
The terminology is a bit mixed up, but Root could mean the root of the file, rather than the root of the entire project. Similarly, submodule could mean modules nested im the same file rather than in subfolders. But I admit it is ambiguous and don't have a solution that doesnt add a bunch of verbosity You're right about running Mill in subfolders. Not sure what the solution there is, maybe these should be treated differently than normal tasks and always be available regardless of what folder you are in. But the details of that design can probably wait till later, since we'll be breaking compat in 0.13.0 in any case so we will have a chance to tweak things |
ea6bd8b
to
4ebb2f2
Compare
I think something is broken with traversal when using the new nested module feature:
I get that against this project: https://github.com/finos/morphir/tree/build-changes-plus-cli-introduction |
This PR implements support for per-subfolder
build.sc
files, namedmodule.sc
. This allows large builds to be split up into multiplebuild.sc
files each relevant to the sub-folder that they are placed in, rather than having a multi-thousand-linebuild.sc
in the root configuring Mill for the entire codebase.Semantics
The
build.sc
in the project root and all nestedmodule.sc
files are recursively within the Mill directory tree are discovered (TODO ignoring files listed in.millignore
), and compiled togetherWe ignore any subfolders that have their own
build.sc
file, indicating that they are the root of their own project and not part of the enclosing folder.Each
foo/bar/qux/build.sc
file is compiled into amillbuild.foo.bar.qux
package object
, with thebuild.sc
andmodule.sc
files being compiled into amillbuild
package object
(rather than a plainobject
in the status quo)An
object blah extends Module
within eachfoo/bar/qux/build.sc
file can be referenced in code viafoo.bar.qux.blah
, or referenced from the command line viafoo.bar.qux.blah
The base modules of
module.sc
files do not have theMainModule
tasks:init
,version
,clean
, etc.. Only the base module of the rootbuild.sc
file has thoseDesign
Uniform vs Non-uniform hierarchy
One design decision here is whether a
module.sc
file in a subfolderfoo/bar/
containingobject qux{ def baz }
would have their targets referenced viafoo.bar.qux.baz
syntax, or via some alternative e.g.foo/bar/qux.baz
.A non-uniform hierarchy
foo/bar/qux.baz
would be similar to how Bazel treats folders v.s. targets non-uniformlyfoo/bar:qux-baz
, and also similar to how external modules in Mill are handled e.g.mill.idea.GenIdea/idea
, as well as existing foreign modules. However, it introduces significant user-facing complexity:foo/bar/qux.baz
vsfoo/bar.qux.baz
orfoo/bar/qux/baz
?module.sc
files rather than just the top-level one e.g.__.compile
?module.sc
modules and targets in Scala code as well?Bazel has significant complexity to handle these cases, e.g. query via
...
vs:all
vs*
. It works, but it does complicate the user-facing semantics.The alternative of a uniform hierarchy also has downsides:
foo.bar.qux.baz
to thebuild.sc
ormodule.sc
file in which it is defined?build.sc
and in a nestedmodule.sc
, what happens?I decided to go with a uniform hierarchy where everything, both in top-level
build.sc
and in nestedmodule.sc
, end up squashed together in a single uniformfoo.bar.qux.baz
hierarchy.Package Objects
The goal of this is to try and make modules defined in nested
module.sc
files "look the same" as modules defined in the rootbuild.sc
. There are two possible approaches:Splice the source code of the various nested
module.sc
files into the top-levelobject build
. This is possible, but very complex and error prone. Especially when it comes to reporting proper error locations in stack traces (filename/linenumber), this will likely require a custom compiler plugin similar to theLineNumberPlugin
we have todayConvert the
object
s intopackage object
s, such that module tree defined in the rootbuild.sc
becomes synonymous with the JVM package tree. While thepackage object
s will cause the compiler to synthesizeobject package { ... }
wrappers, that is mostly hidden from the end user.I decided to go with (2) because it seemed much simpler, making use of existing language features rather than trying to force the behavior we want using compiler hackery. Although
package object
s may go away at some point in Scala 3, they should be straightforward to replace with explicitexport foo.*
statements when that time comes.Existing Foreign Modules
Mill already supports existing
foo.sc
files which support targets and modules being defined within them, but does not support referencing them from the command line.I have removed the ability to define targets and modules in random
foo.sc
files. We should encourage people to put things inmodule.sc
, since that would allow the user to co-locate the build logic within the folder containing the files it is related to, rather than as a bunch of loosefoo.sc
scripts. Removing support for modules/targets infoo.sc
files greatly simplifies the desugaring of these scripts, and since we are already making a breaking change by overhauling how per-foldermodule.sc
files work we might as well bundle this additional breakage together (rather than making multiple breaking changes in series)build.sc
/module.sc
file discoveryFor this implementation, I chose to make
module.sc
files discovered automatically by traversing the filesystem: we recursively walk the subfolders of the rootbuild.sc
project, look for any files namedmodule.sc
. We only traverse folders withmodule.sc
files to avoid having to traverse the entire filesystem structure every time. Emptymodule.sc
files can be used as necessary to allowmodule.sc
files to be placed deeper in the folder treeThis matches the behavior of Bazel and SBT in discovering their
BUILD
/build.sbt
files, and notably goes against Maven/Gradle which require submodules/subprojects to be declared in the top level build config.This design has the following characteristics:
In future, if we wish to allow
mill
invocations from within a subfolder, the distinction betweenbuild.sc
andmodule.sc
allows us to easily find the "enclosing" project root.It ensures that any folders containing
build.sc
/module.sc
files that accidentally get included within a Mill build do not end up getting picked up and confusing the top-level build, because we automatically skip any subfolders containingbuild.sc
Similarly, it ensures that adding a
build.sc
file "enclosing" an existing project, it would not affect Mill invocations in the inner project, because we only walk to the nearest enclosingbuild.sc
file to find the project rootWe do not automatically traverse deeply into sub-folders to discover
module.sc
files, which means that it should be almost impossible to accidentally pick upmodule.sc
files that happen to be on the filesystem but you did not intend to include in the buildThis mechanism should do the right thing 99.9% of the time. For the last 0.1% where it doesn't do the right thing, we can add a
.millignore
/.config/millignore
file to support ignoring things we don't want picked up, but I expect that should be a very rare edge caseTask Resolution
I have aimed to keep the logic in
resolve/
mostly intact. The core change is replacingrootModule: BaseModule
withbaseModules: BaseModuleTree
, which provides enough metadata to allowresolveDirectChildren
andresolveTransitiveChildren
to findBaseModule
s in sub-folders in addition toModule
object
s nested within the parentModule
. Other than that, the logic should be basically unchanged, which hopefully should mitigate the risk of introducing new bugsCompatibility
This change is not binary compatible, and the change in the
.sc
file desugaring is invasive enough we should consider it a major breaking change. This will need to go into Mill 0.12.xOut-Of-Scope/TODO
Running
mill
without a subfolder of the enclosing project. Shouldn't be hard to add given this design, but the PR is complicated enough as is that I'd like to leave it for follow upError reporting when a module is duplicated in an enclosing
object
and in a nestedmodule.sc
file. Again, probably not hard to add, but can happen in a follow upPull request: #3213