Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Modules #8302

Closed
alrz opened this issue Feb 2, 2016 · 46 comments
Closed

Proposal: Modules #8302

alrz opened this issue Feb 2, 2016 · 46 comments

Comments

@alrz
Copy link
Contributor

alrz commented Feb 2, 2016

While namespaces are very useful to organize large code and prevent name collisions, they are not really good at isolating types within the program, and at best you have to setup separate projects with assembly references to each other which sometimes seems like an overkill and complicates the project structure.

Modules would be like namespaces (implicitly partial and public and declared at the assembly level),

module Module1 {
  public interface Interface1 { }
}

To use types declared in other modules, you must write using directives at the top of the module indicating the dependency (types declared in a module, can only referenced from other modules).

module Module2 {
  // module dependency -- required
  using Module1;

  public interface Interface2 : Interface1 { }
}

One can write the whole program in modules as individually isolated components with well defined dependencies.

@HaloFour
Copy link

HaloFour commented Feb 2, 2016

The only reason that I can gather that you'd want to do any of this would be to avoid having to invoke the compiler more than once in order to create more than one assembly. But then would come the feature requests to add all of the different compiler arguments into the source directly so that each of these "modules" could have their own build properties. Seems quite unnecessary and we already have a plethora of build systems from which to choose.

The CLR has a very different concept of modules, including those that are separate from the assembly containing the manifest. That is different conceptually from what you describe above and I don't think that there would be much to gain by exposing that in the compiler either.

@bondsbw
Copy link

bondsbw commented Feb 2, 2016

@alrz How would internal work, would those types only be exposed within the module and not outside (even in the same project)?

I can see something like this being useful for project-agnostic dependency management (similar to NDepend's concept of namespace-defined components (PDF)), but only if modules must not be allowed to form a circular dependency chain.

@alrz
Copy link
Contributor Author

alrz commented Feb 2, 2016

@HaloFour Exporting modules is more of a further consideration, however, I'd prefer to modulize my program instead of creating a jungle of projects just to make some types reusable, you can think of this like "one class per file" in VB6, wasn't it annoying? Yes it's better to have one file per class, but sometimes it's not necessary and rather complicates things up instead of being useful. No, compiler would not support different options for each module because the point is that they are close enough to be in the same project. But my real motivation for this is to levelize types in modules and code organization. I'm not really fond of module keyword either, since these are a special kind of namespace after all. But it'd be interesting to see how a module system could be useful in C#.

@bondsbw Access modifiers can be a tricky part, I still suspect if modules themselves could have access modifiers or not. You should ask if internal for a type in an exported module really makes sense? If so then InternalsVisibleToAttribute would work, I guess. Otherwise it'll work as if they were in a namespace.

@HaloFour
Copy link

HaloFour commented Feb 2, 2016

@alrz

Given proper types and namespaces I don't really see the point. You already have a decent level of control as to what you can import into a particular scope. I don't understand what actual problem you're trying to solve. Why does C# or the CLR need yet another namespacing mechanism?

Assemblies are already comprised of modules. Every assembly must have at least one and they can be in separate files. But that doesn't really buy you much. The manifest still has to flatten all of the metadata across the modules and accessibility modifiers still work as if there is a single module. You'd need something new and completely different for any other flavor of "modules" to work and I don't think it would be worth that much.

How much of what you want to do can be handled through assembly aliases? By default the C# compiler stuffs all references under the default global alias, but you're free to give each reference its own alias and you have to import them separately.

@alrz
Copy link
Contributor Author

alrz commented Feb 2, 2016

@HaloFour OK, here's the thing. Namespaces are useful only to prevent name collisions, still, you can use them to organize code but then it would not be clear that what namespaces are using the types from what other namespaces. Explicit indication of this can be really helpful in large code. using is not required and you can just use the fully qualified name. But if we restrict this, it flows all over the project and with prohibiting circular dependencies we have an organized code with well defined dependencies between various parts of it.

@HaloFour
Copy link

HaloFour commented Feb 2, 2016

@alrz

So is it more a question of seeing/limiting the dependencies of a particular class? Even if you could limit that via code how would you handle the inevitability that one "module" will expose types from another "module" often in ways that the current class never has to directly import? This is true across assemblies today.

@alrz
Copy link
Contributor Author

alrz commented Feb 2, 2016

@HaloFour You mean something like this?

module M1 {
  class A { }
}
module M2 {
  using M1;
  class B { public A F() { ... } }
}
module M3 {
  using M2;
  // A is not in the scope but B is
}

If M3 uses other types from M1 directly, it's required to write using M1 otherwise I don't see why it's a problem.

@HaloFour
Copy link

HaloFour commented Feb 2, 2016

@alrz So you'd only need to import the module if you directly declared a variable of a type in that module? Seems like it would be pretty leaky then.

@dsaf
Copy link

dsaf commented Feb 2, 2016

...at best you have to setup separate projects with assembly references to each other which sometimes seems like an overkill and complicates the project structure.

...instead of creating a jungle of projects just to make some types reusable...

Maybe the problem is the difficulty of dealing with projects on an IDE level rather than language constructs?

@alrz
Copy link
Contributor Author

alrz commented Feb 2, 2016

@HaloFour Got it. What if we require to mention every dependency of the M2 in M3, just like generic constraints which don't inherit?

@dsaf Exporting modules might be not the perfect solution, but the basic idea is to make dependencies between "modules" or whatever, more visible and prevent circular dependencies.

@HaloFour
Copy link

HaloFour commented Feb 2, 2016

@alrz Requiring explicitly module dependencies would create another problem; explosive expansion on required declarations. Imagine if the entire CLR was split into lots of little modules. Many of those modules would take dependencies on one another just by their nature. Any module that takes a dependency on any of those modules would then have to declare all of those dependent modules, and any dependency on that module would have to declare the dependency on that module just every other module. Cartesian product dependencies. Honestly, leaky dependencies would probably be preferable, but largely defeat the purpose of a module system to track dependencies.

@alrz
Copy link
Contributor Author

alrz commented Feb 2, 2016

@HaloFour You mean BCL right? I'll also note that dotnet.github.io claims that .NET Core is modular but there isn't really any enforcement in the language and dependencies are managed through the nuget!

@HaloFour
Copy link

HaloFour commented Feb 2, 2016

@alrz Yeah, CLR/BCL. The corelib kinda straddles both and covers so much functionality that it is ripe for modularization. which CoreCLR/CoreFX is doing by breaking what used to be large assemblies into lots of little assemblies that can be deployed individually. NuGet is one package manager but the dependencies are established through run-of-the-mill assembly dependency metadata in the assembly manifest.

@svick
Copy link
Contributor

svick commented Feb 3, 2016

@alrz

it would not be clear that what namespaces are using the types from what other namespaces

If this is what you want, then I think an IDE feature would make more sense than a language feature. Using Roslyn, it shouldn't be hard to create a VS extension that tells what namespaces/assemblies/types are used by a given file/type/namespace/project. No language change necessary.

@alrz
Copy link
Contributor Author

alrz commented Feb 3, 2016

@svick Namespaces in C# are meant to be used for organizing code and you can use using to bring types in scope, but when you use fully qualified names all that goodness goes away, so all you get from namespaces is to prevent name collisions — that's why they're called namespace. In my opinion, isolating types and managing dependencies is something that the language should enforce not just package managers (which only work in assembly boundaries) or analysers (which don't enforce anything). Least you would get from modules is that they encourage you to actually think about code organization and levelize the dependencies (within a program, or across the assemblies).

@HaloFour
Copy link

HaloFour commented Feb 3, 2016

Why would some new "module" organization be better than assemblies? And how exactly would they be implemented or enforced? You still have the leaky/Cartesian problem, which I doubt can be resolved, particularly after the fact given the size and scope of the BCL. And the entire concept of "exporting" a module is foreign to the BCL.

I don't think that C# needs to be Rust.

Update: Modules in Rust can be referenced by FQN without use also, so this isn't Rust either.

@alrz
Copy link
Contributor Author

alrz commented Feb 3, 2016

@HaloFour What does this has to do with Rust again?

@HaloFour
Copy link

HaloFour commented Feb 3, 2016

@alrz Nothing apparently. You've referenced that language in other proposals so I thought you were taking inspiration from it.

@alrz
Copy link
Contributor Author

alrz commented Feb 4, 2016

@HaloFour Only #8127 which I ended up with Swift syntax, more or less, #7875 and #7671 exist in Java, #7626 is from C++ and this is from TypeScript/F#/Jigsaw. But you're right I'm desperate for Rust features anyway.

Why would some new "module" organization be better than assemblies?

Because you don't always need a separate assembly to modularize your program, F# has a nice top/down dependency across files, with Jigsaw you have to declare modules' dependencies separately and AFAIK they are meant to address JAR shortcomings. From JW:

In short, JARs are a good attempt at modularization, but they don't fulfill all the requirements for a truly modular environment.

Same would apply to assemblies. "entire concept of "exporting" a module is foreign to the BCL." It doesn't have anything to do with BCL itself, because, as I said it "hints the compiler to generate a separate assembly file for the module." This part came from TS. For this to work BCL doesn't have to change because there is no limitation to use types outside modules.

So nothing to do with Rust if you hate it. 😄

@HaloFour
Copy link

HaloFour commented Feb 4, 2016

@alrz IIRC Jigsaw more addresses the fact that the J2*E requirements are incredibly monolithic, basically an all-or-none proposal. But the JVM lacks assemblies so you're stuck with nothing but packages which aren't great for modularization (not to mention a completely broken accessibility model.)

I have no ill will towards Rust (I honestly don't know that much about it), I'm just not a big fan of taking features/syntax from other languages for the sake of making C# more like that language.

Having the compiler create multiple assemblies from a single invocation doesn't sound like a good idea.

@alrz
Copy link
Contributor Author

alrz commented Feb 4, 2016

@HaloFour I don't know much about Jigsaw either because it's not released yet, but I think modularity is something that is good to be encouraged in the language, it is the means to get the program structure under the control in self-contained reusable units, in the future BCL itself can be modularized and take advantage of it. I don't see what is wrong with that. "Having the compiler create multiple assemblies from a single invocation doesn't sound like a good idea." I know that none-source-files as input for the compiler, or having attributes change the static semantics of the language is a no go as it's mentioned in other threads, but I'm not quite sure about this one. However, I do believe that having modules as a mean to control dependencies doesn't seem that unreasonable. "I'm just not a big fan of taking features/syntax from other languages" Isn't this the fundamental of every new language that exists? You don't have to invent the wheel!

@HaloFour
Copy link

HaloFour commented Feb 4, 2016

@alrz IIRC Jigsaw has nothing to do with Java as a language, rather it involves breaking what is considered J2SE and J2EE into smaller JARs or collections of class files. Right now a J2SE implementation must provide every single class as documented or it can't be considered J2SE. This has never really been a problem with the CLR (e.g. you want data you opt into referencing System.Data.dll) and it has only become more modular over time as bits of mscorlib.dll and System.dll were spun off into their own DLLs with type forwarding. CoreCLR takes that a lot further, but it's still all assembly based.

@alrz
Copy link
Contributor Author

alrz commented Feb 4, 2016

@HaloFour Yes it's all assembly based, so I have to create an assembly for each of my so-called modules to have separate boundaries within the program? It seems a bit ironic — I need to make external dependencies to manage them? FYI, Jigsaw will affect Java with some new syntaxes as it is currently proposed.

@HaloFour
Copy link

HaloFour commented Feb 4, 2016

@alrz

FYI, Jigsaw will affect Java with some new syntaxes as it is currently proposed.

No, it introduces a new manifest file called the module declaration which specifies inter-jar dependencies and exported packages. It's a separate syntax located in a separate module-info.java source file which is kept at the root of the JAR. That syntax doesn't affect Java source outside of that special file. This brings Jigsaw JARs closer to on par with what CLR assemblies have always supported, namely explicitly declared dependencies and assembly type associations (to avoid class collisions, a real problem in Java). Ya gotta remember that Java programs are literally just a bunch of class files thrown together into a directory hierarchy without any known interrelationships or dependencies on one another. The CLR equivalent to the recommended Java deployment route today would literally be to ILMerge everything except the .NET Framework into one big .EXE or .DLL.

I doubt Jigsaw will fix protected, though.

@HaloFour
Copy link

HaloFour commented Feb 4, 2016

@alrz

Yes, if you want to break your "program" into smaller reusable parts with explicit dependencies and isolation then the appropriate tool is to use assemblies. That's always been the case. I don't think it is at all appropriate to try to turn Roslyn into a build system and have it "modularize" your "program" into a slew of separate assemblies based on some kind of embedded syntax.

@alrz
Copy link
Contributor Author

alrz commented Feb 4, 2016

@HaloFour I wasn't talking about exported modules though. I'll remove it from the openning post since I'm agree that it'll be a mess in long term and if we truly want reusability, assemblies are the way to go.

@dsaf
Copy link

dsaf commented Feb 5, 2016

@alrz maybe this proposal can provide additional benefits by piggy-backing native .NET modules? For example it would be interesting to be able to use F# and C# within one assembly/project.

http://blogs.msdn.com/b/junfeng/archive/2004/07/15/183813.aspx
http://blogs.msdn.com/b/junfeng/archive/2005/02/12/371683.aspx

@alrz
Copy link
Contributor Author

alrz commented Feb 5, 2016

@dsaf Even if it could, someone has to take the responsibility to hand separate source files to those compilers and link the output which doesn't seem reasonable (and perhaps, unrelated to the syntax that proposed here), I have no idea what you had in mind and what it gets you, but other than that, there is no reason to expose CLR modules to the language.

@HaloFour
Copy link

HaloFour commented Feb 5, 2016

The compiler has always supported .NET module files, there's just never been support in the IDE for projects of that type. I imagine it would work best in a form of parent/child project structure where each of the child projects is compiled to a separate module and then the parent project links them together into an assembly. An additional ILMerge process would be neat, too, so that you wouldn't have to distribute all of the separate module files. Then you could officially have multiple language assemblies.

However, you could still use the embedded .NET module metadata to support this proposal. Currently the compiler spits out a single module per output file but I don't think that there is a reason it couldn't output multiple modules with arbitrary names within the same assembly binary. Internally the type system would still be flattened out but the compiler could enforce module boundaries similarly to how it enforces assembly boundaries with assembly aliasing, or using whatever syntax it desired. if you wanted additional metadata stored regarding inter-module dependencies you could use module-targeting attributes, which already exist.

One suggestion I'd make would be to take a page from #595 and allow the module declaration to apply to the entire current source file so that you can avoid a level of indentation:

module LINQ;
namespace System.Collections.Generic;

public static class Enumerable {
   ...
}

@alrz
Copy link
Contributor Author

alrz commented Feb 5, 2016

Head. Desk.

@HaloFour
Copy link

HaloFour commented Feb 5, 2016

Actually I take that back. It seems that each additional CLR module would require its own file. I thought that they could be packed into a single file along with the metadata.

@alrz
Copy link
Contributor Author

alrz commented Feb 5, 2016

@HaloFour They do, because that's the compiler output, but eventually they will be packed into a single file with link.

@HaloFour
Copy link

HaloFour commented Feb 5, 2016

@alrz You mean al.exe? To my understanding that will produce a new binary containing the assembly manifest for the modules but it only links to the module files which would have to be distributed along with the assembly binary.

@alrz
Copy link
Contributor Author

alrz commented Feb 6, 2016

@HaloFour It's not impossible to merge them together though, from CLR via C#

you can run ILDasm.exe on each of the modules to obtain an IL source code file. Then you can run ILAsm.exe and pass it all of the IL source code files. ILAsm.exe will produce a single file containing all of the types.

@HaloFour
Copy link

HaloFour commented Feb 6, 2016

@alrz

Yes, via ilmerge or ildasm/ilasm you can merge all of the types into a single assembly, but you lose the module information in doing so. Even if you leave all of the .module directives in each of the IL files it will only use the first that it finds.

So there goes one way of implementing something like this. There are others. You could decorate each type with an attribute to specify their module:

[Module("LINQ")]
public static class Enumerable { }

or have assembly-targeted attributes which list each type and their respective module:

[assembly:  Module("LINQ", typeof(System.Collections.Generic.Enumerable))]

@alrz
Copy link
Contributor Author

alrz commented Feb 6, 2016

@HaloFour Since then it wouldn't use those tools I presume, the compiler doesn't need to annotate the types and would simply emit the modules right away, as CLR modules, but it means that each of those languages should support this feature. However, I'm not clear about how the compilers should co-operate to produce a single assembly.

@HaloFour
Copy link

HaloFour commented Feb 6, 2016

@alrz IIRC CLR modules can't be referenced directly since they have no manifest. If the modules aren't meant to survive beyond the final assembly there really isn't a point to using CLR modules, and if you're goal is to produce all of these modules and then merge them into a single assembly then there's no reason to just not compile all of the source into a single assembly to begin with. The module boundaries could be enforced internally by having the compiler keep track of which module each type belongs to. If you wanted the module metadata to be retained in the final assembly you'd probably need to use attributes as I mentioned as anything else would be effectively erased by the compilation.

As for supporting modules, IIRC all of the major compilers already support them but they are treated as a single output. You'd need to invoke the compiler(s) multiple times and then have a different process to link/merge the modules. But, again, since you lose the module metadata I don't see much point in going through all of that trouble.

@alrz
Copy link
Contributor Author

alrz commented Feb 6, 2016

@HaloFour Yup, as I said to @dsaf that wouldn't be really helpful and out of scope of this proposal. Multi-language projects would be well structured as long as you have a separate assembly for each language, e.g. functional core imperative shell, so effectively you would have separate boundaries.

The module boundaries could be enforced internally by having the compiler keep track of which module each type belongs to. If you wanted the module metadata to be retained in the final assembly you'd probably need to use attributes as I mentioned

That was my intention at the very beginning, and yes then attributes would be required.

@Opiumtm
Copy link

Opiumtm commented Oct 12, 2016

@alrz

Namespaces in C# are meant to be used for organizing code and you can use using to bring types in scope, but when you use fully qualified names all that goodness goes away, so all you get from namespaces is to prevent name collisions — that's why they're called namespace .
In my opinion, isolating types and managing dependencies is something that the language should enforce
To use types declared in other modules, you must write using directives at the top of the module indicating the dependency (types declared in a module, can only referenced from other modules).

Stop here and now! You're proposing things that doesn't make "anyone" developer's life easier, but make it potentially harder.

You propose feature which would put additional burden on developer's back. It looks more like Adolf Hitler's regime which put you in concentration camp if you're not marching together and not shout "Zieg heil" loud enough.

types declared in a module, can only referenced from other modules

So if someone publish indeed useful library on nuget which would use "module" to "strictly manage dependencies", developer would be forced to follow this module paradigm even if he have no other reasons to use modules at all.

All you "pure code idealists" should understand one important thing. Code (like politics) not always is pure nor pleasant to behold. If senior developer is writing dirty and unpleasant code, he's most probably have serious reasons to do so. Requirement to quickly create working prototype or personal utility app, for example. If I write all small but useful utility console apps following strict pure code codex, I would hate my development job. Pure and beautiful code make sense mostly when you write production-grade code in team. But when you develop small utility pieces of code to make your own life easier - it's utterly irrelevant to write pure and beautiful code.

For example, recently I had the task to quickly analyze large array of data from database once, just to have some analytics which would help me to develop production code. There was essentially no reasons to write pure and beautiful code.

When you clean the toilet, you do not wear a tuxedo.

So do not force developers to wear a tuxedo when they clean the toilets.

@alrz
Copy link
Contributor Author

alrz commented Oct 13, 2016

@Opiumtm The sentiment here is not as radical as you are thinking. You already need to lay out assembly boundaries with well defined dependencies. I was proposing a mechanism to make it easier to define these boundaries without going through complexity of dependency management outside of the language i.e. package managers.

@Opiumtm
Copy link

Opiumtm commented Oct 14, 2016

@alrz No, you are proposing rather radical things

types declared in a module, can only referenced from other modules

@alrz
Copy link
Contributor Author

alrz commented Oct 14, 2016

@Opiumtm If you want to still be able to use modules form the code is not modularized yet sure, I agree, that should be possible. The proposal probably needs some polishing but the chance for this to be implemented approaches to zero so I'll leave it at that.

@iam3yal
Copy link

iam3yal commented Oct 16, 2016

@Opiumtm

Stop here and now! You're proposing things that doesn't make "anyone" developer's life easier, but make it potentially harder.

Proposals aren't meant to be final. :)

Just because you don't like it or think it's too radical you can't tell people to stop do things or propose things, it's for the design team to decide whether it make sense.

So if someone publish indeed useful library on nuget which would use "module" to "strictly manage dependencies", developer would be forced to follow this module paradigm even if he have no other reasons to use modules at all.

That's a good point but like I said above proposals aren't meant to be final.

Pure and beautiful code make sense mostly when you write production-grade code in team. But when you develop small utility pieces of code to make your own life easier - it's utterly irrelevant to write pure and beautiful code.

I can't see the difference between 10 lines of code and 1000k lines of code in terms of quality but for me, writing small programs is an opportunity and a goal to learn something on the way that in my opinion has the same value if not more than the goal of the program itself.

You still need to maintain this bad/ugly code and it will probably grow over time so personally,
I think that the quantity or size of things should be independent from quality.

Just my 2c.

@gafter
Copy link
Member

gafter commented Mar 24, 2017

We are now taking language feature discussion in other repositories:

Features that are under active design or development, or which are "championed" by someone on the language design team, have already been moved either as issues or as checked-in design documents. For example, the proposal in this repo "Proposal: Partial interface implementation a.k.a. Traits" (issue 16139 and a few other issues that request the same thing) are now tracked by the language team at issue 52 in https://github.com/dotnet/csharplang/issues, and there is a draft spec at https://github.com/dotnet/csharplang/blob/master/proposals/default-interface-methods.md and further discussion at issue 288 in https://github.com/dotnet/csharplang/issues. Prototyping of the compiler portion of language features is still tracked here; see, for example, https://github.com/dotnet/roslyn/tree/features/DefaultInterfaceImplementation and issue 17952.

In order to facilitate that transition, we have started closing language design discussions from the roslyn repo with a note briefly explaining why. When we are aware of an existing discussion for the feature already in the new repo, we are adding a link to that. But we're not adding new issues to the new repos for existing discussions in this repo that the language design team does not currently envision taking on. Our intent is to eventually close the language design issues in the Roslyn repo and encourage discussion in one of the new repos instead.

Our intent is not to shut down discussion on language design - you can still continue discussion on the closed issues if you want - but rather we would like to encourage people to move discussion to where we are more likely to be paying attention (the new repo), or to abandon discussions that are no longer of interest to you.

If you happen to notice that one of the closed issues has a relevant issue in the new repo, and we have not added a link to the new issue, we would appreciate you providing a link from the old to the new discussion. That way people who are still interested in the discussion can start paying attention to the new issue.

Also, we'd welcome any ideas you might have on how we could better manage the transition. Comments and discussion about closing and/or moving issues should be directed to #18002. Comments and discussion about this issue can take place here or on an issue in the relevant repo.

I am not moving this particular issue because I don't have confidence that the LDM would likely consider doing this.

@gafter gafter closed this as completed Mar 24, 2017
@honza77
Copy link

honza77 commented Oct 14, 2018

Did anybody considered something like Haskell or ML style modules? Am not sure how they differ from VB.Net modules.

@gafter
Copy link
Member

gafter commented Oct 15, 2018

Language proposals are now taken on language-specific repositories. For C#, in https://github.com/dotnet/csharplang . For VB, in https://github.com/dotnet/vblang . If you are interested in continuing this proposal/discussion, please repost it on the appropriate repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants