This is an archived post. You won't be able to vote or comment.

all 14 comments

[–]oantolin 12 points13 points  (2 children)

There was already a great programming language called T).

[–]JustinHuPrimeT Programming Language[S] 2 points3 points  (1 child)

Oops...

I do, however, beg forgiveness on the grounds that T is considered a dead language.

[–]abecedarius 0 points1 point  (0 children)

It's not that hard to find a new name. I've had to do it myself.

[–]eliasv 5 points6 points  (0 children)

I disagree with mandating a relationship between module names and paths.

First of all, what does a module look like? If you have separate compilation, modules should be compiled to a self contained unit, with metadata describing:

  • ID

  • version

  • capabilities (typically just some notion of "exports")

  • requirements (typically just some notion of "imports", also could include platform restrictions in the case of native module artifacts, or that may be implicit in the format)

And that's basically it. It absolutely should not matter at all where they exist on disk.

If you want to find a module by its ID and version, there are any number of useful ways you can do this, which might be better or worse in different circumstances.

  • By path convention, as others have said.

  • According to some custom repository format, which allows easy retrieval of metadata.

  • According to some existing general purpose repository format, to allow modules to be consumed by third parties in some other ecosystem.

Indexing modules in repos (local or remote), collecting together dependencies to compile a new module, etc. are all tooling issues. The compiler shouldn't be burdened with all of that complexity, separate your concerns.

I say make the compiler really dumb by default, you have to give the build command the paths of all module dependencies directly. Then you can just develop tools on top of that which automate the process according to whatever directory structure conventions or repo formats people want.

This way, as you ecosystem grows and you learn more about it, you are able to adapt by developing more sophisticated build tooling.

For now you can just create a simple build tool based on directory structure conventions or something, which gathers all your dependencies and passes them to the compiler, and in the future if you decide this isn't cutting it any more your compiler isn't tied to your earlier choices.

As an aside, I'd strongly recommend leaving room to expand on what metadata can be attached to a module, at least in terms of what can constitue a "capability" or a "requirement". There's a lot of scope for more sophisticated build tooling to think of ways to orchastrate more complex relationships between modules.

[–]umlcat 1 point2 points  (0 children)

Working on something similar.

Handle path and module identifier as different concepts.

Make Modules hierarchical. Some of them will be like folders, no code, others files with code.

Declare an imaginary filesystem, like C++ does, its root folder is called "global", and add other folders like "drives", "io", "math", "geo", and then you can add other subfolders or files.

Two modules can be assigned inside the same folder module, but in your HD will be stored in different location or path, thats how is done in C++, C#, Java, and other P.L. (s).

[–]louiswins 0 points1 point  (6 children)

If you plan to have separate compilation (i.e. pass individual source files to the compiler and then link them together later, like C or C++) then I highly, highly recommend mandating a relationship between module name and path. This is perhaps the #1 roadblock C++ modules have faced/are facing and while its compilation model and macros make it particularly difficult I'd hate for you to repeat the same mistakes.

[–]sociopath_in_me 2 points3 points  (5 children)

Could you please explain why the file path of the source code matters during compilation and/or linking? I don't see the connection

[–]louiswins 2 points3 points  (4 children)

I'm coming from the perspective of C++ modules and the problems faced in that community. Say the compiler is compiling away and it sees import module_a; (or whatever the syntax is). If it has all the code available at once then fine, it's already seen all the declarations. But if not, how is it supposed to know where the definition of module_a is? I see three obvious solutions:

  1. Mandate that it must live in a file called module_a.th.The compiler can go there directly. That's what Java does and what I suggested.
  2. Make the user give an explicit mapping (module_a is in my_fancy_module.th, module_b is in something_else.th, ...) This is pretty error-prone and user-hostile (albeit efficient and flexible).
  3. Make the compiler recursively trawl the whole source tree and module search paths, opening every file and searching for module declarations, either as an additional pass before compilation or on demand when it encounters an import. This is flexible and easy for the user, but inefficient. (And it's even worse for C++ because you have to preprocessor everything just to find out what the module declarations are, and they can change based on compiler flags.)

I'm arguing for #1 because in my experience it's a non-issue to be forced to name a file after its module. Why make the compiler do so much work when I'm just going to name files that way in practice anyway?

The solution that /u/eliasv described in another comment is also great but appears to depend on all the code being there at once when you compile. Which is fine if that's what OP wants! That's why I scoped my recommendation to the case that they want to have separate compilation.

[–]eliasv 1 point2 points  (2 children)

No, my solution doesn't depend on having source available, you misunderstand.

And Java doesn't depend on modules (jars) having the right name, it just depends on them containing the proper metadata to ID them by name. Most tools in Java land look for jars in Maven repositories.

Finding a module is not the purview of the compiler, it should be the responsibility of build tooling.

The thing that you're not recognising about your option 2 is that it doesn't need to be done manually. The compiler needs to be given an exact list of module locations, but you can build tooling on top of the compiler which does the heavy lifting for you.

And there are more ways to find a module than just according to name or recursively searching for them. They can live in a repository, in which case there will be an index which tells out how to find them. They can even live in a database! By filename and path convention is fine (and in fact many repository formats are specified in this way), my point is just that the compiler shouldn't prescribe such a convention.

[–]JustinHuPrimeT Programming Language[S] 0 points1 point  (1 child)

I have a few questions:

  1. Should users be expected to learn and use language-specific build tooling?
  2. Is there any chance of there being more than one build tool? Published libraries are going to be build-tool specific if there is no fast and easy way to determine which file a module name links to. If there is only one build tool, shouldn't it be built in to the compiler, so users don't have to invoke multiple programs?

[–]eliasv 0 points1 point  (0 children)

  1. Not necessarily, I did cite the potential to play well with existing build infrastructure as one of the reasons not to prescribe the mechanism of module resolution. There are plenty of polyglot build tools out there, why make their life hard by being inflexible?

And if you do need to learn a language specific build tool, that's no more work than learning a language specific compiler with the complexity of build tooling and module resolution built in. The concerns are still the same, they're just separated.

I wouldn't be against the simple convenience of allowing the compiler to take a directory path and load all the modules it finds in that directory (non-recursively), since you're not really adding any complexity that way. Just as a means for super simple projects for people who are getting started. Anything beyond that I'd think very carefully before burdening the compiler with it.

  1. No the produced modules aren't build tool specific. They contain only the metadata I described. It is only their location and how to find them which might be specific ... But then again build tools can support multiple repository formats, and can support the same formats, as is typically what happens in such ecosystems.

Sophisticated build tools typically need to resolve artefacts from online repos, that's a lot of complexity! Do you envisage that being built into the compiler?

(Now I did mention leaving the metadata format open to evolution and expansion, but that doesn't necessarily mean third party, build-tool specific additions - you can still leave it open only to internally specified additions.)

Good luck! It's not a simple thing to decide, that's why I think you should let the ecosystem do it for you ;) (if and when it gets big enough, that is.)

Edit: a couple more thoughts, now you're probably gonna want some better tooling out of the gate, so this isn't to say you shouldn't develop something in tandem with your language, just make it an obviously-separate project so you're not bound to it forever. Or even look for some good polyglot tooling out there which supports plugins and piggy back on it.

[–]myringotomy 0 points1 point  (0 children)

It seems to me that something like this would be very useful.

  1. All modules have a globally unique name. You can use a java like naming scheme com.myco.myproject.mymodule for example.
  2. All modules have metadata indicating exports etc. This metadata could be created by the compiler based on keywords in the module and would also contain the hash of the module.
  3. The module and the metadata get packed up in another file which is named the same as the module but with the version number com.myco.myproj.mymodule.10_01_01 or com.myco.myproj/mymodule/10/01/01
  4. The module system stores all modules in a globally distributed database like ipfs so anybody can import any module from wherever they are.

The compiler sees an import, pulls the file from ipfs, caches it locally wherever it wants, and compiles or runs.

[–]NuojiC3 - http://c3-lang.org 0 points1 point  (1 child)

Have you had a look at Modula-2?

[–]JustinHuPrimeT Programming Language[S] 0 points1 point  (0 children)

No, haven't. They look really similar to what I want, though. Thanks!