Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting different project concerns into separate repos #147

Open
stuartpb opened this issue Feb 7, 2017 · 26 comments
Open

Splitting different project concerns into separate repos #147

stuartpb opened this issue Feb 7, 2017 · 26 comments

Comments

@stuartpb
Copy link
Member

stuartpb commented Feb 7, 2017

Now that this project has its own org and isn't just a single repo in a more general org, I'm thinking about splitting the parts it into their own repos. For instance:

  • Putting documentation / schema into a "opws-schema" repo, per one of the thoughts in Getting serious about schemas #146.
  • Putting test infrastructure into a "opws-validation" repo? (or maybe letting it live alongside the schemas)
  • Putting policy, practices, stuff like that (Roadmap #100) into an "opws-policy" repo or whatever (opws-procedure? opws-guidelines? owps-constitution? opws-bureaucracy? opws-metaproject?)
  • Putting API runtime stuff into its own repo
  • Putting tools into their own repos

Advantages to this:

  • You don't have cross-concerns mixed in with your issues and commit history and things like that. The commits in the profiles repo would only be about changes to profiles, without having commits for tooling etc. mixed in.
  • Having separate spaces for different concerns lets discussion, you know, grow to fit the space available. There's definitely a bit of a feeling of chaos with the way that everything lives here

Disadvantages:

  • It makes it so broader concerns require more elaborate git clone structures. For instance, testing would require the tests to check out schemas and (if separate) the testing framework.
  • You don't get cross-pollination as easily. Discussion of profiling, at least right now, is intricately linked to discussion of the next extensions to the schema: having to call out to a separate repo to link these discussions in has a bit of a cooling effect.
  • They'll never be truly separate. Circle CI's always going to require the package.json and circle.yaml etc. to live in this repo, so long as they're dictating the tests for this repo. I mean, these commits can be singled out with something like a "[meta]" prefix, but, yeah.
@stuartpb
Copy link
Member Author

stuartpb commented Feb 7, 2017

I'm putting this on 0.1.0 because I want to have a clear idea about what's next by the time we hit that point, and right now I've got kind of a gut feeling that splitting everything into separate repos wouldn't be quite the rosy sunbeam it sounds like at first glance, even though that Disadvantages list is kind of weaksauce (the response to each point being "I can live with that" more than the Advantages items being "I don't know that I really care about that").

@stuartpb
Copy link
Member Author

stuartpb commented Feb 7, 2017

throwing the labels for the concerns this would change onto it, because hey why not

@stuartpb
Copy link
Member Author

stuartpb commented Feb 7, 2017

I like the idea of making the validator its own project, giving it space to really explore straight-up new features (#122, #123, and more), but, yeah, the infrastructure to have this pull that in or vice-versa, eesh.

I mean, there's git submodules, but, I mean, submodules are yucky and everybody knows it. Do they feel like a good fit here, though?

@stuartpb
Copy link
Member Author

stuartpb commented Feb 7, 2017

Like... is there a list of "the fundamental traits to consider about submodules you'll need to consider if you're thinking about using them as a solution" to hit for quick reference? Because I'm going over a list like that in my head, and, yeah, like, it really seems like they're not honestly that bad of a fit here, but I'm also seeing the footguns that'd pop up if they got embraced as a panacea:

  • They're a fixed commit reference that has to be updated manually: Not great, but not bad, either. Like, for instance, say the validator starts changing the command line syntax to invoke it: the one commit here that'd update package.json and/or circle.yaml would also be the commit that updates that reference. So submodules make sense.
    • On the other hand, it'd be awful for anything that would be structured around having the data be a submodule. And the alternative, where everything that's less fluid is a submodule of everything that's more fluid...
  • I'm not sure there's a way to only check out a subset of submodules, which would be important if, say, API runtime stuff has to be a submodule of this in order to make deploying a new API slug on every profile update work smoothly - the validation tests shouldn't require the API stuff to work correctly, nor vice versa.
  • How do submodules work re: being recursive? It'd make sense for the validation project to have schemas as a submodule, and for tooling purposes it'd make sense to have the validation repo be a submodule of this one, but...

@stuartpb
Copy link
Member Author

stuartpb commented Feb 7, 2017

chiming in to note that I have a whole bin/ directory in my old Cloud9 workspace for this project I've never committed because I don't want to weave transient issue script stuff into the project history (since it's just stuff like "what's-going-to-get-affected-by-this-refactor"). So that's an example of the kind of thing that'd be served better in an everything-gets-its-own-repo framework.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 8, 2017

Right now, I'm really leaning toward stuff like tests just incorporating the test framework etc. with a git clone command in a script rather than coming up with any submodule magic.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 9, 2017

Another thing that'd be nice about having a separate repo for the schema: it'd be a dedicated place for all labels to be about the schema, so it wouldn't require as much qualification to have labels for concerns of the schema, like "cookies" or "subdomains" or "legacies" or "password reset".

@stuartpb
Copy link
Member Author

stuartpb commented Feb 9, 2017

Note that, if this project is migrated to separate repos, I think this repo should be kept intact, to sustain legacy links like issues and such, and a new repo should be created for the profiles or data (since, remember, whatever repo succeeds this will contain legacies as well as profiles, per #43). The new repo for profiles can keep the same history from this one - embarrassing and messy as it is - but, after the hop happens, this repo should get a gravestone commit that directs to the new home (the way I think node's legacy repo does it).

@stuartpb
Copy link
Member Author

stuartpb commented Feb 9, 2017

Also, I kind of like having the validator live alongside the schemas, as validation is kind of another layer that has to live beyond schemas (for instance, WILDCARD profiles have tests for consistency between the distinguish object, the filename, and every url that would be out-of-scope for a schema).

This was referenced Feb 9, 2017
@stuartpb
Copy link
Member Author

stuartpb commented Feb 9, 2017

Also, I'm thinking templates will be split out of CONTRIBUTING.md and linked to, as files that live alongside the schema and validation, and get tested on any schema update to ensure that the template content is still valid.

@stuartpb
Copy link
Member Author

I just remembered what the plural for "schema" is, so I've started the repo that I'll be developing #146 in: https://github.com/opws/opws-schemata

@stuartpb
Copy link
Member Author

Re: keeping this repo at this name for legacy reasons, but copying the history to a new repo - wouldn't that break issue references in the commits? It seems like the better strategy would be to just rename this repo and depend on GitHub's redirect mechanisms.

@stuartpb
Copy link
Member Author

I think the future name for this repo will become opws-dataset.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 17, 2017

So, basically, here's the timeline:

  • Split the non-dataset components of this repo (validation, schema, docs) out to separate repo(s), removing them from here.
  • Rename this repo.
  • Migrate open non-dataset-centric issues to new repo(s) by opening corresponding issues in the new repo, linking to the old one in the new one, attaching a "migrated" label to the old issues (and maybe a "legacy" label to the new ones), and closing (and maybe locking) the old issues (after linking to the new equivalent).
  • Migrate the "Schema Structure" section of the Style Guide on the wiki to a "Style Guide" in a wiki (or docs?) for the schema repo.
  • Integrate the repos, ie. so a commit here checks out the schema / validator and runs validation in CI.
  • Require review for PR merges to master in GitHub branch protection settings.
  • Ship v0.1.0.

@stuartpb
Copy link
Member Author

One of the reasons this'll work smoothly is because the repos for the rest of the components will be starting from scratch (ie. I'll be throwing away most of the test infrastructure that's present now, and the stuff in repos will be somewhere between partially-copy-pasted and all-new). If I had multiple components that were moving on with their histories into separate repos, this'd get hairy (and I'd have to do some trickery like http://stackoverflow.com/questions/25224258/how-can-i-make-a-a-subdirectory-in-my-github-project-into-a-new-repository/25224259#25224259).

@stuartpb
Copy link
Member Author

stuartpb commented Mar 3, 2017

I'm working right now on splitting the docs out to the schemata; everything else is ready.

@stuartpb
Copy link
Member Author

stuartpb commented Mar 3, 2017

Things are going to get a little bumpy after the switch, because there's one more change I want to make to the schema before shipping v0.1.0, but I'm confident it'll be weatherable.

@stuartpb
Copy link
Member Author

stuartpb commented Mar 3, 2017

Also, "shipping v0.1.0" is going to take the form of a commit that adds a SCHEMA_VERSION file to the root of master here with the contents v0.1.

@stuartpb
Copy link
Member Author

stuartpb commented Mar 3, 2017

Well, #300 just shipped v0.1.0 as far as I'm concerned, so it'd seem the timeline laid out in #147 (comment) is a little bit scrambled: I did the "split" (which, after having written the validator and schema, was mostly just an entire evening to rewrite the documentation, per neue-#111), then I did the integration (I mean, why would I split stuff out to other repos and not do that integration?), then I renamed this repo (though I honestly kind of wanted to do the rename before the split, but I felt like it had to be post-split for the name to really ring true), than I did a bunch of stuff to acclimate to the new name (#296), then I shipped v0.1.0, and now I'm migrating the issues to separate repos (though I kind of did some of that after creating the opws-guidelines repo, which I did before cutting v0.1).

As for requiring review... I'll look into it.

@stuartpb
Copy link
Member Author

stuartpb commented Mar 3, 2017

Yeah, pull requests are already effectively blocked on review, since I'm the only one who can merge changes anyway. If I'm still the primary contributor, and I intend to commit unilaterally, adding a review requirement doesn't really get me anything except making the merge button harder to use, so, for now, I'm not enabling reviews.

If this project gets contributor traction, then maybe.

@stuartpb stuartpb modified the milestones: v0.1.0, v0.2.0 Nov 12, 2017
@stuartpb
Copy link
Member Author

Kicking this milestone since most of what was pertinent for 0.1.0 was already done, and all that's left is re-filing issues in relevant repositories (and maybe coming up with a repo for API stuff).

@stuartpb
Copy link
Member Author

Reminder that the project wiki exists, and should be used as a basis for some of this splitup:

  • The "Style Guide" and "Profiling Guide" should be used as the basis for documents in opws/opws-guidelines
  • The "History" page, I'm thinking, should be moved to a new "opws/opws-news" repository (which might then become news.opws.org?) Something that makes it easy to keep track of "wait, what the heck, when did this whole massive cross-repository change happen"

@stuartpb
Copy link
Member Author

There's also the question of "what about using a wiki going forward?" And, tbh, I'm against it. I prefer having issues and pull requests to ponder changes and merge bubbles / commit messages to note that they happened - this is especially useful for compiling the aforementioned history. Changes to guidelines can be just as important as changes to schemata.

@stuartpb
Copy link
Member Author

stuartpb commented Nov 13, 2017

One thing I'm realizing right now is that a "news" repo would be an excellent place for "roadmap" issues like "helper library for using profiles" - the issue could stand for "post some news when this is ready". ("news" is also an improvement on "roadmap" in general - there are many times during this project where I had a whole plan roadmapped out, then went veering wildly away from it almost immediately. Better to report on what actions have been taken with a future in mind then what plans are absent any buy-in from reality.)

@stuartpb
Copy link
Member Author

chiming in to note that I have a whole bin/ directory in my old Cloud9 workspace for this project I've never committed because I don't want to weave transient issue script stuff into the project history (since it's just stuff like "what's-going-to-get-affected-by-this-refactor").

Note that I am now committing these scripts at https://github.com/stuartpb/opws-checklisters

@stuartpb
Copy link
Member Author

#308 describes the most significant task remaining before this issue will be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant