Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing the size of the repository #170

Open
egekorkan opened this issue Sep 28, 2021 · 8 comments
Open

Reducing the size of the repository #170

egekorkan opened this issue Sep 28, 2021 · 8 comments

Comments

@egekorkan
Copy link
Contributor

The testing repository has now reached 1 GB and it is going to get even worse over time. Below is a scan of the repository on how much each folder takes. The red part above is the events folder. The blue part below is the .git folder which contains information of the previous commits etc. This folder is this big due to possible PDFs in previous commits I guess. There are also multiple PDFs and PPTX of more then 10 MB in the events folder.

image

@egekorkan egekorkan changed the title Reducing the Reducing the size of the repository Sep 28, 2021
@egekorkan
Copy link
Contributor Author

There are also a lot of stale branches. See https://github.com/w3c/wot-testing/branches/stale . I recommend deleting them

@mmccool
Copy link
Contributor

mmccool commented Sep 28, 2021

Yeah, I noticed this recently too when moving to a new machine where I had to clone a fresh copy of this repo... We could certainly do some cleanup, e.g. deleting stale branches. But at some point (e.g. for the next plugfest; not this one) we should just set up a new repo and archive this one.

@mmccool
Copy link
Contributor

mmccool commented Sep 28, 2021

Seems there is nothing to lose by deleting stale branches so I'll do that now. (DONE)

@egekorkan
Copy link
Contributor Author

We can set up a new repo of course but this means that we will need think of moving the issues and possible links to the original repo need to be updated. Even then, we cannot keep on doing this I would say. From what I see, the PDFs and PPTX are the main problem. We should simply not allow their submission and only allow linking them. Regarding images, they should be uploaded to the user content (like the screenshot above).

@danielpeintner
Copy link
Contributor

Question: I am a bit unsure which problem we try to solve.

I agree, the repo becomes large (and even larger with historical data of binary data where the diff is not really possible). Anyhow, I did not experience a big downside either. Yes, cloning takes longer but as far as I can tell it is still acceptable.

I thought there would be a way to prune the history let's say for content that is older than 2 years or so but it seems there isn't.
A full "clear" is possible
https://gist.github.com/stephenhardy/5470814

@mmccool
Copy link
Contributor

mmccool commented Sep 29, 2021

My suggestion is that we create a new repo ONLY for things needed for CI, and keep that one small (no PDFs or PPTs) and keep this one for big things and archival.

@mmccool
Copy link
Contributor

mmccool commented Sep 29, 2021

From Ege: we could also set up a separate archival system for really old things.

@FadySalama
Copy link
Contributor

We should really think of this as it makes sense to archive older things

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants