-
-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Matsumoto's matroid database and AllMatroids() driver function #37140
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. The one other comment to consider would be to have an explicit link/reference to the database in AllMatroids
.
Based on suggestions by tscrim
Documentation preview for this PR (built with commit e4e2d24; changes) is ready! 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. LGTM.
…driver function <!-- ^^^^^ Please provide a concise, informative and self-explanatory title. Don't put issue numbers in there, do this in the PR body below. For example, instead of "Fixes sagemath#1234" use "Introduce new method to calculate 1+1" --> <!-- Describe your changes here in detail --> This enables the generation of all matroids of certain number of elements (and, optionally, of certain rank or type). There exist restrictions based on the underlying database (Yoshitake Matsumoto's Database of Matroids: https://www- imai.is.s.u-tokyo.ac.jp/~ymatsu/matroid/index.html). Note that the database files consume ~75MB of space, which drops to ~1.5MB upon a standard compression. <!-- Why is this change required? What problem does it solve? --> <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> Partially addresses sagemath#35464 (the type can be set to `paving`). ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> <!-- If your change requires a documentation PR, please link it appropriately --> <!-- If you're unsure about any of these, don't hesitate to ask. We're here to help! --> <!-- Feel free to remove irrelevant items. --> - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [x] I have created tests covering the changes. - [x] I have updated the documentation accordingly. URL: sagemath#37140 Reported by: gmou3 Reviewer(s): Travis Scrimshaw
This doesn't work on an installed sagemath because the data files are not being installed. You need to add them to |
Thank you. Can you please check the PR I made for this (#37231)? This issue did not appear for me so I can't witness any functional difference. |
This PR almost doubles the size of
This is not the way databases are handled in sagemath. The normal way would be to make an optional spkg containing the database. The nicer/easier way would be to create a pypi package similar to https://github.com/sagemath/conway-polynomials (btw, in conway-polynomials the actual data files are xz compressed and transparently decompressed as needed) Also, as @antonio-rojas pointed out, this broke |
I am not familiar with your proposed method of resolving this but I could check it out. Would compressing the files down to ~1.5MB be satisfactory? Then a decompression (and possibly a deletion of the decompressed file) would need to occur when the function is called. |
Have a look at https://github.com/sagemath/conway-polynomials. It's an independent python package so it needs some boilerplate but not too much (you could base your package on this, which is quite modern). The advantage of this is that the database can be used completely independent of sage, can have its own release schedule, etc. This would be a package very easy to install using pip or create a distro package (it is for conway-polynomials). What I think would be reasonable is that your package exports a function analogous to the Then your
I still think this is better handled in a separate package. It may seem small, but you have to bear in mind that sage uses a several databases, some standard and some optional, and if we included every database in the source code, it would all add to a lot, not just space but maintaining work, etc. A separate package is easier to maintain and update independently of sagemath (as long as the interface doesn't change) and can be used without sagemath (this might be useful for someone, I don't know).
That's not necessary. The way you use the files, python makes it very easy to switch to reading the compressed file, decompressing on the fly. See the function Most probably you can copy this function I hope this helps. Maybe @orlitzky has something useful to add (he wrote the conway_polynomials python package). |
The |
I also think a separate package is a good idea. A big benefit is that it de-couples the download & upgrade procedures of the thing that changes rarely (the database) from the thing that changes often (the code in sagelib). Users might install the database package once, and then upgrade sagelib twenty times before having to update to the database package. It's nice in that example to not have to re-download and re-install the big database those twenty times. The database link in the original post isn't working for me right now, but another benefit is that a separate package can have a separate license if those restrictions would cause problems for sage itself. |
@vbraun @mkoeppe what's the way to proceed with this PR? Is it ok to add 75M to the source code just like that? No trigger warnings? No sage-devel discussion? This also broke sagemath-standard for a The issue is some files not installed, so in-place editable builds work. I guess testing only covers editable builds? A better way for this is being worked out in #37231, but IMHO this PR should be reverted for the time being. |
I think it was a simple oversight in the review of this PR. I agree that it needs an urgent fix. To me it does not make much of a difference whether to revert the PR or to merge the current version of #37231 (as a hotfix only; not the final state, which should be adding an optional spkg).
So -- thanks to the added database's extraordinary compression ratio -- at least in terms of repo size there's not a problem that would warrant a force push. (I don't know if we should be concerned about other potential adverse effects, perhaps on the speed of some git operations.) |
We have no policy about adding database-type files, much less anything that is documented in our dev guide. As such, there is no mistake/oversight with the review. (Side note: We might want to be careful about the definition of a database. What if I write a bunch of Python functions, one for each object?) The broken builds in certain situations is a problem, but we don’t have any standard testing for this. This is why we have beta releases. A straight reversion of this PR would also set a bad precedent IMO; it should be handled according to our standard practices of a followup PR. We can also have @vbraun wait for the next release until the issue is settled as we are all trying to figure out the best way forward. |
@tscrim It's not like we're trying to charge you a fine for this.
Well, adding 75 MB of source code in one PR would certainly also raise some objections. |
You’re still trying to lay the blame at my feet for this, which is unfair. |
@tornaria That's right. Two things that we might want to do: |
It's true that your feet should not be blamed for any of this! |
This is a minimal PR to allow building with python 3.12, given the circumstances surrounding sagemath#36181. I've tested this works with system python 3.12 by: ``` ./bootstrap ./configure --disable-doc --disable-editable --enable-system-site- packages env 'MAKE=make -j36' make ./sage -tp 36 --all ./sage -tp 36 --all --long ``` It only gave the expected failures in `src/sage/matroids/database_collections.py` from sagemath#37140. Since scipy 1.12 works ok after sagemath#37123 and it's just a one-liner to change the required version, I included it here. No attempt is made at upgrading anything in sage-the-distro. ### 📝 Checklist - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. EDIT: rebased on top of sagemath#36983 to address reviewer suggestion. URL: sagemath#37270 Reported by: Gonzalo Tornaría Reviewer(s): Aliaksei Urbanski, Gonzalo Tornaría, Tobias Diez
Remove matroid database files (`ext_data/matroids`) and use newly created `matroid-database` pypi package instead. Links to `matroid-database`: [pypi](https://pypi.org/project/matroid- database/), [github](https://github.com/gmou3/matroid-database) Outdated: > Compress (via `xz -e`) the matroid database files (~75MB to <1MB). The files are now opened with `lzma.open()` and the file paths are located through `sage.ext_data` and not relative to `SAGE_SRC` or `SAGE_EXTCODE`. These changes attempt to address the issues raised by @tornaria. > > Also, the addition of the matroid database files inside the folder `ext_data` needs to be noted in the file `setup.cfg.m4`. This issue was brought to my attention by @antonio-rojas. > > These follow the merging of sagemath#37140. > > I also took the initiative of reordering alphabetically the list of paths inside `setup.cfg.m4` and of removing the line `ext_data/images/*` which seemed redundant. <!-- Why is this change required? What problem does it solve? --> <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. URL: sagemath#37231 Reported by: gmou3 Reviewer(s): gmou3, Gonzalo Tornaría, Matthias Köppe
Remove matroid database files (`ext_data/matroids`) and use newly created `matroid-database` pypi package instead. Links to `matroid-database`: [pypi](https://pypi.org/project/matroid- database/), [github](https://github.com/gmou3/matroid-database) Outdated: > Compress (via `xz -e`) the matroid database files (~75MB to <1MB). The files are now opened with `lzma.open()` and the file paths are located through `sage.ext_data` and not relative to `SAGE_SRC` or `SAGE_EXTCODE`. These changes attempt to address the issues raised by @tornaria. > > Also, the addition of the matroid database files inside the folder `ext_data` needs to be noted in the file `setup.cfg.m4`. This issue was brought to my attention by @antonio-rojas. > > These follow the merging of sagemath#37140. > > I also took the initiative of reordering alphabetically the list of paths inside `setup.cfg.m4` and of removing the line `ext_data/images/*` which seemed redundant. <!-- Why is this change required? What problem does it solve? --> <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. URL: sagemath#37231 Reported by: gmou3 Reviewer(s): gmou3, Gonzalo Tornaría, Matthias Köppe
Remove matroid database files (`ext_data/matroids`) and use newly created `matroid-database` pypi package instead. Links to `matroid-database`: [pypi](https://pypi.org/project/matroid- database/), [github](https://github.com/gmou3/matroid-database) Outdated: > Compress (via `xz -e`) the matroid database files (~75MB to <1MB). The files are now opened with `lzma.open()` and the file paths are located through `sage.ext_data` and not relative to `SAGE_SRC` or `SAGE_EXTCODE`. These changes attempt to address the issues raised by @tornaria. > > Also, the addition of the matroid database files inside the folder `ext_data` needs to be noted in the file `setup.cfg.m4`. This issue was brought to my attention by @antonio-rojas. > > These follow the merging of sagemath#37140. > > I also took the initiative of reordering alphabetically the list of paths inside `setup.cfg.m4` and of removing the line `ext_data/images/*` which seemed redundant. <!-- Why is this change required? What problem does it solve? --> <!-- If this PR resolves an open issue, please link to it here. For example "Fixes sagemath#12345". --> <!-- If your change requires a documentation PR, please link it appropriately. --> ### 📝 Checklist - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. URL: sagemath#37231 Reported by: gmou3 Reviewer(s): gmou3, Gonzalo Tornaría, Matthias Köppe
This enables the generation of all matroids of certain number of elements (and, optionally, of certain rank or type). There exist restrictions based on the underlying database (Yoshitake Matsumoto's Database of Matroids: https://www-imai.is.s.u-tokyo.ac.jp/~ymatsu/matroid/index.html). Note that the database files consume ~75MB of space, which drops to ~1.5MB upon a standard compression.
Partially addresses #35464 (the type can be set to
paving
).📝 Checklist