-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip is_dirty check from matlab tools #48
Conversation
This seems generally fine. @jeanconn - do you understand the root cause of the index being touched? I ask because I am unable to reproduce the problem in my Windows VM in ska3. But I do note that my This is more-or-less repeating your earlier point that this might be an opportunity to fix some underlying problems in the MATLAB Ska core installation. |
I thought the root cause of the index being touched is that gitpython is configured such that it is doing the same thing as "git status" and using optional locks and thus does modify the index when is_dirty is used. I found git status did the same thing in #44 (comment) unless --no-optional-locks was configured but I was strictly looking at the svn status not a shasum. As such, I didn't think this was caused by broken git -- I thought this was expected behavior. I'll try to find a non-svn way to show the behavior in windows. |
This is just with my windows miniconda git
It looks like it is "stable" at that shasum after getting status in this directory, so I suppose for an overall underlying reason for an issue that probably speaks to @jskrist's thought that maybe doing something else before checking in the SVN would make sense "If it is, perhaps @matthewdahmer could make sure the aca model is always run before he commits the thermal models to the SVN repository to resolve this issue.". I don't know what state information the index file stores or what state the chandra_models-3.49 repo is in and it is a binary file so it isn't really trivial to ask the file. If I just fetch a clean chandra_models repo from git and run "git status" on that, there are no changes to the index. I just attached the index of chandra_models-3.49 zipped in this comment. |
So I think we want to fix git in the matlab installation such that "git clone" works, but I don't think that fix would fix this. |
I'm wondering now if the index file stores a hash of the location of the repo in it. That path would point to the creators path, then get updated when a status is run on the repo? That would explain the behavior I've seen so far. |
I don't see a full path in there just looking for strings in the index (before or after git status). |
From various googlings, looking at https://learn.microsoft.com/en-us/archive/msdn-magazine/2017/august/devops-git-internals-architecture-and-index-files and the text "Next are five 4-byte values (device, inode, mode, user id and group id) of file-attribute metadata related to the host OS." it would not surprise me if git wrote back some data related to user and host into the file with git status and then maybe just doesn't report it in a useful fashion with "git ls-files -s" which is what I'm looking at for git's display of the index. |
@jeanconn - one thing that strikes me is your git version of https://github.com/sot/skare3/blob/2023.1/pkg_defs/ska3-core/meta.yaml I'm not sure if this version issue is at all related to index problems, but it seems worth understanding. |
Bingo, this looks quite promising since I'm able to reproduce it on Mac and Windows.
And as @jeanconn noted this appears to be stable after the first update.
YES! It looks like a simple |
And I've been going back and forth about whether we should still merge this PR even when the process issue gets fixed. On the one hand @jeanconn noted that the SVN checksum should provide visibility into any issue in the chandra_models repo. On the other hand, if the repo really is dirty, then running any flight tools using the dirty repo is extremely bad (hence the original idea of always raising an exception). Thoughts? |
Regarding "I'm not sure if this version issue is at all related to index problems, but it seems worth understanding." Sorry, I should have used more words. When I said "This is just with my windows miniconda git", I meant, that was with a git in git-bash unrelated to the matlab python environment. So the index issue was reproducible without needing to have the same git as in the matlab python environment (which you can get from svn). |
Regarding "YES! It looks like a simple git status before the SVN checkin should be sufficient?", I don't think we've learned that? IF it looked like we got the same checksum after git status, I'd think that was a possibility, but I haven't seen that we have a matching stable checksum yet. So if the issue is "full path / inode" or "host user" that is the thing saved in the index, then running git status before svn checkin is still going to solve the problem only for the user who does that status and check in. |
We've got a little bit of the problem now that we're training the fot users to ignore the SVN warning they are seeing. So that isn't great. Overall I don't see that it is likely that a fot user will have a custom chandra models without being explicit about it. But if it is a concern, like I said in the other thread, "git status --no-optional-locks" didn't write back to the index for me in my first experiments, but there was no equivalent to that flag for "is_dirty" and is_dirty did not respect the environment variable to do that. So another option would be to do this check using subprocess calling "git status" and parsing its output for example. I don't think it is quite worth it. (update - and I can't find --no-optional-locks in the git-bash git, so maybe it isn't available on windows git anyway?) Another option would be to update the matlab tools do more than warn on the thermal models svn being modified. |
I agree with @jeanconn that this the index file will likely get updated for each individual host/user, even if we run a git status before committing it to SVN. One total "hack" we could potentially do is to:
wrap that series of steps into a helper function and comment it well. We could also rely on SVN to warn users that the chandra models is modified, instead of doing that check via python, but as @jeanconn mentioned, we are currently training people to ignore this warning, so some retraining would be necessary. |
In our meeting today @taldcroft had the good question -- "but does the PR code work to get the version without updating the index". I hadn't checked the returned version so just went back in to do that and it looks OK.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this looks good now. The only thing missing is documentation in the doc string of how the get_data
function changes behavior slightly if THERMAL_MODELS_DIR_FOR_MATLAB_TOOLS_SW
is defined.
I was thinking this for docs in get_data. Does this read OK to you @taldcroft ?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a lien on adding the doc update in the comment.
Looks good @jeanconn |
Description
Skip is_dirty check from matlab tools. The gitpython is_dirty appears to use a lock that touches files in the index. This conflicts with SVN management of the chandra_models repo (and marks the repo as modified for matlab users). That behavior is undesireable. This PR skips the is_dirty check if THERMAL_MODELS_FOR_MATLAB_TOOLS_SW is set.
Fixes #44
Interface impacts
Testing
Unit tests
Independent check of unit tests by [REVIEWER NAME]
Functional tests
Made a new unit test for functional testing.
I directly confirmed in matlab testing that using ska_helpers 0.10.3 and doing 'ska_helpers.chandra_models.get_data("chandra_models/xija/aca/aca_spec.json")'
marks the chandra_models-3.49 directory with an svn red exclamation mark, and doing the same operation with this branch in the pyexec python path does not (leaves it with the svn green check).