Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shebang and executable paths incorrect in gitbash #1302

Open
amorphousWaste opened this issue Apr 28, 2022 · 24 comments · May be fixed by #1475
Open

shebang and executable paths incorrect in gitbash #1302

amorphousWaste opened this issue Apr 28, 2022 · 24 comments · May be fixed by #1475
Labels
bug os:windows Windows-specific shell Shell related issues

Comments

@amorphousWaste
Copy link

When creating a rez environment with gitbash on Windows, the shebang and executable paths use a single slash which gets escaped and causes the paths to be incorrect.

Environment

  • OS "windows 10"
  • Rez version "2.109.0"
  • Rez python version "3.7"

To Reproduce

  1. Open a gitbash shell on Windows
  2. Run a rez-env

Expected behavior
Taken into a rez environment.

Actual behavior
Command not found error

@amorphousWaste amorphousWaste changed the title sheband and executable paths incorrect in gitbash shebang and executable paths incorrect in gitbash Apr 28, 2022
@hughetop
Copy link

I just noticed the gitbash plugin, but I can't seem to get it to work. No matter what I do I get:
C:\Users\hughetop\AppData\Local\Temp\rez_context_b1y1cklv\rez-shell.sh: line 5: c:\program: No such file or directory
The space in "program files" is messing things up, and I can't figure out what the syntax needs to be. Doubling up the backslashes like in the pull request doesn't seem to fix it. Trying to escape the space makes windows think it's another directory. Using the "/c/program\ files/..." syntax doesn't work either. This has been wracking my brain for months, but I really want to get gitbash working so I don't have to use cmd or powershell anymore.

@JeanChristopheMorinPerso JeanChristopheMorinPerso added the os:windows Windows-specific label Apr 29, 2022
@amorphousWaste
Copy link
Author

Good point about the spaces. I think the solution might be to use DOS short paths:
Eg. /c/Program Files/Git/bin/bash.exe -> /c/PROGRA~1/Git/bin/bash.exe. I'm doing some testing now.

@instinct-vfx
Copy link
Contributor

We had a bit of a discussion about this on Slack already. This needs to be fixed for good as this is only the default but other paths might also need escaping. We need proper cross platform escaping to really fix this in my eyes.

@instinct-vfx
Copy link
Contributor

(Adding quotes works in gitbash but not in rez. I think that they are stripped by rez when sanitizing?)

@hughetop
Copy link

Interesting, I tried using that DOS shortpath and it got a different error. So my find_executable is just:
return r'C:\PROGRA~1\Git\bin\bash.exe'.replace('\\', '\\\\')
and it'll recognize gitbash, but it doesn't seem to start bash properly. The bash commands aren't available.

$ rez-env --shell gitbash python-3.9
bash: mkdir: command not found

You are now in a rez-configured environment.

resolved by hughetop@********, on Fri Apr 29 12:52:12 2022, using Rez v2.103.4

requested packages:
python
~platform==windows           (implicit)
~arch==AMD64                 (implicit)
~os==windows-10.0.19042.SP0  (implicit)

resolved packages:
python-3.9.10  \\some\network\path\packages\ext\python\3.9.10
>
hughetop@******** MINGW64 /c/Users/hughetop
$ ls
bash: ls: command not found

It seems to be because the path is using ; as pathsep, not :

$ echo $PATH
/c/rez/Scripts/rez:C:\Program Files\Python39;C:\Program Files\Python39\Scripts\;C:\Program Files\Git\cmd;/mingw64/bin;/usr/bin;/c/Users/hughetop/bin;C:\rez\Scripts\rez

There were other unrelated syspaths in there but I removed them to shorten the example. Whereas if I exit rez and echo $PATH:

$ echo $PATH
/c/rez/Scripts/rez:/c/Users/hughetop/bin:/mingw64/bin:/usr/local/bin:/usr/bin:/bin:/mingw64/bin:/usr/bin:/c/Users/hughetop/bin:/c/Program Files/Python39/Scripts:/c/Program Files/Python39:...

It's closer than I've gotten before, but I still can't get it to handle the paths correctly.

@instinct-vfx
Copy link
Contributor

Ah good point. Had not thought about pathsep

@amorphousWaste
Copy link
Author

amorphousWaste commented Apr 29, 2022

Just to give some quick background, the gitbash plugin is based on work I did to get rez and a linux bootstrap to work in a Windows environment. As part of getting gitbash to work, I made some changes to the local rez install in rez/resolved_context.py:

executor_path = executor.env.PATH.value()
executor_path = re.sub(r'([a-zA-Z]):\\', r'/\g<1>/', executor_path)
executor_path = executor_path.replace('\\', '/')
executor_path = executor_path.replace(';', ':')
executor.env.PATH = executor_path

Essentially the PATH variable needs to be POSIX but PYTHONPATH needs to be DOS. It's very weird and was a huge pain in the rear.

@JeanChristopheMorinPerso
Copy link
Member

JeanChristopheMorinPerso commented Apr 29, 2022

PYTHONPATH needs to be DOS style because the python process is a windows process. PATH needs to be POSIX because of bash. Note that is just from stuff I've read in the past.

This path replacement is very fragile and is one reason why I was against it (and I'm still is).

@JeanChristopheMorinPerso
Copy link
Member

Some reading: https://www.msys2.org/docs/filesystem-paths/

@nerdvegas
Copy link
Contributor

So, git-bash is an example of a shell that requires path normalization - when in a bash shell, (eg) C:\foo must be referred to as /c/foo in cases such as PATH (actually it seems to work with win paths when I test it, but utils like which give weird results). Before normalization was added, paths like C:\pkgs\foo\1.2.3/bin were being constructed, and the mix of differing slashes is just odd. Recently, this issue has come up also - #1321. Ironically, it might mean that we need to enforce forward slashes on win, which we would do as part of path normalization.

So I don't see how we can avoid normalization - if we do, we're simply making package commands unintentionally platform and/or shell- specific (and the whole idea is not to be!). However, PYTHONPATH is a case where an env-var is going to be ingested by a win proc that knows/cares nothing about the shell it was launched from, so normalizing that path is absolutely a problem.

Is this not just a case then of changing the default value of the new setting https://github.com/AcademySoftwareFoundation/rez/wiki/Configuring-Rez#pathed_env_vars to ["PATH"] instead of ["*PATH"]?

In any case, I don't think fixing this is as simple as adding the skip_root_normalization setting as per #1303 (I'm also not clear on why/how that PR fixes the issue described here - please explain this in the PR!).

Currently I'm a little confused about where we are on this as there seem to be multiple overlapping things going on:

  • This issue says there's an escaping issue causing shebang not to work in git-bash, yet the unit tests were passing already? Why are they passing for git-bash?
  • Someone else has mentioned a problem with space in path ala C:\Program Files, is that the actual root of the problem? We should add a unit test for this.
  • We've highlighted cases where normalization should/should not occur (ie the PATH/PYTHONPATH example), is that a question of changing default pathed_env_vars value, or is there a more fundamental problem here? Also, since the unit tests didn't pick this up, we should update them so they do - we've identified a problem, we should add a test to fail on that problem rather than just jumping in and fixing it, lest the fix regress in a later version.

@nerdvegas
Copy link
Contributor

Thinking on this further, I think there's some more things to take into consideration:

  • Currently, {root} or this.root in a package commands is giving the shell-normalized path. But I can see this being a problem for (eg) env.PYTHONPATH.append("{root}/python"), since as mentioned previously, PYTHONPATH should be in the form expect by the platform, not the shell. We need tests to pick this up. I bet they aren't now though because I didn't want python to be a requirement of the unit tests, since this would mean needing to provide a rezified python. Not sure what to do about this.
  • We might need to have the concept of platform path normalization vs shell path normalization (consider that we may have to fwd-slash on win wrt CMake breaks due to windows penchant for escape characters as path separators #1321). Perhaps we need to have separate pathed_env_vars config settings - one global, one per shell (all optional). The shell-specific setting would apply shell path norm, but the global would apply platform norm (and shell would be higher priority).

@hughetop
Copy link

hughetop commented Jun 14, 2022

For what it's worth, I've decided to look into Cmder which extends cmd with bash-like commands, rather than try to get git-bash to work. I've made a cmder-rez plugin, which I can share if anyone is interested.

Edit: The reason behind this decision is that we feel it's safer to use a windows-native shell. So rather than try to get git-bash to work with rez, it makes more sense to extend cmd. Here is the cmder-rez plugin for anyone curious.

@nerdvegas
Copy link
Contributor

Pinging this thread as would be good to get movement.

@JeanChristopheMorinPerso - you mention that "This path replacement is very fragile and is one reason why I was against it (and I'm still is)." - but the git-bash shell requires this. Is there something else you had in mind?

@amorphousWaste - clarification still needed on #1303 as to specifically what problem it fixes and how. Is it actually the space in path (ala Program Files) causing the problem?

It seems to me that PATH might be the only var that needs to be shell-normalised. For all other cases (eg PYTHONPATH), it's platform normalization that should happen, and it seems likely that this could simply default to pathlib.Path.as_posix behaviour.

What I have in mind is something like this (config settings):

# apply platform-specific path normalization
platform_pathed_env_vars = [
  "*PATH"
]

# apply shell-specific path normalization
# note that most shells simply apply platform normalization (git-bash is a notable exception)
shell_pathed_env_vars = [
  "PATH"
]

@JeanChristopheMorinPerso
Copy link
Member

@nerdvegas I completely forgot to explain what I was thinking about. Sorry for that.

I think we could probably give https://www.msys2.org/docs/filesystem-paths/ a read (and I encourage everyone to do it). That's the first thing I'd do. I have a feeling it's a "solved" problem. I'd removed the pathed_env_vars setting all together at the global level and rely on cygpath to convert paths for us. It even has a --path that is specifically designed for converting path lists in environment variables. It knows about the mapping between windows drive and posix mounts and it probably handles corner cases that we haven't thought about or we simply don't know they even exist.

It seems to me that PATH might be the only var that needs to be shell-normalised.

Unfortunately not. What about rez packages that contain bash scripts that don't call any windows executable that would stil require custom environment variables that would contain paths?

We should probably still need to take into considerations that not just PATH might need sanitization, which means we could have a per shell setting, or maybe something like this:

def commands():
    # Instead of "env.MY_CUSTOM_PATH_ENV = 'value here', we could add a new
    # 'set' method on the variables where a user could explicitly mark an environment variable as
    # safe to convert/sanitize. This would cover cases where a package contains bash scripts
    # that rely on custom environment variables that are not path lists and or
    # that does not contains the name "PATH".
    env.MY_CUSTOM_ENV_VAR.set('{root}/something', convert=True)

    # Same for append and prepend.
    env.MY_CUSTOM_ENV_VAR_OTHER.append('asd', convert=True)
    env.MY_CUSTOM_ENV_VAR_OTHER_2.prepend('asd', convert=True)

Going explicit like this would allow packages to control how environment variables should be handled. Though, I'm not sure how doable this is as I haven't played much with the internals of rex. And also note that I haven't put a huge amount of thoughts into this convert=True thing. But there is one big advantage with this approach and it's that a studio wouldn't endup with rotten/bloated rez configs filled with environment variable names that are all defined in packages that are no more used. I'm thinking of the cases where studios are afraid of removing something because they don;'t know what depends on that.

I also think the responsibility of sanitizing paths should be delegated to shell plugins. This means bash, powershell and others wouldn't be affected by any logic required by git-bash.

This would make the implementation less fragile IMO. I'd particularly be happy with a solution that uses cygpath. I'd also be happy if PATH is automatically taken care of by the git-bash plugin (if handled with cygpath).

Lastly we could reach out to the cygwin and or msys2 folks for guidance on how this could be handled.

@nerdvegas
Copy link
Contributor

nerdvegas commented Jun 19, 2022

Cheers for the link jc I just checked it out.

Generally I think we're on the right path (pun intended). A few things:

  • Agreed wrt default cygpath-like behaviour for handling paths. However, cygpath itself is yet another dependency, and we either would need to compile from source or rely on a system installation - neither of which I would want to do. I'm not aware of any python equivalent implementations either. We'll have to figure out what to do here, maybe just our own simplistic impl will be good enough for the moment.
  • We still need pathed_env_vars though. Cygpath itself has an analogy (MSYS2_ARG_CONV_EXCL), because we can't treat everything as a path (I imagine our existing *PATH default will suffice).
  • Agreed on the idea of convert, I have been thinking about this also, However I would call this shell=<bool>, that seems intuitive.
  • Sanitising paths already is delegated to shell plugins (the impl is there anyway), but yes the setting needs to be per-shell. Impl will need reworking also - specifically stuff like this.root is automatically normalized atm, and I don't think we should be doing that. What do we call the shell-specific setting? Perhaps we can just call it shell_pathed_env_vars.

One more thing. Perhaps instead of this shell=True approach, we could do something more like this (which would also match an existing convention): https://github.com/AcademySoftwareFoundation/rez/wiki/Package-Commands#literal-strings. We also really should make a distinction between platform path norm (ala cygpath) and shell norm (ala gitbash). So for eg:

def commands():
    env.PATHISH.append(as_path("{root}/bin"))
    env.FOO_PLUGIN_PATH.append(as_shell_path("{root}/foo"))

What I specifically like about this approach is that it can be used to deal with other edge cases. For example, your commands might source a script which sets a path into an env-var, which you then need to shell-normalize. With just the convert arg we were talking about, we wouldn't be able to deal with that.

@amorphousWaste
Copy link
Author

I created a new PR because I messed up the commits on the old one here: #1364

This should fix the issues with gitbash as well as addressing the concerns with pathing and using the config for that. Pathing has been moved to the individual shell plugins with different options depending on whether the plugin needs to use a shell path or differently formatted path. The config is not used at all. Please let me know if these are the sought after changes or if more work is needed.

@nerdvegas
Copy link
Contributor

nerdvegas commented Sep 9, 2022

Apologies for not looking into this yet, have not had the time recently.

I'm not clear on where we're at - my last comment in this thread was left hanging and there was no further discussion. To reiterate and tl;dr from memory:

  • we should make a distinction between pathed vars and shell-pathed vars
  • pathed vars should be a global setting; shell-pathed vars should be a per-shell setting
  • pathed vars should be normalized as per cygpath behaviour (posix paths are like /a/b/c; win paths are like C:/a/b/c )
  • path normalization shouldn't be forced - if both settings are empty/null, no normalization occurs
    • however one exception is automatically generated vars such as this.root, that we know are paths. Currently they're getting shell-pathing applied, which in incorrect - it should just be the platform (cygpath- like) normalization.

I want to be clear on agreed approach before reviewing code.
Thx
A

@hughetop
Copy link

hughetop commented Sep 9, 2022

It makes sense to me, I've just decided to move away from gitbash so it's not as relevant for me anymore.

@Jawabiscuit
Copy link

Jawabiscuit commented Sep 12, 2022

Hi @nerdvegas, thanks for looking into it again. I've been testing the PR and from a gitbash user's perspective the behavior has been largely addressed with the current changes except for a couple minor things which I've relayed to @amorphousWaste.

Briefly, what I'm experiencing is that the new posix paths that the gitbash plugin is normalizing to '/a/b/c' with ':' pathseps are working well except for PYTHONPATH where the windows python binary expects 'C:/a/b/c' with ';' pathseps. I have enabled ';' pathseps for PYTHONPATH in the config so all that is left is to replicate cygpath -m behavior for paths used for PYTHONPATH. There maybe other path list variables I'm neglecting because AFAIK I've been testing with packages that are setting CMAKE_MODULE_PATH, PATH, and PYTHONPATH for the most part.

My experience with cmake also has been much better with the current changes. The only other minor thing I'm experiencing that is breaking some tests and inconsistent with cygpath -u is posix paths currently look like '/A/b/c', the drive letter being capitalized. Other than that, the only other issue I'm encountering resembling paths seems to be with absolute paths to network shares. It seems unrelated and probably outside the scope of this PR. I may file a separate issue for that if a search doesn't turn up anything.

@amorphousWaste
Copy link
Author

OK, I think this PR should now address all the issues with gitbash pathing. It can be configured in the config.py file and implemented on a per-shell basis or not at all.

@JeanChristopheMorinPerso
Copy link
Member

@amorphousWaste please keep the discussions related to the PR in the PR. It's hard to keep track of comments related to PRs when they are made in issues.

@bfloch
Copy link
Contributor

bfloch commented Dec 1, 2022

I never brought it to finish but I worked on this a while back:
#737

It adds missing semantics for variables to the configuration, so that platform cases can be handled centrally for each environment variable.

It can also handle some of the problems mentioned, like CMAKE behaving non-standard in Windows (uses semicolon as separator but expects posixpath-like paths) etc.

    "CMAKE_MODULE_PATH": {
        "parent_variables": "clear",

        "pathsep": ";",
        "modify_op": "posixpath",
    },

It also solves the issue of certain variables needed to be inherited or cleared from a calling environment, where rez variables land (before or after inherited variables), can make platform dependent choices, and most importantly gives that control back to the user so we don't have to hard-code anything while we can still provide sane defaults.

Maybe worth revisiting the idea because I remember that we pulled back and forth to fix windows env var issues, but the problem is that certain clients have certain expectation towards certain environment variables, and they don't always coincide with what the operating system expects.

It would actually support mentioned functions to normalize/append in a particular way, but would be configurable and over-writable. Centrally because having the logic of how to add to environment variables in individual packages is a maintenance nightmare and should not differ for a single variable/environment anyway. I liked this approach better than extending the append and prepend functionality.

@JeanChristopheMorinPerso
Copy link
Member

Hey @bfloch , thanks for remembering us of that issue! I'll have to carve some time out to read it carefully (with #739 too).

Centrally because having the logic of how to add to environment variables in individual packages is a maintenance nightmare and should not differ for a single variable/environment anyway. I liked this approach better than extending the append and prepend functionality.

Extending append or prepend or adding new functions would allow to make packages portable. It would allow to mark a string as a "shell" string, which would basically instruct the shell plugins to convert them as it sees fit. It's good to note that the packages wouldn't be aware of shell implementations. They would only mark a variable as "need to be converted properly".

@JeanChristopheMorinPerso JeanChristopheMorinPerso added the shell Shell related issues label Mar 11, 2023
@Jawabiscuit Jawabiscuit linked a pull request Apr 21, 2023 that will close this issue
39 tasks
@bfloch
Copy link
Contributor

bfloch commented Jul 21, 2023

Re-reading your comment - I think it is the same idea.

The difference is the level: I don't believe that every environment variable is responsibility of the shell as not every variable follows the shell concatenation and normalization rules.

My proposal spawned off the CMAKE_MODULE_PATH case which independent of platform/shell behaves by their rules.
Similarly I believe we have variables that behave under certain rules that either follow a platform (or shell).

What I wanted to achieve is similar: Have the packages only define their intent: Add to the variable - and let a global config handle the rest of understanding what it means. The global default for un-specified variables means that the semantics do not change much.

Maybe it is worth adding something along "ShellDependent" to #739 but I think that the approach of a config is not that bad.

Since environment variables are by definition a shared resource - the nice thing is that they do generally apply to all packages.
In the unlikely event that packages need to override the rules there is already a mechanism which is config overrides in packages. For package portability - I think having the choice of what the rules for normalization and concatenation are left to the config is actually helpful. If at any point we have generic packages we would be careful to ship them with reasonable defaults in the config - while system maintainers might choose to change them.

Hope you find some time to re-read both PRs. I might check if I can find some time to revisit them if the approach is deemed sound by the TSC.

Another aspect it addresses is to define how to handle parent process variables - which is outside of the scope of this issue, but also worth exploring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug os:windows Windows-specific shell Shell related issues
Projects
None yet
7 participants