Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't redefine assignment builtins, 'builtin', or 'command' as functions #654

Open
andychu opened this issue Mar 11, 2020 · 7 comments
Open

Comments

@andychu
Copy link
Contributor

andychu commented Mar 11, 2020

from #620

Overridable/protected builtins

Also I'm interested in any such differences that matter for ble.sh, other than break / continue / return / exit. Those are the only builtins that are turned into keywords as far as I remember.

ble.sh overrides the builtins bind, exit, read and trap. I found that trap cannot be overridden as well in Oil. I checked which Bash builtin equivalent can be overridden in Oil with the following script:

for b in $(enable | awk '$0=$2'); do
  echo "=== $b ==="
  bin/osh -c "function $b { echo hello; }; $b"
done

I found that, in addition to control flow keywords (exit, break, continue and return), the declare family (declare, export, local, readonly, typeset) and unset, parameter manipulation (set and shift) and other builtins builtin, eval, exec, times, trap, . and : are not allowed to be overridden. I haven't tested Oil specific builtins.

It is interesting to observe that one can override command even though one cannot override builtin. One can override source even though one cannot override .. Is the list of overridable builtins and protected builtins documented somewhere?

Originally posted by @akinomyoga in #620 (comment)

@andychu andychu changed the title some builtins can't be redeifned as functions some builtins can't be redefined as functions Mar 11, 2020
@Crestwave
Copy link
Contributor

It is interesting to observe that one can override command even though one cannot override builtin. One can override source even though one cannot override .. Is the list of overridable builtins and protected builtins documented somewhere?

Well, all of those that can't be overrided seem to be POSIX special builtins except for builtin and some of the "declare family", as you said, so it makes sense that they're lumped together.

On another note, I just realized that the special builtins means that the famous fork bomb is not POSIX-compatible. So I guess that's one more for Oil's "safety" (until this gets fixed, anyway) ¯\_(ツ)_/¯

@andychu
Copy link
Contributor Author

andychu commented Mar 15, 2020

Yes I haven't had time to look into this more closely, but I believe the rules in Oil are basically:

  • POSIX special builtins can't be redefined (which bash only implements under set -o posix, but more other shells do)
  • assignment builtins are special and can't be overriden.
    • this is a side effect of their special parsing and evaluation. For example local $x is evaluated differently than echo $x when x='a=b c'.
  • builtin and command are sort of special cases, but I might be able to change this

If this is blocking anything specific I'll take a closer look

@andychu
Copy link
Contributor Author

andychu commented Apr 6, 2020

builtin declare may not be needed in Oil scripts because it doesn't allow redefinition of declare, but Bash allows redefinition. Also, is there a reason not to support it? type -t declare says it is a builtin, and builtins are supposed to be accessible through builtin. Or maybe declare can be changed to a keyword in Oil (though I'm not sure about the impact on existing Bash scripts).

@akinomyoga Let's talk about these related issues here.

I don't have a problem with the behavior of redefining assignment builtins and allowing builtin declare, etc. but it is tricky to implement.

The underlying reason is that assignment builtins are different than builtins in all shells with regard to word splitting.

Consider:

$ x='foo bar'
$ declare a=$x; echo [$a]
[foo bar]

If assignment builtins were evaluated like regular builtins, then a=foo and there's an empty variable bar. But instead we get a single variable a=foo bar.

So what we have to do is look at the "first" word dynamically while evaluating to determine if it's an assignment builtin:

cmd='declare'
$cmd a=$x

Oil actually implements a more consistent zsh-like behavior. As far as I remember the behavior of bash didn't make much sense. There were lots of corners that were unattended to.


So in summary I think it's possible but not straightforward.

@andychu
Copy link
Contributor Author

andychu commented Apr 6, 2020

Another reason is that export is the only assignment builtin in POSIX shell (meaning it has special word splitting behavior), and it's a special builtin, which means it can't be overridden with a function. bash implements this behavior with -o posix.

So the declare/typeset/readonly/local extensions all follow export. They are assignment builtins, and they are special builtins. Like I said it may be possible to relax -- I'm not sure how hard it is -- but there is a good reason for it.

$ dash -c 'export() { echo hi; }; export'
dash: 1: Syntax error: Bad function name

$ bash -c 'export() { echo hi; }; export'
hi

$ bash -o posix -c 'export() { echo hi; }; export'
bash: `export': is a special builtin

@andychu
Copy link
Contributor Author

andychu commented Apr 7, 2020

Somewhat related: I ran into a manifestation of the same difference between bash and zsh while testing #696 to implement $_.

In zsh assignment builtins are more principled and declare a=(1 2) sets $_ to declare, since the latter isn't really a "word". Bash sets it to a for some reason.

I guess the point is that what the last word is is unclear in the case of assignment builtins, but it's clear for regular builtins. And this is one reason we have to parse assignment builtins up front, to recognize the non-words like a=(1 2).

So there are really 3 kinds of builtins: special, assignment, and "normal" ones.

@akinomyoga
Copy link
Collaborator

@akinomyoga Let's talk about these related issues here.

I don't have a problem with the behavior of redefining assignment builtins and allowing builtin declare, etc. but it is tricky to implement.

@andychu Thank you for your reply! I'm sorry that my confusing writing. I was not thinking about making assignment builtins redefinable but just thought about supporting the builtin declare form. Since ble.sh doesn't redefine declare and other assignment builtins, Oil doesn't need to make them redefinable for ble.sh

[ Note: ble.sh had used builtin declare in the place where ble.sh wanted to use the original builtin even when users redefine them in Bash. But I changed builtin declare to declare in the master branch of ble.sh. I realized that it was an old workaround introduced before ble.sh forbids users to redefine assignment builtins (by removing overriding functions by unset after user commands are executed). Now the workaround is not needed. ]

Nevertheless, I thought that it is natural to support builtin declare form as far as declare is a builtin even if it is not redefinable in Oil. But, yes, I agree that the syntax such as =() is a tricky part. Bash doesn't support the special treatment of array () for assignment builtins when used with builtin.

For the syntax analysis, Bash actually specially treats the literal word declare, readonly, typeset, local and export and doesn't dynamically check if the word is redefined or not. For example, even when declare is redefined by users, declare a=() is still allowed syntactically.

In zsh assignment builtins are more principled and declare a=(1 2) sets $_ to declare, since the latter isn't really a "word". Bash sets it to a for some reason.

We can observe how Bash treats the words in assignment builtins in the following more explicit way:

$ function declare { printf '[%s]\n' "$@"; }
$ declare a=1 b=(1 2 3) c
[a=1]
[b]
[c]

@andychu
Copy link
Contributor Author

andychu commented Oct 12, 2021

I'm going through issues labeled "divergence", and reviewing them... I realize I didn't fully respond to this.

I would say the special cases of builtin, declare, and assignment builtins are likely to persist because of the parsing issue with =():

But, yes, I agree that the syntax such as =() is a tricky part. Bash doesn't support the special treatment of array () for assignment builtins when used with builtin.

For example, even when declare is redefined by users, declare a=() is still allowed syntactically.

I think OSH behavior is very well defined and avoids these confusing (and undocumented) issues! But yes it's slightly incompatible.

I ran into some similar things with extended globs recently! For example, shopt -s extglob changes the valid characters in a function name !!

Like

shopt -s extglob
myfunc@(*.cc|*.h) { echo hi; }  # this would be a syntax error without shopt -s extglob

$ 'myfunc@(*.cc|*.h)'  # now I can call this function!!!
hi

So IMO bash has a deep confusion between parsing and execution... the parsing of =() and @(*.cc) are similar in that respect!


(Removed the "divergence" label because that's for things that we know we should fix. This is under "compatibility" which may not be fixed. Comments welcome!)

@andychu andychu changed the title some builtins can't be redefined as functions can't redefine assignment builtins, 'builtin', or 'command' as functions Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants