Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix self-hosted test failures #662

Open
14 of 19 tasks
kanaka opened this issue Aug 8, 2024 · 32 comments
Open
14 of 19 tasks

Fix self-hosted test failures #662

kanaka opened this issue Aug 8, 2024 · 32 comments

Comments

@kanaka
Copy link
Owner

kanaka commented Aug 8, 2024

It's been a while since I've done a comprehensive self-host test run. I've add a input variable to workflow so that self-hosted mode can be added for manually triggered workflows. The CI run is at: https://github.com/kanaka/mal/actions/runs/10309343464/job/28538806985

Some of these implementations might just need a NO_SELF_HOST setting (such as xslt), but a lot of the failures look like bugs rather than timeouts or memory exhaustion.

Once all these have been fixed or NO_SELF_HOST is added, we should make self-hosted test step enabled by default (it only adds a few minutes overall to the full CI workflow).

UPDATE: self-host mode has been made the default for workflows and the remaining broken implementations already had or have been updated with NO_SELF_HOST (erlang, jq, nasm, xslt, vbs).

Here is the the full list of self-hosted failures with very brief statement of the issue (see comments for more details):

  • awk
  • c.2
  • cpp
  • common-lisp
  • dart
  • erlang - memory leak (e.g. (sumdown 40000))
  • io
  • java
  • jq - empty lists inside functions/closures are failing
  • kotlin
  • nasm - Needs more heap. With increased heap second quasiquote call fails
  • ps
  • powershell
  • python.2
  • sml
  • vala
  • vbs (vbscript) - startup/invocation error
  • xslt - memory/recursion limit
  • swift5
@kanaka
Copy link
Owner Author

kanaka commented Aug 14, 2024

awk fixed by e62567f and d1afe8d

@kanaka
Copy link
Owner Author

kanaka commented Aug 14, 2024

python.2 fixed in 0d32585

@kanaka
Copy link
Owner Author

kanaka commented Aug 15, 2024

Added common-lisp. Now that the underlying common-lisp issue with hash-maps is fixed, it reveals a new self-hosted failure (step4 issue with how nil is handled).

@kanaka
Copy link
Owner Author

kanaka commented Aug 15, 2024

More recent CI run (with awk and python.2 fixed), is here: https://github.com/kanaka/mal/actions/runs/10393189805/job/28780104927

Some of the failures have some common patterns:

  • c.2, cpp, dart, swift5 all fail in step4 with the same type of error. Appears that the result of the list command is being evaluated again when it shouldn't be. For example:
    $ make DOCKERIZE=1 MAL_IMPL=cpp repl^step4
    mal-user> (list 1)
    Uncaught exception: "1" is not applicable
    
  • java and kotlin appear to have a similar symptom:
    $ make DOCKERIZE=1 MAL_IMPL=kotlin repl^step4
    mal-user> (do (prn 101) 7)
    101
    101
    101
    101
    7
    
    And important to note that using Ctrl-D results in multiple exits from the prompt (more levels after running the broken do call).

Other failure summaries:

  • common-lisp: in self-hosted mode, nil and () are being conflated (which is interesting because in traditional Lisps, those are often represented the same way):
    make DOCKERIZE=1 MAL_IMPL=common-lisp repl^mal^step4
    mal-user> nil
    ()
    
  • jq: immediately exits (with 0 exit code) when trying to launch step2
  • nasm: complains about Error: Run out of memory for Cons objects. Increase heap_cons_limit. Doubling the cons and array memory array results steps 0 - 6 working. step7 breaks like this:
    $ make DOCKERIZE=1 MAL_IMPL=nasm repl^mal^step7
    mal-user> (quasiquote ())
    ()
    mal-user> (quasiquote ())
    Error: list does not begin with a function
    Uncaught exception: (#<unknown value>)
      in mal EVAL: (quasiquote ())
    ``
    
  • ps has problems starting at step8:
    $ make DOCKERIZE=1 MAL_IMPL=ps repl^mal^step8
    mal-user> (defmacro! unless (fn* (pred a b) `(if ~pred ~b ~a)))
    <(fn* [& args] (EVAL (nth ast 2) (new-env env (nth ast 1) args)))>
    mal-user> (unless 7 8 9)
    Uncaught exception: typecheck: get at 8638
      in mal EVAL: (unless 7 8 9)
    
  • powershell also has an application error in step4 but it manifests differently than symptom effecting cpp, c.2, etc:
    $ make DOCKERIZE=1 MAL_IMPL=ps repl^mal^step4
    mal-user> (list? (list))
    Uncaught exception: can only apply functions
      in mal EVAL: (list? (list))
    
  • sml in mlton mode appears to have reader/parsing errors that aren't specific enough (just "SyntaxError").
  • vala: macros don't appear to be evaluating their results in self-hosted mode
  • xslt: fails with Error: Too many nested template or function calls. The stylesheet may be looping. . Unless there is a way to increase the call stack size, this one will probably just need to be marked NO_SELF_HOST

@kanaka
Copy link
Owner Author

kanaka commented Aug 15, 2024

@asarhaddon @wasamasa @dubek Anybody want to help me track down some fun self-hosted failures?

@kanaka
Copy link
Owner Author

kanaka commented Aug 18, 2024

c.2, cpp, dart, and swift5 fixed in 7b23c7e

@kanaka
Copy link
Owner Author

kanaka commented Aug 19, 2024

java and kotlin fixed in b31180c (last position of do was being EVAL'd twice).

@kanaka
Copy link
Owner Author

kanaka commented Aug 19, 2024

common-lisp fixed in 40aca13 list? was defined to return true for nil. Added a test to catch that too.

@kanaka
Copy link
Owner Author

kanaka commented Aug 19, 2024

Powershell fixed in 78000d6. However, we probably want to mark powershell as NO_SELF_HOST because the self-hosted tests take 26 minutes to run in GHA. The next longest pole in the tent is rpython which takes less than 10 minutes for everything (most of it is build time).

@dubek
Copy link
Collaborator

dubek commented Aug 20, 2024

I looked at debug logs from jq running repl^mal^step2. It starts processing the expressions from step2_eval.mal, and in the middle of the big (def! EVAL ...) expression I see a halt command which tells it to stop. Not sure what's going on yet.

@dubek
Copy link
Collaborator

dubek commented Aug 20, 2024

Okay, found it: jq chokes on (); the following patch circumvents the jq bug:

diff --git a/impls/mal/step2_eval.mal b/impls/mal/step2_eval.mal
index 4d40f942..f5242a8c 100644
--- a/impls/mal/step2_eval.mal
+++ b/impls/mal/step2_eval.mal
@@ -24,7 +24,7 @@

       (list? ast)
       (if (empty? ast)
-        ()
+        (list)
         (let* [a0  (first ast)
                f    (EVAL a0 env)
                args (rest ast)]

Need to add a tests for () somewhere, and then fix it for jq. Not tonight...

@dubek
Copy link
Collaborator

dubek commented Aug 20, 2024

But then I run into a new issue in mal step3:

mal-user> (+ 1 2)
Uncaught exception: 'env-find-str' not found

more fun with jq awaits...

@kanaka
Copy link
Owner Author

kanaka commented Aug 20, 2024

ps/postscript fixed in a160866 Missing a "pop" in the macro? function 😧

@kanaka
Copy link
Owner Author

kanaka commented Aug 20, 2024

I have a fix for the sml mlton self-host issue: 4719e58

However, I haven't merged it yet because the test I added to catch the issue in non-self-hosted mode revealed a problem in xslt:

$ make DOCKERIZE=1 repl^xslt^step9
user=> (read-string "(+ 1 2")      ;; note the missing paren in the string
(+ 1)
user=> (+ 1 2
Error: EOF while reading list

For some reason, the read-string core function is returning the partial parsed result rather than conveying the parsing error. I suspect the fix isn't too difficult, but it will requiring groking the xslt implementation at least somewhat. @asarhaddon since you've been in the xslt code before, any chance you could look at this?

@asarhaddon
Copy link
Contributor

Do not expect much from me in the xslt field. A few years ago, I have given up merging eval and evalast.

@kanaka
Copy link
Owner Author

kanaka commented Aug 21, 2024

Vala is fixed in 39563fd. However, there is a rare memory corruption I discovered at: #663

@kanaka
Copy link
Owner Author

kanaka commented Aug 22, 2024

  • sml fixed in e9dd8c5
  • xslt non-self-issue issue revealed by new tests is fixed in: a2973ab

@kanaka
Copy link
Owner Author

kanaka commented Aug 22, 2024

For xslt, there is apparently a way to set the recursion depth in saxon via -maxDepth:5000.

kanaka added a commit that referenced this issue Aug 22, 2024
xslt may never be able to do this but we would definitely like jq, nasm,
and possibly vbs to be fixed for self-hosted tests. Ticket tracking
fixes: #662
kanaka added a commit that referenced this issue Aug 22, 2024
xslt may never be able to do this but we would definitely like jq, nasm,
and possibly vbs to be fixed for self-hosted tests. Ticket tracking
fixes: #662
@dubek
Copy link
Collaborator

dubek commented Aug 22, 2024

Another jq update:

dubek@FISHLAPTOP:~/mal/impls/jq (master *) $ ./run
user> ()
()
user> (do ())
()
user> (if true ())
()
user> (def! f (fn* [] ()))
Cannot iterate over null (null)

So () is supported (it's part of our tests) but not inside the body of functions... Weird, still looking.

@kanaka kanaka closed this as completed in 3307f07 Aug 22, 2024
@kanaka
Copy link
Owner Author

kanaka commented Aug 22, 2024

I just added a commit that enables self-hosted mode by default and added NO_SELF_HOST=1 to IMPLS.yaml for jq, nasm, and xslt. This ticket is still applicable, the failures just won't show up in default workflow.

@kanaka kanaka reopened this Aug 22, 2024
@kanaka
Copy link
Owner Author

kanaka commented Aug 22, 2024

I've added erlang and vbs to the list since those currently have NO_SELF_HOST=1 but the former ought to work and the latter is reported to work (but there is something wrong with the invocation it appears).

  • erlang fails like this:
$ make DOCKERIZE=1 MAL_IMPL=erlang test^mal^step4
...
TEST: '(def! DO (fn* (a) 7))' -> ['',] -> SUCCESS (result ignored)
TEST: '(DO 3)' -> ['',7]
Running: env STEP=step0_repl MAL_IMPL=vbs RAW=1 ../../runtest.py  --deferrable --optional --start-timeout 60 --test-timeout 180 --no-pty --debug-file ../../test-mal-vbs.debug ../tests/step0_repl.mal -- ../mal/run
Did not receive one of following prompt(s): ['[^\\s()<>]+> ']
    Got      : 'Input Error: Unknown option "/mnt/d/a/mal/mal/impls/vbs/../vbs/stepA_mal.vbs" specified.\n'

@kanaka
Copy link
Owner Author

kanaka commented Aug 22, 2024

So () is supported (it's part of our tests) but not inside the body of functions... Weird, still looking.

That is weird. That will be a good one to add to the tests once you figure it out.

@kanaka
Copy link
Owner Author

kanaka commented Aug 23, 2024

Erlang has a memory leak (even in non-self-hosted mode). Seems like memory is never reclaimed:

$ make repl^erlang^stepA
user=> (def! sumdown (fn* (N) (if (> N 0) (+ N (sumdown  (- N 1))) 0)))
user=> (sumdown 10000)  ;; watch memory grow in top
user=> (sumdown 10000)
user=> (sumdown 40000)  ;; a few of these will probably exhaust memory and kill the process

@OldLiu001
Copy link
Contributor

OldLiu001 commented Aug 24, 2024

@kanaka Thanks for providing log, vbs fails to start issue has been fixed in #666 .

@dubek
Copy link
Collaborator

dubek commented Aug 26, 2024

Next chapter on jq:

This patch (whitespace changes ignored) fixes one bug:

diff --git a/impls/jq/utils.jq b/impls/jq/utils.jq
index 7b0876b7..0ad1ec81 100644
--- a/impls/jq/utils.jq
+++ b/impls/jq/utils.jq
@@ -1,6 +1,6 @@
 def _debug(ex):
     . as $top
-    | ex
+    | (ex + "\n")
     | debug
     | $top;

@@ -45,6 +45,9 @@ def find_free_references(keys):
         | if .kind == "symbol" then
             if keys | contains([$dot.value]) then [] else [$dot.value] end
         else if "list" == $dot.kind then
+            if $dot.value|length == 0 then
+                []
+            else
                 # if - scan args
                 # def! - scan body
                 # let* - add keys sequentially, scan body
@@ -69,6 +72,7 @@ def find_free_references(keys):
                   else
                     [ $dot.values[1:][] | _refs ]
                   end
+                end
         else if "vector" == $dot.kind then
             ($dot.value | map(_refs) | reduce .[] as $x ([]; . + $x))
         else if "hashmap" == $dot.kind then

And now running mal step2 no longer exits, but the mal step2 tests fail:

$ make MAL_IMPL=jq test^mal^step2
make -C impls/mal step2_eval.mal
make[1]: Entering directory '/home/dubek/mal/impls/mal'
make[1]: 'step2_eval.mal' is up to date.
make[1]: Leaving directory '/home/dubek/mal/impls/mal'
(call STEP_TEST_FILES,mal,step2): impls/tests/step2_eval.mal
----------------------------------------------
Testing test^mal^step2; step file: impls/mal/step2_eval.mal, test file: tests/step2_eval.mal
Running: env STEP=step2_eval MAL_IMPL=jq RAW=1 ../../runtest.py  --deferrable --optional --start-timeout 60 --test-timeout 120  ../tests/step2_eval.mal -- ../mal/run
Testing evaluation of arithmetic operations
TEST: '(+ 1 2)' -> ['',3] -> FAIL (line 3):
    Expected : '.*\n3'
    Got      : '(+ 1 2)\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
TEST: '(+ 5 (* 2 3))' -> ['',11] -> FAIL (line 6):
    Expected : '.*\n11'
    Got      : '(+ 5 (* 2 3))\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
TEST: '(- (+ 5 (* 2 3)) 3)' -> ['',8] -> FAIL (line 9):
    Expected : '.*\n8'
    Got      : '(- (+ 5 (* 2 3)) 3)\nUncaught exception: array ([{"kind":"l...) and string ("\\n") cannot be added '
...
...

So there's another thing lurking there.

@dubek
Copy link
Collaborator

dubek commented Aug 26, 2024

Silly me, my change with the + "\n" at the debug function causes that second problem. Mal step2 tests pass now, I'll keep testing the other steps.

@dubek
Copy link
Collaborator

dubek commented Aug 26, 2024

OK first step towards jq is in PR #668.

Then I discovered another issue in jq: if we introduce this test:

;; Test out-of-order declaration
(def! mul2add3 (fn* (n) (let* (p (mul2 n)) (+ p 3))))
(def! mul2 (fn* (n) (* 2 n)))
(mul2add3 5)
;=>13

Then on the last line instead of 13 we get:

JqMAL Exception :: 'mul2' not found

I assume that let* has a "frozen" environment of that point, so later def! mul2 doesn't modify that env. I need to look into it.

But even after changing the order of functions in env.mal to circumvent this issue, jq times out in mal step6. Not sure if it will be able to self-host at all.

@kanaka
Copy link
Owner Author

kanaka commented Aug 26, 2024

@dubek yeah, I did a little playing around and I think the problem is that jq envs are not actually mutable. They are more like Clojure maps in that a copy is created when a value is added or remove and the new version is the one that is used. This actually is a legitimate way of treating environments but it's not the intended path for mal. I think we've essentially been relying on self-hosted tests to test this mutable property of mal envs. So we probably need a test like this in step4 tests:

;; Testing that env is actually mutable and that lexical scope binds names (not values)
(def! a 12)
(def! fx (fn* () a))
(fx)
;=>12
(def! a 2000)
(fx)
;=>2000

The above will fail in jq. The final test will return 12 instead of 2000.

But I'm also confident that there is a way to solve this in jq because the jq implementation supports atom types which definitely seem to be mutable in jq. It probably won't be easy to convert env to work like atoms (the handling of atom types doesn't appear to be confined to the atom related functions unfortunately). Maybe you can find an easier workaround that doesn't require a full rewrite of env handling. But if not, we'll probably need to understand fully how the jq implementation is doing atom types and use that mechanism for env handling. 🏋️

@wasamasa
Copy link
Collaborator

The above comment looks as if it could close #645

@kanaka
Copy link
Owner Author

kanaka commented Aug 28, 2024

@wasamasa thanks. I realized I had a branch with that and another env test so I've turned that into a PR (#677) and closed #645

@alimpfard
Copy link
Contributor

alimpfard commented Oct 21, 2024

I just read through this issue (looking at jq in particular), here are some comments:

@dubek yeah, I did a little playing around and I think the problem is that jq envs are not actually mutable. They are more like Clojure maps in that a copy is created when a value is added or remove and the new version is the one that is used. This actually is a legitimate way of treating environments but it's not the intended path for mal.

Yes, jq does not have mutable state at all, jqmal tries to hide this by late-binding free variables in functions, but modifying an already-bound variable is not really possible in general without a lot of pain-

But I'm also confident that there is a way to solve this in jq because the jq implementation supports atom types which definitely seem to be mutable in jq. It probably won't be easy to convert env to work like atoms (the handling of atom types doesn't appear to be confined to the atom related functions unfortunately). Maybe you can find an easier workaround that doesn't require a full rewrite of env handling. But if not, we'll probably need to understand fully how the jq implementation is doing atom types and use that mechanism for env handling. 🏋️

Atoms have unique identities, it's relatively simple to do the same with bindings (stamp the env entry with an identifier and do a little dance where looking up is equivalent to:

def env_get2($name; $current_env; $fallback_env):
    $name | env_get($current_env) as $v | $v.identity | env_get_by_id($fallback_env) // $v.entry

That said, I don't think the guide actually requires this name-binding behaviour?
if this is indeed the intended purpose, I don't see much point in the distinction between atoms and variables?

@kanaka
Copy link
Owner Author

kanaka commented Oct 29, 2024

Add io self-hosted failue in step 0: https://github.com/kanaka/mal/actions/runs/11524734437/job/32085546455?pr=699

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants