Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[eval] https tests #11638

Draft
wants to merge 1 commit into
base: development
Choose a base branch
from
Draft

[eval] https tests #11638

wants to merge 1 commit into from

Conversation

kLabz
Copy link
Contributor

@kLabz kLabz commented Apr 18, 2024

There's something wrong with mbedtls (or more likely how we use it), but I couldn't fix this yet.. :/

Also, hl on mac seems to have a similar issue 🤔

@kLabz kLabz added bug platform-eval Everything related to the Haxe 4 eval macro interpreter help wanted labels Apr 18, 2024
@kLabz kLabz changed the title [eval] https downloads failing on all platforms [eval] https downloads failing Apr 18, 2024
@kLabz
Copy link
Contributor Author

kLabz commented Apr 19, 2024

@Apprentice-Alchemist do you have an idea what we're doing wrong there that could cause this Unix.Unix_error(Unix.EBADF, "recv", "") error?

CAMLprim value ml_mbedtls_ssl_read(value ssl, value buf, value pos, value len) {
CAMLparam4(ssl, buf, pos, len);
CAMLreturn(Val_int(mbedtls_ssl_read(SslContext_val(ssl), String_val(buf) + Int_val(pos), Int_val(len))));
}

public override function readBytes(buf:haxe.io.Bytes, pos:Int, len:Int):Int {
if (pos < 0 || len < 0 || ((pos + len) : UInt) > (buf.length : UInt))
throw haxe.io.Error.OutsideBounds;
socket.handshake();
var r = @:privateAccess socket.ssl.read(buf, pos, len);
if (r == -1)
throw haxe.io.Error.Blocked;
else if (r <= 0)
throw new haxe.io.Eof();
return r;
}

haxe/std/sys/Http.hx

Lines 444 to 453 in db842bf

api.prepare(size);
try {
while (size > 0) {
var len = sock.input.readBytes(buf, 0, if (size > bufsize) bufsize else size);
api.writeBytes(buf, 0, len);
size -= len;
}
} catch (e:haxe.io.Eof) {
throw "Transfer aborted";
}

Edit:
Seems like I can dodge the issue with some GC runs

@Apprentice-Alchemist
Copy link
Contributor

Hmm. If I try haxe.Http.requestUrl("https://raw.githubusercontent.com/HaxeFoundation/haxe/development/tests/unit/res1.txt") it works, if I try with https://build.haxe.org/builds/haxe/linux64/haxe_latest.tar.gz I get a segfault.

As for that EBADF issue, the cause would be this function being called with a bad socket

let socket_receive socket bytes =
Unix.recv socket bytes 0 (Bytes.length bytes) []
in

aka the issue would be somehwere here
static int bio_read_cb(void* ctx, unsigned char* buf, size_t len) {
CAMLparam0();
CAMLlocal3(r, s, vctx);
vctx = (value)ctx;
s = caml_alloc_string(len);
r = caml_callback2(Field(vctx, 2), Field(vctx, 0), s);
memcpy(buf, String_val(s), len);
CAMLreturn(Int_val(r));
}
CAMLprim value ml_mbedtls_ssl_set_bio(value ssl, value p_bio, value f_send, value f_recv) {
CAMLparam4(ssl, p_bio, f_send, f_recv);
CAMLlocal1(ctx);
ctx = caml_alloc(3, 0);
Store_field(ctx, 0, p_bio);
Store_field(ctx, 1, f_send);
Store_field(ctx, 2, f_recv);
mbedtls_ssl_set_bio(SslContext_val(ssl), (void*)ctx, bio_write_cb, bio_read_cb, NULL);
CAMLreturn(Val_unit);
}

I don't know enough about Ocaml FFI to immediately see anything wrong with the bindings.

@kLabz
Copy link
Contributor Author

kLabz commented Apr 19, 2024

Yes the file needs to be big enough to trigger that issue. The GC workaround makes me think we're allocating a bit too much there, and eval GC doesn't like while loops ? 🤔

@Simn
Copy link
Member

Simn commented Apr 22, 2024

I'm not very familiar with OCaml FFI either, but I always thought that only CAMLprim value kind of functions were supposed to have all this CAMLparam0 and such cruft. If that isn't the case then at the very least it looks suspicious that there's a param0 thing on a function that clearly has 3 parameters.

@Apprentice-Alchemist
Copy link
Contributor

That param0 thing is needed for that function to be able to declare value locals, but since it doesn't take any value parameters it doesn't need to use CAMLparam3.
https://ocaml.org/manual/5.1/intfc.html#sec503

@Simn
Copy link
Member

Simn commented Apr 22, 2024

Hmm, but it does have this vctx = (value)ctx; line which makes me wonder if ctx shouldn't be a value in the first place.

@Apprentice-Alchemist
Copy link
Contributor

Thinking about it some more, the real problem is probably that ctx is stored somewhere the GC can't see.

@Simn
Copy link
Member

Simn commented Apr 22, 2024

Yes I think that's what I'm trying to get at as well.

@Simn
Copy link
Member

Simn commented Apr 26, 2024

Update please!

@kLabz kLabz force-pushed the eval_ssl branch 2 times, most recently from 51b28f6 to 6bda9cc Compare April 29, 2024 20:34
@kLabz
Copy link
Contributor Author

kLabz commented Apr 29, 2024

Well.. locally it's worse than before (the workaround isn't working anymore), even if CI seems fine so far 🤔
There's something wrong with the mbedtls 3 impl I think, will investigate tomorrow.

@Apprentice-Alchemist
Copy link
Contributor

Works fine on my machine, with or without the workaround.
(Arch Linux, compiler linked with mbedtls 3)

@kLabz
Copy link
Contributor Author

kLabz commented Apr 30, 2024

Works fine on my machine, with or without the workaround. (Arch Linux, compiler linked with mbedtls 3)

You do enable https tests by commenting out this line, right?

#if !github false && #end // comment out line to run https tests locally

I'm on Arch Linux too and I get this with or without the workaround:

src/unit/TestMainNow.hx:6: Generated at: 2024-04-30 11:49:31
src/unit/TestMain.hx:31: START
malloc(): invalid size (unsorted)
[3]    2210161 IOT instruction (core dumped)  haxe compile-macro.hxml

(unless I use nightlies which are not compiled with mbedtls 3, then it's fine)

@Apprentice-Alchemist
Copy link
Contributor

Yes, I enabled https tests.

Does the core dump give a useful backtrace?

@kLabz
Copy link
Contributor Author

kLabz commented Apr 30, 2024

Doesn't give anything :/
Can't repro my issue on CI, but also can't make it work locally unless I force mbedtls 2..

Also, a similar fix might be needed for mac+hl for these tests to pass

@Apprentice-Alchemist
Copy link
Contributor

Regarding the HL+macOS failure, I don't think it's caused by GC issues or stuff like that.

I did a little bit of debugging using CI (https://github.com/Apprentice-Alchemist/hashlink/actions/runs/8907771743/job/24462210850)
The error is caused by mbedtls failing in TLS 1.3 stuff:
https://github.com/Mbed-TLS/mbedtls/blob/2ca6c285a0dd3f33982dd57299012dacab1ff206/library/ssl_tls13_generic.c#L1684-L1693

Not yet sure what the root cause is, but I suspect it might be due to some compile option getting enable by Homebrew somehow?

@Apprentice-Alchemist
Copy link
Contributor

Ah, figured it out.

Starting in MbedTLS 3.6 MBEDTLS_SSL_PROTO_TLS1_3 is enabled by default.
The TLS 1.3 code uses PSA Crypto, which requires psa_crypto_init to be called before using any mbedtls functions.

@kLabz
Copy link
Contributor Author

kLabz commented May 2, 2024

Ah nice, thanks! Would be nice to have a draft PR with your branch so that it could be done at some point (by you or someone else), and I could disable the test here in the meantime.

As for my local issue, as long as it's only happening for me I guess it's ok... :/

@Simn
Copy link
Member

Simn commented Jul 25, 2024

I have merged #11655, please update the branch!

@kLabz
Copy link
Contributor Author

kLabz commented Jul 29, 2024

I'm still getting malloc(): invalid size (unsorted) locally, which

  • is really annoying, since I do work with eval target (and it's worse than before for me since I can't work around the issue anymore)
  • tells me there's still something wrong even if CI isn't catching it

@kLabz kLabz changed the title [eval] https downloads failing [eval] https tests Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help wanted platform-eval Everything related to the Haxe 4 eval macro interpreter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants