Resultify AVM #496

cdmistman · 2021-04-19T22:17:54Z

fixes #472

…etry logging

avm/src/vm/memory.rs

aguillenv · 2021-04-20T11:54:28Z

avm/src/vm/event.rs

+          let hand_mem = hand_mem.clone();
+          async move {
+            if let OpcodeFn::Io(func) = i.opcode.fun {
+              //eprintln!("{} {:?}", i.opcode._name, i.args);


These comments are for future debugging?

yes - I didn't remove it since it's been there for a while (and could prove useful in the future)

aguillenv · 2021-04-20T11:57:10Z

avm/src/vm/protos/mod.rs

@@ -1 +1,3 @@
-// Dummy mod. Will be auto generated by protoc-rust


Hmmm will think how to avoid this and be transparent for us

sounds good. I brought it back in the next commit, but I keep forgetting the 1 extra step after git add .

dfellis

Undo these changes

dfellis · 2021-04-20T17:18:07Z

avm/src/vm/event.rs

 /// Special events in alan found in standard library modules, @std.
 /// The IDs for built-in events are negative to avoid collision with positive, custom event IDs.
 /// The first hexadecimal byte of the ID in an integer is between 80 and FF
 /// The remaining 7 bytes can be used for ASCII-like values
 pub enum BuiltInEvents {
  /// Alan application start
  /// '"start"' in ASCII or 2273 7461 7274 22(80)
-  START,
+  START = -9213673853036498142,


What's going on here?

with this, we can use BuiltInEvents::START as i64

dfellis · 2021-04-20T17:22:37Z

avm/src/vm/event.rs

            }
+            Ok(())
+          } else {


This whole branch is known (to us) to never happen because of the way the opcode fragments work. It would be nice to be able to assert this with the type during the fragment generation, perhaps having another enum for the "fragment type" wrapping around the vec and only allowing the particular opcode function enum we care about within each of these outer enums and being able to avoid this looping check here?

Not something for this PR, but it does bother me that we have to define an error path here that will never be triggered and that we have to pay a price doing these useless checks.

I think there's a way to unify the function types of the different opcode kinds, so then we can entirely get rid of these checks and only use the additional information for building up the instruction fragments. I imagine it'd be a struct with 2 fields, and running them would be much faster.

The new opcode defs would look like:

pub enum OpcodeKind { CPU, IO, UnpredCPU, } pub struct Opcode { name: &'static str, kind: OpcodeKind, exec: fn(&[i64], Arc<HandlerMemory>) -> Box<Pin<VMResult<Arc<HandlerMemory>>>>, }

so execution can be done with purely an iterator (simplified):

handler.instructions .iter() .map(|fragment| { fragment .iter() .map(|(op, instrs)| exec(&instrs, hmem.fork())) .collect::<UnorderedFutures<VMResult<Vec<Arc<HandlerMemory>>>>>>() }) .collect::<OrderedFutures<VMResult<Vec<Arc<HandlerMemory>>>>>()

So the CPU-bound opcodes can return an event payload, they don't return the HandlerMemory.

However, I think that weirdness is used only by the emitop opcode, so if we can refactor things to let event emission be done specially by that opcode and not require checking the return value, that'd be great. But what I'd prefer is that CPU opcodes still have a special return type, it's just that it's now void and we don't have to do any conditional checking on every single CPU opcode call, which is probably a significant perf impact.

dfellis · 2021-04-20T17:25:12Z

avm/src/vm/http.rs

@@ -57,7 +57,7 @@ macro_rules! make_server {
        let port_num = http.port;
        let addr = std::net::SocketAddr::from(([0, 0, 0, 0], port_num));
        let make_svc = hyper::service::make_service_fn(|_conn| async {
-          Ok::<_, std::convert::Infallible>(hyper::service::service_fn($listener))
+          Ok::<_, $crate::vm::VMError>(hyper::service::service_fn($listener))


Why this change?

the service function could return a VMError since it has to work with the memory of the event that handles each request, so I allow the hyper service to propagate such errors - although we could handle the error directly in the service function?

So for the web server I would prefer it is infallible and if it encounters any VMError responses it returns a 500 Internal Server Error to the end user with some sort of short message indicating an unexpected failure within the AVM itself? Simply crashing out will make it harder to track down what's going wrong, I think.

avm/src/vm/memory.rs

dfellis · 2021-04-20T17:31:24Z

avm/src/vm/memory.rs

@@ -136,15 +144,19 @@ impl HandlerMemory {
    event_id: i64,
    curr_addr: i64,
    curr_hand_mem: &Arc<HandlerMemory>,
-  ) -> Option<Arc<HandlerMemory>> {
-    let pls = Program::global().event_pls.get(&event_id).unwrap().clone();
+  ) -> VMResult<Option<Arc<HandlerMemory>>> {


We're wrapping another layer around HandlerMemory here.

You know when we want to improve performance we're going to be undoing a lot of this and going back to "we have thoroughly tested/audited our code so we don't have to pay a runtime penalty for this safety" :/

that makes sense, although many of these checks are happening anyways when we call unwrap or expect. This is just to allow us to prevent these panics and recover from them if the daemon is running.

dfellis · 2021-04-20T17:35:50Z

avm/src/vm/memory.rs

+    let s_bytes = f
+      .block
+      .iter()
+      .skip(1)
+      // TODO: `IntoIter::new` will be deprecated once `rust-lang/rust#65819` is merged
+      .flat_map(|(_, chars)| IntoIter::new(chars.to_ne_bytes()))
+      .collect::<Vec<u8>>();


Why switch from the for in loop to this if it depends on discourage usage of a built-in type?

the for in loop from before had a non-constant number of allocations - this iterator implementation has 1 allocation total, guaranteed (on iter() - skip doesn't allocate, IntoIter::new recasts the array from chars.to_ne_bytes to an iterator, and collect inherits the allocation from iter())

(side note: impl IntoIterator for [T; N] will be stabilized soon, but it'll actually call impl IntoIter for Slice, which will result in dropping the array while returning the iterator on a slice of the array, which is a compiler error. In edition 2021, this will delegate to array::IntoIter, which won't be deprecated)

dfellis · 2021-04-20T17:39:24Z

avm/src/vm/opcode.rs

  cpu!(i8f64 => fn(args, hand_mem) {
-    let out = hand_mem.read_fixed(args[0]) as f64;
-    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
-    None
+    let out = hand_mem.read_fixed(args[0])? as f64;
+    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
+    Ok(None)
  });
  cpu!(i16f64 => fn(args, hand_mem) {
-    let out = hand_mem.read_fixed(args[0]) as f64;
-    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
-    None
+    let out = hand_mem.read_fixed(args[0])? as f64;
+    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
+    Ok(None)
  });
  cpu!(i32f64 => fn(args, hand_mem) {
-    let out = hand_mem.read_fixed(args[0]) as f64;
-    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
-    None
+    let out = hand_mem.read_fixed(args[0])? as f64;
+    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
+    Ok(None)
  });
  cpu!(i64f64 => fn(args, hand_mem) {
-    let out = hand_mem.read_fixed(args[0]) as f64;
-    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
-    None
+    let out = hand_mem.read_fixed(args[0])? as f64;
+    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
+    Ok(None)
  });
  cpu!(f32f64 => fn(args, hand_mem) {
-    let out = f32::from_ne_bytes((hand_mem.read_fixed(args[0]) as i32).to_ne_bytes());
-    hand_mem.write_fixed(args[2], i32::from_ne_bytes(out.to_ne_bytes()) as i64);
-    None
+    let out = f32::from_ne_bytes((hand_mem.read_fixed(args[0])? as i32).to_ne_bytes());
+    hand_mem.write_fixed(args[2], i32::from_ne_bytes(out.to_ne_bytes()) as i64)?;
+    Ok(None)
  });
  cpu!(strf64 => fn(args, hand_mem) {
-    let s = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0]));
+    let s = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0])?)?;
    let out: f64 = s.parse().unwrap();
-    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
-    None
+    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
+    Ok(None)
  });
  cpu!(boolf64 => fn(args, hand_mem) {
-    let out = hand_mem.read_fixed(args[0]) as f64;
-    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()));
-    None
+    let out = hand_mem.read_fixed(args[0])? as f64;
+    hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?;
+    Ok(None)


These opcodes sound very confused and unsure of themselves "read fixed argument 0? I guess? Then write it as a bool to argument 2? Ok, nothing else."

dfellis · 2021-04-20T17:41:00Z

avm/src/vm/opcode.rs

-    let b_regex = Regex::new(&b_str).unwrap();
+    let a_str = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0])?)?;
+    let b_str = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[1])?)?;
+    let b_regex = Regex::new(&b_str).map_err(|regex_err| VMError::Other(format!("Bad regex construction: {}", regex_err)))?;


This is good, but makes me wonder what we can/should do at compile time for this. A bad regex should be detectable during the compilation phase.

unfortunately not :/ regex does some heap allocations that can't be done at compile time* - see https://docs.rs/regex/1.4.5/src/regex/exec.rs.html

* allocations in const-fns is planned, per rust-lang/const-eval#20

Oh, I was thinking something like "Should we have a fake rust file that we inject the users' regex into and test if it parses correctly and just emit an error during compilation when the regex string is static"

But maybe the correct answer here is: "the regex API in Alan needs to be changed to be a Result and we return an Error to the user if it fails to build correctly"

ah I see - we could also ship ripgrep with our compiler binary, and when we detect regex usage we

create an empty file

run rg on that empty file using the user's regex

continue compiling if the exit code is 0 or throw a compiler error if the exit code is non-zero

We could also shave regex off our dependency tree and save some space in the binary size by just shipping ripgrep in the avm regardless of if the compiler is there, and every time the user wants to do a regex query we do the same but on a file containing the input.

In all seriousness though, we might be able to pre-compile the Regex in the compiler by calling regex as a wasm module, store the object as a Buffer, and include it in the aga/agc output, but that would take a solid amount of time to get working...

actually it's not worth it to store the memory representation of regex::Regex (to get past the platform differences you'd have to include a wasm runtime - which could be absolutely minimal - to compile the wasm memory to native memory, but the issue lies in Rust's lacking of a stable ABI), but we could still run it from wasm to validate the input

I think all of this makes it clear that we need to change the regex-related APIs in Alan to be Result-based so it can provide the failure at runtime. At least the user will need to have defined some code to handle what happens if it can't parse the regex, though that is almost certainly going to be a logic bug. :/

avm/src/vm/opcode.rs

avm/src/vm/telemetry.rs

aguillenv · 2021-04-21T15:31:22Z

avm/src/main.rs

-        run_file(&fp, false).await;
+        if let Err(ee) = run_file(&fp, false).await {
+          eprintln!("{}", ee);
+          std::process::exit(2);


Why exit code 2 and not 1?

that was what we were using in the majority of cases where we did eprintln!() and then exit()

aguillenv · 2021-04-21T15:44:47Z

avm/src/vm/opcode.rs

    let event_emit = EventEmit {
      id: i64::from(BuiltInEvents::HTTPCONN),
      payload: Some(event),
    };
-    let event_tx = EVENT_TX.get().unwrap();
+    let event_tx = EVENT_TX.get().ok_or(VMError::ShutDown)?;


Hmmm interesting, does this work as unwrap_or?

no, since ok_or still returns the Result, whereas unwrap_or is the same as getOr in Alan (if it's an Err, returns an Ok with the given default value)

aguillenv · 2021-04-21T15:49:53Z

avm/src/vm/opcode.rs

+        match http_listener(req).await {
+          Ok(res) => Ok(res),
+          Err(_) => {
+            // TODO: log the error?


the chances to go through this brach are quite small right? If would represent an important error and difficult to track might be interesting to log it

yes, but I'll let the daemon error-catching PR handle that

depombo

LGTM once all tests are passing (only the AVM style right now?) and we verify there is no change in behavior/functionality for the AVM

dfellis · 2021-04-21T18:32:50Z

@cdmistman it appears that there was another accidental behavior change here, the double-send guard on http server responses apparently broke: https://github.com/alantech/alan/pull/496/checks?check_run_id=2403147748#step:4:1373

cdmistman · 2021-04-21T19:06:10Z

there was another accidental behavior change here, the double-send guard on http server responses apparently broke

As mentioned offline, this is from observing the results from the JS runtime, indicating that there isn't a change in behavior, but rather a race condition in our http tests.

cdmistman added 8 commits April 19, 2021 13:02

refactor VM part

6cf67cd

reduce allocs in agc loading, get rid of unnecessary if expr in telem…

a16ee4f

…etry logging

change everything to Result

5ef56d3

finish refactor, mostly

42819d9

fix file loading

7bec966

finalize

eb2ac77

warnings cleanup

28f9625

get rid of all panics that can be avoided

fe79663

aguillenv reviewed Apr 20, 2021

View reviewed changes

cdmistman added 2 commits April 20, 2021 09:22

more error usage

0ac4cfa

get rid of panics in vm::memory

1864854

dfellis requested changes Apr 20, 2021

View reviewed changes

get rid of panics in opcodes, just have to fix http

15c5cd8