-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resultify AVM #496
Resultify AVM #496
Conversation
let hand_mem = hand_mem.clone(); | ||
async move { | ||
if let OpcodeFn::Io(func) = i.opcode.fun { | ||
//eprintln!("{} {:?}", i.opcode._name, i.args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments are for future debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - I didn't remove it since it's been there for a while (and could prove useful in the future)
avm/src/vm/protos/mod.rs
Outdated
@@ -1 +1,3 @@ | |||
// Dummy mod. Will be auto generated by protoc-rust |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm will think how to avoid this and be transparent for us
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good. I brought it back in the next commit, but I keep forgetting the 1 extra step after git add .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undo these changes
/// Special events in alan found in standard library modules, @std. | ||
/// The IDs for built-in events are negative to avoid collision with positive, custom event IDs. | ||
/// The first hexadecimal byte of the ID in an integer is between 80 and FF | ||
/// The remaining 7 bytes can be used for ASCII-like values | ||
pub enum BuiltInEvents { | ||
/// Alan application start | ||
/// '"start"' in ASCII or 2273 7461 7274 22(80) | ||
START, | ||
START = -9213673853036498142, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this, we can use BuiltInEvents::START as i64
} | ||
Ok(()) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole branch is known (to us) to never happen because of the way the opcode fragments work. It would be nice to be able to assert this with the type during the fragment generation, perhaps having another enum for the "fragment type" wrapping around the vec and only allowing the particular opcode function enum we care about within each of these outer enums and being able to avoid this looping check here?
Not something for this PR, but it does bother me that we have to define an error path here that will never be triggered and that we have to pay a price doing these useless checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a way to unify the function types of the different opcode kinds, so then we can entirely get rid of these checks and only use the additional information for building up the instruction fragments. I imagine it'd be a struct with 2 fields, and running them would be much faster.
The new opcode defs would look like:
pub enum OpcodeKind {
CPU,
IO,
UnpredCPU,
}
pub struct Opcode {
name: &'static str,
kind: OpcodeKind,
exec: fn(&[i64], Arc<HandlerMemory>) -> Box<Pin<VMResult<Arc<HandlerMemory>>>>,
}
so execution can be done with purely an iterator (simplified):
handler.instructions
.iter()
.map(|fragment| {
fragment
.iter()
.map(|(op, instrs)| exec(&instrs, hmem.fork()))
.collect::<UnorderedFutures<VMResult<Vec<Arc<HandlerMemory>>>>>>()
})
.collect::<OrderedFutures<VMResult<Vec<Arc<HandlerMemory>>>>>()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the CPU-bound opcodes can return an event payload, they don't return the HandlerMemory.
However, I think that weirdness is used only by the emitop
opcode, so if we can refactor things to let event emission be done specially by that opcode and not require checking the return value, that'd be great. But what I'd prefer is that CPU opcodes still have a special return type, it's just that it's now void
and we don't have to do any conditional checking on every single CPU opcode call, which is probably a significant perf impact.
avm/src/vm/http.rs
Outdated
@@ -57,7 +57,7 @@ macro_rules! make_server { | |||
let port_num = http.port; | |||
let addr = std::net::SocketAddr::from(([0, 0, 0, 0], port_num)); | |||
let make_svc = hyper::service::make_service_fn(|_conn| async { | |||
Ok::<_, std::convert::Infallible>(hyper::service::service_fn($listener)) | |||
Ok::<_, $crate::vm::VMError>(hyper::service::service_fn($listener)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the service function could return a VMError
since it has to work with the memory of the event that handles each request, so I allow the hyper service to propagate such errors - although we could handle the error directly in the service function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for the web server I would prefer it is infallible and if it encounters any VMError
responses it returns a 500 Internal Server Error
to the end user with some sort of short message indicating an unexpected failure within the AVM itself? Simply crashing out will make it harder to track down what's going wrong, I think.
@@ -136,15 +144,19 @@ impl HandlerMemory { | |||
event_id: i64, | |||
curr_addr: i64, | |||
curr_hand_mem: &Arc<HandlerMemory>, | |||
) -> Option<Arc<HandlerMemory>> { | |||
let pls = Program::global().event_pls.get(&event_id).unwrap().clone(); | |||
) -> VMResult<Option<Arc<HandlerMemory>>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're wrapping another layer around HandlerMemory
here.
You know when we want to improve performance we're going to be undoing a lot of this and going back to "we have thoroughly tested/audited our code so we don't have to pay a runtime penalty for this safety" :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense, although many of these checks are happening anyways when we call unwrap
or expect
. This is just to allow us to prevent these panics and recover from them if the daemon is running.
let s_bytes = f | ||
.block | ||
.iter() | ||
.skip(1) | ||
// TODO: `IntoIter::new` will be deprecated once `rust-lang/rust#65819` is merged | ||
.flat_map(|(_, chars)| IntoIter::new(chars.to_ne_bytes())) | ||
.collect::<Vec<u8>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why switch from the for in
loop to this if it depends on discourage usage of a built-in type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the for in
loop from before had a non-constant number of allocations - this iterator implementation has 1 allocation total, guaranteed (on iter()
- skip
doesn't allocate, IntoIter::new
recasts the array from chars.to_ne_bytes
to an iterator, and collect
inherits the allocation from iter()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(side note: impl IntoIterator for [T; N]
will be stabilized soon, but it'll actually call impl IntoIter for Slice
, which will result in dropping the array while returning the iterator on a slice of the array, which is a compiler error. In edition 2021, this will delegate to array::IntoIter
, which won't be deprecated)
cpu!(i8f64 => fn(args, hand_mem) { | ||
let out = hand_mem.read_fixed(args[0]) as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes())); | ||
None | ||
let out = hand_mem.read_fixed(args[0])? as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?; | ||
Ok(None) | ||
}); | ||
cpu!(i16f64 => fn(args, hand_mem) { | ||
let out = hand_mem.read_fixed(args[0]) as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes())); | ||
None | ||
let out = hand_mem.read_fixed(args[0])? as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?; | ||
Ok(None) | ||
}); | ||
cpu!(i32f64 => fn(args, hand_mem) { | ||
let out = hand_mem.read_fixed(args[0]) as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes())); | ||
None | ||
let out = hand_mem.read_fixed(args[0])? as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?; | ||
Ok(None) | ||
}); | ||
cpu!(i64f64 => fn(args, hand_mem) { | ||
let out = hand_mem.read_fixed(args[0]) as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes())); | ||
None | ||
let out = hand_mem.read_fixed(args[0])? as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?; | ||
Ok(None) | ||
}); | ||
cpu!(f32f64 => fn(args, hand_mem) { | ||
let out = f32::from_ne_bytes((hand_mem.read_fixed(args[0]) as i32).to_ne_bytes()); | ||
hand_mem.write_fixed(args[2], i32::from_ne_bytes(out.to_ne_bytes()) as i64); | ||
None | ||
let out = f32::from_ne_bytes((hand_mem.read_fixed(args[0])? as i32).to_ne_bytes()); | ||
hand_mem.write_fixed(args[2], i32::from_ne_bytes(out.to_ne_bytes()) as i64)?; | ||
Ok(None) | ||
}); | ||
cpu!(strf64 => fn(args, hand_mem) { | ||
let s = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0])); | ||
let s = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0])?)?; | ||
let out: f64 = s.parse().unwrap(); | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes())); | ||
None | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?; | ||
Ok(None) | ||
}); | ||
cpu!(boolf64 => fn(args, hand_mem) { | ||
let out = hand_mem.read_fixed(args[0]) as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes())); | ||
None | ||
let out = hand_mem.read_fixed(args[0])? as f64; | ||
hand_mem.write_fixed(args[2], i64::from_ne_bytes(out.to_ne_bytes()))?; | ||
Ok(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These opcodes sound very confused and unsure of themselves "read fixed argument 0? I guess? Then write it as a bool to argument 2? Ok, nothing else."
let b_regex = Regex::new(&b_str).unwrap(); | ||
let a_str = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[0])?)?; | ||
let b_str = HandlerMemory::fractal_to_string(hand_mem.read_fractal(args[1])?)?; | ||
let b_regex = Regex::new(&b_str).map_err(|regex_err| VMError::Other(format!("Bad regex construction: {}", regex_err)))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good, but makes me wonder what we can/should do at compile time for this. A bad regex should be detectable during the compilation phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately not :/ regex does some heap allocations that can't be done at compile time* - see https://docs.rs/regex/1.4.5/src/regex/exec.rs.html
* allocations in const-fns is planned, per rust-lang/const-eval#20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I was thinking something like "Should we have a fake rust file that we inject the users' regex into and test if it parses correctly and just emit an error during compilation when the regex string is static"
But maybe the correct answer here is: "the regex API in Alan needs to be changed to be a Result
and we return an Error to the user if it fails to build correctly"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I see - we could also ship ripgrep
with our compiler binary, and when we detect regex usage we
- create an empty file
- run
rg
on that empty file using the user's regex - continue compiling if the exit code is 0 or throw a compiler error if the exit code is non-zero
We could also shave regex
off our dependency tree and save some space in the binary size by just shipping ripgrep
in the avm regardless of if the compiler is there, and every time the user wants to do a regex query we do the same but on a file containing the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all seriousness though, we might be able to pre-compile the Regex in the compiler by calling regex
as a wasm module, store the object as a Buffer
, and include it in the aga/agc output, but that would take a solid amount of time to get working...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it's not worth it to store the memory representation of regex::Regex
(to get past the platform differences you'd have to include a wasm runtime - which could be absolutely minimal - to compile the wasm memory to native memory, but the issue lies in Rust's lacking of a stable ABI), but we could still run it from wasm to validate the input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of this makes it clear that we need to change the regex-related APIs in Alan to be Result
-based so it can provide the failure at runtime. At least the user will need to have defined some code to handle what happens if it can't parse the regex, though that is almost certainly going to be a logic bug. :/
run_file(&fp, false).await; | ||
if let Err(ee) = run_file(&fp, false).await { | ||
eprintln!("{}", ee); | ||
std::process::exit(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why exit code 2
and not 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that was what we were using in the majority of cases where we did eprintln!()
and then exit()
let event_emit = EventEmit { | ||
id: i64::from(BuiltInEvents::HTTPCONN), | ||
payload: Some(event), | ||
}; | ||
let event_tx = EVENT_TX.get().unwrap(); | ||
let event_tx = EVENT_TX.get().ok_or(VMError::ShutDown)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm interesting, does this work as unwrap_or
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, since ok_or
still returns the Result
, whereas unwrap_or
is the same as getOr
in Alan (if it's an Err
, returns an Ok
with the given default value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
match http_listener(req).await { | ||
Ok(res) => Ok(res), | ||
Err(_) => { | ||
// TODO: log the error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the chances to go through this brach are quite small right? If would represent an important error and difficult to track might be interesting to log it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but I'll let the daemon error-catching PR handle that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once all tests are passing (only the AVM style right now?) and we verify there is no change in behavior/functionality for the AVM
@cdmistman it appears that there was another accidental behavior change here, the double- |
As mentioned offline, this is from observing the results from the JS runtime, indicating that there isn't a change in behavior, but rather a race condition in our http tests. |
fixes #472