engine: introduce executor for richer input #2607

bassosimone · 2023-10-24T08:56:33Z

In ooni/ooni.org#1295, I suggested to take the following JSON as the smallest executable unit:

{
  "input": "https://dns.google/dns-query",
  "options": {"HTTP3Enabled": true},
  "test_name": "dnscheck"
}

This issue is about writing (well, most likely, refactoring) code to make it possible. A good (but minor) question to ask is whether the structure should be as indicated above, or whether we should instead use:

{
  "inputs": ["https://dns.google/dns-query"],
  "options": {"HTTP3Enabled": true},
  "test_name": "dnscheck"
}

The latter would probably map more cleanly with how we actually execute experiments. However, it would probably be more tidy from the logical model point of view to have an executor that takes in input a list of the former structure rather than one that takes in input a single instance of the latter structure. As a minor, but useful improvement, using the former structure would allow us to get rid of the annoying issue (originating from Measurement Kit) that to execute an input less test you need to supply it as input []string{""}, otherwise it would not run.

The text was updated successfully, but these errors were encountered:

While starting to work on richer input, I realised that it would be very handy to reserve the name "target" to describe a richer input target as the tuple containing options and some input. However, using such a naming would clash with the definition of MeasurementTarget, which is a string definition that is only there to say semantically that the Input field of a measurement is the thing that we should really measure. Therefore, for making semantic space for targets, I am going to refactor MeasurementTarget to be MeasurementInput. No functional change. Part of ooni/probe#2607

While starting to work on richer input, I realised that it would be very handy to reserve the name "target" to describe a richer input target as the tuple containing options and some input. However, using such a naming would ~clash with the definition of MeasurementTarget, which is a string definition that is only there to say semantically that the Input field of a measurement is the "thing" that we measure, as well as to apply a specific JSON serialization policy that is consistent with the historical behavior of OONI Probe. Therefore, for making semantic space for a richer input target, I am going to refactor MeasurementTarget to be MeasurementInput. No functional change. Part of ooni/probe#2607.

This diff removes an unused argument from a testing hook. Removing this argument simplifies the implementation of the richer input patches and does not change the behavior in any way since the argument wasn't used. Part of ooni/probe#2607.

This diff refactors *engine.experiment to make the report field goroutine safe. It also moves at the bottom of experiment.go code that was intermixed with *engine.experiment methods. Part of ooni/probe#2607

We originally developed asynchronous experiments as a mean to support websteps, a nettest that returned more than one measurement per invocation of its Measure method. Since then, we removed websteps. Therefore, this code is currently technically unused. Additionally, this code further complicates implementing richer input, because it is another way of performing measurements. As such, in the interest of switfly moving forward with richer input _and_ of simplifying the engine, we are now removing this unused functionality from the tree. Part of ooni/probe#2607

This diff refactors `*engine.experiment` to make the report field goroutine safe. It also moves at the bottom of experiment.go code that was intermixed with `*engine.experiment` methods. Part of ooni/probe#2607

We originally developed asynchronous experiments as a mean to support websteps, a nettest that returned more than one measurement per invocation of its Measure method. Since then, we removed websteps. Therefore, this code is currently technically unused. Additionally, this code further complicates implementing richer input, because it is another way of performing measurements. As such, in the interest of switfly moving forward with richer input _and_ of simplifying the engine, we are now removing this unused functionality from the tree. While there, better the documentation of OnProgress and undeprecate functions to open/close reports since it's clear that, for now, using a submitter in cmd/ooniprobe is a bit of a stretch given our current goals. So, it feels best to avoid deprecating until we have nice plan for replacement. To have more confidence that the job we did is ~correct, we use the following table to cross compare the operations that we previously performed for running sync experiments (i.e., all experiments) in an async way to the code we're using after this diff to run experiments. (Note that the diff itself is such that you can easily see the deleted and the added code inside of the `experiment.go` file.) In both cases, we're looking at the operations performed starting from `MeasureWithContext`. | Operation | Before | After | | --------------------------- | ------ | ----- | | MaybeLookupLocationContext | yes | yes | | use e.session.byteCounter | yes | yes | | use e.byteCounter | yes | yes | | newMeasurement | yes | yes | | save start time | yes | yes | | initialize experiment args | yes | yes | | call measurer.Run | yes | yes | | save end time | yes | yes | | compute measurement runtime | yes | yes | | scrub measurement | yes | yes | | return measurement | yes | yes | Part of ooni/probe#2607 Depends on #1612

The current structure of the `*registry.Factory` registry is that we register an experiment like this: ```Go func init() { AllExperiments["dnscheck"] = &Factory{ build: func(config interface{}) model.ExperimentMeasurer { return dnscheck.NewExperimentMeasurer( *config.(*dnscheck.Config), ) }, config: &dnscheck.Config{}, enabledByDefault: true, inputPolicy: model.InputOrStaticDefault, } } ``` Then, when we're setting options, we're modifying the `config` directly with code like this: ```Go // SetOptionAny sets an option given any value. func (b *Factory) SetOptionAny(key string, value any) error { field, err := b.fieldbyname(b.config, key) if err != nil { return err } switch field.Kind() { case reflect.Int64: return b.setOptionInt(field, value) case reflect.Bool: return b.setOptionBool(field, value) case reflect.String: return b.setOptionString(field, value) default: return fmt.Errorf("%w: %T", ErrUnsupportedOptionType, value) } } ``` Finally, we pass the modified config to a new experiment when we're creating it: ```Go func (b *Factory) NewExperimentMeasurer() model.ExperimentMeasurer { return b.build(b.config) } ``` This means that, if we run two back experiments that both require options, we're going to always reuse the same option structure. This feels wrong. We should instead construct a new factory each time, so we start from empty options: ```Go func init() { AllExperiments["dnscheck"] = func() *Factory { return &Factory{ build: func(config interface{}) model.ExperimentMeasurer { return dnscheck.NewExperimentMeasurer( *config.(*dnscheck.Config), ) }, config: &dnscheck.Config{}, enabledByDefault: true, inputPolicy: model.InputOrStaticDefault, } } } ``` This diff applies this very simple mechanical change. Note that so far we were ~good with the current codebase because we don't use options much and generally we invoke each experiment just once per run. This is going to change with OONI Run v2 and richer input. Therefore, it makes sense to tackle this issue now in the context of ooni/probe#2607.

This diff completes the set of preliminary richer input diffs. We build the TargetLoader using the ExperimentBuilder, which in turn uses a registry.Factory under the hood. This means that we can load targets for each experiment. Part of ooni/probe#2607

This diff completes the set of preliminary richer input diffs. We build the TargetLoader using the ExperimentBuilder, which in turn uses a registry.Factory under the hood. This means that we can load targets for each experiment, because the actual implementation of the TargetLoader can be experiment dependent. Part of ooni/probe#2607

This diff simplifies test code in pkg/oonimkall in preparation for further richer-input related changes. Part of ooni/probe#2607

This diff: 1. modifies `./internal/registry` and its `dnscheck.go` file such that `dnscheck` has its own private target loader; 2. modifies `./internal/targetloading` to allow reusing the code to read static input from CLI and from files; 3. modifies `./internal/model` to pass optional richer input to experiments; 4. modifies `./internal/engine` to pass richer input to experiments; 5. modifies `./internal/experiment/dnscheck` to read and use the richer input it is passed; 6. implements a custom target loader for `./internal/experiment/dnscheck` returning some static entries; 7. modifies `./internal/oonirun` to make sure richer-input experiments honor their options by moving option loading before the loading of targets. Once this diff is merged, `miniooni` and `ooniprobe` will run `dnscheck` with richer input. Obviously, we are still lacking an API to fetch richer input, so for now this is static richer input, suitable as a starting point to land more diffs. Part of ooni/probe#2607.

This diff simplifies test code in `pkg/oonimkall` in preparation for further richer-input related changes. Part of ooni/probe#2607. While there (a) realize we don't need anymore to init the example experiment name in its factory constructor, so normalize its construction to be equal to all the other experiments; (b) realize that the `TestTaskRunnerRun` does not require the network and is ~short, so there's no point to skip it when in short mode; (c) avoid using the deprecated `legacy/mockable` package and instead use the `mocks` package.

@DecFox

This commit adds a living design document for richer input. The intent for the final document is to explain the problem we wanted to solve, the alternatives we considered, and how we specifically implemented it. For now, the intent is to collect rough notes about what we have done so far and what to do next, such that @DecFox and I can synchronize on it and continue working on it, possibly also with input from @ainghazal. Part of ooni/probe#2607

bassosimone · 2024-06-07T11:04:42Z

We introduced the executor, even though we didn't name it explicitly. See ooni/probe-cli#1621 for details. Compare to what we originally thought, here's what ended up being different:

Rather than having a global executor, each experiment package implements its own executor using support libraries that we refactored for this purpose. This design is more flexible because we reduce the pressure on the common framework.
An ExperimentMeasurer that is richer input aware executes the smallest possible unit, but the framework around it is still executing on the OONI-Run-v2-like data structure because that is more convenient as it fits better with how the code is currently written.
The way in which we implemented the changes is that the common framework is ready and we have dnscheck to show us that it is possible to have richer-input experiments. We'll gradually migrate also other experiments.

The latter design choice, in particular, reduced friction for integrating with miniooni and ooniprobe a lot.

bassosimone added enhancement improving existing code or new feature priority/high ooni/probe-engine funder/drl2022-2024 labels Oct 24, 2023

bassosimone self-assigned this Oct 24, 2023

This was referenced Oct 24, 2023

miniooni: refactor to use the richer input executor #2609

Closed

ooniprobe: refactor to use the richer input executor #2610

Closed

bassosimone changed the title ~~engine: introduce runner for richer input~~ engine: introduce executor for richer input Oct 24, 2023

bassosimone mentioned this issue Nov 21, 2023

Expand OONI’s testing model to support richer testing input ooni/ooni.org#1291

Closed

26 tasks

bassosimone mentioned this issue Jun 4, 2024

refactor: MeasurementTarget -> MeasurementInput ooni/probe-cli#1609

Merged

bassosimone mentioned this issue Jun 4, 2024

cleanup(oonirun): remove unused testing-hook argument ooni/probe-cli#1610

Merged

bassosimone mentioned this issue Jun 5, 2024

refactor(experiment): make report goroutine safe ooni/probe-cli#1612

Merged

bassosimone mentioned this issue Jun 5, 2024

cleanup: remove asynchronous experiments ooni/probe-cli#1613

Merged

bassosimone mentioned this issue Jun 5, 2024

fix(registry): instantiate factory for each invocation ooni/probe-cli#1614

Merged

bassosimone mentioned this issue Jun 6, 2024

refactor: construct TargetLoader using ExperimentBuilder ooni/probe-cli#1617

Merged

bassosimone mentioned this issue Jun 6, 2024

feat(dnscheck): implement richer input ooni/probe-cli#1618

Merged

bassosimone added a commit to ooni/probe-cli that referenced this issue Jun 7, 2024

cleanup(pkg/oonimkall): simplify test code

8b710f6

This diff simplifies test code in pkg/oonimkall in preparation for further richer-input related changes. Part of ooni/probe#2607

bassosimone mentioned this issue Jun 7, 2024

cleanup(pkg/oonimkall): simplify test code ooni/probe-cli#1619

Merged

bassosimone mentioned this issue Jun 7, 2024

chore(richer input): add living design document ooni/probe-cli#1621

Merged

bassosimone closed this as completed Jun 7, 2024

bassosimone mentioned this issue Jun 7, 2024

engine: unify code implementing Submitter and Experiment.OpenReport #2608

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

engine: introduce executor for richer input #2607

engine: introduce executor for richer input #2607

bassosimone commented Oct 24, 2023

bassosimone commented Jun 7, 2024

engine: introduce executor for richer input #2607

engine: introduce executor for richer input #2607

Comments

bassosimone commented Oct 24, 2023

bassosimone commented Jun 7, 2024