Enable differential privacy for FixedPointBoundedL2VecSum. #1440

MxmUrw · 2023-06-02T12:14:39Z

This comprises the following changes:

Add a noise_param parameter to the fixedvec type(s) in the various vdaf enums.
This describes the amount of noise to be added.
Wire up the new postprocess() function of prio to be called in both leader
and helper. The noising itself happens in the implementation of the fixedvec
type in prio, see the relevant pr (adding differential privacy to FixedPointBoundedL2VecSum libprio-rs#578).

Notes:

We currently call the postprocess() function of prio in two seperate locations for the leader and for the helper, as it seems that their respective aggregation mechanism utilizes different codepaths. I can imagine that there could be a better way to hook up our noising function. Is there?
We have a new noise_param argument for our vdaf. Should we add such an argument to tools/collect? Currently, a default value (no noise) is used when instantiating our vdaf.
We added an experimental feature for janus/aggregator and janus/aggregator_core which depends on the experimental feature for prio.
We added a $FEATURES argument to the dockerfile which allows additional janus features to be enabled, for example from our external docker-compose.yml.

Tasks:

Switch prio dependency back to crates.io, after our other pr is merged.

This comprises the following changes: - Add a `noise_param` parameter to the fixedvec type(s) in the various vdaf enums. This describes the amount of noise to be added. - Wire up the new `postprocess()` function of prio to be called in both leader and helper. The noising itself happens in the implementation of the fixedvec type in prio, see the relevant pr (divviup/libprio-rs#578).

tgeoghegan

I think we can't really review this until the corresponding libprio-rs change is done, but we can have some useful discussion in the meantime.

We currently call the postprocess() function of prio in two seperate locations for the leader and for the helper, as it seems that their respective aggregation mechanism utilizes different codepaths. I can imagine that there could be a better way to hook up our noising function. Is there?

Both helper and leader paths ultimately call aggregator::aggregate_share::compute_aggregate_share to get an <A as vdaf::Aggregator>::AggregateShare. I think that would be the place to apply noise. Both because you'd call it once for either role, but also because in that scope, you have awareness of how many contributions went into the aggregate share, and my intuition is that some DP schemes will want that information to tune the noise they apply.

We have a new noise_param argument for our vdaf. Should we add such an argument to tools/collect? Currently, a default value (no noise) is used when instantiating our vdaf.

IIUC the noise parameter is meaningless for the collector, which is why over in divviup/libprio-rs#578, I argued for changing the interface so that only the vdaf::Aggregator trait deals with DP parameters. But I think we'll have to revisit the question of what the DAP collector needs to do (or not) once we have agreed on the libprio-rs level change.

We added an experimental feature for janus/aggregator and janus/aggregator_core which depends on the experimental feature for prio.

While it's a bit icky semantically, I'd prefer to put this behind the existing fpvec_bounded_l2 to minimize feature combinations.

We added a $FEATURES argument to the dockerfile which allows additional janus features to be enabled, for example from our external docker-compose.yml.

Yes, this is a good idea!

branlwyd

We have a new noise_param argument for our vdaf. Should we add such an argument to tools/collect? Currently, a default value (no noise) is used when instantiating our vdaf.

IMO, feel free to extend tooling to be aware of the new VDAF parameter, gated behind an appropriate feature flag.

branlwyd · 2023-06-12T18:25:57Z

aggregator/src/aggregator/aggregation_job_continue.rs

@@ -199,6 +199,11 @@ impl VdafOps {
            }
        }

+        // Postprocess the aggregated shares. This allows, e.g., for central differential privacy,
+        // but the implementation is experimental.
+        #[cfg(feature = "experimental")]


I think we (eventually?) want to drop the "experimental" flag here -- I would strongly prefer that VDAF-handling be uniform, even if the VDAFs we use in Divvi Up happen to have a no-op postprocess implementation.

Do you have thoughts on whether it will be appropriate to drop this specific feature gate in the future, perhaps once the related libprio-rs changes land?

branlwyd · 2023-06-12T18:35:27Z

aggregator/src/aggregator/aggregation_job_continue.rs

+        // Postprocess the aggregated shares. This allows, e.g., for central differential privacy,
+        // but the implementation is experimental.
+        #[cfg(feature = "experimental")]
+        accumulator.postprocess(&vdaf).unwrap();


Question about postprocess semantics: this code runs every time that an aggregation job is stepped, which ultimately causes postprocess to be called on every batch aggregation related to the aggregation job. If multiple aggregation jobs write into the same batch, this might lead to postprocess being called on the same batch aggregation multiple times. This processing happens before the share is merged into the existing batch aggregation.

Is this expected from the POV of VDAF that uses this currently? More generally, do these semantics make sense for postprocess for all VDAFs? (I would naively expect that we'd want postprocess to be called once per batch aggregation, or perhaps once per collection, likely as part of the collection process -- the reasoning here is that the grouping of reports into aggregation jobs is implementation-specific, and I suspect many postprocess implementations would be easier to write if they did not have to account for being called on a given batch aggregation an arbitrary number of times. But, again, that might be a naive perception.)

(If there's prior discussion/documentation of the postprocess semantics, apologies, I couldn't find it easily on the related libprio-rs PR or issue.)

branlwyd · 2023-06-12T18:42:55Z

aggregator/src/aggregator/aggregation_job_driver.rs

+        // Postprocess the aggregated shares. This allows, e.g., for central differential privacy,
+        // but the implementation is experimental.
+        #[cfg(feature = "experimental")]
+        accumulator.postprocess(&vdaf)?;


Could postprocess be part of accumulator.flush_to_datastore? (this would cause it to be rerun if a technical issue causes a transaction retry in the transaction containing flush_to_datastore, so both performance & correctness might be good reasons not to do this -- though if this is a correctness concern see my other comment about postprocess' expected semantics)

MxmUrw · 2023-09-12T14:37:00Z

Closing in favor of the up-to-date PR: #1892.

MxmUrw requested a review from a team as a code owner June 2, 2023 12:14

MxmUrw changed the title ~~Enable differential privacy for the fixedvec type.~~ Enable differential privacy for FixedPointBoundedL2VecSum. Jun 2, 2023

tgeoghegan reviewed Jun 9, 2023

View reviewed changes

tgeoghegan requested a review from a team June 9, 2023 23:08

branlwyd reviewed Jun 12, 2023

View reviewed changes

MxmUrw mentioned this pull request Aug 2, 2023

Remove Prio3FixedPoint64BitBoundedL2VecSum #1658

Closed

MxmUrw mentioned this pull request Aug 31, 2023

Discussion: How to integrate differential privacy into janus? #1865

Closed

MxmUrw closed this Sep 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable differential privacy for FixedPointBoundedL2VecSum. #1440

Enable differential privacy for FixedPointBoundedL2VecSum. #1440

MxmUrw commented Jun 2, 2023 •

edited

Loading

tgeoghegan left a comment •

edited

Loading

branlwyd left a comment

branlwyd Jun 12, 2023

branlwyd Jun 12, 2023

branlwyd Jun 12, 2023

MxmUrw commented Sep 12, 2023

Enable differential privacy for FixedPointBoundedL2VecSum. #1440

Enable differential privacy for FixedPointBoundedL2VecSum. #1440

Conversation

MxmUrw commented Jun 2, 2023 • edited Loading

tgeoghegan left a comment • edited Loading

Choose a reason for hiding this comment

branlwyd left a comment

Choose a reason for hiding this comment

branlwyd Jun 12, 2023

Choose a reason for hiding this comment

branlwyd Jun 12, 2023

Choose a reason for hiding this comment

branlwyd Jun 12, 2023

Choose a reason for hiding this comment

MxmUrw commented Sep 12, 2023

MxmUrw commented Jun 2, 2023 •

edited

Loading

tgeoghegan left a comment •

edited

Loading