[DAP-17/18] Move TaskConfig/VdafConfig from taskprov into base DAP as TaskConfiguration#4624
[DAP-17/18] Move TaskConfig/VdafConfig from taskprov into base DAP as TaskConfiguration#4624jcjones wants to merge 4 commits into
Conversation
… TaskConfiguration DAP-17+ defines TaskConfiguration and VDAF config encodings in the base protocol rather than the taskprov extension. This commit: - Creates `messages/src/task.rs` with a `TaskConfiguration` - `min_batch_size` is now `u64` (was `u32`) - VDAF `max_measurement`/`max_weight` fields are now `u64` (was `u32`) - `task_start`/`task_duration` are removed as direct fields, replaced by a `task_interval` `TaskExtension` - `batch_config` is now an explicit opaque field - `TaskExtensionType` gains a `TaskInterval` variant - `VdafConfig` and `TaskExtensionType` preserve unknown codepoints via `Unknown` variants, so a peer advertising a newer revision's VDAF type or task extension stays interoperable
Co-authored-by: David Cook <dcook@divviup.org>
| let task_end = task_config | ||
| .task_start() | ||
| .add_duration(task_config.task_duration())?; | ||
| let (task_start, task_end) = match task_config.task_interval()? { |
There was a problem hiding this comment.
It's kinda funky that we transform an Interval into two Times like this. It looks like we do this because AggregatorTask::new expects task start and end as separate values, but I would think it's no longer possible to have a task have a start but no end, and vice versa. Should we revisit the data model here so that these values are an Interval all the way through?
There was a problem hiding this comment.
That's a pretty sizable refactor down into the database, but I agree with you. How about I take that on as a follow-up?
| /// This field is used to distinguish tasks with otherwise equivalent DAP task parameters. | ||
| taskprov_task_info: Option<Vec<u8>>, | ||
| /// The `task_info` byte string from a `TaskConfiguration` struct. `None` for API-provisioned | ||
| /// tasks, which have no `TaskConfiguration`; used to distinguish tasks with otherwise |
There was a problem hiding this comment.
Separately from this PR, we should consider allowing API-provisioned tasks to set a task info string. That would require plumbing this up into divviup-api, so it's a minor undertaking, but I don't see why we wouldn't allow this.
| if task_info.is_empty() { | ||
| return Err(Error::InvalidParameter("task_info must not be empty")); | ||
| } |
There was a problem hiding this comment.
Is this right? We observe elsewhere that tasks provisioned via aggregator API could have empty task info. I just went and looked and sure enough, draft-ietf-ppm-dap-18 has opaque task_info<1..2^8-1>; which means empty task_info is illegal.
| task_start: Time, | ||
| task_duration: Duration, |
There was a problem hiding this comment.
I think this echoes a comment I left elsewhere, but why not have a single Interval parameter?
There was a problem hiding this comment.
I wanted to match what the database had. See above. I'm opening a follow-on: #4625
| /// `Unknown(0x0001)` compares and hashes equal to `TaskInterval`. | ||
| #[derive(Clone, Copy, Debug)] | ||
| #[non_exhaustive] | ||
| pub enum TaskExtensionType { |
There was a problem hiding this comment.
Elsewhere, like HpkeKemId, we used #[repr(u16)]. I think we should be consistent here.
There was a problem hiding this comment.
On second thought, I went with the ExtensionType strategy here rather than the HpkeKemId one to be forward-compatibile -- repr will hard-fail decoding anything it doesn't reocgnize, but the task-extension TLV framing is supposed to let receivers round-trip extensions they don't understand, right? So I went with this pattern.
There was a problem hiding this comment.
On HpkeKemId we're using the num_enum derive macros, and those seem to care about having a #[repr(u_)] macro on the enum. I think we can achieve forwards-compatibility either way, the difference is whether we write the match blocks or the macro does.
| batch_mode: batch_mode::Code, | ||
| batch_config: Vec<u8>, |
There was a problem hiding this comment.
Since batch_config is intended to model extra parameters that may be required by a future batch mode, it might be nicer to model these two as a single Rust enum (similar to VdafConfig). If we ever support a batch mode with a non-empty batch_config, we could decode the embedded opaque data at the same time we decode the whole TaskConfiguration. For the current state of the code base, this means we wouldn't have to sprinkle empty byte arrays into our initialization (which is a bit of a layering violation).
| /// `task_start` and `task_duration`, inserting it at the correct sorted position in the | ||
| /// given extensions. | ||
| #[allow(clippy::too_many_arguments)] | ||
| pub fn new_with_task_interval( |
There was a problem hiding this comment.
I think introducing a builder for TaskConfiguration may be a bit nicer for our test code. The two constructors we have now share a lot of their signatures.
| /// Searches extensions for a `task_interval` extension and decodes it. | ||
| pub fn task_interval(&self) -> Result<Option<Interval>, Error> { |
There was a problem hiding this comment.
Similarly, we may want to consider representing task extensions with an enum, and only storing the raw encoded form of an extension if we don't have support for the extension.
| InvalidParameter(&'static str), | ||
| /// A codec error occurred while processing a message. | ||
| #[error("codec error: {0}")] | ||
| Codec(#[from] prio::codec::CodecError), |
There was a problem hiding this comment.
If we get rid of encoded byte arrays for inner fields of TaskConfiguration, we could potentially drop this error variant.
| /// `Unknown(0x0001)` compares and hashes equal to `TaskInterval`. | ||
| #[derive(Clone, Copy, Debug)] | ||
| #[non_exhaustive] | ||
| pub enum TaskExtensionType { |
There was a problem hiding this comment.
On HpkeKemId we're using the num_enum derive macros, and those seem to care about having a #[repr(u_)] macro on the enum. I think we can achieve forwards-compatibility either way, the difference is whether we write the match blocks or the macro does.
DAP-17+ defines TaskConfiguration and VDAF config encodings in the base protocol rather than the taskprov extension. This PR:
messages/src/taskprov.rstomessages/src/task.rswith minor renamesmin_batch_sizeis nowu64(wasu32)max_measurement/max_weightfields are nowu64(wasu32)task_start/task_durationare removed as direct fields, replaced by atask_intervalTaskExtensionbatch_configis now an explicit opaque fieldTaskExtensionTypegains aTaskIntervalvariantVdafConfigandTaskExtensionTypepreserve unknown codepoints viaUnknownvariants, so a peer advertising a newer revision's VDAF type or task extension stays interoperable