Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: move partition #1326

Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
924722e
init
zhiqiangxu Jun 25, 2023
da95dbb
avoid recompute current_deadline
zhiqiangxu Jun 25, 2023
e502709
treat empty bitfield as all
zhiqiangxu Jun 26, 2023
eaf5370
rm useless quote
zhiqiangxu Jun 26, 2023
2fa9693
add verify
zhiqiangxu Jun 27, 2023
8725420
combine option1 & option2
zhiqiangxu Jun 28, 2023
d534373
fix
zhiqiangxu Jun 28, 2023
36caece
fix
zhiqiangxu Jun 28, 2023
badb691
nit
zhiqiangxu Jun 28, 2023
fd39aac
mod error
zhiqiangxu Jun 28, 2023
daab781
nit
zhiqiangxu Jun 28, 2023
0271369
fmt
zhiqiangxu Jun 28, 2023
c046cc6
fix ci
zhiqiangxu Jun 28, 2023
3eac91e
fix bug
zhiqiangxu Jul 3, 2023
181dc2e
add test
zhiqiangxu Jul 4, 2023
8eb2243
add more test
zhiqiangxu Jul 5, 2023
394ea44
Merge remote-tracking branch 'origin/master' into feature/move_partit…
zhiqiangxu Jul 11, 2023
d1fdb0f
partial fix for review
zhiqiangxu Jul 18, 2023
0804565
Merge remote-tracking branch 'origin/master' into feature/move_partit…
zhiqiangxu Jul 19, 2023
0353a28
adjust test
zhiqiangxu Jul 19, 2023
bc9e6a1
use .context_code
zhiqiangxu Jul 19, 2023
c1d88a2
fix for test
zhiqiangxu Jul 19, 2023
37d7d03
disallow empty partitions
zhiqiangxu Jul 19, 2023
89bcb74
refactor deadline_available_for_move
zhiqiangxu Jul 19, 2023
79ab15d
fix for clippy
zhiqiangxu Jul 19, 2023
39d6447
minor opt
zhiqiangxu Jul 19, 2023
0bd0611
only verify_windowed_post once
zhiqiangxu Jul 20, 2023
c01b54c
mod error msg
zhiqiangxu Jul 24, 2023
ccdb45f
1. verify_window_post batch by batch
zhiqiangxu Jul 25, 2023
b775fa8
fix ci
zhiqiangxu Jul 25, 2023
a5290cc
mod check for epoch
zhiqiangxu Jul 25, 2023
bf13558
partial review fix
zhiqiangxu Aug 2, 2023
9195aa1
Merge remote-tracking branch 'origin/master' into feature/move_partit…
zhiqiangxu Aug 2, 2023
70fd5fe
adjust test
zhiqiangxu Aug 2, 2023
511e170
refactor with Partition::adjust_for_move
zhiqiangxu Aug 2, 2023
d16077c
share the language with FIP
zhiqiangxu Aug 30, 2023
9ed53b8
deadline_available_for_move => ensure_deadline_available_for_move
zhiqiangxu Aug 30, 2023
15d5ac1
add some doc comment
zhiqiangxu Aug 30, 2023
5744223
more renaming
zhiqiangxu Aug 30, 2023
e8f0a22
more renaming
zhiqiangxu Aug 30, 2023
194b7cd
Merge remote-tracking branch 'origin/master' into feature/move_partit…
zhiqiangxu Aug 30, 2023
5e400d2
rename + merge master
zhiqiangxu Aug 30, 2023
c1159c5
mod wording
zhiqiangxu Aug 30, 2023
cb5e3c4
fix test
zhiqiangxu Aug 30, 2023
faa4873
renaming in test
zhiqiangxu Aug 31, 2023
7bc1156
apply alex's idea of not re-quantizing at all.
zhiqiangxu Sep 19, 2023
f7b7bd7
1. forbid moving when there're early terminations
zhiqiangxu Sep 22, 2023
3b25fdc
rm anyhow::Ok
zhiqiangxu Sep 22, 2023
f3ddb23
Merge branch 'master' into feature/move_partition_verify
zhiqiangxu Sep 23, 2023
9c8517a
minor optimization by observing that partition `faulty_power` should …
zhiqiangxu Sep 23, 2023
3c6a374
adjust find_sectors_by_expiration for not re-quantizing
zhiqiangxu Sep 25, 2023
271e3c0
add test
zhiqiangxu Sep 26, 2023
d2202ab
fix for review
zhiqiangxu Sep 26, 2023
d276241
add a comment about not re-quantizing when moving expirations_epochs
zhiqiangxu Sep 26, 2023
b2dc66c
minor optimization
zhiqiangxu Sep 27, 2023
fcc4cc0
avoid scanning the same range twice
zhiqiangxu Sep 28, 2023
7619065
1. review fix
zhiqiangxu Sep 28, 2023
6e91261
fix comment
zhiqiangxu Sep 28, 2023
9964257
use with_context_code
zhiqiangxu Sep 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions actors/miner/src/deadlines.rs
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,26 @@ pub fn deadline_available_for_compaction(
)
}

// the distance between from_deadline and to_deadline clockwise in deadline unit.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: why have you chosen raw comments rather than doc-comments for all the functions?

Suggested change
// the distance between from_deadline and to_deadline clockwise in deadline unit.
/// the distance between from_deadline and to_deadline clockwise in deadline unit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I was following the original style of compact_partitions:)

fn deadline_distance(policy: &Policy, from_deadline: u64, to_deadline: u64) -> u64 {
if to_deadline >= from_deadline {
to_deadline - from_deadline
} else {
policy.wpost_period_deadlines - from_deadline + to_deadline
}
}

// only allow moving to a nearer deadline from current one
pub fn deadline_available_for_move(
policy: &Policy,
from_deadline: u64,
to_deadline: u64,
current_deadline: u64,
) -> bool {
deadline_distance(policy, current_deadline, to_deadline)
< deadline_distance(policy, current_deadline, from_deadline)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look correct, but please add tests to demonstrate all the cases.

// Determine current period start and deadline index directly from current epoch and
// the offset implied by the proving period. This works correctly even for the state
// of a miner actor without an active deadline cron
Expand Down
297 changes: 297 additions & 0 deletions actors/miner/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ pub enum Method {
ChangeBeneficiary = 30,
GetBeneficiary = 31,
ExtendSectorExpiration2 = 32,
MovePartitions = 33,
// Method numbers derived from FRC-0042 standards
ChangeWorkerAddressExported = frc42_dispatch::method_hash!("ChangeWorkerAddress"),
ChangePeerIDExported = frc42_dispatch::method_hash!("ChangePeerID"),
Expand Down Expand Up @@ -2995,6 +2996,301 @@ impl Actor {
Ok(())
}

fn move_partitions(
rt: &impl Runtime,
mut params: MovePartitionsParams,
) -> Result<(), ActorError> {
if params.from_deadline == params.to_deadline {
return Err(actor_error!(illegal_argument, "from_deadline == to_deadline"));
}
let policy = rt.policy();
if params.from_deadline >= policy.wpost_period_deadlines
|| params.to_deadline >= policy.wpost_period_deadlines
{
return Err(actor_error!(
illegal_argument,
"invalid param, from_deadline: {}, to_deadline: {}",
params.from_deadline,
params.to_deadline
));
}

rt.transaction(|state: &mut State, rt| {
let info = get_miner_info(rt.store(), state)?;

rt.validate_immediate_caller_is(&[info.worker, info.owner])?;
anorth marked this conversation as resolved.
Show resolved Hide resolved

let store = rt.store();
let current_deadline = state.deadline_info(policy, rt.curr_epoch());
let mut deadlines =
state.load_deadlines(store).map_err(|e| e.wrap("failed to load deadlines"))?;

// only try to do synchronous Window Post verification if from_deadline doesn't satisfy deadline_available_for_compaction
if !deadline_available_for_compaction(
anorth marked this conversation as resolved.
Show resolved Hide resolved
policy,
current_deadline.period_start,
params.from_deadline,
rt.curr_epoch(),
) {
// the heavy path: try to do synchronous Window Post verification
// current deadline must be in the dispute window to satisfy the condition of synchronous Window POST verification
if !deadline_available_for_optimistic_post_dispute(
policy,
current_deadline.period_start,
params.from_deadline,
rt.curr_epoch(),
) {
return Err(actor_error!(
forbidden,
"cannot move from deadline {} when !deadline_available_for_compaction && !deadline_available_for_optimistic_post_dispute",
params.from_deadline
));
}


let dl_from = deadlines
.load_deadline(policy, rt.store(), params.from_deadline)
.map_err(|e| {
e.downcast_default(ExitCode::USR_ILLEGAL_STATE, "failed to load deadline")
})?;
anorth marked this conversation as resolved.
Show resolved Hide resolved

let proofs_snapshot =
dl_from.optimistic_proofs_snapshot_amt(store).map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
"failed to load proofs snapshot",
)
})?;

let partitions_snapshot =
dl_from.partitions_snapshot_amt(store).map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
"failed to load partitions snapshot",
)
})?;

// Load sectors for the dispute.
let sectors =
Sectors::load(rt.store(), &dl_from.sectors_snapshot).map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
"failed to load sectors array",
)
})?;

proofs_snapshot
.for_each(|_, window_proof| {
let mut all_sectors =
Vec::<BitField>::with_capacity(window_proof.partitions.len() as usize);
let mut all_ignored =
Vec::<BitField>::with_capacity(window_proof.partitions.len() as usize);

for partition_idx in window_proof.partitions.iter() {
let partition = partitions_snapshot
.get(partition_idx)
anorth marked this conversation as resolved.
Show resolved Hide resolved
.map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
"failed to load partitions snapshot for proof",
)
})?
.ok_or_else(|| {
actor_error!(
illegal_state,
"failed to load partitions snapshot for proof"
)
})?;
all_sectors.push(partition.sectors.clone());
all_ignored.push(partition.terminated.clone());
// fail early since remove_partitions will fail when there're faults anyway.
if !partition.faults.is_empty() {
return Err(anyhow::anyhow!("unable to do synchronous Window POST verification while there're faults in from deadline {}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use actor_error!, not anyhow. FORBIDDEN might be an appropriate exit code.

Copy link
Contributor Author

@zhiqiangxu zhiqiangxu Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If use actor_error!, it will report expected "Result<(), Error>", found "ActorError"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean return Err(actor_error!(...))

params.from_deadline
));
}
}

// Load sector infos for proof, substituting a known-good sector for known-faulty sectors.
// Note: this is slightly sub-optimal, loading info for the recovering sectors again after they were already
// loaded above.
anorth marked this conversation as resolved.
Show resolved Hide resolved
let sector_infos = sectors
.load_for_proof(
&BitField::union(&all_sectors),
&BitField::union(&all_ignored),
)
.map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
"failed to load sectors for post verification",
)
})?;

// Find the proving period start for the deadline in question.
let mut pp_start = current_deadline.period_start;
if current_deadline.index < params.from_deadline {
pp_start -= policy.wpost_proving_period
}
let target_deadline =
new_deadline_info(policy, pp_start, params.from_deadline, rt.curr_epoch());
anorth marked this conversation as resolved.
Show resolved Hide resolved

if !verify_windowed_post(
rt,
target_deadline.challenge,
&sector_infos,
window_proof.proofs.clone(),
)
anorth marked this conversation as resolved.
Show resolved Hide resolved
.map_err(|e| e.wrap("window post failed"))?
{
return Err(actor_error!(
illegal_argument,
"invalid post was submitted"
)
.into());
}
Ok(())
})
.map_err(|e| {
e.downcast_default(ExitCode::USR_ILLEGAL_STATE, "while removing partitions")
})?;
}

if !deadline_available_for_compaction(
policy,
current_deadline.period_start,
params.to_deadline,
rt.curr_epoch(),
) {
anorth marked this conversation as resolved.
Show resolved Hide resolved
return Err(actor_error!(
forbidden,
"cannot move to deadline {} during its challenge window, \
or the prior challenge window,
or before {} epochs have passed since its last challenge window ended",
params.to_deadline,
policy.wpost_dispute_window
));
}

if !deadline_available_for_move(
policy,
params.from_deadline,
params.to_deadline,
current_deadline.index,
) {
anorth marked this conversation as resolved.
Show resolved Hide resolved
return Err(actor_error!(
forbidden,
"cannot move to a deadline which is further from current deadline"
));
}

anorth marked this conversation as resolved.
Show resolved Hide resolved
let from_quant = state.quant_spec_for_deadline(policy, params.from_deadline);
let to_quant = state.quant_spec_for_deadline(policy, params.to_deadline);

let mut from_deadline =
deadlines.load_deadline(policy, store, params.from_deadline).map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
format!("failed to load deadline {}", params.from_deadline),
)
})?;
let mut to_deadline =
deadlines.load_deadline(policy, store, params.to_deadline).map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
format!("failed to load deadline {}", params.to_deadline),
)
})?;

let partitions = &mut params.partitions;
if partitions.is_empty() {
let partitions_amt = from_deadline.partitions_amt(rt.store()).map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
format!("failed to load partitions for deadline {}", params.from_deadline),
)
})?;

for partition_idx in 0..partitions_amt.count() {
partitions.set(partition_idx);
}
anorth marked this conversation as resolved.
Show resolved Hide resolved
}

let (live, dead, removed_power) =
from_deadline.remove_partitions(store, partitions, from_quant).map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
format!(
"failed to remove partitions from deadline {}",
params.from_deadline
),
)
})?;

state.delete_sectors(store, &dead).map_err(|e| {
e.downcast_default(ExitCode::USR_ILLEGAL_STATE, "failed to delete dead sectors")
})?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to do this if keeping the partition intact (including its reference to terminated sectors) while moving. I can see below that's not what you're doing, but please leave this comment until we resolve that.

Copy link
Contributor Author

@zhiqiangxu zhiqiangxu Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking into the "moving partitions intact" approach, but Deadline.expirations_epochs seems quite complicated to maintain while moving. I guess it's the main reason current compact_partitions was implemented by removing then adding sectors.

@anorth Not sure whether it's worth to load sector infos just to calculate an accurate epoch for Deadline.expirations_epochs. Does it matter if the epoch is a period later? If that's the case, we can simply use the source epoch to calculate the target epoch with a different quant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see. I think we should avoid loading sector info if at all possible.

If we say that a partition with faulty sectors can't be moved, then I think we only need to worry about on-time expirations. In that case I suggest rounding up, i.e. find the next epoch quantized to the new deadline that is not sooner than epoch in the queue. Then the SP might need to maintain a sector up to 1 proving period longer than committed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry about removing anything from the old deadline's queue. It will have redundant entries, but they're harmless.

Copy link
Contributor Author

@zhiqiangxu zhiqiangxu Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about other sectors: faulty, unproven, terminated? I was considering to allow all sectors for this "moving partitions intact" approach.

Also, I found there is a test case which requires the epoch to be exactly quant.quantize_up(sector.expiration), should we modify this check?

I've temporarily pushed the latest commit here in case you want to try the test cases like ok_to_move_partitions_from_safe_epoch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to consider allowing any partition to be moved too, but at the moment I don't think it's worth the work/risk of changing how remove_partitions works at the moment. Let's require non-faulty, fully proven sectors for now.

Yes I think we should probably remove that test case. The quantisation is mentioned elsewhere too but I think we can probably remove it, and update associated code to handle arbitrary quantisation. This can be an independent change ahead of what you're doing here – for now just comment out that state assertion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do push that commit to this branch - I think it's the right approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I found there is a test case which requires the epoch to be exactly quant.quantize_up(sector.expiration), should we modify this check?

I looked into this and I think we can just make a simple change to make it epoch >= target.

Copy link
Contributor Author

@zhiqiangxu zhiqiangxu Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it a bit more restrictive: epoch >= target && (epoch-target)%quant.unit ==0


let sectors = state.load_sector_infos(store, &live).map_err(|e| {
Stebalien marked this conversation as resolved.
Show resolved Hide resolved
e.downcast_default(ExitCode::USR_ILLEGAL_STATE, "failed to load moved sectors")
})?;
let proven = true;
anorth marked this conversation as resolved.
Show resolved Hide resolved
let added_power = to_deadline
.add_sectors(
Stebalien marked this conversation as resolved.
Show resolved Hide resolved
store,
info.window_post_partition_sectors,
proven,
&sectors,
info.sector_size,
to_quant,
)
.map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
"failed to add back moved sectors",
)
})?;

if removed_power != added_power {
return Err(actor_error!(
illegal_state,
"power changed when compacting partitions: was {:?}, is now {:?}",
removed_power,
added_power
));
}

deadlines
.update_deadline(policy, store, params.from_deadline, &from_deadline)
.map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
format!("failed to update deadline {}", params.from_deadline),
)
})?;
deadlines.update_deadline(policy, store, params.to_deadline, &to_deadline).map_err(
|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
format!("failed to update deadline {}", params.to_deadline),
)
},
)?;

state.save_deadlines(store, deadlines).map_err(|e| {
e.downcast_default(
ExitCode::USR_ILLEGAL_STATE,
format!(
"failed to save deadline when move_partitions from {} to {}",
params.from_deadline, params.to_deadline
),
)
})?;

Ok(())
})?;

Ok(())
}
/// Compacts sector number allocations to reduce the size of the allocated sector
/// number bitfield.
///
Expand Down Expand Up @@ -5113,6 +5409,7 @@ impl ActorCode for Actor {
ConfirmSectorProofsValid => confirm_sector_proofs_valid,
ChangeMultiaddrs|ChangeMultiaddrsExported => change_multiaddresses,
CompactPartitions => compact_partitions,
MovePartitions => move_partitions,
CompactSectorNumbers => compact_sector_numbers,
ConfirmChangeWorkerAddress|ConfirmChangeWorkerAddressExported => confirm_change_worker_address,
RepayDebt|RepayDebtExported => repay_debt,
Expand Down
7 changes: 7 additions & 0 deletions actors/miner/src/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,13 @@ pub struct CompactPartitionsParams {
pub partitions: BitField,
}

#[derive(Serialize_tuple, Deserialize_tuple)]
pub struct MovePartitionsParams {
anorth marked this conversation as resolved.
Show resolved Hide resolved
pub from_deadline: u64,
pub to_deadline: u64,
pub partitions: BitField,
}

#[derive(Serialize_tuple, Deserialize_tuple)]
pub struct CompactSectorNumbersParams {
pub mask_sector_numbers: BitField,
Expand Down
Loading