Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace DeferredCall and DynamicDeferredCall with a more general-purpose implementation #3382

Merged
merged 25 commits into from
Mar 10, 2023

Conversation

hudson-ayers
Copy link
Contributor

@hudson-ayers hudson-ayers commented Jan 23, 2023

Pull Request Overview

This PR replaces both the existing DeferredCall implementation (which was only usable for chip peripherals) and the existing DynamicDeferredCall implementation (which was used by capsules, kernel components, and even some chip peripherals) with a single, new DeferredCall. The new DeferredCall has higher size/cycle overhead than the original DeferredCall, but can be used anywhere in Tock. The new DeferredCall has lower size overhead than DynamicDeferredCall , and a much less verbose interface.

TL;DR SUMMARY:
  • Imix flash: -524 bytes
  • Imix (.bss + .data): +108 bytes
  • Total code diff: -1300 LOC (not counting lines added from duplicating the time TRD)

Pros of this change

  • We no longer have two different mechanisms for deferred procedure calls, which was always a rough edge for newcomers to Tock
  • The new interface is much less verbose than the (more commonly used) DynamicDeferredCall -- this change has a net delta of 1300 LOC removed!
  • The new DeferredCall checks whether more DeferredCalls have been created than there is space for, and checks that exactly as many DeferredCalls have been registered as were created. This check happens at the beginning of the kernel loop, rather than as DeferredCalls are created. This has two major advantages: first, DeferredCall errors will not lead to hard-to-debug hangs (before, a common bug was for too many DynamicDeferredCalls to be created, but the panic message could not be printed because the check happened at creation, before the debug writer had been setup). Second, it is now much more difficult to forget to register a DeferredCall -- as a result, there is no longer a need for capsules to verify that a DeferredCall has been initialized and registered before using it.
  • The new interface has lower code size overhead than DynamicDeferredCall, and only marginally higher overhead than our old DeferredCall. For Imix, this change reduces code size by 524 bytes.
  • The new interface does not use atomics, and as such we no longer require the #[core_intrinsics] nightly feature (notably, I think we could have made this change to the old DeferredCall as well).
  • The new approach is completely compatible with out-of-tree capsules and peripherals using deferred calls (unlike the old DeferredCall).

Cons of this change

  • Higher flash/RAM use and more cycles used compared to old DeferredCall, so for a board with no use for DynamicDeferredCall and no need for out-of-tree peripherals, this change can have a negative impact. In practice, any real Tock system will use enough DynamicDeferredCall (the console requires it if you use MuxUart!) that this is unlikely to have a negative impact for any systems.
  • Pay the RAM cost of all 32 DeferredCalls whether they are used or not (fixed 256 byte cost). Overall, Imix RAM use went up by 108 bytes (there were some RAM savings thanks to the new DeferredCall being smaller in size than the old DynamicDeferredCall, which show up directly in .bss thanks to many of these types being allocated as fields on global static objects via static_init).
  • Maximum of 32 DeferredCalls supported. We could increase this limit to 64 or 128 at some reasonably small cost, but it seems unlikely that any Tock boards today will require that many. For example, Imix uses only 11 and is already near its code size limit.
  • Less clear mapping of DeferredCalls to the kernel scheduler compared to the old DynamicDeferredCall, since the use of globals means there is a level of indirection that cannot be directly traced through a DeferredCallManager or similar.

Acknowledgements

This PR involved significant collaboration with @lschuermann , who helped port many of the boards and chips over, wrote the original DynamicDeferredCall implementation, and collaborated with me on a bunch of different designs before we arrived at this one.

@kupiakos authored the first draft of the neat DynDefCallRef type, which helped to limit the code size overhead of this approach relative to simply using trait objects, which carry a bunch of information in vtables that we do not need for this.

Testing Strategy

This pull request was tested by running blink and imix on Imix. We will probably need to test across a few more boards, as it is easy to miss spots where we need to call DeferredCallClient::register() in main.rs. Fortunately, any board where we have forgotten to do this will panic at the start of the kernel loop with a useful message.

TODO or Help Wanted

Help wanted: Soundness review of how this type uses a (non-public) static mut Cell<usize> instead of AtomicUsize.

Documentation Updated

  • Updated the relevant files in /docs (I think, may not have gotten everything).

Formatting

  • Ran make prepush.

@github-actions github-actions bot added component kernel nrf Change pertains to the nRF5x family of MCUs. sam4l Change pertains to the SAM4L MCU. stm32 Change pertains to the stm32 family of MCUSs WG-OpenTitan In the purview of the OpenTitan working group. labels Jan 23, 2023
@hudson-ayers hudson-ayers added the P-Significant This is a substancial change that requires review from all core developers. label Jan 25, 2023
Copy link
Member

@lschuermann lschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just done another pass over all changes to boards, components and chips in an effort to ensure that

  • all register calls use the fully-qualified module path, to avoid a call to .register() and imports of DeferredCallClient without additional context.
  • all implementations of .init(), .register_circular_deps(), etc. register all required deferred calls.
  • each board which retrieves a peripheral struct that has a required initialization routine calls this function.

Some additional noteworthy consequences of this proposed architecture which I've observed from this code review are:

  • deregistration of deferred calls no longer trivially possible.
  • constructors of peripherals can no longer be const
    Both of these points don't seem like blockers to me.

Open questions:

  • should every peripheral struct have an initialization method? Should we unify their names? This makes code review simpler. Sometimes, chips which use a common base-crate and peripheral struct, and then extend that with other peripherals will even go as far as to implicitly rename the peripheral initialization method.
    • Can we somehow enforce that this method is called? It's easy to forget and causes headaches. Maybe with an additional struct that is #[must_use]?

chips/earlgrey/src/aes.rs Outdated Show resolved Hide resolved
doc/reference/trd2-hil-design.md Outdated Show resolved Hide resolved
@hudson-ayers
Copy link
Contributor Author

  • deregistration of deferred calls no longer trivially possible.

We could pretty easily support deregistration, but I am not sure it is worth it -- it is not currently used anywhere, and any capsule could manually implement it with a flag at the cost of a word of memory. I think average capsules not having to consider the possibility of a deferred call somehow being deregistered is preferable.

Open questions:

  • should every peripheral struct have an initialization method? Should we unify their names? This makes code review simpler. Sometimes, chips which use a common base-crate and peripheral struct, and then extend that with other peripherals will even go as far as to implicitly rename the peripheral initialization method.

  • Can we somehow enforce that this method is called? It's easy to forget and causes headaches. Maybe with an additional struct that is #[must_use]?

Hmm, this is a neat idea -- we could make initialize() a method on the InterruptService trait to enforce that each struct has one, and that it has a unified name. But I don't think that #[must_use] is the right tool here unfortunately -- it is used for Result-like structs that must be checked when returned from a method, but does not require that a given struct must be instantiated somewhere (i.e. instead you only get an error when the type is created but not used).

Either way, I don't think that this PR needs to block on either of these, though they might make nice improvements.

alistair23
alistair23 previously approved these changes Feb 1, 2023
Copy link
Contributor

@alistair23 alistair23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check every use, but overall this seems like a big improvement!

@hudson-ayers
Copy link
Contributor Author

Note: The current diff on the PR looks like much less than -1300 LOC, that is because the current TRD rules required me to duplicate the entire time TRD in order to adjust the syntax of its code examples

@hudson-ayers
Copy link
Contributor Author

Rebased to resolve conflicts

alistair23
alistair23 previously approved these changes Feb 9, 2023
kupiakos
kupiakos previously approved these changes Feb 24, 2023
Copy link
Contributor

@kupiakos kupiakos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed for soundness, nothing stands out to me as concerning.

kernel/src/deferred_call.rs Show resolved Hide resolved
kernel/src/deferred_call.rs Outdated Show resolved Hide resolved
kernel/src/deferred_call.rs Outdated Show resolved Hide resolved
kernel/src/deferred_call.rs Show resolved Hide resolved
hudson-ayers and others added 19 commits March 1, 2023 08:48
This switches to using the fully-qualified module path to call the
DeferredCallClient::register trait function. It makes it obvious that
a deferred call is registered and avoid an otherwise context-less
import of DeferredCallClient.

Signed-off-by: Leon Schuermann <leon@is.currently.online>
Signed-off-by: Leon Schuermann <leon@is.currently.online>
Signed-off-by: Leon Schuermann <leon@is.currently.online>
@hudson-ayers
Copy link
Contributor Author

rebased

Copy link
Contributor

@brghena brghena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big improvement, to code clarity if nothing else. Great work.

@hudson-ayers hudson-ayers added the last-call Final review period for a pull request. label Mar 6, 2023
@hudson-ayers
Copy link
Contributor Author

bors r+

@bors
Copy link
Contributor

bors bot commented Mar 10, 2023

@bors bors bot merged commit b2f7bcd into master Mar 10, 2023
@bors bors bot deleted the new-deferred-call branch March 10, 2023 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component kernel last-call Final review period for a pull request. nrf Change pertains to the nRF5x family of MCUs. P-Significant This is a substancial change that requires review from all core developers. sam4l Change pertains to the SAM4L MCU. stm32 Change pertains to the stm32 family of MCUSs WG-OpenTitan In the purview of the OpenTitan working group.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants