Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate MICROPY_STACKLESS #3362

Open
dhalbert opened this issue Aug 31, 2020 · 5 comments
Open

Investigate MICROPY_STACKLESS #3362

dhalbert opened this issue Aug 31, 2020 · 5 comments

Comments

@dhalbert
Copy link
Collaborator

Now that we are using PYSTACK, investigate whether turning on MICROPY_STACKLESS is a significant performance improvement.

From py/mpconfig.h:

// Avoid using C stack when making Python function calls. C stack still
// may be used if there's no free heap.
#ifndef MICROPY_STACKLESS
#define MICROPY_STACKLESS (0)
#endif

Suggested by Damien to Scott.

@bill88t
Copy link

bill88t commented Dec 29, 2022

After a heavy amount of testing, I have come to the conclusion that at least for rp2, stackless is stable and offers quite a lot of benefits in comparisson to the current pystack.

Building with

#ifdef CIRCUITPY_PYSTACK_SIZE
#undef CIRCUITPY_PYSTACK_SIZE
#endif
#define CIRCUITPY_PYSTACK_SIZE 0

#ifdef MICROPY_ENABLE_PYSTACK
#undef MICROPY_ENABLE_PYSTACK
#endif
#define MICROPY_ENABLE_PYSTACK (0)

#ifdef MICROPY_STACKLESS
#undef MICROPY_STACKLESS
#endif
#define MICROPY_STACKLESS (1)

in ports/raspberrypi/boards/raspberry_pi_pico_w/mpconfigboard.h, results to a build that has no preallocated pystack, and instead unlimited recursion.

What does all of this mean?

  1. You can finally use meaningful recursion on CircuitPython.
    A simple test script:
    import time
    def test(num):
      print(str(num))
      time.sleep(0.01)
      test(num+1)
    test(0)
    
    Can reach a recursion level of 1.9k which is quite astounding honestly.
    However currently there is no RecursionLimit to stop it from nuking everything.
    Should you wish to try it, be warned that on linux it will crash your usb port as the board will entirely hang.
  2. More flexible memory management.
    You can freely allocate either ram or pystack.
    PSRAM board should be a lot more capable of heavy workloads with stackless.
  3. There seems to be no downside to all of this.
    Each port should be individually validated to work with this change.
    I will be doing my testing, but I wouldn't call it enough.

The pr for these changes will be #7396.

@RetiredWizard
Copy link

I'll add that I've been running PyDOS "stackless" on both Circuitpython and Micropython for about a year and a half and haven't seen any issues. That being said, other than running neopixels, SPI SD cards/displays and various I2C devices I'm not really stressing the high speed timing of the microcontrollers or GPIO outputs and I'd classify both PyDOS and ljinux as not being core usage cases for Circuitpython 😁

I have built and tested on multiple development boards using the ESP32 family, nRF52840, RP2040, SAMD51, stm32L4+, mimxrt10xx microcontrollers and even the Raspberry Pi Zero bare bones.

@RetiredWizard
Copy link

RetiredWizard commented Dec 29, 2022

#ifdef CIRCUITPY_PYSTACK_SIZE
#undef CIRCUITPY_PYSTACK_SIZE
#endif
#define CIRCUITPY_PYSTACK_SIZE 0

Just looking forward to a PR (someday post 8.0.0 😁) to implement this and I'm thinking if you define MICROPY_ENABLE_PYSTACK (0) then I would think CIRCUITPY_PYSTACK_SIZE shouldn't be allocated. If that's true, it probably doesn't hurt to leave the parameter set in case someone wants to override the setting and re-enable PYSTACK.

For all of my testing I would just modify the py/circuitpy_mpconfig.h file and change "#define MICROPY_ENABLE_PYSTACK" from "(1)" to "(0)" and in py/mpconfig.h change #define MICROPY_STACKLESS" from "(0)" to "(1)". Actually, I didn't start setting the MICROPY_STACKLESS parameter until I started working with the ESP chips, on the RP2040 boards I started with, I simply disabled PYSTACK which solved my issues.

@RetiredWizard
Copy link

To document a little of the discord discussion by @anecdata and @Neradoc:

  • MICROPY_STACKLESS causes all allocations to take place on the heap, meaning that even functions that are originally designed to avoid heap allocation can trigger a garbage collection, or contribute to memory fragmentation.
    // Avoid using C stack when making Python function calls. C stack still
    // may be used if there's no free heap.

  • MICROPY_ENABLE_PYSTACK enables a separate "stack" from the C stack to handle python local allocations. It's called "scoped allocation" in some of the related discussions on microptyhon.
    // Whether to enable a separate allocator for the Python stack.
    // If enabled then the code must call mp_pystack_init before mp_init.

MICROPY_STACKLESS causes all allocations to take place on the heap
seems to have large implications for fragmentation

I do struggle with fragmentation with my builds but can't really compare the stackless builds to the standard builds in that regard since my application doesn't really run at all unless I use a stackless build.

@bill88t
Copy link

bill88t commented Jan 3, 2023

I have performed A LOT more testing on esp and nrf, everything works a-ok.
The only things that need to be taken care of is the recursion limit and the exception chain.

On esp with 8mb psram I reached recursion level 128k before crashing.
Do keep in mind though, after around 1k if you press Ctrl + C, it will still hard crash.
This is because of the exception chain being absolutely massive.

Should the recursion limit be implemented, and the exception chain get auto-trimmed, we are good.
I will push the tested platforms to the pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants