Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pragma for data segment variable allocations #453

Closed
ct-clmsn opened this issue Mar 19, 2022 · 2 comments
Closed

add pragma for data segment variable allocations #453

ct-clmsn opened this issue Mar 19, 2022 · 2 comments

Comments

@ct-clmsn
Copy link

ct-clmsn commented Mar 19, 2022

Abstract

Add a pragma for variables that provides data segment allocations C-style static keyword semantics.

Motivation

A variable in C and C++ prefixed with the 'static' keyword is added into the data segment of the program (Rust provides similar functionality). In high performance computing, numerical analysis, scientific computing, and embedded-device applications this feature can provide additional performance benefits; when a program is loaded, memory is allocated for the variable. The memory allocation is resident for the duration of the program. The static pragma would only be applicable to variables defined outside of the scope of a type definition. Ordinal types that are not enums, complex numbers, real number types, char, pointers/addresses, arrays would be permitted to use this pragma (potentially other types that are not currently allocated on the heap or dynamically allocated at program runtime by the nim code generator/compiler). String literals maybe outside the scope of this feature request depending on how string literals are implemented - if string literals are implemented as fixed size arrays of bytes this feature request would be applicable, if string literals are allocated on the heap then this feature request is not applicable.

The author would prefer the pragma {.allocStatic.} but {.allocDataSegment.} maybe more appropriate as it would not create the potential for users to conflate the pragma with the nim keyword static.

Description

The author cannot find a mechanism that currently provides this functionality in nim. The pragma would offer the following syntax:

var x : T {.allocStatic.} where T is one of the types enumerated in the 'Motivation' section.

Examples

var x : T ... {.allocStatic.}

var y : array[10, T] {.allocStatic.}

Before

Not applicable

After

Please review examples section above

Backward incompatibility

This request does not impact type definitions and only impacts variables with types that are currently allocated on the stack. This author does not currently believe there would be backwards compatibility issues. There maybe conflict with how reference counting works currently and potentially with garbage collection (GC) but, since this request is for variables allocated on the stack that should mitigate or marginalize the concern of an impact on the GC and reference counting implementations. The intent of this request is to provide basic functionality.

@mratsim
Copy link
Collaborator

mratsim commented Mar 21, 2022

Somewhat duplicate with addressable consts, progmem and rom: #257 (comment) except that your use-case is possible today since your variables are non-consts.

If you want more specificity you can as of today use codegendecl, https://nim-lang.org/docs/manual.html#implementation-specific-pragmas-codegendecl-pragma

For instance

var a {.codegenDecl: "static $# $#".}: int

will generate static int a

A variable in C and C++ prefixed with the 'static' keyword is added into the data segment of the program (Rust provides similar functionality). In high performance computing, numerical analysis, scientific computing, and embedded-device applications this feature can provide additional performance benefits; when a program is loaded, memory is allocated for the variable. The memory allocation is resident for the duration of the program. The static pragma would only be applicable to variables defined outside of the scope of a type definition.

I'm confused about what you want here:

  • The data segment of the program is NOT memory allocated at start, it holds already initialized values, and it adds to the binary size. If you want a 1GB array in your data segment the binary will be at least 1GB. Nim global variables by default use the BSS segment which is allocated when program is loaded, which is what you describe.
  • For HPC, numerical analysis, scientific computing the data segment is usually a non-starter because of the size of what is handled there, unless it's a couple of kilobytes precomputed tables for example for interpolation or fast exponentiation or logarithm, ... in that case, you can today use let + a precomputation function that will be run at program start (save space), or const if you want it precomputed within the binary (save time), or codegendecl if you want the data in a specific place like PROGMEM for embedded.

String literals maybe outside the scope of this feature request depending on how string literals are implemented - if string literals are implemented as fixed size arrays of bytes this feature request would be applicable, if string literals are allocated on the heap then this feature request is not applicable.

String literals are fixed sized array of chars.

Additionally while not specifically mentioned in your request, if you want local variables to be stored not on the stack but in specific location of the program memory and initialized at program startup, you can use the {.global.} pragma: https://nim-lang.org/docs/manual.html#pragmas-global-pragma

@ct-clmsn
Copy link
Author

ct-clmsn commented Mar 21, 2022

@mratsim - My understanding of the static keyword is consistent with the definition you provided; the phrasing is from prior experience working with different C compiler implementations and their associated optimizers.

Occasionally, the static keyword can be handled differently by a vendor or implementer; sometimes compilers will optimize in the direction of making a smaller executable size on disk which means the static memory is left to be allocated during program initialization. Sort of a 'late binding' situation. Users sometimes have to wrangle with compiler flags to avoid a storage/disk optimization.

As to the second comment, predefining static memory for typed communication buffers (uint8, int*, double, etc) can yield performance benefits in distributed communication settings. As you highlighted, the memory is available when the program on disk is loaded into memory (as the compiler created a physical allocation in the program for the buffer).

Thank you for the response, commentary, and for helping steer the clarification of this request!

The pragma you provided is what I was hoping to achieve:

var a {.codegenDecl: "static $# $#".}: int

As a new user, the $# syntax in the codegenDecl pragma was not something that seemed possible. Also your clarification of the language's handling of globally annotated variables was also helpful. Thanks! Consider this request is closed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants