Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: new annotation to convert array of structs (AoS) to struct of arrays (SoA) #64926

Closed
mappu opened this issue Jan 2, 2024 · 5 comments

Comments

@mappu
Copy link

mappu commented Jan 2, 2024

Proposal Details

Hi,

Here's a really common code example, of an array of structs (AoS):

type User struct {
	Name string
	XPos int64
	YPos int64
}

var World []User

However if the access pattern is to iterate over the same property, for each user, then for modern CPU caches, sometimes it can be more efficient to write code as a struct of arrays (SoA):

var World struct {
	UserName []string
	UserXPos []int64
	UserYPos []int64
}

This can sometimes significantly improve performance. However, it's more cumbersome to write this way, and all call sites need to change. In a large codebase, it's hard to experiment with this idea.

I propose a new annotation that can automatically convert AoS declarations to SoA:

var World []Users //go: soa

This would perform a compiler-internal transformation to use an SoA representation in memory.

Callers would still access fields as if AoS was used and the annotation were not present.

Any Go implementations not understanding the annotation would continue to treat the fields as AoS. This would affect older versions of the Go toolchain as well as TinyGo / gccgo.

One effect is each individual array element (e.g. World[3]) can no longer have its address taken to get a *User.

  • This is already the case for some literals and map values, which are non addressable.
  • As a useful addition, I would like to propose extending the copy builtin to support copying an AoS-ified struct to an ordinary one.

The benefit of this proposal is to easily support transitioning to AoS in a large existing codebase.

Future work, the compiler could recognize some patterns (e.g. short accumulation loop) and use SIMD automatically.


This is based on previous work in other programming languages:

Supporting discussion:

@mappu mappu added the Proposal label Jan 2, 2024
@gopherbot gopherbot added this to the Proposal milestone Jan 2, 2024
@apparentlymart
Copy link

apparentlymart commented Jan 2, 2024

I think this is an interesting idea, but I don't think there's any significant precedent so far for a comment to materially influence the layout of a type in memory. I do see the benefit that it could therefore be backward compatible with older Go toolchain versions and other implementations, but historically that hasn't been a significant design constraint, and recent changes to the official toolchain have moved in the direction of making it easier to upgrade rather than easier to keep running older Go versions.

That aside, this feels to me as one example of a more general idea of choosing between different machine representations of a single type at the Go level. I expect that a first decision to be made here is whether Go is the kind of language that aims to offer that sort of control over implementation details that aren't visible at the Go level of abstraction; if the answer to that is yes then I would prefer to see it specified in a way that suggests how other similar concerns could be met in the future.

For example, Rust has the idea of a repr attribute attached to a type declaration, which has evolved to meet a number of different use-cases where the Rust-level type could be usefully lowered to more than one machine-level representation. Go doesn't have something analogous to Rust's attribute syntax though, so that analogy isn't directly actionable. 🤔

@egtann
Copy link

egtann commented Jan 2, 2024

As another programming language example/prior art, Zig uses this layout extensively in its compiler to great effect. Zig has a syntax that makes looping through them very nice: https://zig.news/andrewrk/multi-object-for-loops-data-oriented-design-41ob.

@zephyrtronium
Copy link
Contributor

A //go:soa comment is not sufficient. SoA arrays are structured in a fundamentally different manner from AoS ones. If you were to declare var World []User //go:soa and then World to a function taking a []Users, that function would do the wrong thing. If you try to declare var nonsense []string //go:soa or var nonsense int //go:soa, that needs to fail. Addressability is a property of types, not values, so we can't just say, "Elements of slices and arrays are addressable, unless the variable declaration has a particular magic comment next to it."

SoA needs to be formally marked as a different kind of type. That means (or, arguably, regardless) this needs to be a language change proposal (with corresponding template). Ideally you could also precisely describe how this interacts with reflection – do we need a new reflect.Kind (probably yes)? do we need to add any other new reflection APIs (probably no)?

@ajwerner
Copy link

ajwerner commented Jan 3, 2024

Can you help justify this as a language change? I feel like this is the sort of thing which needs some experience. go generate is a good starting point for something like this. What I imagine would be useful is a tool which can be used to generate a go file with a struct corresponding to an SoA for a structure type. That wouldn't require any sort of language change and it would be immediately useful.

Like imagine:

//go:generate go run github.com/hypothetical/soatool --type User

type User struct {
	Name string
	XPos int64
	YPos int64
}

This could generate some file user_soa.go:

type Users struct {
   Name []string
   XPos []int64
   YPos []int64 
}

func (u *Users) Append(u User) { /* ... */ }
func (u *Users) Get(i int) User { /* ... */
func (u *Users) Len() int { /* ... */ }

Then with the new ranging over functions stuff, you can imagine a method on Users which reconstructs individual user structs on the fly for iteration if you wanted them in the composed form.

@ianlancetaylor
Copy link
Member

Thanks for the proposal, but as @zephyrtronium points out this is simply impossible as written. I'm going to close this proposal. Please comment if you disagree.

@ianlancetaylor ianlancetaylor closed this as not planned Won't fix, can't repro, duplicate, stale Feb 6, 2024
@golang golang locked and limited conversation to collaborators Feb 5, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants