Skip to content

Commit

Permalink
runtime: redesign scavenging algorithm
Browse files Browse the repository at this point in the history
Currently the runtime's scavenging algorithm involves running from the
top of the heap address space to the bottom (or as far as it gets) once
per GC cycle. Once it treads some ground, it doesn't tread it again
until the next GC cycle.

This works just fine for the background scavenger, for heap-growth
scavenging, and for debug.FreeOSMemory. However, it breaks down in the
face of a memory limit for small heaps in the tens of MiB. Basically,
because the scavenger never retreads old ground, it's completely
oblivious to new memory it could scavenge, and that it really *should*
in the face of a memory limit.

Also, every time some thread goes to scavenge in the runtime, it
reserves what could be a considerable amount of address space, hiding it
from other scavengers.

This change modifies and simplifies the implementation overall. It's
less code with complexities that are much better encapsulated. The
current implementation iterates optimistically over the address space
looking for memory to scavenge, keeping track of what it last saw. The
new implementation does the same, but instead of directly iterating over
pages, it iterates over chunks. It maintains an index of chunks (as a
bitmap over the address space) that indicate which chunks may contain
scavenge work. The page allocator populates this index, while scavengers
consume it and iterate over it optimistically.

This has a two key benefits:
1. Scavenging is much simpler: find a candidate chunk, and check it,
   essentially just using the scavengeOne fast path. There's no need for
   the complexity of iterating beyond one chunk, because the index is
   lock-free and already maintains that information.
2. If pages are freed to the page allocator (always guaranteed to be
   unscavenged), the page allocator immediately notifies all scavengers
   of the new source of work, avoiding the hiding issues of the old
   implementation.

One downside of the new implementation, however, is that it's
potentially more expensive to find pages to scavenge. In the past, if
a single page would become free high up in the address space, the
runtime's scavengers would ignore it. Now that scavengers won't, one or
more scavengers may need to iterate potentially across the whole heap to
find the next source of work. For the background scavenger, this just
means a potentially less reactive scavenger -- overall it should still
use the same amount of CPU. It means worse overheads for memory limit
scavenging, but that's not exactly something with a baseline yet.

In practice, this shouldn't be too bad, hopefully since the chunk index
is extremely compact. For a 48-bit address space, the index is only 8
MiB in size at worst, but even just one physical page in the index is
able to support up to 128 GiB heaps, provided they aren't terribly
sparse. On 32-bit platforms, the index is only 128 bytes in size.

For #48409.

Change-Id: I72b7e74365046b18c64a6417224c5d85511194fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/399474
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
  • Loading branch information
mknyszek committed May 3, 2022
1 parent b4d8114 commit 91f8630
Show file tree
Hide file tree
Showing 9 changed files with 565 additions and 382 deletions.
47 changes: 40 additions & 7 deletions src/runtime/export_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -968,7 +968,6 @@ func NewPageAlloc(chunks, scav map[ChunkIdx][]BitRange) *PageAlloc {
p.init(new(mutex), testSysStat)
lockInit(p.mheapLock, lockRankMheap)
p.test = true

for i, init := range chunks {
addr := chunkBase(chunkIdx(i))

Expand Down Expand Up @@ -1007,6 +1006,18 @@ func NewPageAlloc(chunks, scav map[ChunkIdx][]BitRange) *PageAlloc {
}
}

// Make sure the scavenge index is updated.
//
// This is an inefficient way to do it, but it's also the simplest way.
minPages := physPageSize / pageSize
if minPages < 1 {
minPages = 1
}
_, npages := chunk.findScavengeCandidate(pallocChunkPages-1, minPages, minPages)
if npages != 0 {
p.scav.index.mark(addr, addr+pallocChunkBytes)
}

// Update heap metadata for the allocRange calls above.
systemstack(func() {
lock(p.mheapLock)
Expand All @@ -1015,12 +1026,6 @@ func NewPageAlloc(chunks, scav map[ChunkIdx][]BitRange) *PageAlloc {
})
}

systemstack(func() {
lock(p.mheapLock)
p.scavengeStartGen()
unlock(p.mheapLock)
})

return (*PageAlloc)(p)
}

Expand All @@ -1035,13 +1040,16 @@ func FreePageAlloc(pp *PageAlloc) {
for l := 0; l < summaryLevels; l++ {
sysFreeOS(unsafe.Pointer(&p.summary[l][0]), uintptr(cap(p.summary[l]))*pallocSumBytes)
}
// Only necessary on 64-bit. This is a global on 32-bit.
sysFreeOS(unsafe.Pointer(&p.scav.index.chunks[0]), uintptr(cap(p.scav.index.chunks)))
} else {
resSize := uintptr(0)
for _, s := range p.summary {
resSize += uintptr(cap(s)) * pallocSumBytes
}
sysFreeOS(unsafe.Pointer(&p.summary[0][0]), alignUp(resSize, physPageSize))
}

// Subtract back out whatever we mapped for the summaries.
// sysUsed adds to p.sysStat and memstats.mappedReady no matter what
// (and in anger should actually be accounted for), and there's no other
Expand Down Expand Up @@ -1550,3 +1558,28 @@ func (s *Scavenger) Stop() {
s.Wake()
<-s.done
}

type ScavengeIndex struct {
i scavengeIndex
}

func NewScavengeIndex(min, max ChunkIdx) *ScavengeIndex {
s := new(ScavengeIndex)
s.i.chunks = make([]atomic.Uint8, uintptr(1<<heapAddrBits/pallocChunkBytes/8))
s.i.min.Store(int32(min / 8))
s.i.max.Store(int32(max / 8))
return s
}

func (s *ScavengeIndex) Find() (ChunkIdx, uint) {
ci, off := s.i.find()
return ChunkIdx(ci), off
}

func (s *ScavengeIndex) Mark(base, limit uintptr) {
s.i.mark(base, limit)
}

func (s *ScavengeIndex) Clear(ci ChunkIdx) {
s.i.clear(chunkIdx(ci))
}
Loading

0 comments on commit 91f8630

Please sign in to comment.