High memory consumption during compaction #346

holiman · 2020-11-20T13:45:25Z

When running go-ethereum as an archive node, the total amount of data stored in leveldb is extremely large. In this case, it contains roughly 5.4T data, in ~3M ldb-files, and the default size of 2Mb each. (Edit: 5.4Tb data is the correct number)

When only just opening the database, do some reading, and closing it, the close-procedure takes upwards of 18 minuts to complete. However, the big problem is rather the memory consumption while this is happening.

And this appears to be due to the iterators used during table compaction

Example stack trace:

github.com/syndtr/goleveldb/leveldb/util.(*BufferPool).Put(0xc000136fc0, 0xcf1ce4f400, 0x11ef, 0x200a)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/util/buffer_pool.go:158 +0x184
github.com/syndtr/goleveldb/leveldb/table.(*Reader).readRawBlock(0xcc32ce6f70, 0x0, 0x11ea, 0xdc91e48a1ce63501, 0x8, 0xc022323198, 0x713402, 0x0, 0x0)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/table/reader.go:589 +0x551
github.com/syndtr/goleveldb/leveldb/table.(*Reader).readBlock(0xcc32ce6f70, 0x0, 0x11ea, 0x1, 0x0, 0x71bb63, 0x12ed060)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/table/reader.go:603 +0x55
github.com/syndtr/goleveldb/leveldb/table.(*Reader).readBlockCached(0xcc32ce6f70, 0x0, 0x11ea, 0xccea270001, 0xccea278090, 0xc022323438, 0x7183f0, 0xccea278090, 0x44ac6a)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/table/reader.go:648 +0x1fe
github.com/syndtr/goleveldb/leveldb/table.(*Reader).getDataIter(0xcc32ce6f70, 0x0, 0x11ea, 0x0, 0x1, 0x0, 0x3)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/table/reader.go:765 +0x5b
github.com/syndtr/goleveldb/leveldb/table.(*Reader).getDataIterErr(0xcc32ce6f70, 0x0, 0x11ea, 0x0, 0x1, 0x0, 0x0)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/table/reader.go:780 +0x160
github.com/syndtr/goleveldb/leveldb/table.(*indexIter).Get(0xcc32c76a20, 0xcc32d3e5a0, 0x709501)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/table/reader.go:507 +0x2b0
github.com/syndtr/goleveldb/leveldb/iterator.(*indexedIterator).setData(0xcc32cd1080)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/iterator/indexed_iter.go:39 +0x41
github.com/syndtr/goleveldb/leveldb/iterator.(*indexedIterator).First(0xcc32cd1080, 0xcf1cee4060)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/iterator/indexed_iter.go:88 +0xfe
github.com/syndtr/goleveldb/leveldb/iterator.(*mergedIterator).First(0xca950b5a00, 0xc000197560)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/iterator/merged_iter.go:72 +0x8a
github.com/syndtr/goleveldb/leveldb/iterator.(*mergedIterator).Next(0xca950b5a00, 0x15ae420)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/iterator/merged_iter.go:157 +0x322
github.com/syndtr/goleveldb/leveldb.(*tableCompactionBuilder).run(0xc000234640, 0xc00019d170, 0x0, 0x0)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/db_compaction.go:442 +0x31a
github.com/syndtr/goleveldb/leveldb.(*DB).compactionTransact(0xc000282000, 0x1320774, 0xb, 0x15895a0, 0xc000234640)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/db_compaction.go:186 +0x181
github.com/syndtr/goleveldb/leveldb.(*DB).tableCompaction(0xc000282000, 0xc000160000, 0xc000120100)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/db_compaction.go:580 +0x63f
github.com/syndtr/goleveldb/leveldb.(*DB).tableAutoCompaction(0xc000282000)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/db_compaction.go:644 +0x54
github.com/syndtr/goleveldb/leveldb.(*DB).tCompaction(0xc000282000)
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/db_compaction.go:863 +0x2ef
created by github.com/syndtr/goleveldb/leveldb.openDB
	github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca/leveldb/db.go:155 +0x582

What is happening here, is that the merged_iterator calls First on each of it's sub-iterators. I have not checked explicitly, but I assume
that the sub-iterators (indexed_iterator) are one per file, so roughly 3M of them.

For each one, First() loads the block header. This consumes ~20G memory, which would suggest that each block header is ~6K (does that sound reasonable?).

Now, from a higher level perspective, I can understand that the merged iterator does need to load the (first) keys of the sub-iterators, but it seems a bit
problematic to load the full blockheader, via indexedIterator.setData, from a memory consumption perspective.

I've tried some ways to see if I could hack around it, but neither seemed very feasible:

Making the merged iterator load keys lazily when needed, this doesn't work since next later on wants to loop over all keys anyway:

	for x, tkey := range i.keys {
		if tkey != nil && (key == nil || i.cmp.Compare(tkey, key) < 0) {
			key = tkey
			i.index = x
		}
	}

Avoid loading data from the indexed_iterator. This doesn't quite fly, since First then calls Next, and Next needs to actually load the data to figure out the first key. So the total effect is the same, unless a more thorough rewrite can be made.

At this point, I'm curious if there's some other way to get around this. For example, would it be possible to make smaller compactions which do not cover the entire keyspace? Or compacting some "random" subset of the indexed iterators, a few at a time?
Glad for any advice you can give -- I'm aware this isn't you full-time job, so thanks for even reading this :)

The text was updated successfully, but these errors were encountered:

holiman · 2020-11-20T20:18:11Z

I added some printouts in session_compaction.go

leveldb: newCompaction. t0=1, t1=0  elems
leveldb: newCompaction: after expand, t0=2923417, t1=0  elems
leveldb: adding 2923417 iterators at level 0

holiman · 2020-11-22T09:32:01Z

So, for some reason the db has wound up in this state:

 Level |   Tables   |    Size(MB)   |    Time(sec)  |    Read(MB)   |   Write(MB)
-------+------------+---------------+---------------+---------------+---------------
   0   |    2923421 | 5479128.78218 |       0.00000 |       0.00000 |       0.00000
-------+------------+---------------+---------------+---------------+---------------
 Total |    2923421 | 5479128.78218 |       0.00000 |       0.00000 |       0.00000

We're investigating what may have caused this to happen.

holiman mentioned this issue Nov 22, 2020

cmd/geth: add dbstats command ethereum/go-ethereum#21887

Closed

johnsonjh mentioned this issue Nov 27, 2020

Tracking other 32-bit issues (excluding BBoltDB) pkt-cash/pktd#247

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High memory consumption during compaction #346

High memory consumption during compaction #346

holiman commented Nov 20, 2020 •

edited

Loading

holiman commented Nov 20, 2020

holiman commented Nov 22, 2020

High memory consumption during compaction #346

High memory consumption during compaction #346

Comments

holiman commented Nov 20, 2020 • edited Loading

holiman commented Nov 20, 2020

holiman commented Nov 22, 2020

holiman commented Nov 20, 2020 •

edited

Loading