Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

refactor block size calculation #637

Conversation

krasi-georgiev
Copy link
Contributor

@krasi-georgiev krasi-georgiev commented Jun 21, 2019

It now includes the size of the meta file itself as well for a more
correct block size.
It fixes a bug where the size didn't change when calling block.Delete.
Adds a dedicated test to ensure correct block sizes.
To allow opening the db in a read only mode it ensures the meta file is
not rewritten when the block size hasn't change .

Signed-off-by: Krasi Georgiev 8903888+krasi-georgiev@users.noreply.github.com

@krasi-georgiev krasi-georgiev force-pushed the calculate-block-size-refactor branch 2 times, most recently from 9022985 to 26a8eca Compare June 21, 2019 12:17
It now includes the size of the meta file itself as well for a more
correct block size.
It fixes a bug where the size didn't change when calling block.Delete.
Adds a dedicated test to ensure correct block sizes.
To allow opening the db in a read only mode it ensures the meta file is
not rewritten when the block size hasn't change .

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev krasi-georgiev force-pushed the calculate-block-size-refactor branch from 26a8eca to 64101d9 Compare June 21, 2019 12:19
Copy link
Contributor

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits, looks good otherwise.

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev krasi-georgiev force-pushed the calculate-block-size-refactor branch from d78542a to 321002e Compare June 24, 2019 10:00
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@bwplotka
Copy link
Contributor

Taking look now (:

Copy link
Contributor

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How it looks from compatibility standpoint? We are changing meta.json so old meta.json NumBytes will be ignored. Is that ok?

@krasi-georgiev
Copy link
Contributor Author

krasi-georgiev commented Jun 24, 2019

I don't see any problem as when opening the db it will re-write the meta to the new version because of this check
https://github.com/prometheus/tsdb/pull/637/files#diff-4c74d030dafa9036e73f63926059db29R363

@krasi-georgiev
Copy link
Contributor Author

just tested it and works as expected, all meta files are rewritten to included the correct block size information for each piece.

Copy link
Contributor

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I checked and spoke offline with @krasi-georgiev and it seems like NumBytes was added for size retention and that was the main use case. I like that we count it more precisely now.

However I find it quite odd that it's not populated on compaction, only on OpenBlock and rewritten back in meta.json. Especially for read only mode, we now base on fact that OpenBlock has to be opened by write mode TSDB first before we access the meta.json etc.. That's quite weird. Plus making sure compatibilty for meta.json is kept (what if some other system already base on NumBytes)?

So my question is: Do we really need to bake num of bytes in meta.json even though we calculate precise number in OpenBlock anyway? (even just to check if we have to rewritte)!

Also if we would stop writing this to meta.json we can totally remove this interesting calculation on how many bytes meta.json will have IF we would write number of those bytes etc.. it's quite complex (:

What do you think guys?

@brian-brazil
Copy link
Contributor

By chance today's blog post uses numBytes. Populating it correctly on compaction seems like it would make more sense.

@codesome
Copy link
Contributor

So my question is: Do we really need to bake num of bytes in meta.json even though we calculate precise number in OpenBlock anyway? (even just to check if we have to rewritte)!

This makes me think that we don't need any size info in the meta at all for size based retention.

@bwplotka
Copy link
Contributor

This makes me think that we don't need any size info in the meta at all for size based retention.

Indeed. However if we already have docs for it I wonder what's the use case we described for NumBytes there. Can you link the post @brian-brazil here?

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev
Copy link
Contributor Author

krasi-georgiev commented Jun 24, 2019

How it looks from compatibility standpoint? We are changing meta.json so old meta.json NumBytes will be ignored. Is that ok?

I added a test to ensure that blocks with missing size info get recalculated and the meta file is rewritten at open.

By chance today's blog post uses numBytes. Populating it correctly on compaction seems like it would make more sense.

I already considered this option(writing the meta size info at compaction when creating a block), but there would be blocks that would never be touched by compaction(e.g. blocks that are too big already).

This makes me think that we don't need any size info in the meta at all for size based retention.

I considered this option as well, but using the meta simplifies the implementation quite a bit.

btw I think everyone should focus on block.Size() rather than the actual meta format. We can probably add some metric so that Brian's blog post uses that instead of relying on the meta format.

@bwplotka
Copy link
Contributor

bwplotka commented Jun 24, 2019

Yes but the meta json format takes the most lines of this PR and the most questions.

How keeping this size in meta helps in implementation? we do .Size of everything on every open block anyway in this PR (:

Looking on article: https://www.robustperception.io/how-much-disk-space-do-prometheus-blocks-use

It seems like use case is to tell block sizes. I would say log line with calculated block size on compaction would be enough for this use case. Or metric as Krasi suggest.

I would vote for removing any num size from stats and just calculating on open block.

@krasi-georgiev
Copy link
Contributor Author

I would vote for removing any num size from stars and just calculating on open block.

That would probably work as well, but I thought that having it in the actual meta would simplify debugging or some other use cases where people would want to know the block size just by looking at the block meta.

@codesome
Copy link
Contributor

codesome commented Jun 24, 2019

having it in the actual meta would simplify debugging or some other use cases where people would want to know the block size just by looking at the block meta.

I thought of this when I reviewed, but then a simple du command would also be enough.

@krasi-georgiev
Copy link
Contributor Author

I thought of this, but then a simple du command would also be enough.

yep, ok I will refactor to NOT use the meta file and lets see how it looks than.

@bwplotka
Copy link
Contributor

Agree with @codesome - it's really easy to see this using linux tooling / object storage meta or whatever else. If any, we can discuss it in another PRs, right? (:

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev
Copy link
Contributor Author

new version ready.

Copy link
Contributor

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor suggestions, but LGTM otherwise. Thanks!

CHANGELOG.md Outdated
@@ -1,5 +1,7 @@
## master / unreleased
- [FEATURE] Provide option to compress WAL records using Snappy. [#609](https://github.com/prometheus/tsdb/pull/609)
- [BUGFIX] Re-calculate block size when calling `block.Delete`.
- [CHANGE] `BlockStats` no longer holds size information and this is kept in memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [CHANGE] `BlockStats` no longer holds size information and this is kept in memory.
- [CHANGE] `BlockStats` no longer holds (not precise) size information. This is now dynamically calculate while opening a block in memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the size is recalculated NOT just when opening a block, but also when calling block.Delete so I prefer to keep this a bit more generic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. Can we have then first fix (not precise) and then This is now dynamically calculated and kept in memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by `(not precise)'? If that is the missing meta size that it can be something like:

- [CHANGE] `BlockStats` no longer holds size information. This is now dynamically calculated and kept in memory. It also includes the meta file size which was not included before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect!

@@ -285,6 +284,11 @@ type Block struct {
tombstones TombstoneReader

logger log.Logger

numBytesChunks int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint64 and separate struct maybe for those? Not a blocker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all Size,Write interfaces I have seen so far use int or int64 so I prefer to keep it as is.
Also for db.opts.MaxBytes we also use int64 where -1 means that the size retention is disabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fine by me then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining

@@ -437,7 +428,9 @@ func (pb *Block) GetSymbolTableSize() uint64 {

func (pb *Block) setCompactionFailed() error {
pb.meta.Compaction.Failed = true
return writeMetaFile(pb.logger, pb.dir, &pb.meta)
n, err := writeMetaFile(pb.logger, pb.dir, &pb.meta)
pb.numBytesMeta = n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight inconsistency with below code. We update size only on no error, here we do it no matter what. Can we do it on no error here as well? (: this will avoid potential confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

level.Error(c.logger).Log(
"msg", "Failed to write 'Deletable' to meta file after compaction",
"ulid", b.meta.ULID,
)
}
b.numBytesMeta = n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we set numBytes to new block, but we ignore the other sizes later on?

Copy link
Contributor Author

@krasi-georgiev krasi-georgiev Jun 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this is setting the numBytesMeta for the existing blocks. Existing blocks are already open so need to set the size information when it changes.

Later on it is creating new blocks and these would have their size calculated when they are open at db.reload().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, LGTM.

Copy link
Contributor

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, super minor nit only (:

block.go Outdated
return writeMetaFile(pb.logger, pb.dir, &pb.meta)
n, err := writeMetaFile(pb.logger, pb.dir, &pb.meta)
if err != nil {
pb.numBytesMeta = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really -> I would just not set anything. (:

level.Error(c.logger).Log(
"msg", "Failed to write 'Deletable' to meta file after compaction",
"ulid", b.meta.ULID,
)
}
b.numBytesMeta = n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, LGTM.

Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
Signed-off-by: Krasi Georgiev <8903888+krasi-georgiev@users.noreply.github.com>
@krasi-georgiev krasi-georgiev force-pushed the calculate-block-size-refactor branch from 9e0d32c to c7e140f Compare June 24, 2019 15:23
@krasi-georgiev
Copy link
Contributor Author

krasi-georgiev commented Jun 24, 2019

@brian-brazil This one is needed to unblock the db read only PR so will merge with the current implementation, but feel free to try and refactor at some point if you have a different idea.
Once we have the read only db working and the tests for it it should be easier to refactor and make sure it doesn't brake the read only db mode.

@krasi-georgiev krasi-georgiev merged commit 8d86e92 into prometheus-junkyard:master Jun 24, 2019
@krasi-georgiev krasi-georgiev deleted the calculate-block-size-refactor branch June 24, 2019 15:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants