Convert brw_stats sizes to bytes #86

joehandzik · 2017-05-25T20:33:33Z

This will work through IO sizes of Gigabytes

Fixes #50

Signed-Off-By: Joe Handzik joseph.t.handzik@hpe.com

joehandzik · 2017-05-25T20:40:45Z

I'd like this to be a conversation starter, rather than a sure-fire code change. I'm not sure that this makes the user's life any better, since now you see entries like:

lustre_disk_io_total{operations="read", ost="XXX", size="1000"} 1
lustre_disk_io_total{operations="read", ost="XXX", size="2000"} 1
lustre_disk_io_total{operations="read", ost="XXX", size="4000"} 1
lustre_disk_io_total{operations="read", ost="XXX", size="8000"} 1

Which I don't think is meaningfully better than:

lustre_disk_io_total{operations="read", ost="XXX", size="1K"} 1
lustre_disk_io_total{operations="read", ost="XXX", size="2K"} 1
lustre_disk_io_total{operations="read", ost="XXX", size="4K"} 1
lustre_disk_io_total{operations="read", ost="XXX", size="8K"} 1

Thoughts, @mjtrangoni or @knweiss?

knweiss · 2017-05-26T10:54:05Z

@joehandzik Using bytes will e.g. help Grafana to display a numerically sorted graph legend (i.e. sorted by size). Right now, Grafana has no idea how to sort the size labels as it doesn't understand the various units used. (Also, the lustre_disk_io_total metric output of lustre_exporter is sorted alphabetically and not numerically by size - see my example in #44. Of course, that's not wrong but also not very nice.)

Please use 1024 instead of 1000 as the base.

roclark · 2017-05-26T14:54:08Z

@knweiss good tip on sorting based on numeric labels in Grafana, I haven't played around with that feature too much yet. I can see value in that feature.

One concern I have is differentiating the sizes in Grafana. Does it auto-configure numeric labels similar to axis values? I just have concerns about these numbers taking up a lot of real-estate in Grafana. For example, just 1GB is 1073741824. Seeing a size similar to this next to the intended value is difficult for me to quickly decipher differences. Also, the tool-tip would take up a much larger space, right? Especially if you add other values to the label, such as instance, OST, etc.

Could be I'm off-base here since I haven't played around with this specific functionality in Grafana too much yet, so please correct me if I am wrong.

roclark · 2017-05-26T14:57:20Z

sources/procfs.go

+		new_s = strings.TrimSuffix(s, "G")
+		multiplier = 1000000000
+	default:
+		//this should only happen for integers


Or if for some reason the file has another label... though we aren't there yet, imagine once we get into the terabyte range, we will have to parse for that. Though this should work for all current scenarios, just note that it could get some unexpected value and it would be tossed here. Not that that's bad, just that your comment might be misleading.

Comment clarified as suggested.

roclark · 2017-05-26T15:01:05Z

sources/procfs.go

+func convertToBytes(s string) string {
+	new_s := ""
+	multiplier := 1
+	switch finalChar := s[len(s)-1:]; finalChar {


Again, there shouldn't be an issue with this, but might be safe to force this to an uppercase value. Might add a bit of safety just in case these values get changed for some odd reason in future updates. Though I certainly don't feel very strongly about this, so take it or leave it as you desire.

Forced capitalization as suggested.

roclark · 2017-05-26T15:03:10Z

sources/procfs.go

+func convertToBytes(s string) string {
+	new_s := ""
+	multiplier := 1
+	switch finalChar := s[len(s)-1:]; finalChar {


Another thought, are we sure the length of s will always be at least 1? I'm pretty sure we can be confident it is at least 1 as we shouldn't make it to this function if it isn't, but might not hurt to double check.

Check added as suggested.

roclark · 2017-05-26T15:22:35Z

sources/procfs.go

+		return s
+	}
+	byteVal = byteVal * multiplier
+	return strconv.Itoa(byteVal)


Probably want to use the longer string/int parsing method we have used throughout this file as this breaks for sizes much greater than 1GB. For example, check out this sample code for 10GB. It returns an unexpected number. Playing around with smaller values works as expected though.

Sorry this didn't make it into the actual review!

This was the most egregious mistake I made! The code was actually broken. Accounted for, though I'm not super crazy about a lot of the casting choices I made. I could force everything to float64 and minimize the casting, but that's the only simplification I see. Penny for your thoughts?

No worries my friend! Hmm... any reason for float64 as opposed to just int64 (or... uint64)? I like that idea though, just force it all to that value just to be safe. Heck, what if there's some bug where Lustre has the metrics as bytes in the code but it's actually in the GB size, then we will still run into this issue. It's a stretch... but I'd rather be safe. Regardless of that though, I like the idea of forcing float64(/int64/uint64) on everything.

Ok, I can force float64. It's float64 because that's what math.Pow() returns (I guess I could immediately cast to uint64, but that seems...limiting).

Sold! Works for me!

Ok, after looking at this again and looking at some of the float-to-string and string-to-float methods, I'm not sure that's any better. I could convert the math.Pow() operations to uint64 immediately, but that just makes the code messier. Any serious opposition to leaving the PR as-is currently?

Nope, I'm fine with how it is. Just thinking it might've been easier to keep it all int, but based on what you said, I think this makes a lot more sense this way.

roclark

Looking a lot better! Only thing that still stands out to me (and I'm open to discussion) is the float vs. int discussion.

Edit: Dismissing this particular review based on discussions above.

roclark

Looks good - ready to merge whenever you are.

joehandzik · 2017-05-26T18:04:58Z

@roclark Let's wait and get a confirmation from @knweiss that this is adequate. I'm ok with waiting until next week to merge this PR.

roclark · 2017-05-26T18:08:36Z

Works for me @joehandzik

mjtrangoni · 2017-05-26T18:37:53Z

sources/procfs.go

+	}
+	new_s := ""
+	multiplier := float64(1)
+	switch finalChar := strings.ToUpper(s[len(s)-1:]); finalChar {


@joehandzik just a question. Are they originally upper or lower case? Why do you have to force to the upper case?

They are currently uppercase in every situation that we've seen. But...we've been down this road before with string parsing in another project that I help maintain. Any time you can eliminate case sensitivity bugs, you sleep just a little better. :)

knweiss · 2017-05-29T14:24:50Z

Some thoughts:

Is the "G" unit support even necessary? AFAICS the largest unit Lustre uses is "M":

static void display_brw_stats(struct seq_file *seq, char *name, char *units,
        struct obd_histogram *read, struct obd_histogram *write, int scale)
{
[...]
                if (!scale)
                        seq_printf(seq, "%u", i);
                else if (i < 10)
                        seq_printf(seq, "%u", scale << i);
                else if (i < 20)
                        seq_printf(seq, "%uK", scale << (i-10));
                else
                        seq_printf(seq, "%uM", scale << (i-20));

                seq_printf(seq, ":\t\t%10lu %3lu %3lu   | %4lu %3lu %3lu\n",
                           r, pct(r, read_tot), pct(read_cum, read_tot),
                           w, pct(w, write_tot), pct(write_cum, write_tot));

I wonder: As Prometheus uses float64 metric values why do we bother converting the product to integer? Couldn't the return value of convertToBytes() simply be a float64 string representation?
convertToBytes() is a nice candidate for a unit test. ;-)
- Should convertToBytes() also return an error?

FWIW (still uses an integer string representation):

var (
        unitToMultiplierMap = map[int]uint64{
                'K': 1024,
                'M': 1024 * 1024,
        }
        errEmpty    = errors.New("can't parse empty string")
        errTooShort = errors.New("string too short")
)

func convertToBytes(s string) (string, error) {
        var byteVal uint64
        var err error
        if s == "" {
                return s, errEmpty
        }
        mul, present := unitToMultiplierMap[int(s[len(s)-1])]
        if !present {
                return s, nil
        }
        if len(s) < 2 {
                return s, errTooShort
        }
        byteVal, err = strconv.ParseUint(s[0:len(s)-1], 10, 64)
        if err != nil {
                return s, err
        }
        return strconv.FormatUint(byteVal*mul, 10), nil
}

func _convertToBytes(s string) string {
        s, _ = convertToBytes(s)
        return s
}

roclark · 2017-05-30T15:20:12Z

@knweiss:

Is the "G" unit support even necessary? AFAICS the largest unit Lustre uses is "M"

Good point. An argument can be made about future-proofing this, though it does seem a little excessive if the Lustre source doesn't even handle it.

I wonder: As Prometheus uses float64 metric values why do we bother converting the product to integer? Couldn't the return value of convertToBytes() simply be a float64 string representation?

I was originally thinking about it the other way and asking @joehandzik if we could replace the float64 with a uint64 since that's what we use later, and he rightfully talked me out of it. Your thought is probably even more straightforward. So... can we replace the uint64 in convertToBytes() with a float64 and use that throughout Joe?

convertToBytes() is a nice candidate for a unit test. ;-)

Agreed!

Should convertToBytes() also return an error?

Good idea, though I might argue against the errTooShort error in your code above as I've seen single-character sizes before (e.g. 8 bytes). Since this is a byte value, it doesn't have a unit.

Thanks as always for the feedback @knweiss - much appreciated! Let me know if you have any questions on my thoughts.

knweiss · 2017-05-30T15:33:39Z

@roclark AFAICS the code handles single-character sizes correctly as it returns early if no unit character is present:

        if !present {
                return s, nil
        }

You'll only get the errTooShort error if there's a unit character without a size (e.g. s == 'M').

roclark · 2017-05-30T15:37:01Z

@knweiss ah, yes, you are correct! I think I looked right through those particular lines...

joehandzik · 2017-05-30T16:30:33Z

@knweiss I'm hesitant to add any error catching at all beyond detecting that we don't know how to interpret something and pass it through. If we see bad output, I'd rather bail on the parsing instead of throwing an error that would halt the exporter. Agreed that a unit test is worthwhile.

As for using float64...I'm fine with that, but it'll need to be a change that is part of a larger refactor. The rest of the code expects uint64 today, so I can't change this here without triggering a bunch of build warning about mismatched types. I'd rather translate to a uint64 in this code today, and schedule in a separate change to normalize everything to a float64 later on (probably sometime this week, just after this gets merged).

joehandzik · 2017-05-30T17:13:33Z

@knweiss @roclark Unit test added, lemme know if there are any cases that I missed.

roclark · 2017-05-30T18:53:23Z

Unit tests look good to me. One might argue you could test a lower-case unit... but I'm not too worried about that considering Lustre doesn't currently (assuming they don't randomly change it later) use lower-case units.

joehandzik · 2017-05-30T19:22:00Z

@roclark That was totally worthwhile, actually. There was a bug in my logic for uppercasing.

roclark · 2017-05-30T19:31:30Z

Glad I mentioned it then! Nothing else stands out to me at the moment.

This will work through IO sizes of Gigabytes Fixes #50 Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>

joehandzik · 2017-05-31T16:13:17Z

@roclark Rebase done, everything should build, check, and merge cleanly now. I moved the convertToBytes() method to proc_common.go, FYI.

joehandzik requested a review from roclark May 25, 2017 20:33

joehandzik force-pushed the wip-fix-issue-50 branch from a897f78 to bbdc997 Compare May 25, 2017 20:34

roclark suggested changes May 26, 2017

View reviewed changes

roclark reviewed May 26, 2017

View reviewed changes

joehandzik force-pushed the wip-fix-issue-50 branch from bbdc997 to 24b86a0 Compare May 26, 2017 17:29

roclark reviewed May 26, 2017

View reviewed changes

roclark approved these changes May 26, 2017

View reviewed changes

mjtrangoni reviewed May 26, 2017

View reviewed changes

joehandzik force-pushed the wip-fix-issue-50 branch 4 times, most recently from 3e3591c to 42d8f9f Compare May 30, 2017 17:11

roclark approved these changes May 30, 2017

View reviewed changes

joehandzik force-pushed the wip-fix-issue-50 branch from 42d8f9f to d1abd88 Compare May 30, 2017 19:21

Convert brw_stats sizes to bytes

0c76943

This will work through IO sizes of Gigabytes Fixes #50 Signed-Off-By: Joe Handzik <joseph.t.handzik@hpe.com>

joehandzik force-pushed the wip-fix-issue-50 branch from d1abd88 to 26bafb8 Compare May 31, 2017 16:03

joehandzik force-pushed the wip-fix-issue-50 branch from 26bafb8 to 0c76943 Compare May 31, 2017 16:11

roclark merged commit e89e98b into master May 31, 2017

joehandzik deleted the wip-fix-issue-50 branch May 31, 2017 19:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert brw_stats sizes to bytes #86

Convert brw_stats sizes to bytes #86

joehandzik commented May 25, 2017

joehandzik commented May 25, 2017

knweiss commented May 26, 2017

roclark commented May 26, 2017

roclark May 26, 2017

joehandzik May 26, 2017

roclark May 26, 2017

joehandzik May 26, 2017

roclark May 26, 2017

joehandzik May 26, 2017

roclark May 26, 2017 •

edited

Loading

joehandzik May 26, 2017

roclark May 26, 2017

joehandzik May 26, 2017

joehandzik May 26, 2017

roclark May 26, 2017

joehandzik May 26, 2017

roclark May 26, 2017

roclark left a comment •

edited

Loading

roclark left a comment

joehandzik commented May 26, 2017

roclark commented May 26, 2017

mjtrangoni May 26, 2017

joehandzik May 26, 2017

knweiss commented May 29, 2017

roclark commented May 30, 2017

knweiss commented May 30, 2017

roclark commented May 30, 2017

joehandzik commented May 30, 2017

joehandzik commented May 30, 2017

roclark commented May 30, 2017

joehandzik commented May 30, 2017

roclark commented May 30, 2017

joehandzik commented May 31, 2017

Convert brw_stats sizes to bytes #86

Convert brw_stats sizes to bytes #86

Conversation

joehandzik commented May 25, 2017

joehandzik commented May 25, 2017

knweiss commented May 26, 2017

roclark commented May 26, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roclark May 26, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roclark left a comment • edited Loading

Choose a reason for hiding this comment

roclark left a comment

Choose a reason for hiding this comment

joehandzik commented May 26, 2017

roclark commented May 26, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

knweiss commented May 29, 2017

roclark commented May 30, 2017

knweiss commented May 30, 2017

roclark commented May 30, 2017

joehandzik commented May 30, 2017

joehandzik commented May 30, 2017

roclark commented May 30, 2017

joehandzik commented May 30, 2017

roclark commented May 30, 2017

joehandzik commented May 31, 2017

roclark May 26, 2017 •

edited

Loading

roclark left a comment •

edited

Loading