Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a TCP listener #56

Merged
merged 2 commits into from
May 6, 2015
Merged

add a TCP listener #56

merged 2 commits into from
May 6, 2015

Conversation

teepark
Copy link
Contributor

@teepark teepark commented Feb 6, 2015

Because individual sends via TCP can end up split across packet
boundaries, the parsing code had to be reworked to handle that case.
It now takes io.ReadCloser rather than []byte.

It is used this way in the UDP case as well.

@mreiferson
Copy link
Contributor

Hi @teepark thanks for this PR.

It looks like it needs a rebase after a recent merge and there's a bit of code to review but otherwise at a high level I'm 👍 with this.

@markrechler
Copy link
Contributor

@teepark, wanted to check-in and see if you wanted a stab at rebasing this against master.

Have not gotten to test the TCP functionality yet but it seems with these changes the first set of values don't get sent out.

/statsdaemon --address=:8125 --flush-interval=60 --percent-threshold=50 --percent-threshold=95 --percent-threshold=99 --debug=true
2015/04/27 19:00:03 listening on :8125
2015/04/27 19:00:30 ERROR: unrecognized type code "gapps.v_proxy.dev_dev.v-proxy-1.mem.heap_idle_bytes:2400256|gapps.v_proxy.dev_dev.v-proxy-1.mem.heap_in_use_bytes:2842624|gapps.v_proxy.dev_dev.v-proxy-1.mem.heap_released_bytes:0|gapps.v_proxy.dev_dev.v-proxy-1.mem.gc_pause_usec_100:3644|gapps.v_proxy.dev_dev.v-proxy-1.mem.gc_pause_usec_99:3644|gapps.v_proxy.dev_dev.v-proxy-1.mem.gc_pause_usec_95:1630|gapps.v_proxy.dev_dev.v-proxy-1.mem.next_gc_bytes:3414400|gapps.v_proxy.dev_dev.v-proxy-1.mem.gc_runs:105|capps.v_proxy.dev_dev.proxy.start:27|c"
2015/04/27 19:01:03 DEBUG: apps.v_proxy.dev_dev.proxy.start 1262 1430161263

@teepark
Copy link
Contributor Author

teepark commented Apr 27, 2015

hi @markrechler, I won't have a chance to for little while.

on that failure, shouldn't there be newlines between the values?

@markrechler
Copy link
Contributor

There are:

2015/04/27 19:58:15 line from buf: "apps.v_proxy.dev_dev.proxy.start:57|c", "apps.v_proxy.dev_dev.status_code.301:57|c\napps.v_proxy.dev_dev.proxy.success:57|c\n"

Added some debugging:

line, rest := lineFrom(buf)
log.Printf("line from buf: %q, %q", line, rest)

Looks like rest keeps the newlines. Will dig in some more later, probably something with the parsing as it only happens for the first set of values, works fine after that first error.

@markrechler
Copy link
Contributor

Only seems to affect gauges:

2015/04/27 20:16:18 buffer: "apps.v_proxy.dev_dev.proxy.start:58|c\napps.v_proxy.dev_dev.status_code.301:58|c\napps.v_proxy.dev_dev.proxy.success:58|c\n"
2015/04/27 20:16:19 buffer: "apps.v_proxy.dev_dev.v-proxy-1.mem.heap_objects:19532|g"

@markrechler
Copy link
Contributor

This is making more sense now. We queue up counter metrics in the library we use and send gauges one at a time.

This is related to line parsing. In master we are able to take in one value at a time, or a few values at a time seperated by newlines (https://github.com/etsy/statsd/blob/master/docs/metric_types.md#multi-metric-packets).

@teepark
Copy link
Contributor Author

teepark commented Apr 29, 2015

I do actually have a chance to look at this. This branch is also capable of reading multi-metric packets, and that's actually the paradigm that it uses for long-lived TCP connections: it always pulls in each metric by reading up to a newline. You don't have a test case that illustrates this branch's problem with gauges do you?

@markrechler
Copy link
Contributor

Was actually just looking at this, if you submit a metric or two via
echo -ne "deploys.test.myservice:1|c" | nc -w 1 -u localhost 8125

This manifested in the form of gauges for us because we were sending them as one-offs, we send other stuff as multi-metric off the bat.

Adding another condition to lineFrom

if len(input) > 0 {
        return input, []byte{}
    }

Sends all metrics with the caveat that Next will never return a false.

@teepark
Copy link
Contributor Author

teepark commented Apr 29, 2015

I'm merging changes with current master right now, give me a minute to get that pushed.

I mostly mean in statsdaemon_test.go -- I'd love a real failing test :)

@markrechler
Copy link
Contributor

Will see if I can come up with anything, the buffer effectively becomes "deploys.test.myservice:2|cdeploys.test.myservice:1|c" so it's a bit funkier to test for.

@teepark
Copy link
Contributor Author

teepark commented Apr 29, 2015

with newlines though right?
"deploys.test.myservice:2|c\ndeploys.test.myservice:1|c"

I can stick that in the tests

@markrechler
Copy link
Contributor

It has newlines when sending multi-metric, or single with newline, but when sending one-offs w/out newlines, the buffer turns into essentially an invalid mult-metric.

Breaks:

echo -ne "deploys.test.myservice:2|c" | nc -w 1 -u localhost 8125
echo -ne "deploys.test.myservice:1|c" | nc -w 1 -u localhost 8125

Works:

echo -e "deploys.test.myservice:2|c" | nc -w 1 -u localhost 8125
echo -e "deploys.test.myservice:1|c" | nc -w 1 -u localhost 8125

Works:

echo -e "deploys.test.myservice:2|c\ndeploys.test.myservice:1|c" | nc -w 1 -u localhost 8125

@markrechler
Copy link
Contributor

To give some more context:
https://github.com/etsy/statsd/blob/master/stats.js#L211
statsd assumes a newline indicates more than one metric

and with the current/master parsing:
https://github.com/bitly/statsdaemon/blob/master/statsdaemon.go#L374

Even without a newline, there is still potentially a valid line

@teepark
Copy link
Contributor Author

teepark commented Apr 29, 2015

the code in those two links are equivalent. if '\n' doesn't appear in the string then bytes.Split(n []byte{'\n'}) returns a [][]byte of length 1.

@markrechler
Copy link
Contributor

Right, was getting caught up in the wrong part of buffer handling, apologies, so one-off metrics work fine with the tcp-handler. UDP being treated as a stream breaks the parsing since one-offs just keep getting bunched together in the buffer. It might be better to have separate UDP parsing that treats each packet as a complete message.

@teepark
Copy link
Contributor Author

teepark commented Apr 29, 2015

I get it now, sorry I've been so dense. Yeah the parser here is set up to read from any io.Reader including what comes out of net.ListenUDP. But in that case I don't want to be joining together all reads into a single buffer.

OK, I'll build a test case around this and then work from there.

@teepark
Copy link
Contributor Author

teepark commented Apr 29, 2015

ok, give that a shot now. flipping the boolean here causes the test to pass or fail, so it seems to be doing what it should.

)

if strings.HasPrefix(typeCode, "c") {
split = bytes.SplitN([]byte(typeCode), []byte("|@"), 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timers also support sampling rates, may want to generalize this.

Test wise, could add this to TestParseLineTimer:

    d = []byte("glork:320|ms|@0.1")
    packet = parseLine(d)
    assert.NotEqual(t, packet, nil)
    assert.Equal(t, "glork", packet.Bucket)
    assert.Equal(t, uint64(320), packet.Value.(uint64))
    assert.Equal(t, "ms", packet.Modifier)
    assert.Equal(t, float32(0.1), packet.Sampling)

@markrechler
Copy link
Contributor

Thanks for adding the UDP test. Preliminary testing against a real data set is looking good.

var (
serviceAddress = flag.String("address", ":8125", "UDP service address")
tcpServiceAddress = flag.String("tcpaddr", "", "TCP service address, if set")
maxUdpPacketSize = flag.Int64("max-udp-packet-size", 1472, "Maximum UDP packet size")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the maxUdpPacketSize flag since it's no longer being used.

@markrechler
Copy link
Contributor

Sorry for the delayed reply, everything looks good. @teepark could you rebase/squash down the commits to prep for a fast-forward merge whenever you get a chance?

@ploxiln
Copy link
Contributor

ploxiln commented May 5, 2015

I think this suffered from a messy rebase. try rebase -i master, and in the rebase commit list, delete the lines for commits that you didn't make

teepark added 2 commits May 5, 2015 14:06
Because individual sends via TCP can end up split across packet
boundaries, the parsing code had to be reworked to handle that case. It
now takes io.ReadCloser rather than []byte.

It is used this way in the UDP case as well, although there it disallows
single metrics spanning multiple read()s.
markrechler added a commit that referenced this pull request May 6, 2015
@markrechler markrechler merged commit c19b47a into bitly:master May 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants