Skip to content
This repository has been archived by the owner on Nov 8, 2022. It is now read-only.

Requesting large number of metrics leads to task hang #662

Closed
marcin-krolik opened this issue Jan 8, 2016 · 10 comments
Closed

Requesting large number of metrics leads to task hang #662

marcin-krolik opened this issue Jan 8, 2016 · 10 comments
Assignees

Comments

@marcin-krolik
Copy link
Collaborator

I found out requesting large number of metrics (around 500 is enough) from any collector plugin leads to task hang.
Issue was check with lower number of metrics (50, 100, 200, 300) and there was no error.

Reproduction:

  1. Load file publisher
$ $SNAP_PATH/bin/snapctl plugin load  build/plugin/snap-publisher-file
  1. Load plugin
$ $SNAP_PATH/bin/snapctl plugin load <plugin_name>
  1. Prepare task manifest with more then 500 metrics (example attached)
  2. Create task
$SNAP_PATH/bin/snapctl task create -t examples/task/task.json
  1. Try to watch task
$SNAP_PATH/bin/snapctl task watch <id>

Fails:
Task watcher starts but no output shows up.
File publisher produces only 1 iteration of results, then no new content published
It is not possible to stop started task with

$SNAP_PATH/bin/snapctl task stop <id>

Task stays on the list of tasks with state "Running"

Publisher plugin, collector plugin heartbeat leaves. It is possible to unload publisher and collector.

files_and_logs.zip

@jcooklin jcooklin self-assigned this Jan 8, 2016
@lynxbat lynxbat added the tracked label Jan 8, 2016
@thomastaylor312
Copy link
Contributor

Just as a note, I have seen this too with a plugin that gets a metric ton of data once a day.

jcooklin added a commit to jcooklin/snap that referenced this issue Jan 9, 2016
@jcooklin
Copy link
Collaborator

jcooklin commented Jan 9, 2016

@marcin-krolik @thomastaylor312: #664 should fix this issue. When you have time please test it.

@thomastaylor312
Copy link
Contributor

I won't be back in to a place where I can test it until Monday. I'll do
that first thing.

On Fri, Jan 8, 2016 at 6:01 PM Joel Cooklin notifications@github.com
wrote:

@marcin-krolik https://github.com/marcin-krolik @thomastaylor312
https://github.com/thomastaylor312: #664
#664 should fix this issue. When
you have time please test it.


Reply to this email directly or view it on GitHub
#662 (comment).

@marcin-krolik
Copy link
Collaborator Author

Hi @jcooklin, I created PR for processes
intelsdi-x/snap-plugin-collector-processes#1
I also reproduced this bug with dummy plugin which generates arbitrary number of metrics. It seems there is cut off at exactly 500 metrics which leads to this hang.

@jcooklin
Copy link
Collaborator

Hi @marcin-krolik, did you test your processes plugin against #664?

@thomastaylor312
Copy link
Contributor

@jcooklin I am still seeing the issue even with #664. It runs once and then doesn't run again. It also doesn't ever publish the metrics.

@jcooklin
Copy link
Collaborator

I successfully tested against @marcin-krolik process plugin here.

jrcookli@jp22-4:/tmp$ snapctl task list
ID                   NAME                        STATE       HIT     MISS    FAIL    CREATED         LAST FAILURE
0e969c18-266d-42a7-9377-d162b449225b     Task-0e969c18-266d-42a7-9377-d162b449225b   Running     127     0   0   10:17AM 1-11-2016
jrcookli@jp22-4:/tmp$ head -2  snap-publisher-file.stdout
2016/01/11 10:17:06 time="2016-01-11T10:17:06-08:00" level=info msg="Publishing started"
2016/01/11 10:17:06 time="2016-01-11T10:17:06-08:00" level=info msg="publishing 10025 metrics to map[file:{/tmp/published_processes}]"

Notice that it has run 127 times collecting 10K metrics on each interval.

@lynxbat
Copy link
Contributor

lynxbat commented Jan 11, 2016

@jcooklin we are linking to a private github above.

@thomastaylor312
Copy link
Contributor

I looked at this offline with @jcooklin. Looks like I had used the wrong code. #664 is working like it should.

@marcin-krolik
Copy link
Collaborator Author

Hi,
I successfully retested issue with processes plugin and with dummy plugin as well.

Thanks!

jcooklin added a commit that referenced this issue Jan 14, 2016
Fix #662: Task hangs on a large number of metrics
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants