Skip to content
This repository has been archived by the owner on Nov 8, 2022. It is now read-only.

SDI-1394 a metric can be changed from static to dynamic by setting na… #977

Merged
merged 4 commits into from
Jul 1, 2016

Conversation

candysmurf
Copy link
Contributor

Fixes #937

Summary of changes:

  • checks the name along with the value

Testing done:

  • unit test

@intelsdi-x/snap-maintainers

@lmroz
Copy link
Contributor

lmroz commented Jun 9, 2016

This is bad fix.
Plugins exposing dynamic metrics change the Value field from * to it's realization:
https://github.com/intelsdi-x/snap/blob/master/plugin/collector/snap-collector-mock1/mock/mock.go#L58

Then this is used to strip such namespace element from metric name:
https://github.com/intelsdi-x/snap-plugin-publisher-influxdb/blob/master/influx/influx.go#L163

@candysmurf
Copy link
Contributor Author

@imroz, thanks for reviewing it. From my understanding, IsDynamic only applies to the metric types, not metric data themselves. I don't see two issues you linked. Thoughts?

@lmroz
Copy link
Contributor

lmroz commented Jun 9, 2016

@candysmurf Dynamic metric as exposed by GetMetricTypes is defined as a set of metrics that may change during runtime. So every named element (formely just *) contains some kind of data. In mock plugin this is name of a host that we pretend to measure.

So example scenario is:

  • plugin exposes namespace /org/[host]/power_consumption
  • user requests metric /org/*/power_consumption
  • plugin returns 2 metric types /org/host1/power_consumption = 10W, /org/host2/power_consumption = 15W

Now, publisher is going to put such rows in database:

| MetricName             | Host  | Value |
|------------------------|-------|-------|
| /org/power_consumption | host1 | 10W   |
| /org/power_consumption | host2 | 15W   |

It does this by inspecting namespace elements using .IsDynamic(), then striping dynamic ones from metric namespace and putting them as separate column in DB.

But notice * is present in namespace only before CollectMetrics, so after your fix such data will be generated in DB:

| MetricName                   | Value |
|------------------------------|-------|
| /org/host1/power_consumption | 10W   |
| /org/host2/power_consumption | 15W   |

@candysmurf
Copy link
Contributor Author

@lmroz, thanks for your examples. However, I tried with Snap master where it does not have my change. It does not do what you expected.

IMHO, in the publishers, we know if it's a dynamic or static metric and the position of wildcard "*" The wildcard may be in many places beyond your examples. We could have a scenario where stripping some of them and keeping a few of them. In another word, publishers should do diligent based on use cases. For example if you like to store "/org/power_consumption" instead of "/org/host1/power_consumption".

In summary, IsDynamic should only check for metric types that defined in the task manifest. It has no other purpose. If it does, it's wrong itself. Please share your thoughts.

@marcin-krolik
Copy link
Collaborator

@candysmurf Emily, Lukasz is right. This fix will break handling of dynamic metrics further in the pipeline. You can't leave "*" in dynamic metric after collection. You have to change the Value to particular instance. Please take a closer look at mock plugin and how it is handled there

@candysmurf
Copy link
Contributor Author

@marcin-krolik, @lmroz: I cannot agree that the IsDynamic function is useful in collectors. For a publisher plugin, if we have to depend on IsDynamic to decide what to do for the incoming metric data, does it mean that we'll need to populate the name for each dynamic element when we're collecting the data? Is there any better and scalable way to achieve this when we're trying to address this bug. Share your thoughts.

@candysmurf
Copy link
Contributor Author

candysmurf commented Jun 10, 2016

@lmroz , @marcin-krolik, @jcooklin, @geauxvirtual , @intelsdi-x/plugin-maintainers:

I propose to remove IsDynamic function entirely. It's flawed. To use your example, when we create a namespace, it should be in the following format:

/org/host/host1/power_consumption
/org/host/host2/power_consumption

A publisher can strip off both "/host/host1" and "/host/host2" to create tags. Another even more obvious approach could be

/org/host=host1/power_consumption
/org/host=host2/power_consumption

or

/org/host*host1/power_consumption
/org/host*host2/power_consumption

So that you definitely know which one to remove and create tags.
Please share your thoughts.

@elemoine
Copy link
Contributor

I am failing to understand what issue this PR is attempting to fix. I did read #937, but I don't understand why a plugin would want to give names to namespace elements (e.g. ns[0].Name = "org"). The approach stating that elements with names are dynamic elements makes sense to me. Plugins should just be aware of that.

@marcin-krolik
Copy link
Collaborator

@candysmurf Problem with your PR is that it tries to address reported bug incorrectly. I had the same confusion when I reported #949 .

IsDynamic() is useful especially in publisher or processors. Metric which leaves collector cannot have wildcard (* character) as Value in any of NamespaceElement. On the other hand we need to have some sort of mechanism to recognize metrics which have "dynamic" origin (metric with wildcard was send to CollectMetrics()). This is achieved by substituting NamespaceElement.Value with particular "instance" like host1, or 10.1.1.2 etc., but leaving NamespaceElement.Name as it was. This way we can still identify metric as dynamic, because IsDynamic() checks for non-empy value of Name.

In short, you were asked to collect such metric
/intel/*/foo

This metric was created with namespace as

metric.Namepace_ = core.NewNamespace("intel").AddDynamicElement("host_ip", "IP address of the host").AddStaticElement("foo")

In CollectMetric() you need to handle this metric. You have two ways of doing that

  1. with static namespace
  2. with dynamic namespace

If you will choose 1 like:

metric.Namespace_: core.NewNamespace("intel", "10.1.1.2", "foo").

then you will loose information of dynamic origin of the metric. Publisher will have no way to get rid of unwanted namespace element which is "10.1.1.2"

If you go with 2, follow the example from mock collector then you don not loose dynamic origin of the metric, because second NamespaceElement still have Name set, so it can be recognized as dynamic in publishers/processors.

Hope it helps.

@candysmurf
Copy link
Contributor Author

candysmurf commented Jun 10, 2016

@elemoine, @marcin-krolik : Thanks for your reply. I understood that my PR will not work for some of publishers and processors as there are dependencies on "IsDynamic" checking and can only check for the name field of the struct.

Please read issue #937, users can set the "name" for static elements. Purely checking if the name is not empty to assume that the element is dynamic itself is flawed.

Note that if collectors didn't set the name during the data collection, even we keep the current IsDynamic checking for publishers or processors, it'll not achieve the result you expected.

I have my proposal in above comment to decouple the dependency on "IsDynamic" Please share your thoughts and if you still like to keep the coupling between Snap, Collectors, Processors, Publishers through "IsDynamic".

@marcin-krolik
Copy link
Collaborator

@candysmurf I'm aware of #937. At the same time I don't think either removal of IsDynamic or introducing another breaking change in namespace creation (which would require to reimplement all plugins which offer dynamic metrics) will help solve this issue.

First solution which comes to my mind could based on introduction of interface instead of core.NamespaceElement with two different structs implementing such interface. One being static, another dynamic. For static setting Name would not be possible. It would not break plugins and IsDynamic would still serve it's purpose.

@candysmurf
Copy link
Contributor Author

@marcin-krolik. It's a good suggestion to use interface. Let's think along this line.

I'm afraid that many plugins will still need to be changed as any dynamic element will need to provide the name field in order for IsDynamic to work appropriately. The amount of refactoring may be not any less.

Anyway, thanks for your suggestion, I like it!

@jcooklin
Copy link
Collaborator

jcooklin commented Jun 11, 2016

It's a good suggestion to use interface

An interface cannot be decoded which will cause some headaches

I would consider addressing this issue with plugin authoring documentation. I added a comment to #937 suggesting the same.

@lmroz
Copy link
Contributor

lmroz commented Jun 13, 2016

I see two options here:

  • add to namespace validation a check if namespace element has a Name then have to have * as a Value. Perhaps this check after CollectMetrics should also be done to make sure no element changed it's definition (btw, maybe some info should be stripped before collection, like Description and added later to avoid pushing such data back and forth - especially in case of distributed workflow)
  • allow namespace element to be Named or Dynamic or both (but Dynamic implies Named). That would also solve problem when list of targets is known (for example, number of cpus), but it's presented in different way in metric list.

@candysmurf
Copy link
Contributor Author

candysmurf commented Jun 13, 2016

@marcin-krolik, interface idea is brilliant and idiomatic to me. I was able to use custom marshal and unmarshal to address the polymorphic element slice. Unfortunately, GOB was not happy with the following error. e.g.

Failures:

  * /Users/egu/.gvm/pkgsets/go1.6.2/snap-latest/src/github.com/intelsdi-x/snap/scheduler/workflow_test.go
  Line 144:
  Expected: nil
  Actual:   'gob: wrong type (core.Namespace) for received field MetricType.Namespace_'

I tried to stick to the original variable and method names as much as possible. I'm not sure if Joel meant the same thing.

@imroz, I didn't get what you suggest. Please elaborate. thanks.

@candysmurf candysmurf force-pushed the is-dynamic branch 2 times, most recently from 7754a41 to 899a42f Compare June 13, 2016 18:38

func init() {
gob.Register(&StaticElement{})
gob.Register(&DynamicElement{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we want to have register going on in multiple places.

We also need to use gob.RegisterName() to avoid import path issues in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, where do you suggest to put them?

@lmroz
Copy link
Contributor

lmroz commented Jun 14, 2016

First solution I proposed is simply adding piece of code like this:

for _, elem := range ns {
    if elem.Name != "" && elem.Value != "*" {
        return meaningful error
    }
}

in these places:
https://github.com/intelsdi-x/snap/blob/master/control/metrics.go#L340
https://github.com/intelsdi-x/snap/blob/master/control/control.go#L1063

Second one is IMO better. We would have 3 types of NamespaceElement

  • Static Anonymous Element - this is element that have Value other than * and Name empty, currently created with AddStaticElement
  • Dynamic Named Element - element with Value = * and non-empty Name, currently created with AddDynamicElement
  • Static Named Element - the one that caused "problem", has Value other than * and non-empty Name, this would allow asking plugin about concrete metric (not all metrics) and simultaneously enable publishers to treat this element as data and strip it from namespace. Additionaly user would have been informed in a consistent way about available metrics (that requires additional code to list such metric only once, not per each instance).

Anyway, validation like in first solution has to be added to disallow unwanted combination of empty Name and Value = *

@candysmurf
Copy link
Contributor Author

candysmurf commented Jun 14, 2016

@lmroz, thanks for the details.

The first option will not require a lot of changes if only two places that we need to patch. Please confirm this.

Agreed, the second one is better and I believe it's doable technically. It requires a lot of changes inside Snap.

I already spent some effort similar to the second approach. I still need to tackle GRPC part and change a lots "[]core.Namespace" to []core.NamespaceElement" and test. Changes will be extensive. Before I put all my effort into it, I would like to get your inputs. Basically, we have following suggestions:

  1. Document it only (simplest, no code change)
  2. Patch a few places (Lukas's option 1, I have no knowledge if it's only a few patchable places.)
  3. Utilize Interface, a lot of code changes
  4. Symbolic separation between dynamic and static elements (Emily's simple and dirty)

@lmroz
Copy link
Contributor

lmroz commented Jun 14, 2016

99.9% sure it's only these 2 places (AFAIK we don't touch binary data that goes out of processor), but remember that in CollectMetrics you have to check in metric catalog if namespace element was static and then collector assigned a Name (or the opposite).

@candysmurf
Copy link
Contributor Author

candysmurf commented Jun 14, 2016

@lmroz, thanks for the confirmation and two links.

Basically, two scenarios we'll throw an error.

  Name != "" && Value != "*"
  Name == "" && Value == "*"

These two checks will be done for every metric collected. Do you have any concern regarding

  1. performance
  2. throw errors

If not, this probably is the simplest fixable solution although it's not ideal. Please share your thoughts.

@candysmurf candysmurf force-pushed the is-dynamic branch 2 times, most recently from 20a31c0 to 149833e Compare June 16, 2016 03:58
@candysmurf
Copy link
Contributor Author

@lmroz, @marcin-krolik: Luaksz's proposal #1 is implemented. Please review and share your thought if Snap really wants to enforce the validation in control.go. I didn't throw errors instead logged them.

@elemoine
Copy link
Contributor

If I may chime in… This validation implies a loop over every collected metric, and for each metric a copy of every namespace element. As @jcooklin commented the issue can be addressed with documentation. For those reasons I personally would not add validation. My 2 cents.

/intel/cassandra/node/zeus/type/Cache/scope/KeyCache/name/Requests/OneMinuteRate
/intel/cassandra/node/zeus/type/ClientRequest/scope/CASRead/name/Unavailables/FiveMinuteRate
/intel/cassandra/node/apollo/type/Cache/scope/RowCache/name/Hits/OneMinuteRate
/intel/cassandra/node/apollo/type/ClientRequest/scope/RangeSlice/name/Latency/OneMinuteRate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor but isn't the convention to use the snake notation in metric namespaces?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elemoine, what's the snake notation? is it [1][2][3][4]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry_for_not_being_more_explicit :)

https://en.m.wikipedia.org/wiki/Snake_case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elemoine: thanks for the clarification! Agreed that snake_case could be easier to read comparing to CamelCase. In my example, I copied them from the Cassandra collector. Those camel cases were from MX4J APIs.

Are you suggesting that Snap should standardize cases in namespaces to Snake Case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have a standardization on this for namespaces. I prefer whatever is closest to the resources. In this case @candysmurf has then coupled well with the Cass collector which is good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should hopefully be consistent inside a plugin. But it is not something we can control.

@elemoine
Copy link
Contributor

@candysmurf I've added a number of comments in the code. Feel free to ignore me if my comments are not valid/relevant :)

@candysmurf candysmurf force-pushed the is-dynamic branch 4 times, most recently from b5e2009 to 4cdb353 Compare June 20, 2016 20:59
@candysmurf
Copy link
Contributor Author

candysmurf commented Jun 20, 2016

@elemoine, thanks for your feedback! Please take another look and let me know if any other suggestions. thank you.

@candysmurf
Copy link
Contributor Author

candysmurf commented Jun 28, 2016

I got an idea to enhance collectors so that they're TimeDB friendly. The good thing is that it'll not have a dependency on "IsDynamic()". There will be no modification to publishers and minimum changes to collectors. I'm implemented it in both Cassandra and Redfish collectors. They'll be ready for review soon.

With that said, setting the name for a dynamic or static element is irrelevant. Developers can have the freedom to use the Name field.

@candysmurf
Copy link
Contributor Author

candysmurf commented Jun 30, 2016

@jcooklin, Updated the doc. Please review this PR and let me know if anything you like me to change. thank you.

@candysmurf candysmurf force-pushed the is-dynamic branch 4 times, most recently from 3b23bcc to fd9468a Compare June 30, 2016 03:04
@candysmurf
Copy link
Contributor Author

@jcooklin , Well done! Thanks for the collaboration!

return fmt.Errorf("A static element %s should not define name %s for namespace %s.", value, name, ns)
}

func errorMetricDynamicElemnetHasNoName(value, ns string) error {
Copy link
Contributor

@pittma pittma Jun 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo here. should be:

func errorMetricDynamicElementHasNoName(value, ns string) error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected. Thanks!

@pittma
Copy link
Contributor

pittma commented Jul 1, 2016

👍

@pittma pittma merged commit 47b789c into intelsdi-x:master Jul 1, 2016
@andrzej-k andrzej-k mentioned this pull request May 31, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants