Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FaaS metrics - Not clear which attributes should be added to each metric #86

Open
joaopgrassi opened this issue Jun 6, 2023 · 4 comments
Assignees
Labels
invalid This doesn't seem right question Further information is requested

Comments

@joaopgrassi
Copy link
Member

joaopgrassi commented Jun 6, 2023

The current FaaS metric semantic conventions list several metrics, and then lists the attributes .

I'm working on moving the metrics to YAML, and it is not clear which attributes should be added to each metric.

Before the attribute table, it is stated:

Below is a table of the attributes to be included on FaaS metric events

then after, there's this statement:

Outgoing FaaS invocations are identified using the faas.invoked_* attributes above. faas.trigger SHOULD be included in all metric events while faas.invoked_* attributes apply on outgoing FaaS invocation events only.

It says the attributes faas.invoked_* should be included in "outgoing FaaS invocation events only" but then it says "Outgoing FaaS invocations are identified using the faas.invoked_*". The metric table does not list what attributes are to be added, so it's unclear which metric is "outgoing" and which are not.

In the FaaS span semconv we have more info:

Incoming invocations

This section describes incoming FaaS invocations as they are reported by the FaaS instance itself.
For incoming FaaS spans, the span kind MUST be Server.

Outgoing invocations

This section describes outgoing FaaS invocations as they are reported by a client calling a FaaS instance.
For outgoing FaaS spans, the span kind MUST be Client

I looked at the existing metrics and after reading all, my "gut feeling" is:

  • Outgoing faas.invoke_duration : the client invoking the function knows how long it took
  • Incoming faas.init_duration, faas.coldstarts, faas.errors: It seems to me these are "server" information, meaning the FaaS itself know this info about itself?
  • ? faas.invocations: Is this the client counting how many times it invoked the function? Or is the function counting how many times it was invoked? 🤔
  • ? faas.timeouts: Is this the client reporting how many times invocations timed out? Or is it the server/function itself while doing other things?

CC @skonto since you were the original author, could you shed some light on this?

@joaopgrassi joaopgrassi added invalid This doesn't seem right question Further information is requested labels Jun 6, 2023
@Oberon00
Copy link
Member

Oberon00 commented Jun 6, 2023

faas.invoke_duration : the client invoking the function knows how long it took

I think purely from the naming I would agree, but then the important duration of the server-side execution (which was renamed to invocation to reduce confusion open-telemetry/opentelemetry-specification#3209) is missing. OTOH, it does seem like a bad idea to have server and client execution/invocation duration under the same metric key, and one of them is definitely missing

@Oberon00
Copy link
Member

Oberon00 commented Jun 6, 2023

It may be sensible to drop "outgoing FaaS" entirely from metrics. AFAIK this concept is only really supported by AWS Lambda which has a Invoke REST API as the lowest level public API to invoke a Lambda. Other cloud vendors seem to not offer such an API, or only offer it in a very limited way for debugging only (GCP). On other providers (Azure, GCF), you would make an ordinary HTTP request to invoke the function

@lmolkova
Copy link
Contributor

lmolkova commented Jul 18, 2023

faas.timeouts, faas.errors

seems relevant to a status attribute discussion (and then the timeout/error is reported as a status code and not an individual metric) - open-telemetry/opentelemetry-specification#3243

faas.invocations

It seems it can be derived from faas.invoke_duration histogram, is this metric is necessary at all?

@joaopgrassi
Copy link
Member Author

faas.invocations

It seems it can be derived from faas.invoke_duration histogram, is this metric is necessary at all?

Good question. faas.invocations says: Number of successful invocations. If faas.invoke_duration also counts the number of successful ones (which I think it does/should) then I think you are right, it can be derived.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants