-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
detectors/gcp: support Cloud Run jobs #465
detectors/gcp: support Cloud Run jobs #465
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/gcbrun
Note that Cloud Run jobs do not add |
@damemi not running gcbrun...? |
/gcbrun |
hm that seemed to work |
thanks |
cc @dashpole |
I don't see those set in https://cloud.google.com/run/docs/container-contract#jobs-env-vars. Are the docs incorrect? |
@dashpole I think doc is lacking. I was tested on real environment. |
@dashpole PTAL |
8db87c0
to
5cf62cc
Compare
/gcbrun |
@dashpole gentle ping |
Yes, sorry. We are discussing as a team to make sure this is the right approach. |
@dashpole Ah, I see. I waiting your discuss result. |
Hey @zchee, if you make a job with multiple tasks, what is the value in CLOUD_RUN_EXECUTION? Does each task end up with the same job name and execution? If so, we may need some additional information to prevent collisions between executions. |
@dashpole I'll investigate it. Just in case, in the case of the ref: |
If you can, try both, and also look at what CLOUD_RUN_TASK_INDEX and CLOUD_RUN_TASK_ATTEMPT are set to. Really appreciate the help |
@dashpole okay, will check both situations today. Also, I'll report Cloud Run jobs docs lacking that's 2 envs via Mercari's Google TAM (I work for Mercari, Inc). |
@dashpole replied from Google support, said "seems to serverless team forgot update document", So I think would be better to waiting to update docs. Maybe your concern is also cleared. Also, not yet checked real results (sorry, I was busy a bit) but will check. |
@dashpole I found otel side hardcoded |
Note that the old cloud run (which that code references) is deprecated in favor of this code base. |
@dashpole updated document |
@dashpole gentle ping I want to hear your opinions. |
What were CLOUD_RUN_TASK_INDEX and CLOUD_RUN_TASK_ATTEMPT set to? Also, what is in d.metadata.InstanceID() (faas.id)? Basically, we want to make sure that faas.name, faas.version, and faas.id uniquely identify an instance of a cloud run job. Otherwise, if two things are running in parallel with the same faas.* attributes, there could be conflicts. |
@dashpole I'll reply details later. |
Sorry for the late reply. Use https://github.com/GoogleCloudPlatform/golang-samples/tree/main/run/jobs diff --git a/run/jobs/main.go b/run/jobs/main.go
index 05e3195..a8a6d0a 100644
--- a/run/jobs/main.go
+++ b/run/jobs/main.go
@@ -21,6 +21,8 @@ import (
"os"
"strconv"
"time"
+
+ "cloud.google.com/go/compute/metadata"
)
type Config struct {
@@ -99,6 +101,12 @@ func main() {
}
log.Printf("Completed Task #%s, Attempt #%s", config.taskNum, config.attemptNum)
+
+ instanceID, err := metadata.InstanceID()
+ if err != nil {
+ log.Fatal(err)
+ }
+ log.Printf("instanceID: %s\n", instanceID)
}
// Throw an error based on fail rate $ gcloud alpha run jobs create --tasks=5 --parallelism=3 --region=asia-northeast1 jobs
# execute...
|
@zchee sorry for the delay in this. The spec changes just merged and from the discussion in open-telemetry/opentelemetry-specification#3378 this will be slightly different than we talked about here. Instead, we will have separate GCP-specific resource attributes for Cloud Run execution and Job ID (open-telemetry/opentelemetry-go#4079 will add these to the Go semconv package). I think we should probably just make these available via their own functions, and maybe add 2 new functions |
make sense. I'll implements. |
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
e852ece
to
0a7fa91
Compare
Cloud Functions gen2 also reserved K_CONFIGURATION environment variable Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
0a7fa91
to
68b1cff
Compare
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this @zchee. We just need to make a few changes to match the spec for Cloud Run Jobs that was approved upstream.
They did not like the idea of using execution+task
as faas.version
, so those will be their own individual attributes instead. That means we need separate functions for each of them.
To know that we need to call those functions, Cloud Run Jobs have to be treated like their own environment, separate from Services. So a new detector function for that as well.
imo, faas.name
is fine to be shared between Jobs and Services. These are semantically providing the same info, even though it is technically different environment variables.
To summarize, we need the following new functions:
onCloudRunJob()
- detect if we are running on Cloud Run JobsFaasJobExecution()
- return the Execution ID of the JobFaasJobTaskIndex()
- return the Task Index for the execution
Our Collector resource detection processor will be updated to write to the new gcp.cloud_run.*
attributes when running on Cloud Run Jobs, and our exporter's monitored resource mapping will be updated to parse execution+task
into a unique task ID.
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
@damemi All of done. Also, thanks for polite details reply. BTW, I'm concerned about the inconsistencies in the notation of Cloud Run jobs. Like
For now, I have updated this PR according to otel spec's changes. But certainly not unified between the source code and GCP documentationd. such as |
Should I include that to this PR? |
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
yes please do, this will prevent the same headache in the future :) for the naming inconsistencies, I don't really have a good answer. I think the usage depends on the context, ie when referring to the product "Cloud Run jobs" makes sense, when referring to multiple jobs the plural makes sense vs a singular job. Either way, I would always prefer the |
detectors/gcp/faas.go
Outdated
func (d *Detector) FaaSVersion() (string, error) { | ||
if version, found := d.os.LookupEnv(faasRevisionEnv); found { | ||
return version, nil | ||
} | ||
return "", errEnvVarNotFound | ||
} | ||
|
||
// FaaSID returns the instance id of the cloud run instance or cloud function. | ||
// FaaSJobExecution returns the execution id of the Cloud Run jobs. | ||
func (d *Detector) FaaSJobExecution() (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on second thought, the new functions should probably match the resource attribute they are updating. So CloudRunJobExecution()
and CloudRunJobTaskIndex()
. Then it's:
FaaSName()
->faas.name
FaaSVersion()
->faas.version
CloudRunJobExecution()
->gcp.cloud_run.job.execution
CloudRunJobTaskIndex()
->gcp.cloud_run.job.task_index
sorry to ask you to change this again. does that sound good to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something shorter like JobExecution()
and JobTaskIndex()
are probably fine too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose the first one cuz mapped to otel spec is readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me!
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
Signed-off-by: Koichi Shiraishi <zchee.io@gmail.com>
@damemi PTAL. |
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @zchee for working on this!
Support Cloud Run jobs using
CLOUD_RUN_JOB
andCLOUD_RUN_EXECUTION
.