-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creation of Replay Pipeline for Splunk export #11
Creation of Replay Pipeline for Splunk export #11
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the PR. Comments below.
Could you add a sentence in README with steps on how to deploy specifically the replay pipeline as needed?
variables.tf
Outdated
@@ -124,3 +124,9 @@ variable "dataflow_job_udf_function_name" { | |||
description = "[Optional Dataflow UDF] Name of JavaScript function to be called (default: '')" | |||
default = "" | |||
} | |||
|
|||
variable "dataflow_replay_template_version" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of specifying a version number vs the GCS path. Should we make the other parameter dataflow_template_path
consistent though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the version is all that's required, because the template path itself won't really change (as far as I know). Should we address that in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we should make the path static, but then consolidate the path + version variables together. How does that sound to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that sounds good. Up to you if you prefer to address in this PR or a follow-up one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed it in this PR, let me know what you think!
name = local.dataflow_replay_job_name | ||
template_gcs_path = local.dataflow_deadletter_template_gcs_path | ||
temp_gcs_location = "gs://${local.dataflow_temporary_gcs_bucket_name}/${local.dataflow_temporary_gcs_bucket_path}" | ||
machine_type = var.dataflow_job_machine_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The machine type and count specified are specific to Pub/Sub to Splunk template sizing. I wonder if we should leave those out for Pub/Sub to Pub/Sub template, and rely on default since it's an ephemeral pipeline? I'm fine either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember when we did this with Tempus, if the machine type for the replay pipeline was too small, then it took quite some time (depending on the number of logs) to burn down the PubSub to PubSub template. Perhaps we can just rely on the default, but give the opportunity for customization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
…subnetwork in replay pipeline
variables.tf
Outdated
description = "Dataflow template version for the replay job." | ||
default = "latest" | ||
} | ||
|
||
variable "dataflow_template_path" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@npredey Should we remove this input variable, now that you have a similarly-named local variable?
@@ -72,6 +72,12 @@ variable "splunk_hec_token" { | |||
|
|||
# Dataflow job parameters | |||
|
|||
variable "dataflow_template_version" { | |||
type = string | |||
description = "Dataflow template version for the replay job." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this version variable applies to both templates/jobs. Just minor update to description (and associated README table of params)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in this iteration, @npredey . Looks good to me, one last question about dataflow_template_path
- how about deleting that input variable in favor of standardizing ondataflow_template_version
? If so, let's update the table in README. Will probably need to update one line in pipeline.tf to use the local path variable similar to replay.tf
…le for dataflow template path in pipeline.tf
@rarsan I standardized the input variable for template path, and switched to use the local template path variable like in |
… & template paths; Updated README
Looks good @npredey ! So I pulled & tested this PR but it failed as it turns out a |
@rarsan I sent an invite out to you for my repo, which should grant you the proper access. Let me know if that works! |
No description provided.