-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Neuron observability #121
Neuron observability #121
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freschri This is AWSome work. We need some strategic and tactical changes as per the comments below to align this well. Love the neuron pattern.
...-eks-observability-accelerators/single-new-eks-inferentia-neuron-opensource-observability.md
Outdated
Show resolved
Hide resolved
...-eks-observability-accelerators/single-new-eks-inferentia-neuron-opensource-observability.md
Outdated
Show resolved
Hide resolved
...-eks-observability-accelerators/single-new-eks-inferentia-neuron-opensource-observability.md
Outdated
Show resolved
Hide resolved
...-eks-observability-accelerators/single-new-eks-inferentia-neuron-opensource-observability.md
Outdated
Show resolved
Hide resolved
...-eks-observability-accelerators/single-new-eks-inferentia-neuron-opensource-observability.md
Outdated
Show resolved
Hide resolved
lib/single-new-eks-opensource-observability-pattern/neuron/neuron-monitor-addon.ts
Outdated
Show resolved
Hide resolved
lib/single-new-eks-opensource-observability-pattern/neuron/neuron-monitor.yaml
Outdated
Show resolved
Hide resolved
...-eks-observability-accelerators/single-new-eks-inferentia-neuron-opensource-observability.md
Outdated
Show resolved
Hide resolved
...-eks-observability-accelerators/single-new-eks-inferentia-neuron-opensource-observability.md
Outdated
Show resolved
Hide resolved
@freschri Any updates to this PR? |
I addressed all points apart from the addons. One of them depends on another PR in Blueprints (Neuron Device Plugin Addon). My suggestion was on another channel to merge this PR as is, if no other concern, and raise 2 issues to port the addons to Blueprints. That can be done as an exercise by someone with limited experience on Blueprints/Patterns/Observability. Please let me know |
Thats a good opportunity mentor, if you have some one in your region. If i have someone, i will let you know. But this is not a burning one, please take it slow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freschri Couple of minor ones. Once you fix these, i can run e2e.
lib/single-new-eks-fargate-opensource-observability-pattern/index.ts
Outdated
Show resolved
Hide resolved
/do-e2e-test single-new-eks-opensource-observability deploy |
/do-e2e-test single-new-eks-opensource-observability deploy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the CDK version.
"aws-cdk-lib": "2.132.0",
"aws-cdk": "2.132.0"
/do-e2e-test single-new-eks-inferentia-opensource-observability deploy |
/do-e2e-test single-new-eks-opensource-observability deploy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more feedback
removed neuron reference in kustom as requested
# Conflicts: # lib/single-new-eks-fargate-opensource-observability-pattern/index.ts # package.json
@elamaran11 updated blueprints version by merging main into here. please trigger e2e thanks |
/do-e2e-test single-new-eks-inferentia-opensource-observability deploy |
@elamaran11 e2e passes. shall we merge? |
Nope lets wait, I might to crash e2e and do few other checks. |
/do-e2e-test single-new-eks-inferentia-opensource-observability destroy |
The E2E Failure is related to subnet ENI which is very interim. It worked on retry, |
/do-e2e-test single-new-eks-opensource-observability deploy |
Running E2E on OSS Pattern. |
/do-e2e-test single-new-eks-opensource-observability destroy |
/do-e2e-test single-new-eks-inferentia-opensource-observability deploy |
/do-e2e-test single-new-eks-inferentia-opensource-observability destroy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Single New EKS Cluster Open Source Observability Accelerator for Neuron-based clusters
depends on aws-observability/aws-observability-accelerator#32
@elamaran11 please check if we want to keep/remove cdk.json from gitignore (we need to be able to add entries that are validated in compilation otherwise github compilation fails on push)