Skip to content

Commit

Permalink
Feature: store tool artifact on host (#18)
Browse files Browse the repository at this point in the history
* feat(cli): define flag for store on host artifact & implement cli part

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(cli): refactor output handling

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(*): extract constant usage

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(*): extract constant usage

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(*): extract constant usage

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(watchdog): eliminate code duplication

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* feat(cli): pass additional args to dumper

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* feat(dumper): implement storing dump on host

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(cli): convert var to field

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(tests): add test cases for store on host feature

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(test): eliminate test code duplication

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(tests): eliminate code duplication

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* fix(dumper): resolve linting issue

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* fix(tests): fix unit test

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* fix(cli): rename flag shorthand

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(*): update cli docs

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(tests): update targets to run integration tests

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(*): format imports

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* fix(dumper): fix dumper flags usage

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(*): move arch mapping in tests to shell scripts

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* fix(tests): make test logging with t.Log & fix test case usage

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* fix(docs): fix typos in docs

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* feat(watchdog): starts operators only on output flags

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(tests): split integration tests to two stages

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* fix(*): fix flags generation

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* fix(dumper): fix output file location for store-on-host

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(tests): use parallel in setup method

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(test): run tests in parallel in ci mode

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>

* chore(flags): extract helper type to build cli args & update usages

Signed-off-by: ArtemTrofimushkin <artemtrofimushkin@gmail.com>
  • Loading branch information
ArtemTrofimushkin authored Feb 22, 2022
1 parent 81c18cd commit 2ec917f
Show file tree
Hide file tree
Showing 37 changed files with 677 additions and 498 deletions.
36 changes: 19 additions & 17 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ GREEN := $(shell tput -Txterm setaf 2)
YELLOW := $(shell tput -Txterm setaf 3)
WHITE := $(shell tput -Txterm setaf 7)
RESET := $(shell tput -Txterm sgr0)
ARCH := $(shell uname -m)

.PHONY: all
all: help
Expand Down Expand Up @@ -41,13 +42,14 @@ test: test-unit test-integration
test-unit:
TEST_RUN_ARGS="$(TEST_RUN_ARGS)" TEST_DIR="$(TEST_DIR)" ./hacks/run-unit-tests.sh

.PHONY: test-integration-prepare
test-integration-prepare:
./hacks/prepare-integration-tests.sh "$(ARCH)"

.PHONY: test-integration
test-integration:
./hacks/run-integration-tests.sh amd64

.PHONY: test-integration-arm64
test-integration-arm64:
./hacks/run-integration-tests.sh arm64
./hacks/prepare-integration-tests.sh "$(ARCH)"
./hacks/run-integration-tests.sh

.PHONY: prepare
prepare: tidy lint doc build-cli build-dumper test
Expand All @@ -59,15 +61,15 @@ help:
@echo ' ${YELLOW}make${RESET} ${GREEN}<target>${RESET}'
@echo ''
@echo 'Targets:'
@echo " ${YELLOW}cover ${RESET} Open html coverage report in browser"
@echo " ${YELLOW}doc ${RESET} Run doc generation"
@echo " ${YELLOW}lint ${RESET} Run linters via golangci-lint"
@echo " ${YELLOW}tidy ${RESET} Run tidy for go module to remove unused dependencies"
@echo " ${YELLOW}build-cli ${RESET} Build cli component of shovel"
@echo " ${YELLOW}build-dumper ${RESET} Build dumper component of shovel"
@echo " ${YELLOW}setup ${RESET} Setup local environment. Create kind cluster"
@echo " ${YELLOW}test ${RESET} Run all available tests"
@echo " ${YELLOW}test-unit ${RESET} Run all unit tests"
@echo " ${YELLOW}test-integration ${RESET} Run all integration tests (for amd64 arch)"
@echo " ${YELLOW}test-integration-arm64 ${RESET} Run all integration tests (for arm64 arch)"
@echo " ${YELLOW}prepare ${RESET} Run all available checks and generators"
@echo " ${YELLOW}cover ${RESET} Open html coverage report in browser"
@echo " ${YELLOW}doc ${RESET} Run doc generation"
@echo " ${YELLOW}lint ${RESET} Run linters via golangci-lint"
@echo " ${YELLOW}tidy ${RESET} Run tidy for go module to remove unused dependencies"
@echo " ${YELLOW}build-cli ${RESET} Build cli component of shovel"
@echo " ${YELLOW}build-dumper ${RESET} Build dumper component of shovel"
@echo " ${YELLOW}setup ${RESET} Setup local environment. Create kind cluster"
@echo " ${YELLOW}test ${RESET} Run all available tests"
@echo " ${YELLOW}test-unit ${RESET} Run all unit tests"
@echo " ${YELLOW}test-integration ${RESET} Run all integration tests"
@echo " ${YELLOW}test-integration-prepare ${RESET} Prepare integration tests (build dumper and load to kind cluster)"
@echo " ${YELLOW}prepare ${RESET} Run all available checks and generators"
32 changes: 22 additions & 10 deletions cli/cmd/command_builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,33 @@ package cmd

import (
"fmt"
"github.com/dodopizza/kubectl-shovel/internal/flags"
"github.com/dodopizza/kubectl-shovel/internal/globals"

"github.com/spf13/cobra"
"github.com/spf13/pflag"

"k8s.io/cli-runtime/pkg/genericclioptions"

"github.com/dodopizza/kubectl-shovel/internal/flags"
"github.com/dodopizza/kubectl-shovel/internal/globals"
"github.com/dodopizza/kubectl-shovel/internal/kubernetes"
)

// CommonOptions contains generic arguments for cli
// CommonOptions contains shared arguments for cli commands
type CommonOptions struct {
Container string
Image string
Pod string
Output string
Container string
Image string
Pod string
Output string
StoreOutputOnHost bool

kube *genericclioptions.ConfigFlags
kubeConfig *genericclioptions.ConfigFlags
}

// CommandBuilder represents logic for building and running tools
type CommandBuilder struct {
CommonOptions *CommonOptions
tool flags.DotnetTool
kube *kubernetes.Client
}

// GetFlags return FlagSet that describes generic options
Expand Down Expand Up @@ -55,9 +61,15 @@ func (options *CommonOptions) GetFlags(tool string) *pflag.FlagSet {
fmt.Sprintf("./output.%s", tool),
"Output file",
)
fs.BoolVarP(
&options.StoreOutputOnHost,
"store-output-on-host",
"t",
options.StoreOutputOnHost,
"Flag, indicating that output should be stored on host /tmp folder")

options.kube = genericclioptions.NewConfigFlags(false)
options.kube.AddFlags(fs)
options.kubeConfig = genericclioptions.NewConfigFlags(false)
options.kubeConfig.AddFlags(fs)

return fs
}
Expand Down
8 changes: 5 additions & 3 deletions cli/cmd/commands.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@ package cmd

import (
"fmt"
"github.com/dodopizza/kubectl-shovel/internal/flags"

"github.com/spf13/cobra"

"github.com/dodopizza/kubectl-shovel/internal/flags"
)

var (
Expand All @@ -20,7 +22,7 @@ func NewGCDumpCommand() *cobra.Command {
builder := NewCommandBuilder(flags.NewDotnetGCDump)
return builder.Build(
"Get dotnet-gcdump results",
"This subcommand will run dotnet-gcdump tool for running in k8s appplication.\n"+
"This subcommand will run dotnet-gcdump tool for running in k8s application.\n"+
"Result will be saved locally so you'll be able to analyze it with appropriate tools.\n"+
"You can find more info about dotnet-gcdump tool by the following links:\n\n"+
"\t* https://devblogs.microsoft.com/dotnet/collecting-and-analyzing-memory-dumps/\n"+
Expand All @@ -34,7 +36,7 @@ func NewTraceCommand() *cobra.Command {
builder := NewCommandBuilder(flags.NewDotnetTrace)
return builder.Build(
"Get dotnet-trace results",
"This subcommand will capture runtime events with dotnet-trace tool for running in k8s appplication.\n"+
"This subcommand will capture runtime events with dotnet-trace tool for running in k8s application.\n"+
"Result will be saved locally in nettrace format so you'll be able to convert it and analyze with appropriate tools.\n"+
"You can find more info about dotnet-trace tool by the following links:\n\n"+
"\t* https://github.com/dotnet/diagnostics/blob/master/documentation/dotnet-trace-instructions.md\n"+
Expand Down
100 changes: 72 additions & 28 deletions cli/cmd/launch.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,30 +3,75 @@ package cmd
import (
"context"
"fmt"
"github.com/dodopizza/kubectl-shovel/internal/events"
"github.com/dodopizza/kubectl-shovel/internal/globals"
"strings"

"github.com/pkg/errors"

"github.com/dodopizza/kubectl-shovel/internal/events"
"github.com/dodopizza/kubectl-shovel/internal/flags"
"github.com/dodopizza/kubectl-shovel/internal/globals"
"github.com/dodopizza/kubectl-shovel/internal/kubernetes"
"github.com/dodopizza/kubectl-shovel/internal/watchdog"
)

func (cb *CommandBuilder) args(info *kubernetes.ContainerInfo) []string {
args := []string{"--container-id", info.ID, "--container-runtime", info.Runtime}
args = append(args, cb.tool.ToolName())
args = append(args, cb.tool.FormatArgs()...)
return args
func (cb *CommandBuilder) newKubeClient() error {
kube, err := kubernetes.NewClient(cb.CommonOptions.kubeConfig)
if err != nil {
return err
}

cb.kube = kube
return nil
}

func (cb *CommandBuilder) args(pod *kubernetes.PodInfo, container *kubernetes.ContainerInfo) []string {
args := flags.NewArgs().
Append("container-id", container.ID).
Append("container-runtime", container.Runtime)

if cb.CommonOptions.StoreOutputOnHost {
args.AppendKey("store-output-on-host")
}

return args.
Append("container-name", container.Name).
Append("pod-name", pod.Name).
Append("pod-namespace", pod.Namespace).
AppendCommand(cb.tool.ToolName()).
AppendFrom(cb.tool).
Get()
}

func (cb *CommandBuilder) copyOutput(pod *kubernetes.PodInfo, output string) error {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()

pinger := watchdog.NewPinger(cb.kube, pod.Name)
go func() {
if err := pinger.Run(ctx); err != nil {
fmt.Println(err)
}
}()

fmt.Println("Retrieve output from diagnostics job")
if err := cb.kube.CopyFromPod(pod.Name, output, cb.CommonOptions.Output); err != nil {
return errors.Wrap(err, "Error while retrieving diagnostics job output")
}
fmt.Printf("Result successfully written to %s\n", cb.CommonOptions.Output)
return nil
}

func (cb *CommandBuilder) storeOutputOnHost(pod *kubernetes.PodInfo, output string) error {
fmt.Printf("Output located on host: %s, at path: %s\n", pod.Node, output)
return nil
}

func (cb *CommandBuilder) launch() error {
k8s, err := kubernetes.NewClient(cb.CommonOptions.kube)
if err != nil {
if err := cb.newKubeClient(); err != nil {
return errors.Wrap(err, "Failed to init kubernetes client")
}

targetPod, err := k8s.GetPodInfo(cb.CommonOptions.Pod)
targetPod, err := cb.kube.GetPodInfo(cb.CommonOptions.Pod)
if err != nil {
return errors.Wrap(err, "Failed to get info about target pod")
}
Expand All @@ -42,34 +87,29 @@ func (cb *CommandBuilder) launch() error {
}

jobSpec := kubernetes.
NewJobRunSpec(cb.args(targetContainer), cb.CommonOptions.Image, targetPod).
NewJobRunSpec(cb.args(targetPod, targetContainer), cb.CommonOptions.Image, targetPod).
WithContainerFSVolume(targetContainer)

if targetPod.ContainsMountedTmp(targetContainerName) {
jobSpec.WithContainerMountsVolume(targetContainer)
}

if cb.CommonOptions.StoreOutputOnHost {
jobSpec.WithHostTmpVolume()
}

fmt.Printf("Spawning diagnostics job with command:\n%s\n", strings.Join(jobSpec.Args, " "))
if err := k8s.RunJob(jobSpec); err != nil {
if err := cb.kube.RunJob(jobSpec); err != nil {
return errors.Wrap(err, "Failed to spawn diagnostics job")
}

fmt.Println("Waiting for a diagnostics job to start")
jobPod, err := k8s.WaitPod(jobSpec.Selectors)
jobPod, err := cb.kube.WaitPod(jobSpec.Selectors)
if err != nil {
return errors.Wrap(err, "Failed to wait diagnostics job execution")
}

op := watchdog.NewOperator(k8s, jobPod.Name)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
go func() {
if err := op.Run(ctx); err != nil {
fmt.Println(err)
}
}()

jobPodLogs, err := k8s.ReadPodLogs(jobPod.Name, globals.PluginName)
jobPodLogs, err := cb.kube.ReadPodLogs(jobPod.Name, globals.PluginName)
if err != nil {
return errors.Wrap(err, "Failed to read logs from diagnostics job targetPod")
}
Expand All @@ -81,13 +121,17 @@ func (cb *CommandBuilder) launch() error {
return err
}

fmt.Println("Retrieve output from diagnostics job")
if err := k8s.CopyFromPod(jobPod.Name, output, cb.CommonOptions.Output); err != nil {
return errors.Wrap(err, "Error while retrieving diagnostics job output")
// dealing with output
outputHandler := cb.copyOutput
if cb.CommonOptions.StoreOutputOnHost {
outputHandler = cb.storeOutputOnHost
}
if err := outputHandler(jobPod, output); err != nil {
return err
}

fmt.Printf("Result successfully written to %s\nCleanup diagnostics job", cb.CommonOptions.Output)
if err := k8s.DeleteJob(jobSpec.Name); err != nil {
fmt.Println("Cleanup diagnostics job")
if err := cb.kube.DeleteJob(jobSpec.Name); err != nil {
return errors.Wrap(err, "Error while deleting job")
}

Expand Down
1 change: 1 addition & 0 deletions cli/docs/kubectl-shovel_dump.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ Also use `-n`/`--namespace` if your pod is not in current context's namespace:
-p, --process-id int The process ID to collect the trace from (default 1)
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
-t, --store-output-on-host Flag, indicating that output should be stored on host /tmp folder
--tls-server-name string Server name to use for server certificate validation. If it is not provided, the hostname used to contact the server is used
--token string Bearer token for authentication to the API server
--type type The kinds of information that are collected from process. Supported types:
Expand Down
3 changes: 2 additions & 1 deletion cli/docs/kubectl-shovel_gcdump.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Get dotnet-gcdump results

### Synopsis

This subcommand will run dotnet-gcdump tool for running in k8s appplication.
This subcommand will run dotnet-gcdump tool for running in k8s application.
Result will be saved locally so you'll be able to analyze it with appropriate tools.
You can find more info about dotnet-gcdump tool by the following links:

Expand Down Expand Up @@ -54,6 +54,7 @@ Also use `-n`/`--namespace` if your pod is not in current context's namespace:
-p, --process-id int The process ID to collect the trace from (default 1)
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
-t, --store-output-on-host Flag, indicating that output should be stored on host /tmp folder
--timeout timeout Give up on collecting the GC dump if it takes longer than this many seconds.
Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
Will be rounded to seconds. If no unit provided defaults to seconds.
Expand Down
3 changes: 2 additions & 1 deletion cli/docs/kubectl-shovel_trace.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Get dotnet-trace results

### Synopsis

This subcommand will capture runtime events with dotnet-trace tool for running in k8s appplication.
This subcommand will capture runtime events with dotnet-trace tool for running in k8s application.
Result will be saved locally in nettrace format so you'll be able to convert it and analyze with appropriate tools.
You can find more info about dotnet-trace tool by the following links:

Expand Down Expand Up @@ -92,6 +92,7 @@ Or convert any other format to speedscope format with:
https://docs.microsoft.com/en-us/dotnet/core/diagnostics/well-known-event-providers
--request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
-s, --server string The address and port of the Kubernetes API server
-t, --store-output-on-host Flag, indicating that output should be stored on host /tmp folder
--tls-server-name string Server name to use for server certificate validation. If it is not provided, the hostname used to contact the server is used
--token string Bearer token for authentication to the API server
--user string The name of the kubeconfig user to use
Expand Down
Loading

0 comments on commit 2ec917f

Please sign in to comment.