Skip to content

Commit

Permalink
Cloud Foundry Buildpack for Collector (open-telemetry#1404)
Browse files Browse the repository at this point in the history
* Cloud Foundry Buildpack for the Splunk Collector

This is a deployment of the OpenTelemetry Collecter in the form of a
Cloud Foundry buildpack. This buildpack supplies the Collector to the
app it is used for. When the app is deployed it can run and configure
the Collector as a sidecar (as described in the README). This will allow
the Collector to observe the given app as well as the whole environment's
metrics. This also installs the Smart Agent bundle to make sure the
Smart Agent receiver works as expected.
  • Loading branch information
crobert-1 authored Apr 19, 2022
1 parent 7d7c281 commit c5d9954
Show file tree
Hide file tree
Showing 8 changed files with 381 additions and 1 deletion.
44 changes: 44 additions & 0 deletions .github/workflows/cloudfoundry_buildpack.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: Cloud Foundry Buildpack

# The workflow triggered by any change in deployments/cloudfoundry/buildpack/.
# 1. Run buildpack test.

on:
pull_request:
paths:
- 'deployments/cloudfoundry/buildpack/**'

permissions:
contents: write

defaults:
run:
working-directory: 'deployments/cloudfoundry/buildpack'

jobs:

test:
name: Test buildpack supplies required dependencies
runs-on: ubuntu-latest
steps:
- name: Check out the codebase.
uses: actions/checkout@v3

- name: Setup script's input argument directories
shell: bash
run: |
sudo mkdir /tmp/cf_build_dir
sudo mkdir /tmp/cf_cache_dir
sudo mkdir /tmp/cf_deps_dir
- name: Run buildpack supply script
shell: bash
run: |
sudo ./bin/supply /tmp/cf_build_dir /tmp/cf_cache_dir /tmp/cf_deps_dir 0
- name: Delete created files
shell: bash
run: |
sudo rm -rf /tmp/cf_build_dir
sudo rm -rf /tmp/cf_cache_dir
sudo rm -rf /tmp/deps_dir
2 changes: 1 addition & 1 deletion .github/workflows/lychee.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
id: lychee
uses: lycheeverse/lychee-action@v1.4.1
with:
args: --accept 200,429 --exclude "my.host" --exclude "file://*" --exclude "api.*.signalfx.com" --exclude "ingest.*.signalfx.com" --exclude "splunk.jfrog.io.*basearch" --exclude "localhost:*" --exclude "127.*:*" --exclude "splunk_gateway_url" -v -n './*.md' './**/*.md'
args: --accept 200,429 --exclude "my.host" --exclude "file://*" --exclude "api.*.signalfx.com" --exclude "ingest.*.signalfx.com" --exclude "splunk.jfrog.io.*basearch" --exclude "localhost:*" --exclude "127.*:*" --exclude "splunk_gateway_url" --exclude ".*.cf-app.com" -v -n './*.md' './**/*.md'
- name: Fail if there were link errors
run: exit ${{ steps.lychee.outputs.exit_code }}
- name: Create Issue From File
Expand Down
6 changes: 6 additions & 0 deletions deployments/cloudfoundry/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Splunk OpenTelemetry Collector Pivotal Cloud Foundry (PCF) Integrations

### Cloud Foundry Buildpack

This integration can be used to install and run the Collector as a sidecar to your app.
In this configuration, the collector will run in the same container as the app.
177 changes: 177 additions & 0 deletions deployments/cloudfoundry/buildpack/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Splunk OpenTelemetry Collector Pivotal Cloud Foundry (PCF) Buildpack

A [Cloud Foundry buildpack](https://docs.pivotal.io/application-service/2-11/buildpacks/) to install
the Splunk OpenTelemetry Collector for use with PCF apps.

The buildpack's default functionality, as described in this document, is to deploy the OpenTelemetry Collector
as a sidecar for the given app that's being deployed. The Collector is able to observe the app as a
[nozzle](https://docs.pivotal.io/tiledev/2-10/nozzle.html#nozzle) to
the [Loggregator Firehose](https://docs.cloudfoundry.org/loggregator/architecture.html).
The Loggregator Firehose is one of the architectures Cloud Foundry
uses to emit logs and metrics. This means that the Splunk OpenTelemetry Collector will be observing all
apps and deployments that emit metrics and logs to the Loggregator Firehose as long as it's running.

## Installation
- Clone this repository.
- Change to this directory.
- Run the following command:
```sh
# Add buildpack for Splunk OpenTelemetry Collector
$ cf create-buildpack otel_collector_buildpack . 99 --enable
```

### Dependencies in Cloud Foundry Environment

- `wget`
- `jq`

### Using PCF Buildpack With an Application
This section covers basic cf CLI (Cloud Foundry Command Line Interface) commands to use the buildpack.
```sh
# Basic setup, see the configuration section for more envvars that can be set
$ cf set-env <app-name> OTEL_CONFIG <config_file_name>
$ cf set-env <app-name> OTELCOL <desired_collector_executable_name>

$ cd <my-application-directory>/

# How to run an application without a manifest.yml file:
# Note: This will provide the Collector for the app, but the Collector will not be running.
$ cf push <app-name> -b otel_collector_buildpack -b <main_buildpack>

# How to run an application with a manifest.yml file:
# Option 1)
$ cf push
# Option 2)
$ cf push <app-name> -b otel_collector_buildpack -b <main_buildpack> -f manifest.yml
```

Note: This buildpack requires another buildpack to be supplied after it, it is not allowed to
be the last one for an app. Also, the manifest.yml file will need to provide the
command to run the Splunk OpenTelemetry Collector as a sidecar for the application.

## Configuration

The following only applies if you are using the `otelconfig.yaml` config
provided by the buildpack. If you provide a custom configuration file for the Splunk OpenTelemetry Collector
in your application (and refer to it in the sidecar configuration), these might not
work unless you have preserved the references to the environment variables in the config file.
For proper functionality, the `OTEL_CONFIG` environment variable must point to
the configuration file, whether using the default or a customer version.

Set the following environment variables with `cf set-env` as applicable to configure this buildpack, or
include them in the `manifest.yml` file, as shown in the [included example](#sidecar-configuration).

Required:
- `RLP_GATEWAY_ENDPOINT` - The URL of the RLP gateway that acts as the proxy for the firehose,
e.g. `https://log-stream.sys.TAS_ENVIRONMENT_NAME.cf-app.com`
- `UAA_ENDPOINT` - The URL of UAA provider,
e.g. `https://uaa.sys.TAS_ENVIRONMENT_NAME.cf-app.com`
- `UAA_USERNAME` - Name of the UAA user.
- `UAA_PASSWORD` - Password for the UAA user.
- `SPLUNK_ACCESS_TOKEN` - Your Splunk organization access token.
- The Splunk Observability Suite requires an endpoint, so one of the two options below must be specified:
1) - `SPLUNK_INGEST_URL` - The ingest base URL for Splunk. This option takes precedence over SPLUNK_REALM.
- `SPLUNK_API_URL` - The API server base URL for Splunk. This option takes precedence over SPLUNK_REALM.
2) - `SPLUNK_REALM` - The Splunk realm in which your organization resides. Used to derive SPLUNK_INGEST_URL
and SPLUNK_API_URL.

Optional:
- `OS` - Operating system that Cloud Foundry is running. Must match format of Otel Collector executable name.
Default: `linux_amd64`. This is the only officially supported OS, so any changes to this variable should
be done at the user's own risk.
- `OTEL_CONFIG` - Local name of Splunk OpenTelemetry config file. Default: `otelconfig.yaml`
- `OTEL_VERSION` - Executable version of Splunk OpenTelemetry Collector to use. The buildpack depends on features present in version
v0.48.0+. Default: `latest`. Example valid value: `v0.48.0`.
Note that if left the default value, the latest version will be found and later variable references will be
to a valid version number, not simply the word "latest".
- `OTEL_BINARY` - Splunk OpenTelemetry Collector executable file name. Default: `otelcol_$OS-v$OTEL_VERSION`
- `OTEL_BINARY_DOWNLOAD_URL` - URL to download the Splunk OpenTelemetry Collector from. This takes precedence over other
version variables. Only the Splunk distribution is supported.
Default: `https://github.com/signalfx/splunk-otel-collector/releases/download/${OTEL_VERSION}/otelcol_${OS}`
- `RLP_GATEWAY_SHARD_ID` - Metrics are load balanced between receivers that use the same shard ID.
Only use if multiple receivers must receive all metrics instead of
balancing metrics between them. Default: `opentelemetry`
- `RLP_GATEWAY_TLS_INSECURE` - Whether to skip TLS verify for the RLP gateway endpoint. Default: `false`
- `UAA_TLS_INSECURE` - Whether to skip TLS verify for the UAA endpoint. Default: `false`
- `SMART_AGENT_VERSION` - Version of the Smart Agent that should be downloaded. This is a dependency of
the collector's `signalfx` receiver. Default: `latest`. Example valid value: `v5.19.1`.
Note that if left the default value, the latest version will be found and later variable references will be
to a valid version number, not simply the word "latest".

## Sidecar Configuration

The recommended method for running the Collector is to run it as a sidecar using
the Cloud Foundry [sidecar
functionality](https://docs.cloudfoundry.org/devguide/sidecars.html).
Additional information can be found [in the v3 API
docs](http://v3-apidocs.cloudfoundry.org/version/release-candidate/#sidecars).

Here is an example application `manifest.yml` file that would run the Collector as
a sidecar:

```yaml
---
applications:
- name: test-app
buildpacks:
- otel_collector_buildpack
- go_buildpack
instances: 1
memory: 256M
random-route: true
env:
RLP_GATEWAY_ENDPOINT: "https://log-stream.sys.TAS_ENVIRONMENT_NAME.cf-app.com"
UAA_ENDPOINT: "https://uaa.sys.TAS_ENVIRONMENT_NAME.cf-app.com"
UAA_USERNAME: "..."
UAA_PASSWORD: "..."
SPLUNK_ACCESS_TOKEN: "..."
SPLUNK_REALM: "..."
sidecars:
- name: otel-collector
process_types:
- web
command: "$HOME/.otelcollector/otelcol_${OS:-linux_amd64}-v${OTEL_VERSION:-0.48.0} --config=$HOME/.otelcollector/${OTEL_CONFIG:-otelconfig.yaml}"
memory: 100MB
```
If using a `manifest.yaml` file, you may push your app simply with the following command:
```sh
# If you are using cf CLI v7
$ cf push
# If you are using cf CLI v6
$ cf v3-push <app-name>
```
This will deploy the app with the proper buildpacks, and the Splunk OpenTelemetry Collector running in the
sidecar configuration.

## Troubleshooting

* If the app is running but the Splunk OpenTelemetry Collector is not, it may be that the sidecar configuration is not
being picked up properly from the manifest file. Try running the following commands:

```sh
# This will apply the manifest file to an existing app
$ cf v3-apply-manifest -f manifest.yml
# This will re-load the app with the sidecar configuration included
$ cf push
```

* Another possibility is the sidecar was not allocated memory. The `memory` option
is required for a sidecar process for it to run. Once memory allocation has been added to the sidecar,
re-run the above command to apply the manifest and push the application again. Example YAML that will
fail:
```yaml
sidecars:
- name: otel-collector
process_types:
- web
command: "$HOME/.otelcollector/otelcol_${OS:-linux_amd64}-v${OTEL_VERSION:-0.48.0} --config=$HOME/.otelcollector/${OTEL_CONFIG:-otelconfig.yaml}"
```

### Useful CF CLI debugging commands
```sh
$ cf apps # Checks status of app
$ cf logs <app-name> --recent # View the app's logs
$ cf env <app-name> # Show all environment variables for the app.
$ cf events <app-name> # View the app's events
```
4 changes: 4 additions & 0 deletions deployments/cloudfoundry/buildpack/bin/detect
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/sh

echo "OpenTelemetry Collector"
exit 0
5 changes: 5 additions & 0 deletions deployments/cloudfoundry/buildpack/bin/release
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/sh

cat <<EOF
---
EOF
113 changes: 113 additions & 0 deletions deployments/cloudfoundry/buildpack/bin/supply
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
#!/bin/bash

set -euo pipefail

BUILD_DIR=$1
CACHE_DIR=$2
DEPS_DIR=$3
IDX=$4

BUILDPACK_DIR="$(readlink -f ${BASH_SOURCE%/*})"
TARGET_DIR="$BUILD_DIR/.otelcollector"

OTEL_CONFIG="${OTEL_CONFIG:-otelconfig.yaml}"
# Version must be >=0.48.0, the first release included the enabled Cloud Foundry receiver
OTEL_VERSION="${OTEL_VERSION:-latest}"
# Cloud Foundry only supports Linux stacks currently, but if that changes this variable can be modified
# for the proper stack.
OS="${OS:-linux_amd64}"
OTEL_BASE_URL="https://github.com/signalfx/splunk-otel-collector/releases"

# Get proper version number if we're getting latest release
if [ $OTEL_VERSION = "latest" ] && [ -z "${OTEL_BINARY_DOWNLOAD_URL:-}" ]; then
OTEL_VERSION=$( wget -qO- --header="Accept: application/json" "$OTEL_BASE_URL/latest" | jq -r '.tag_name' )
if [ -z "$OTEL_VERSION" ]; then
echo "Failed to get tag_name for latest release from $OTEL_BASE_URL/latest" >&2
exit 1
fi
fi

echo "-----> Installing Otel Collector ${OTEL_VERSION}"
echo " BUILD_DIR: $BUILD_DIR"
echo " CACHE_DIR: $CACHE_DIR"
echo " DEPS_DIR: $DEPS_DIR"
echo " BUILDPACK_INDEX: $IDX"
echo " BUILDPACK_DIR: $BUILDPACK_DIR"
echo " TARGET_DIR: $TARGET_DIR"

mkdir -p $TARGET_DIR

# Copy over default OpenTelemetry configuration file for sidecar configuration use
cp "$BUILDPACK_DIR/../$OTEL_CONFIG" "$TARGET_DIR/"

# File name for given release version
OTEL_BINARY="${OTEL_BINARY:-otelcol_${OS}-${OTEL_VERSION}}"
CACHED_OTEL_BINARY="$CACHE_DIR/$OTEL_BINARY"

if [[ -f "$CACHED_OTEL_BINARY" ]]; then
echo "-----> Using cached Otel Collector install: $CACHED_OTEL_BINARY"
cp $CACHED_OTEL_BINARY $TARGET_DIR
else
OTEL_BINARY_DOWNLOAD_URL="${OTEL_BINARY_DOWNLOAD_URL:-${OTEL_BASE_URL}/download/${OTEL_VERSION}/otelcol_${OS}}"
echo "-----> Downloading OpenTelemetry Collector $OTEL_VERSION ($OTEL_BINARY_DOWNLOAD_URL)"
wget -nv -O "$TARGET_DIR/$OTEL_BINARY" $OTEL_BINARY_DOWNLOAD_URL
fi

# Cache the Collector
cp "$TARGET_DIR/$OTEL_BINARY" $CACHE_DIR

# Move Collector executable and config to dependencies directory to enable app access
chmod 755 "$TARGET_DIR/$OTEL_BINARY" "$TARGET_DIR/$OTEL_CONFIG"

# Define Collector's default environment variables by adding bash script to profile.d directory in root of app
mkdir -p $DEPS_DIR/$IDX/profile.d
echo "export RLP_GATEWAY_SHARD_ID=\"opentelemetry\"
export RLP_GATEWAY_TLS_INSECURE=false
export UAA_TLS_INSECURE=false
export OTEL_VERSION=$OTEL_VERSION" >> $DEPS_DIR/$IDX/profile.d/otel_config_env_vars.sh

echo "-----> Successfully installed OpenTelemetry Collector $OTEL_VERSION and set default environment variables"

if [ $OS != linux_amd64 ]; then
echo "-----> OS doesn't support Smart Agent dependencies, signalfx receiver won't be supported."
exit 0
fi

# Need to run SignalFx patch interpreter for the SignalFx receiver to work properly.
# Get proper version number if we're getting latest release
SMART_AGENT_VERSION="${SMART_AGENT_VERSION:-latest}"
SMART_AGENT_BASE_URL="https://github.com/signalfx/signalfx-agent/releases"

if [ $SMART_AGENT_VERSION = "latest" ]; then
SMART_AGENT_VERSION=$( wget -qO- --header="Accept: application/json" "${SMART_AGENT_BASE_URL}/latest" | jq -r '.tag_name' )
if [ -z "$SMART_AGENT_VERSION" ]; then
echo "Failed to get tag_name for latest release from $SMART_AGENT_BASE_URL/latest" >&2
exit 1
fi
fi

SMART_AGENT=signalfx-agent-${SMART_AGENT_VERSION#v}.tar.gz
SMART_AGENT_DOWNLOAD_URL=$SMART_AGENT_BASE_URL/download/${SMART_AGENT_VERSION}/$SMART_AGENT
SA_TARGET_DIR="$BUILD_DIR/.signalfx"
mkdir -p $SA_TARGET_DIR

echo "-----> Downloading SignalFx Smart Agent $SMART_AGENT_VERSION ($SMART_AGENT_DOWNLOAD_URL)"
wget -nv -O "$SA_TARGET_DIR/$SMART_AGENT" $SMART_AGENT_DOWNLOAD_URL
tar -xf "$SA_TARGET_DIR/$SMART_AGENT" -C "$SA_TARGET_DIR"
mv "${SA_TARGET_DIR}/signalfx-agent" "${SA_TARGET_DIR}/agent-bundle"

# Absolute path of interpreter in smart agent dir is set in dependent binaries
# requiring the interpreter location not to change.
SPLUNK_BUNDLE_DIR="/home/vcap/app/.signalfx/agent-bundle"
SPLUNK_COLLECTD_DIR="${SPLUNK_BUNDLE_DIR}/run/collectd"

${SA_TARGET_DIR}/agent-bundle/bin/patch-interpreter ${SPLUNK_BUNDLE_DIR}
rm -f $SPLUNK_BUNDLE_DIR/bin/signalfx-agent \
$SPLUNK_BUNDLE_DIR/bin/agent-status \
$SA_TARGET_DIR/$SMART_AGENT;

echo "export SPLUNK_BUNDLE_DIR=${SPLUNK_BUNDLE_DIR}
export SPLUNK_COLLECTD_DIR=${SPLUNK_COLLECTD_DIR}
export PATH=${PATH}:${SPLUNK_BUNDLE_DIR}/bin" >> $DEPS_DIR/$IDX/profile.d/signalfx_agent_env_vars.sh

echo "-----> Successfully installed SignalFx Smart Agent bundle"
31 changes: 31 additions & 0 deletions deployments/cloudfoundry/buildpack/otelconfig.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
receivers:
cloudfoundry:
rlp_gateway:
endpoint: ${RLP_GATEWAY_ENDPOINT}
shard_id: ${RLP_GATEWAY_SHARD_ID}
tls:
insecure_skip_verify: ${RLP_GATEWAY_TLS_INSECURE}
uaa:
endpoint: ${UAA_ENDPOINT}
username: ${UAA_USERNAME}
password: ${UAA_PASSWORD}
tls:
insecure_skip_verify: ${UAA_TLS_INSECURE}

processors:
resourcedetection:
system:

exporters:
signalfx:
access_token: ${SPLUNK_ACCESS_TOKEN}
realm: ${SPLUNK_REALM}
api_url: ${SPLUNK_API_URL}
ingest_url: ${SPLUNK_INGEST_URL}

service:
pipelines:
metrics:
receivers: [cloudfoundry]
processors: [resourcedetection]
exporters: [signalfx]

0 comments on commit c5d9954

Please sign in to comment.