Skip to content

Commit

Permalink
feat: introduce "exit spans" and mark S3 spans as exit spans (#2552)
Browse files Browse the repository at this point in the history
Per https://github.com/elastic/apm/blob/main/specs/agents/tracing-spans.md#exit-spans
the result is that HTTP child spans of S3 spans are no longer captured.

Also:
- Add http context to the S3 spans, now that the http spans are gone some
  of this info might be helpful.
- Update TAV versions for aws-sdk to something more reasonable.

Closes: #2125
  • Loading branch information
trentm authored Jan 31, 2022
1 parent dbaddc5 commit 520a901
Show file tree
Hide file tree
Showing 13 changed files with 95 additions and 25 deletions.
4 changes: 2 additions & 2 deletions .tav.yml
Original file line number Diff line number Diff line change
Expand Up @@ -524,8 +524,8 @@ aws-sdk:
# Maintenance note: This should be updated periodically using:
# ./dev-utils/aws-sdk-tav-versions.sh
#
# Test v2.858.0, every N=31 of 159 releases, and current latest.
versions: '2.858.0 || 2.889.0 || 2.920.0 || 2.951.0 || 2.982.0 || 2.1013.0 || 2.1016.0 || >2.1016.0 <3'
# Test v2.858.0, every N=41 of 209 releases, and current latest.
versions: '2.858.0 || 2.899.0 || 2.940.0 || 2.981.0 || 2.1022.0 || 2.1063.0 || 2.1066.0 || >2.1066.0 <3'
commands:
- node test/instrumentation/modules/aws-sdk/aws4-retries.test.js
- node test/instrumentation/modules/aws-sdk/s3.test.js
Expand Down
5 changes: 5 additions & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ Notes:
* Add `transaction.name` to captured APM errors. This will allow the Kibana APM
app to correlate error groups and transaction groups. ({issues}2456[#2456])
* Mark S3 spans (from 'aws-sdk' instrumentation) as exit spans (per
https://github.com/elastic/apm/blob/main/specs/agents/tracing-spans.md#exit-spans).
The result is that HTTP child spans of S3 spans are no longer captured.
({issues}2125[#2125])
[float]
===== Bug fixes
Expand Down
6 changes: 6 additions & 0 deletions docs/agent-api.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -643,6 +643,12 @@ Sub-millisecond precision can be achieved using decimals.
If not provided,
the current time will be used

** `exitSpan` +{type-boolean}+ Make an "exit span".
Exit spans represent outgoing communication. They are used to create a node
in the {kibana-ref}/service-maps.html[Service Map] and a downstream service
in the {kibana-ref}/dependencies.html[Dependencies Table]. The provided subtype
will be used as the downstream service name.

Start and return a new custom span associated with the current active transaction.
This is the same as getting the current transaction with `apm.currentTransaction` and,
if a transaction was found,
Expand Down
6 changes: 6 additions & 0 deletions docs/transaction-api.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,12 @@ Sub-millisecond precision can be achieved using decimals.
If not provided,
the current time will be used

** `exitSpan` +{type-boolean}+ Make an "exit span".
Exit spans represent outgoing communication. They are used to create a node
in the {kibana-ref}/service-maps.html[Service Map] and a downstream service
in the {kibana-ref}/dependencies.html[Dependencies Table]. The provided subtype
will be used as the downstream service name.

Start and return a new custom span associated with this transaction.
When a span is started it will measure the time until <<span-end,`span.end()`>> is called.

Expand Down
1 change: 1 addition & 0 deletions index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,7 @@ declare namespace apm {
export interface SpanOptions {
startTime?: number;
childOf?: Transaction | Span | string;
exitSpan?: boolean;
}

type CaptureBody = 'off' | 'errors' | 'transactions' | 'all';
Expand Down
2 changes: 1 addition & 1 deletion lib/agent.js
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ Agent.prototype.setTransactionOutcome = function (outcome) {
return this._instrumentation.setTransactionOutcome.apply(this._instrumentation, arguments)
}

Agent.prototype.startSpan = function (name, type, subtype, action, { childOf } = {}) {
Agent.prototype.startSpan = function (name, type, subtype, action, { startTime, childOf, exitSpan } = {}) {
return this._instrumentation.startSpan.apply(this._instrumentation, arguments)
}

Expand Down
12 changes: 6 additions & 6 deletions lib/instrumentation/http-shared.js
Original file line number Diff line number Diff line change
Expand Up @@ -146,12 +146,12 @@ exports.traceOutgoingRequest = function (agent, moduleName, method) {

if (!options.headers) options.headers = {}

// Attempt to use the span context as a traceparent header.
// If the transaction is unsampled the span will not exist,
// however a traceparent header must still be propagated
// to indicate requested services should not be sampled.
// Use the transaction context as the parent, in this case.
var parent = span || ins.currTransaction()
// W3C trace-context propagation.
// There are a number of reasons why `span` might be null: child of an
// exit span, `transactionMaxSpans` was hit, unsampled transaction, etc.
// If so, then fallback to the current run context's span or transaction,
// if any.
var parent = span || parentRunContext.currSpan() || parentRunContext.currTransaction()
if (parent && parent._context) {
const headerValue = parent._context.toTraceParentString()
const traceStateValue = parent._context.toTraceStateString()
Expand Down
2 changes: 1 addition & 1 deletion lib/instrumentation/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Instrumentation.prototype.currTransaction = function () {
if (!this._started) {
return null
}
return this._runCtxMgr.active().currTransaction() || null
return this._runCtxMgr.active().currTransaction()
}

Instrumentation.prototype.currSpan = function () {
Expand Down
30 changes: 26 additions & 4 deletions lib/instrumentation/modules/aws-sdk/s3.js
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ function instrumentationS3 (orig, origArguments, request, AWS, agent, { version,

const ins = agent._instrumentation

const span = ins.createSpan(name, TYPE, SUBTYPE, opName)
const span = ins.createSpan(name, TYPE, SUBTYPE, opName, { exitSpan: true })
if (!span) {
return orig.apply(request, origArguments)
}
Expand All @@ -69,12 +69,13 @@ function instrumentationS3 (orig, origArguments, request, AWS, agent, { version,
// This can be asynchronously determined -- e.g. if it differs from the
// configured service endpoint region -- so this won't be set until
// 'complete'.
const region = request.httpRequest && request.httpRequest.region
const httpRequest = request.httpRequest
const region = httpRequest && httpRequest.region

// Destination context.
// '.httpRequest.endpoint' might differ from '.service.endpoint' if
// the bucket is in a different region.
const endpoint = request.httpRequest && request.httpRequest.endpoint
const endpoint = httpRequest && httpRequest.endpoint
const destContext = {
service: {
name: SUBTYPE,
Expand All @@ -94,13 +95,34 @@ function instrumentationS3 (orig, origArguments, request, AWS, agent, { version,
span.setDestinationContext(destContext)

if (response) {
// https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/HttpResponse.html
const httpResponse = response.httpResponse
let statusCode
if (httpResponse) {
statusCode = httpResponse.statusCode

// Set HTTP context. Some context not being set, though it is available:
// - method: Not that helpful.
// - url: Mostly redundant with context.destination.address.
// - response.headers: A lot of added size for uncertain utility. The
// inclusion of Amazon's request ID headers might be worth it.
const httpContext = {
status_code: statusCode
}
const encodedBodySize = Buffer.isBuffer(httpResponse.body) && httpResponse.body.byteLength
if (encodedBodySize) {
// I'm not actually sure if this might be decoded_body_size.
httpContext.response = { encoded_body_size: encodedBodySize }
}
span.setHttpContext(httpContext)
}

// Follow the spec for HTTP client span outcome.
// https://github.com/elastic/apm/blob/main/specs/agents/tracing-instrumentation-http.md#outcome
//
// For example, a S3 GetObject conditional request (e.g. using the
// IfNoneMatch param) will respond with response.error=NotModifed and
// statusCode=304. This is a *successful* outcome.
const statusCode = response.httpResponse && response.httpResponse.statusCode
if (statusCode) {
span._setOutcomeFromHttpStatusCode(statusCode)
} else {
Expand Down
5 changes: 4 additions & 1 deletion lib/instrumentation/span.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,21 @@ module.exports = Span
util.inherits(Span, GenericSpan)

function Span (transaction, name, ...args) {
const defaultChildOf = transaction._agent._instrumentation.currSpan() || transaction
const opts = typeof args[args.length - 1] === 'object'
? (args.pop() || {})
: {}

if (!opts.childOf) {
const defaultChildOf = transaction._agent._instrumentation.currSpan() || transaction
opts.childOf = defaultChildOf
opts.timer = defaultChildOf._timer
} else if (opts.childOf._timer) {
opts.timer = opts.childOf._timer
}

this._exitSpan = !!opts.exitSpan
delete opts.exitSpan

GenericSpan.call(this, transaction._agent, ...args, opts)

this._db = null
Expand Down
23 changes: 19 additions & 4 deletions lib/instrumentation/transaction.js
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,8 @@ Transaction.prototype.setCloudContext = function (cloudContext) {
}

// Create a span on this transaction and make it the current span.
Transaction.prototype.startSpan = function (...spanArgs) {
const span = this.createSpan(...spanArgs)
Transaction.prototype.startSpan = function (...args) {
const span = this.createSpan(...args)
if (span) {
this._agent._instrumentation.supersedeWithSpanRunContext(span)
}
Expand All @@ -129,7 +129,7 @@ Transaction.prototype.startSpan = function (...spanArgs) {
// This does *not* replace the current run context to make this span the
// "current" one. This allows instrumentations to avoid impacting the run
// context of the calling code. Compare to `startSpan`.
Transaction.prototype.createSpan = function (...spanArgs) {
Transaction.prototype.createSpan = function (...args) {
if (!this.sampled) {
return null
}
Expand All @@ -142,8 +142,23 @@ Transaction.prototype.createSpan = function (...spanArgs) {
return null
}

// Exit spans must not have child spans. The spec allows child spans "that
// have the same type and subtype", but this agent currently has no use for
// this case.
// https://github.com/elastic/apm/blob/master/specs/agents/tracing-spans.md#child-spans-of-exit-spans
const opts = typeof args[args.length - 1] === 'object'
? (args.pop() || {})
: {}
opts.childOf = opts.childOf || this._agent._instrumentation.currSpan() || this
const childOf = opts.childOf
if (childOf instanceof Span && childOf._exitSpan) {
this._agent.logger.trace({ exitSpanId: childOf.id, newSpanArgs: args },
'createSpan: drop child span of exit span')
return null
}

this._builtSpans++
return new Span(this, ...spanArgs)
return new Span(this, ...args, opts)
}

// Note that this only returns a complete result when called *during* the call
Expand Down
18 changes: 12 additions & 6 deletions test/instrumentation/modules/aws-sdk/s3.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -61,14 +61,9 @@ tape.test('simple S3 usage scenario', function (t) {
const tx = events.shift().transaction
const errors = events.filter(e => e.error).map(e => e.error)

// Currently HTTP spans under each S3 span are included. Eventually
// those will be excluded. Filter those out for now.
// https://github.com/elastic/apm-agent-nodejs/issues/2125
// Compare some common fields across all spans.
const spans = events.filter(e => e.span)
.map(e => e.span)
.filter(e => e.subtype !== 'http')

// Compare some common fields across all spans.
t.equal(spans.filter(s => s.trace_id === tx.trace_id).length,
spans.length, 'all spans have the same trace_id')
t.equal(spans.filter(s => s.transaction_id === tx.id).length,
Expand Down Expand Up @@ -98,6 +93,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'ListBuckets',
context: {
http: { status_code: 200, response: { encoded_body_size: 205 } },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -114,6 +110,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'CreateBucket',
context: {
http: { status_code: 200, response: { encoded_body_size: 177 } },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -134,6 +131,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'HeadBucket',
context: {
http: { status_code: 200 },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -154,6 +152,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'PutObject',
context: {
http: { status_code: 200 },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -174,6 +173,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'HeadObject',
context: {
http: { status_code: 200 },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -194,6 +194,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'GetObject',
context: {
http: { status_code: 200, response: { encoded_body_size: 8 } },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -214,6 +215,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'GetObject',
context: {
http: { status_code: 304 },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -234,6 +236,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'GetObject',
context: {
http: { status_code: 200, response: { encoded_body_size: 8 } },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -255,6 +258,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'GetObject',
context: {
http: { status_code: 404, response: { encoded_body_size: 207 } },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -279,6 +283,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'DeleteObject',
context: {
http: { status_code: 204 },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand All @@ -299,6 +304,7 @@ tape.test('simple S3 usage scenario', function (t) {
subtype: 's3',
action: 'DeleteBucket',
context: {
http: { status_code: 204 },
destination: {
address: LOCALSTACK_HOST,
port: 4566,
Expand Down
6 changes: 6 additions & 0 deletions test/types/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ apm.startSpan('foo', 'type', { childOf: 'baz' })
apm.startSpan('foo', 'type', 'subtype', { childOf: 'baz' })
apm.startSpan('foo', 'type', 'subtype', 'action', { childOf: 'baz' })
apm.startSpan('foo', 'type', 'subtype', 'action', { startTime: 42 })
apm.startSpan('foo', 'type', 'subtype', { exitSpan: true })

apm.setLabel('foo', 'bar')
apm.setLabel('foo', 1)
Expand Down Expand Up @@ -180,10 +181,13 @@ apm.logger.fatal('')
trans.startSpan('foo', 'type', { childOf: 'baz' })
trans.startSpan('foo', 'type', 'subtype', { childOf: 'baz' })
trans.startSpan('foo', 'type', 'subtype', 'action', { childOf: 'baz' })
trans.startSpan('foo', 'type', 'subtype', { exitSpan: true })

function ensureParentId (id: string) {}
ensureParentId(trans.ensureParentId())

trans.setOutcome('failure')

trans.end()
trans.end('foo')
trans.end('foo', 42)
Expand All @@ -205,6 +209,8 @@ apm.logger.fatal('')

span.addLabels({ s: 'bar', n: 42, b: false })

span.setOutcome('failure')

span.end()
span.end(42)
}
Expand Down

0 comments on commit 520a901

Please sign in to comment.