Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Test for the health of all code base URLs #2127

Merged
merged 4 commits into from
Jul 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci_cron.yml
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ jobs:
env:
VERBOSE_TEST_OUTPUT: true
DB_PORT: ${{ job.services.mysql.ports[3306] }}
CI_CRON: true


multiverse:
Expand Down
2 changes: 2 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1248,6 +1248,7 @@ Style/MethodCallWithArgsParentheses:
AllowedMethods:
- add_dependency
- add_development_dependency
- catch
- expect
- fail
- gem
Expand All @@ -1261,6 +1262,7 @@ Style/MethodCallWithArgsParentheses:
- source
- stub
- stub_const
- throw
- use
AllowedPatterns: [^assert, ^refute]

Expand Down
2 changes: 1 addition & 1 deletion lib/new_relic/agent/configuration/environment_source.rb
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ def set_key_by_type(config_key, environment_key)
end
else
::NewRelic::Agent.logger.info("#{environment_key} does not have a corresponding configuration setting (#{config_key} does not exist).")
::NewRelic::Agent.logger.info('Run `rake newrelic:config:docs` or visit https://newrelic.com/docs/ruby/ruby-agent-configuration to see a list of available configuration settings.')
::NewRelic::Agent.logger.info('Run `rake newrelic:config:docs` or visit https://docs.newrelic.com/docs/apm/agents/ruby-agent/configuration/ruby-agent-configuration to see a list of available configuration settings.')
self[config_key] = value
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/new_relic/agent/instrumentation/memcache.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
# each with slightly different APIs and semantics.
# See:
# http://www.deveiate.org/code/Ruby-MemCache/ (Gem: Ruby-MemCache)
# http://seattlerb.rubyforge.org/memcache-client/ (Gem: memcache-client)
# https://github.com/mperham/memcache-client (Gem: memcache-client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know he wrote memcache-client!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd be interested in knowing more about the history there. I think that repo is a fork of another fork of the original RubyForge code (which Mike also has commit access to). For whatever reason, it's what's linked to from the Rubygems.org page's "Homepage" link, so I'm using it here.

# https://github.com/mperham/dalli (Gem: dalli)

require_relative 'memcache/helper'
Expand Down
2 changes: 1 addition & 1 deletion lib/new_relic/agent/instrumentation/queue_time.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
module NewRelic
module Agent
module Instrumentation
# https://newrelic.com/docs/features/tracking-front-end-time
# https://docs.newrelic.com/docs/features/tracking-front-end-time
# Record queue time metrics based on any of three headers
# which can be set on the request.
module QueueTime
Expand Down
2 changes: 1 addition & 1 deletion lib/new_relic/agent/instrumentation/sequel.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def supported_sequel_version?
else
NewRelic::Agent.logger.info('Detected Sequel version %s.' % [Sequel::VERSION])
NewRelic::Agent.logger.info('Please see additional documentation: ' +
'https://newrelic.com/docs/ruby/sequel-instrumentation')
'https://docs.newrelic.com/docs/apm/agents/ruby-agent/frameworks/sequel-instrumentation/')
end

Sequel.synchronize { Sequel::DATABASES.dup }.each do |db|
Expand Down
2 changes: 1 addition & 1 deletion lib/new_relic/agent/pipe_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ def initialize(channel_id)
if @pipe && @pipe.parent_pid != $$
@pipe.after_fork_in_child
else
NewRelic::Agent.logger.error('No communication channel to parent process, please see https://newrelic.com/docs/ruby/resque-instrumentation for more information.')
NewRelic::Agent.logger.error('No communication channel to parent process, please see https://docs.newrelic.com/docs/apm/agents/ruby-agent/background-jobs/resque-instrumentation/ for more information.')
end
end

Expand Down
26 changes: 26 additions & 0 deletions test/helpers/misc.rb
Original file line number Diff line number Diff line change
Expand Up @@ -124,3 +124,29 @@ def skip_unless_minitest5_or_above

skip 'This test requires MiniTest v5+'
end

def skip_unless_ci_cron
return if ENV['CI_CRON']

skip 'This test only runs as part of the CI cron workflow'
end
kaylareopelle marked this conversation as resolved.
Show resolved Hide resolved

def agent_root
@agent_root ||= File.expand_path('../../..', __FILE__).freeze
end

def newest_ruby
@newest_ruby ||= begin
hash = YAML.load_file(File.join(agent_root, '.github/workflows/ci_cron.yml'))
version_string = hash['jobs']['unit_tests']['strategy']['matrix']['ruby-version'].sort do |a, b|
Gem::Version.new(a) <=> Gem::Version.new(b)
end.last
Gem::Version.new(version_string)
end
end

def skip_unless_newest_ruby
return if Gem::Version.new(RUBY_VERSION) >= newest_ruby

skip 'This test only runs on the latest CI cron Ruby version'
end
138 changes: 138 additions & 0 deletions test/new_relic/healthy_urls_test.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# This file is distributed under New Relic's license terms.
# See https://github.com/newrelic/newrelic-ruby-agent/blob/main/LICENSE for complete details.
# frozen_string_literal: true

# The unit test class defined herein will verify the health of all URLs found
# in the project source code.
#
# To run the URL health tests by themselves:
# TEST=test/new_relic/healthy_urls_test bundle exec rake test
#
# A file will be scanned for URLs if the file's basename is found in the
# FILENAMES array OR the file's extension is found in the EXTENSIONS array
# unless the file's absolute path matches the IGNORED_FILE_PATTERN regex.
#
# NOTE that CHANGELOG.md is handled with special logic so that only the most
# recent 2 versions mentioned in the changelog are scannned for URLs.
#
# See TIMEOUT for the number of seconds permitted for a GET request to a given
# URL to be completed.
#
# Enable DEBUG for additional verbosity

require_relative '../test_helper'

class HealthyUrlsTest < Minitest::Test
ROOT = File.expand_path('../../..', __FILE__).freeze
FILENAMES = %w[
baselines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I'm familiar with this file. Where is it and what does it do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's at test/performance/script/baselines. I haven't checked, but possibly it's used to power the perf tests' setting of a baseline from which a subsequent set of results is compared to.

Brewfile
Capfile
Dockerfile
Envfile
Gemfile
Guardfile
install_mysql55
LICENSE
mega-runner
newrelic
newrelic_cmd
nrdebug
Comment on lines +38 to +40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, are these executables as well as files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these ones are really just Ruby scripts so they're eligible for having URLs in the source code. For now I went with an allowlist approach of filenames and extensions for specifying what to scan for URLs. In the future we may want to reverse this and have a blocklist of things like image files or perhaps even take a quick peek at the file's encoding and make a judgment call from that.

Rakefile
run_tests
runner
Thorfile
].freeze
EXTENSIONS = %w[
css
erb
gemspec
haml
html
js
json
md
proto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make changes to the proto files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Not only can we, but we have, and with a URL in fact! There is currently only 1 file at infinite_tracing/lib/proto/infinite_tracing.proto. That file has a URL in the source code comments at the top.

rake
readme
rb
sh
txt
thor
tt
yml
].freeze
FILE_PATTERN = /(?:^(?:#{FILENAMES.join('|')})$)|\.(?:#{EXTENSIONS.join('|')})$/.freeze
IGNORED_FILE_PATTERN = %r{/(?:coverage|test)/}.freeze
URL_PATTERN = %r{(https?://.*?)[^a-zA-Z0-9/\.\-_#]}.freeze
IGNORED_URL_PATTERN = %r{(?:\{|\(|\$|169\.254|\.\.\.|metadata\.google)}
TIMEOUT = 5
DEBUG = false

def test_all_urls
skip_unless_ci_cron
skip_unless_newest_ruby
load_httparty

urls = gather_urls
errors = urls.each_with_object({}) do |(url, _files), hash|
error = verify_url(url)
hash[url] = error if error
end

msg = "#{errors.keys.size} URLs were unreachable!\n\n"
msg += errors.map { |url, error| " #{url} - #{error}\n files: #{urls[url].join(',')}" }.join("\n")

assert_empty errors, msg
end

private

def load_httparty
require 'httparty'
rescue
skip 'Skipping URL health tests in this context, as HTTParty is not available'
end

def real_url?(url)
return false if url.match?(IGNORED_URL_PATTERN)

true
end

def gather_urls
Dir.glob(File.join(ROOT, '**', '*')).each_with_object({}) do |file, urls|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can default the urls keys to be an empty array, ready to be << onto!

Example from docs: https://rubyapi.org/3.2/o/hash
h = Hash.new { |hash, key| "Default value for #{key}" }

Change we can do:
each_with_object(Hash.new { |h, k| h[k] = [] }).

Which then would mean that urls[url] ||= [] could be omitted in line 155

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @olleolleolle! Implemented with #2132.

Using Hash.new([]) in conjunction with += [] from Ulysee's blog post looks pretty slick too, but the Ruby docs example approach is more readable and predictable in its behavior, so we'll go with it.

LETTERS = ('a'..'z').to_a

hash1 = LETTERS.each_with_object(Hash.new { |h, k| h[k] = [] }) { |l, h| h[l] << true }
hash2 = LETTERS.each_with_object(Hash.new([])) { |l, h| h[l] += [true] }
hash1 == hash2

next unless File.file?(file) && File.basename(file).match?(FILE_PATTERN) && !file.match?(IGNORED_FILE_PATTERN)

changelog_entries_seen = 0
File.open(file).each do |line|
changelog_entries_seen += 1 if File.basename(file).eql?('CHANGELOG.md') && line.start_with?('##')
break if changelog_entries_seen > 2
next unless line =~ URL_PATTERN

url = Regexp.last_match(1).sub(%r{(?:/|\.)$}, '')
if real_url?(url)
urls[url] ||= []
urls[url] << file
end
end
end
end

def verify_url(url)
puts "Testing '#{url}'..." if DEBUG
res = HTTParty.get(url, timeout: TIMEOUT)
if res.success?
puts ' OK.' if DEBUG
return
end

msg = "HTTP #{res.code}: #{res.message}"
puts " FAILED. #{msg}" if DEBUG
msg
rescue StandardError => e
msg = "#{e.class}: #{e.message}"
puts " FAILED. #{msg}" if DEBUG
msg
end
end