Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize model callbacks on boot #37

Merged
merged 6 commits into from
Aug 16, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 8 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,16 +342,14 @@ end

##### Models

```ruby
class ApplicationRecord < ActiveRecord::Base
include DfE::Analytics::Entities
end
```

If everything has worked, you should see jobs flowing into your queues on each
web request and model update. While you’re setting things up consider setting
the config options `async: false` and `log_only: true` to take ActiveJob and
BigQuery (respectively) out of the loop.
All models in your app will automatically send callbacks if their tables are
listed in `analytics.yml`. This is a change from versions < v1.4 where it was
necessary to manually mix in `DfE::Analytics::Entities`. This did not support
sending events on `has_and_belongs_to_many` tables.

While you’re setting things up consider setting the config options `async:
false` and `log_only: true` to take ActiveJob and BigQuery (respectively) out
of the loop.

### Adding specs

Expand Down
30 changes: 28 additions & 2 deletions lib/dfe/analytics.rb
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@

module DfE
module Analytics
class ConfigurationError < StandardError; end

def self.events_client
@events_client ||= begin
require 'google/cloud/bigquery'
Expand All @@ -28,7 +30,7 @@ def self.events_client
bigquery_api_json_key
].select { |val| config.send(val).nil? }

raise "DfE::Analytics: missing required config values: #{missing_config.join(', ')}" if missing_config.any?
raise(ConfigurationError, "DfE::Analytics: missing required config values: #{missing_config.join(', ')}") if missing_config.any?

Google::Cloud::Bigquery.new(
project: config.bigquery_project_id,
Expand Down Expand Up @@ -74,6 +76,19 @@ def self.configure
config.queue ||= :default
end

def self.initialize!
DfE::Analytics::Fields.check!

entities_for_analytics.each do |entity|
model = model_for_entity(entity)
if model.include?(DfE::Analytics::Entities) && !@shown_deprecation_warning
Rails.logger.info("DEPRECATION WARNING: DfE::Analytics::Entities was manually included in a model (#{model.name}), but it's included automatically since v1.4. You're running v#{DfE::Analytics::VERSION}. To silence this warning, remove the include from model definitions in app/models.")
else
model.include(DfE::Analytics::Entities)
end
end
end

def self.enabled?
config.enable_analytics.call
end
Expand Down Expand Up @@ -103,7 +118,7 @@ def self.async?
end

def self.entities_for_analytics
allowlist.keys & all_entities_in_application
allowlist.keys
end

def self.all_entities_in_application
Expand Down Expand Up @@ -140,6 +155,17 @@ def self.entity_model_mapping
# these back to table_names which are equivalent to dfe-analytics
# "entities".
@entity_model_mapping ||= begin
# Gems like devise put helper methods into controllers, and they add
# those methods via the routes file.
#
# Rails.configuration.eager_load = true, which is enabled by default in
# production and not in development, will cause routes to be loaded
# before controllers; a direct call to Rails.application.eager_load! will
# not. To avoid this specific conflict with devise and possibly other
# gems/engines, proactively load the routes unless
# configuration.eager_load is set.
Rails.application.reload_routes! unless Rails.configuration.eager_load
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we only need to do this in development env? Or do we not need to check that because in non-dev the routes will be loaded at this point anyway? I don't quite remember what we said, but might be worth explaining why this is ok to do in production if there's anything interesting to say.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will beef up this comment to explain what default configs there are in prod/dev


Rails.application.eager_load!

ActiveRecord::Base.descendants
Expand Down
47 changes: 47 additions & 0 deletions lib/dfe/analytics/fields.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,53 @@ module Analytics
# Tools to check and update configuration for model fields sent via
# DfE::Analytics
module Fields
def self.check!
errors = []

if surplus_fields.any?
errors << <<~HEREDOC
Database field removed! Please remove it from analytics.yml and then run
bundle exec rails dfe:analytics:regenerate_blocklist
Removed fields:
#{surplus_fields.to_yaml}
HEREDOC
end

if unlisted_fields.any?
errors << <<~HEREDOC
New database field detected! You need to decide whether or not to send it
to BigQuery. To send, add it to config/analytics.yml. To ignore, run:
bundle exec rails dfe:analytics:regenerate_blocklist
New fields:
#{unlisted_fields.to_yaml}
HEREDOC
end

if conflicting_fields.any?
errors << <<~HEREDOC
Conflict detected between analytics.yml and analytics_blocklist.yml!
The following fields exist in both files. To remove from the blocklist, run:
bundle exec rails dfe:analytics:regenerate_blocklist
Conflicting fields:
#{conflicting_fields.to_yaml}
HEREDOC
end

configuration_errors = errors.join("\n\n----------------\n\n")

raise(ConfigurationError, configuration_errors) if errors.any?
end

def self.blocklist
DfE::Analytics.blocklist
end
Expand Down
6 changes: 6 additions & 0 deletions lib/dfe/analytics/railtie.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ class Railtie < Rails::Railtie
app.config.middleware.use DfE::Analytics::Middleware::RequestIdentity
end

config.after_initialize do
# internal gem tests will sometimes suppress this so they can test the
# init process
DfE::Analytics.initialize! unless ENV['SUPPRESS_DFE_ANALYTICS_INIT']
end

rake_tasks do
path = File.expand_path(__dir__)
Dir.glob("#{path}/tasks/**/*.rake").each { |f| load f }
Expand Down
45 changes: 1 addition & 44 deletions lib/dfe/analytics/tasks/fields.rake
Original file line number Diff line number Diff line change
Expand Up @@ -2,50 +2,7 @@ namespace :dfe do
namespace :analytics do
desc 'Generate a new field blocklist containing all the fields not listed for sending to Bigquery'
task check: :environment do
surplus_fields = DfE::Analytics::Fields.surplus_fields
unlisted_fields = DfE::Analytics::Fields.unlisted_fields
conflicting_fields = DfE::Analytics::Fields.conflicting_fields

surplus_fields_failure_message = <<~HEREDOC
Database field removed! Please remove it from analytics.yml and then run
bundle exec rails dfe:analytics:regenerate_blocklist
Removed fields:
#{surplus_fields.to_yaml}
HEREDOC

unlisted_fields_failure_message = <<~HEREDOC
New database field detected! You need to decide whether or not to send it
to BigQuery. To send, add it to config/analytics.yml. To ignore, run:
bundle exec rails dfe:analytics:regenerate_blocklist
New fields:
#{unlisted_fields.to_yaml}
HEREDOC

conflicting_fields_failure_message = <<~HEREDOC
Conflict detected between analytics.yml and analytics_blocklist.yml!
The following fields exist in both files. To remove from the blocklist, run:
bundle exec rails dfe:analytics:regenerate_blocklist
Conflicting fields:
#{conflicting_fields.to_yaml}
HEREDOC

puts unlisted_fields_failure_message if unlisted_fields.any?

puts surplus_fields_failure_message if surplus_fields.any?

puts conflicting_fields_failure_message if conflicting_fields.any?

raise if surplus_fields.any? || unlisted_fields.any? || conflicting_fields.any?
DfE::Analytics::Fields.check!
end

desc 'Generate a new field blocklist containing all the fields not listed for sending to Bigquery'
Expand Down
21 changes: 12 additions & 9 deletions spec/dfe/analytics/entities_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@
t.string :last_name
t.string :first_name
end

model { include DfE::Analytics::Entities }
end

before do
Expand All @@ -25,11 +23,16 @@
allow(DfE::Analytics).to receive(:allowlist_pii).and_return({
Candidate.table_name.to_sym => pii_fields
})

# autogenerate a compliant blocklist
allow(DfE::Analytics).to receive(:blocklist).and_return(DfE::Analytics::Fields.generate_blocklist)

DfE::Analytics.initialize!
end

describe 'create_entity events' do
context 'when fields are specified in the analytics file' do
let(:interesting_fields) { [:id] }
let(:interesting_fields) { ['id'] }

it 'includes attributes specified in the settings file' do
Candidate.create(id: 123)
Expand Down Expand Up @@ -81,8 +84,8 @@
end

context 'and the specified fields are listed as PII' do
let(:interesting_fields) { [:email_address] }
let(:pii_fields) { [:email_address] }
let(:interesting_fields) { ['email_address'] }
let(:pii_fields) { ['email_address'] }

it 'hashes those fields' do
Candidate.create(email_address: 'adrienne@example.com')
Expand All @@ -98,8 +101,8 @@
end

context 'and other fields are listed as PII' do
let(:interesting_fields) { [:id] }
let(:pii_fields) { [:email_address] }
let(:interesting_fields) { ['id'] }
let(:pii_fields) { ['email_address'] }

it 'does not include the fields only listed as PII' do
Candidate.create(id: 123, email_address: 'adrienne@example.com')
Expand Down Expand Up @@ -128,7 +131,7 @@

describe 'update_entity events' do
context 'when fields are specified in the analytics file' do
let(:interesting_fields) { %i[email_address first_name] }
let(:interesting_fields) { %w[email_address first_name] }

it 'sends update events for fields we care about' do
entity = Candidate.create(email_address: 'foo@bar.com', first_name: 'Jason')
Expand Down Expand Up @@ -184,7 +187,7 @@
end

describe 'delete_entity events' do
let(:interesting_fields) { [:email_address] }
let(:interesting_fields) { ['email_address'] }

it 'sends events when objects are deleted' do
entity = Candidate.create(email_address: 'boo@example.com')
Expand Down
18 changes: 18 additions & 0 deletions spec/dfe/analytics/fields_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@
expect(fields).not_to include('email_address')
expect(fields).not_to include('id')
end

describe '.check!' do
it 'raises an error' do
expect { DfE::Analytics::Fields.check! }.to raise_error(DfE::Analytics::ConfigurationError, /New database field detected/)
end
end
end

describe '.conflicting_fields' do
Expand All @@ -47,6 +53,12 @@
conflicts = described_class.conflicting_fields
expect(conflicts[Candidate.table_name.to_sym]).to eq(%w[email_address first_name])
end

describe '.check!' do
it 'raises an error' do
expect { DfE::Analytics::Fields.check! }.to raise_error(DfE::Analytics::ConfigurationError, /Conflict detected/)
end
end
end

context 'when there are no conflicts' do
Expand Down Expand Up @@ -86,6 +98,12 @@
expect(fields).to eq ['some_removed_field']
end
end

describe '.check!' do
it 'raises an error' do
expect { DfE::Analytics::Fields.check! }.to raise_error(DfE::Analytics::ConfigurationError, /Database field removed/)
end
end
end
end
end
9 changes: 7 additions & 2 deletions spec/dfe/analytics/load_entities_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,12 @@
Candidate.table_name.to_sym => []
})

# autogenerate a compliant blocklist
allow(DfE::Analytics).to receive(:blocklist).and_return(DfE::Analytics::Fields.generate_blocklist)

allow(DfE::Analytics::SendEvents).to receive(:perform_later)

DfE::Analytics.initialize!
end

around do |ex|
Expand All @@ -32,7 +37,7 @@

described_class.new(entity_name: Candidate.table_name).run

expect(DfE::Analytics::SendEvents).to have_received(:perform_later) do |payload|
expect(DfE::Analytics::SendEvents).to have_received(:perform_later).twice do |payload|
schema = DfE::Analytics::EventSchema.new.as_json
schema_validator = JSONSchemaValidator.new(schema, payload.first)

Expand All @@ -50,6 +55,6 @@

described_class.new(entity_name: Candidate.table_name, batch_size: 2).run

expect(DfE::Analytics::SendEvents).to have_received(:perform_later).once
expect(DfE::Analytics::SendEvents).to have_received(:perform_later).exactly(3).times
end
end
Loading