Failures that happen in `after` do not trigger a retry #110

schneems · 2020-08-10T17:42:30Z

From: https://app.circleci.com/pipelines/github/heroku/hatchet/160/workflows/37a65f89-4da7-4c54-bdb0-09eac28cc942/jobs/474

An error occurred in an `after(:context)` hook.
Failure/Error: @app.run_multi("ls") { |out| expect(out).to include("Gemfile") }
  expected "" to include "Gemfile"
# ./spec/hatchet/app_spec.rb:217:in `block (4 levels) in <top (required)>'
# ./lib/hatchet/app.rb:240:in `block in run_multi'

RSpec::Retry: 2nd try ./spec/hatchet/app_spec.rb:174
.

Finished in 22.84 seconds (files took 2.11 seconds to load)
2 examples, 0 failures, 1 error occurred outside of examples

For whatever reason sometimes heroku run <command> comes back empty. I wish I knew why and I would fix it, but until then we're managing the situation through relying on rspec-retry. The problem is that the logic in this test doesn't propagate while inside of the test context but instead in the after(:all) step:

    before(:all) do
      skip("Must set HATCHET_EXPENSIVE_MODE") unless ENV["HATCHET_EXPENSIVE_MODE"]

      @app = Hatchet::GitApp.new("default_ruby", run_multi: true)
      @app.deploy
    end

    after(:all) do
      @app.teardown! if @app # <====== ERROR fires here because it's where we're waiting on threads to join
    end

    it "test one" do
      @app.run_multi("ls") { |out| expect(out).to include("Gemfile") }  # <== This is where the thread is spawned that the error comes from
      expect(@app.platform_api.formation.list(@app.name).detect {|ps| ps["type"] == "web"}["size"].downcase).to_not eq("free")
    end

The text was updated successfully, but these errors were encountered:

schneems · 2020-08-10T19:48:45Z

Here are some options:

Have retry_multi commands auto-retry whatever Hatchet's default retry value is when it sees an exception. The downside here is that when tests start to fail it now takes a much longer time to fail if they're "valid" failures and not due to flappy tests.
Explore if the retry behavior can be applied retroactively to a test who's after block fails. This behavior would be tricky to generalize, presumably for an after(:all) you wouldn't know where the behavior came from, and would have to re-run the entire group. This would be very invasive and difficult.
Add or document some kind of a manually join option to ensure the run passed or failed in the test itself (where it can safely be re-run). Downside is that it introduces a manually complex step that requires the developer to remember.

is probably the easiest to implement as well as buys us the most transparency in terms of what's happening. It could look like this if we returned the thread instead of true:

    it "test one" do
      background_tasks = []
      background_tasks << @app.run_multi("ls") { |out| expect(out).to include("Gemfile") }
      expect(@app.platform_api.formation.list(@app.name).detect {|ps| ps["type"] == "web"}["size"].downcase).to_not eq("free")

      background_tasks.map(&:join)
    end

But that's pretty ugly.

Introduce a retry: flag to the run_multi to allow the developer to do it manually. Downside is that it still requires manual intervention.
Document the behavior and don't encourage it, if someone wants to write a test this way they can use the non-concurrent run in multiple processes.

…` blocks

schneems mentioned this issue Aug 10, 2020

Document before and after behavior NoRedInk/rspec-retry#112

Open

schneems added a commit that referenced this issue Aug 10, 2020

[close #110][changelog skip] Document run_multi problem with `after…

827df0d

…` blocks

schneems closed this as completed in 970c5eb Aug 10, 2020

schneems mentioned this issue Aug 10, 2020

[close #103] Remove active support #107

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failures that happen in `after` do not trigger a retry #110

Failures that happen in `after` do not trigger a retry #110

schneems commented Aug 10, 2020

schneems commented Aug 10, 2020

Failures that happen in after do not trigger a retry #110

Failures that happen in after do not trigger a retry #110

Comments

schneems commented Aug 10, 2020

schneems commented Aug 10, 2020

Failures that happen in `after` do not trigger a retry #110

Failures that happen in `after` do not trigger a retry #110