Skip to content
This repository has been archived by the owner on Mar 28, 2023. It is now read-only.

Stop processors when fatal errors occur #252

Merged
merged 7 commits into from
Oct 8, 2019
Merged

Stop processors when fatal errors occur #252

merged 7 commits into from
Oct 8, 2019

Conversation

abaiken
Copy link
Contributor

@abaiken abaiken commented Sep 26, 2019

Closes #251

DB ERROR HANDLING TESTS:

  1. Bring up local dev
  2. Upload a report
  3. Stop the db docker stop yupana_db_1
  4. Once the db has stopped, upload another report (or you can wait for it to error on the next query for new reports/slices)- and watch the processors shut down
  5. Restart the db docker start yupana_db_1 and restart the server -and watch it process the first report immediately - as well as pull the second one off the queue
    docker start yupana_db_1

KAFKA ERROR HANDLING TESTS:

  1. Bring up local development
  2. Kill the kafka container docker stop docker_kafka_1 before the server starts and watch it shutoff processors
  3. Upload a large report (for example a 20k host report from sat) - wait until it is sending hosts to HBI - kill kafka
    ( This will take a while but it will eventually stop due to a kafka timeout error and trigger the processors to shut down)

The Kafka LOG errors are a bug with aiokafka --> see aio-libs/aiokafka#496)

@abaiken abaiken added this to the QPC 2019 - Sprint 19 milestone Sep 26, 2019
@abaiken abaiken requested a review from a team September 26, 2019 22:19
@codecov-io
Copy link

codecov-io commented Sep 27, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@81ffd5c). Click here to learn what that means.
The diff coverage is 69.59%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #252   +/-   ##
=========================================
  Coverage          ?   94.73%           
=========================================
  Files             ?       29           
  Lines             ?     2452           
  Branches          ?      286           
=========================================
  Hits              ?     2323           
  Misses            ?       84           
  Partials          ?       45
Impacted Files Coverage Δ
yupana/processor/garbage_collection.py 100% <100%> (ø)
yupana/processor/report_processor.py 94.24% <100%> (ø)
yupana/processor/report_slice_processor.py 92.12% <100%> (ø)
yupana/processor/report_consumer.py 68.42% <60.46%> (ø)
yupana/processor/abstract_processor.py 95.85% <71.42%> (ø)
yupana/processor/processor_utils.py 75.67% <75.67%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 81ffd5c...d7585bf. Read the comment docs.

@abaiken abaiken changed the title WIP: Stop processors when fatal errors occur Stop processors when fatal errors occur Sep 30, 2019
yupana/processor/report_consumer.py Outdated Show resolved Hide resolved
@myersCody myersCody self-requested a review October 7, 2019 17:38
@myersCody
Copy link
Contributor

I retested the critical error using the methods used in the description and it worked for me! 👍

Copy link
Contributor

@kholdaway kholdaway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@abaiken abaiken merged commit 7e99788 into master Oct 8, 2019
@abaiken abaiken deleted the issues/251 branch October 8, 2019 16:48
abaiken added a commit that referenced this pull request Oct 11, 2019
* Remove report archives from postgresDB after specified time  (#233)

* initial changes to add a garbage collector.

* Scheduled monthly dependency update for September (#237)

* Update requests-mock from 1.6.0 to 1.7.0
* Update requests-mock from 1.6.0 to 1.7.0
* Update sphinx from 2.1.2 to 2.2.0
* Update sphinx from 2.1.2 to 2.2.0
* Update lazy-object-proxy from 1.4.1 to 1.4.2
* Update port-for from 0.3.1 to 0.4
* Update virtualenv from 16.7.2 to 16.7.4
* Update zipp from 0.5.2 to 0.6.0

* Add Grafana & Prometheus to local development (#239)

* add prometheus & grafana to local dev

* Add host/report metrics (per hour/minute/day/source/etc) (#244)

* Add host metrics to Grafana

* Add metric to capture kafka errors (#245)

* Add metric to capture kafka errors

* Add grafana charts and prometheus metrics to capture db errors (#246)

* Change our base image from CentOS to UBI  (#247)

* Initial changes to test moving from a centos image to ubi

* Stop processors when fatal errors occur (#252)

* Initial changes for adding a shutdown for all processors when fatal errors occur

* Added Jenkinsfile for e2e smoke tests (#255)
abaiken added a commit that referenced this pull request Oct 31, 2019
* Remove report archives from postgresDB after specified time  (#233)

* initial changes to add a garbage collector.

* Scheduled monthly dependency update for September (#237)

* Update requests-mock from 1.6.0 to 1.7.0
* Update requests-mock from 1.6.0 to 1.7.0
* Update sphinx from 2.1.2 to 2.2.0
* Update sphinx from 2.1.2 to 2.2.0
* Update lazy-object-proxy from 1.4.1 to 1.4.2
* Update port-for from 0.3.1 to 0.4
* Update virtualenv from 16.7.2 to 16.7.4
* Update zipp from 0.5.2 to 0.6.0

* Add Grafana & Prometheus to local development (#239)

* add prometheus & grafana to local dev

* Add host/report metrics (per hour/minute/day/source/etc) (#244)

* Add host metrics to Grafana

* Add metric to capture kafka errors (#245)

* Add metric to capture kafka errors

* Add grafana charts and prometheus metrics to capture db errors (#246)

* Change our base image from CentOS to UBI  (#247)

* Initial changes to test moving from a centos image to ubi

* Stop processors when fatal errors occur (#252)

* Initial changes for adding a shutdown for all processors when fatal errors occur

* Added Jenkinsfile for e2e smoke tests (#255)

* Updated the services defined for the yupana smoke tests (#257)

* Updated the services defined for the yupana smoke tests
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stop processors when Kafka/DB errors are encountered
4 participants