Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sfn triggered twice even if only test invoke cronjob once #43

Closed
rivernews opened this issue Oct 1, 2022 · 1 comment
Closed

Sfn triggered twice even if only test invoke cronjob once #43

rivernews opened this issue Oct 1, 2022 · 1 comment

Comments

@rivernews
Copy link
Owner

rivernews commented Oct 1, 2022

Turns out cronjob may just only invoked once.

But the metadata S3 trigger called twice:

image

Interval around 40 seconds, actually quite a while. What is that metadata trigger? We should able to look at which metadata.json triggered each, closer lookout their cloudWatch:

  • 23:36:13 first 09429cf6-8fc4-47b9-9cef-e3681a55434c triggered by daily-headlines/2021-08-20T22:34:47Z/metadata.json
  • 23:36:57 second b16ec1d8-0868-4adc-9f63-63489ad36d69 triggered by daily-headlines/2021-08-20T23:13:25Z/metadata.json

Recall our cronjob has Saved landing page metadata to s3://**/daily-headlines/2021-08-20T22:34:47Z/metadata.json, so that's the first one no problem.

The question is why the second get triggered. We may look at its events and figure out why the trigger:

  • Landing page S3 key **/daily-headlines/2021-08-20T23:13:25Z/landing.html, uuid 18acaf47-bba9-4468-9486-0adaa35ca457.
  • Its LANDING_METADATA_DONE is evented at 2022-09-30T23:36:55Z. This is the cronjob outcome.
    • Cronjob log. Test invoke 7c2bce03-2c84-4126-8746-04938ddf103f done at --:36:11 -> generated metadata.json -> metadata trigger at 23:36:13. Make sense.
    • --:36:46 cronjob invoked again 9eaf1695-aeb2-4338-8871-43391530929f. Why? This is not our manual Test invoke. Only interpretation is our cronjob rate(40min) is up and happened to trigger here.
  • LANDING_METADATA_REQUESTED is evented at 2022-09-30T22:44:52Z, which will always happens the same time with LANDING_PAGE_FETCHED.. so it means landing page trigger invoked, means landing page fetched. Irrelevant to our context here we can exclude it.
@rivernews
Copy link
Owner Author

If we're not ready, we shouldn't enable cronjob. Disable cronjob rate for now.

rivernews added a commit that referenced this issue Oct 1, 2022
rivernews added a commit that referenced this issue Oct 2, 2022
* temp store all

* remove go_poc

* upgrade so project runs on M1

* Try S3 notification

* Fix prefix to include newssite alias

* Fix aws lambda PathError issue

* Save to metadata.json complete

* add untitled stories in metadata.json

* rename stories function to landing_metadata

* rename batch stories fetch tf to metadata

* Improved metadata access s3 event

* Metadata.json trigger computing env

* read parse metadata.json

* fetch a story POC
#24

* Sfn map parallism POC
#24

* randomize requests

* Refactor to allow individual tf modules
address #25 (comment)

* scaffold table

* draft table design

* create table

* Draining mechanism draft - identify all TODOs
#25 (comment)

* Draft for put landing page; identified TODOs
Issue: #25

* Complete tf surgery; Identify all TODOs in golang
For #25

* fix compile error; progress in metadata cronjob add query

* Ready to test

* Fix db field first char not lowercase
Tracked by #25 (comment)

* Fix permission of db index, S3 pull
Tracked by #25 (comment)

* All tests complete
Tracked by #25 (comment)

* Move landing PutItem out to s3 trigger lambda; ready for S3 batch move

* create reusable lambda module; optimize package size
#25 (comment)

* Fix golang build path

* Refactor to use our custom lambda module

* add landing s3 trigger

* rm golang module stories that are renamed

* Fix env var

* Fix permission for PutItem move from landing to s3 trigger

* Fix metadata s3 trigger not fired

* Fix s3 trigger not working - S3 notification can only have one resource

* Make it easier to test

* prod grade setting enabled

* In Sfn pin lambda version, so rolling deploy works better for lambda

* Display sfn map result / target stories count info in finalizer

* stop landing s3 trigger from sending slack logs
Fixes #40

* Let Sfn pin lambda version
Fixes #39

* improve log for metadata trigger

* improve cronjob log

* log cronjob event for better understanding of how it get triggered

* Disable cronjob to better debug
Fixes #43

* workaround to scale up our Sfn pipeline
Fix #44

* improve log for landing S3 trigger

* re-enable prod config plus cronjob
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant