Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce noise from 'ignored properties' warnings #383

Closed
MeltyBot opened this issue May 18, 2022 · 3 comments
Closed

Reduce noise from 'ignored properties' warnings #383

MeltyBot opened this issue May 18, 2022 · 3 comments

Comments

@MeltyBot
Copy link
Contributor

Migrated from GitLab: https://gitlab.com/meltano/sdk/-/issues/386

Originally created by @aaronsteers on 2022-05-18 03:42:17


The SDK prints warnings for every field that the API returns but is not mapped to the stream's schema. We can do a couple things to reduce the noisiness of logs.

Consolidate warnings into a single message per stream

Instead of:

2022-05-18T03:26:54.596749Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'closed_by' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.597507Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'milestone' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.598140Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'author' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.598749Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'type' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.599366Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'assignee' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.600108Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'issue_type' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.600725Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'time_stats' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.601362Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'task_completion_status' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.604317Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'blocking_issues_count' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.604790Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property '_links' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.605257Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'references' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.605731Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'severity' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.606134Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'moved_to_id' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.606522Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'service_desk_reply_to' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.606932Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'epic_iid' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.607318Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'epic' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.607710Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'iteration' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.608199Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'health_status' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr
2022-05-18T03:26:54.608593Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=Property 'project_path' was present in the 'issues' stream but not found in catalog schema. Ignoring. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr

We could send a single message:

2022-05-18T03:26:54.608593Z [info     ] time=2022-05-18 03:26:54 name=tap-gitlab level=WARNING message=The following properties were present in the 'issues' stream but not found in catalog schema (ignoring): closed_by, milestones, author, type, assignee, issue_type, time_stats, task_completion_status, blocking_issues_count, _links, references, severity, moved_to_id, service_deks_reply_to, epic_iid, epic, iteration, health_status, project_path. cmd_type=extractor job_id=gitlab-to-postgres name=tap-gitlab run_id=8308d041-475f-4404-be2c-e4012413e93a stdio=stderr

Pytest failures

We could have a (perhaps optional) pytest that throws a hard test failure during development, if extra properties are unmapped and not explicitly excluded.

We could optionally pair this with a new stream class property ignored_properties which would be functionally identical to the developer adding this to post_process():

    for ignored in IGNORED_PROPERTIES:
        result.pop(ignored, None)
@MeltyBot
Copy link
Contributor Author

@edgarrmondragon
Copy link
Collaborator

Copied from GitLab:


  1. We should probably get rid of any root logger usage
  2. The tap-github should allow more granular control: e.g. - tap-github.streams.stargazers, tap-github.metrics`, etc. This would allow users to set different log levels for different streams and features.
  3. Module-level loggers should be kept where it makes sense as low-level SDK features: singer catalog, configuration parsing, etc.

@stale
Copy link

stale bot commented Jul 18, 2023

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.

@stale stale bot added the stale label Jul 18, 2023
@stale stale bot closed this as completed Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants