-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: db connection analytics #13346
feat: db connection analytics #13346
Conversation
e11c306
to
eabd614
Compare
with closing(engine.raw_connection()) as conn: | ||
if not engine.dialect.do_ping(conn): | ||
raise DBAPIError(None, None, None) | ||
with closing(engine.raw_connection()) as conn: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
engine wasn't defined if database was None, so I put it under this block.
Codecov Report
@@ Coverage Diff @@
## db-connection-logs #13346 +/- ##
=====================================================
Coverage ? 54.61%
=====================================================
Files ? 478
Lines ? 15982
Branches ? 4123
=====================================================
Hits ? 8729
Misses ? 7253
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, my only concern is that in theory we might fail to rollback a transaction. I also left a comment on using log
instead of log_context
.
with event_logger.log_context(action=f"db_connection_failed.{database.db_engine_spec.__name__}"): | ||
db.session.rollback() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worries me a bit — if there's an error in the context manager we won't rollback the transaction.
I think it would be better to just log, instead of using a context manager:
try:
event_logger.log(action=f"db_connection_failed.{database.db_engine_spec.__name__}")
finally:
db.session.rollback()
To ensure that rollback
is always called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe log
should be pessimistic / guaranteed to never raise(?) so that we don't have to use caution every time we call it? That would mean wrapping the content of log
in a try
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betodealmeida and I were talking about this earlier, and most statements before the yield are pretty safe, but to be sure, I moved what I could below the yield statement. I don't think there's much risk of this failing now:
start_time = time.time()
payload_override = {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lmk if you think this works as a solution @betodealmeida @mistercrunch
@@ -69,8 +64,9 @@ def run(self) -> Model: | |||
security_manager.add_permission_view_menu("database_access", database.perm) | |||
db.session.commit() | |||
except DAOCreateFailedError as ex: | |||
logger.exception(ex.exception) | |||
raise DatabaseCreateFailedError() | |||
with event_logger.log_context(action=f"db_creation_failed.{ex.exception}"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to just use log
in places like this. The main difference between log
and log_context
is that the former, in addition to being a context manager, logs the duration automatically, together with some dashboard/chart metadata that is irrelevant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true... it also logs the user_id and referrer, so maybe it's still useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to either 1) refactor log
to do some of the magic in log_context
maybe add a bool flag enrich_with_request_context=False
, or 2) can we just call the context manager with the with
block? Would that just work? or 3) refactor log_context
to NOT be be context manager and create a new one say log_context_manager
(or better name) that's used only when we want duration / with block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem is that adding the flag enrich_with_request_context
would be a breaking change (our event logger, for example, would have to be updated).
I think the best way here is:
- Create a new method in
AbstractEventLogger
calledlog_with_context
, that does all the job thatlog_context
does but is a function, not a context manager. The method callslog
at the end, with all the extracted context. - Change
AbstractEventLogger.log_context
to uselog_with_context
for the enrichment, removing the shared code between them. - In this PR, call the new
log_with_context
. Since it delegates the actual logging tolog
it will work with all existing event loggers, and is not a breaking change..
Eventually I think we should have the event logger itself be a context manager by implementing __enter__
and __exit__
methods, so we could call:
with event_logger(action="foo"):
do_something()
Since we can do this by adding methods to AbstractEventLogger
it would also not be a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick clarification that enrich_with_request_context
would be false by default to protect backwards compatibity. But maybe function composition is better than a parameter here, I'm ok with with either pattern.
There's a need to really clarify/improve the method in AbstractEventLogger
with good/clear names here between functions we can call, content manager(s) and decorator(s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we're merging this branch into @hughhhh's feature branch rather than master, @betodealmeida said that he could do the log_context refactor in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on the refactor right now.
@mistercrunch you're right, I forgot that log
takes a **kwargs
that would swallow the new argument.
I feel like we may want to add Eventually I'd like to add the I think you can just pass as a keyword as in I made a related comment here: #13223 (comment) |
stats_logger.incr( | ||
f"db_connection_failed.{database.db_engine_spec.__name__}" | ||
) | ||
with event_logger.log_context(action=f"db_connection_failed", engine=database.db_engine_spec.__name__): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mistercrunch I added in the engine here as a kwarg. Also, just as an FYI, the strings that we're passing in these two files aren't exactly 1:1. Here we have to use the engine spec name, and in the test_connection file we pass the engine+driver combo. But the information is essentially the same.. any downstream integrations can still see the engine from the string that is passed in.
f6d8ea1
to
4fe1895
Compare
rerun tests |
fb68366
to
d4588f6
Compare
4538e2f
to
bf0ff7b
Compare
bf0ff7b
to
0ceb8f7
Compare
@@ -69,8 +68,9 @@ def run(self) -> Model: | |||
security_manager.add_permission_view_menu("database_access", database.perm) | |||
db.session.commit() | |||
except DAOCreateFailedError as ex: | |||
logger.exception(ex.exception) | |||
raise DatabaseCreateFailedError() | |||
with event_logger.log_context(action=f"db_creation_failed.{ex.exception}"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hughhhh I saw this earlier today, that we're currently logging the entire exception. Can we change this on line 71 and 72 to log just the class? I originally thought since we were already logging it, it would be fine to pass to the event logger, but I think we should actually clean this up as well.
@hughhhh is taking this effort over in https://github.com/apache/superset/pull/13468/files |
SUMMARY
adds analytics logs to the db connection success/failures
TEST PLAN
unit tests
ADDITIONAL INFORMATION