refactor: SQLConnector connection lifecycle handling #1393

qbatten · 2023-02-06T18:56:26Z

Feature scope

Other

Description

This is a sub-issue broken out of #1352.

This comment explains this sub-issue in detail.

The idea here is to make the SQLConnector correctly handle it's database connection, and take on sole ownership of that job. In other words:

The connection itself is protected
Streams & other classes/functions/etc are not allowed to open/close/access a connection (subclasses of SQLConnector are still allowed & expected to do so).
The SQLConnector itself opens and closes connections correctly and doesn't leave connections hanging open

Some more detailed next steps here are:

Make a SQLConnector (protected) method, _connect that can be used as a context manager, and use that every single time a connection is opened. (An alternative is just to make sure that any connection opened gets closed as soon as it's done, but I think that's not as nice & is more prone to error.)
Only the SQLConnector can use the connection. No other objects should access or be aware of the SQLConnector's connection; and anything that other objects ask the SQLConnector to do that involve interaction w the database should be accessible via a SQLConnector method that correctly opens and closes the connection for that db interaction.

To fully realize this goal, we need to make breaking changes. In particular:

Stop accepting a _connection arg on instantiation
Stop storing and accessing _connection altogether
Remove connection and create_sqlalchemy_connection. It's a public method that returns a connection. That encourages bad behavior.
Remove create_sqlalchemy_engine. There shouldn't be a public method for creating or getting a sqlalchemy engine; also there should just be one engine per SQLConnector.

The text was updated successfully, but these errors were encountered:

qbatten · 2023-02-06T19:09:09Z

There is a mostly-not-breaking way to do this, and I'm going to push a draft PR of that soon. It does...sorta break things, but only for subclasses.

The breaking change is as follows: we currently allow someone to pass _connection. In order to move towards the goal of this issue, that must be removed, it's just incompatible with this change.

tayloramurphy · 2023-02-06T20:20:20Z

Thanks for the issue / PR! I'll have the engineering team take a look at the PR.

BuzzCutNorman · 2023-02-07T19:08:05Z

If you remove create_sqlalchemy_engine would there, be a new way for developers to set engine arguments? I have overloaded it to allow some user configuration options in this portion of my tap-mssql.

qbatten · 2023-02-07T19:16:36Z

@BuzzCutNorman you should be able to do that exact thing, by overriding the _engine method instead.

E.g.

    def _engine(self) -> Engine:
        """Return the engine object.
        Returns:
            The SQLAlchemy Engine that's attached to this SQLConnector instance.
        """

        if not self._cached_engine:
            eng_prefix = "ep."
            eng_config = {f"{eng_prefix}url":self.sqlalchemy_url,f"{eng_prefix}echo":"False"}

            if self.config.get('sqlalchemy_eng_params'):
                for key, value in self.config['sqlalchemy_eng_params'].items():
                    eng_config.update({f"{eng_prefix}{key}": value})
           self._cached_engine = sqlalchemy.engine_from_config(eng_config, prefix=eng_prefix)
        return cast(Engine, self._cached_engine)

I just copied/pasted/edited that in the github comment text box so the spacing might be off lol. Hope that makes sense though. Basically, instead of create_sqlalchemy_engine, we're just creating and caching a single engine in _engine. So you'd replace the code you linked to with my snippet above. And you just gotta make sure to hold onto the (new!) if not self._cached_engine... and self._cached_engine = sqlalchemy.engine_from_config...{or however the engine is being made} so that the caching behavior is preserved.

Here's the method in the PR that I'm talking about ("_engine")

BuzzCutNorman · 2023-02-07T19:21:11Z

Thanks, for the explanation and example. much appreciated.

qbatten · 2023-02-14T16:30:40Z

Solved by #1394

qbatten added kind/Feature New feature or request valuestream/SDK labels Feb 6, 2023

qbatten mentioned this issue Feb 6, 2023

feat: Refactor SQLConnector connection handling #1394

Merged

tayloramurphy added the Community-Contributed PR label Feb 6, 2023

aaronsteers added this to Office Hours Feb 8, 2023

aaronsteers moved this to Up Next in Office Hours Feb 8, 2023

aaronsteers moved this from Up Next to Discussed in Office Hours Feb 8, 2023

qbatten closed this as completed Feb 14, 2023

edgarrmondragon mentioned this issue Feb 22, 2024

feat: Make SQLConnector _connect function public #2262

Open

raulbonet mentioned this issue Mar 24, 2024

feat: SQLConnector: managing transactions #2338

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: SQLConnector connection lifecycle handling #1393

refactor: SQLConnector connection lifecycle handling #1393

qbatten commented Feb 6, 2023

qbatten commented Feb 6, 2023

tayloramurphy commented Feb 6, 2023 •

edited

Loading

BuzzCutNorman commented Feb 7, 2023 •

edited

Loading

qbatten commented Feb 7, 2023 •

edited

Loading

BuzzCutNorman commented Feb 7, 2023

qbatten commented Feb 14, 2023

refactor: SQLConnector connection lifecycle handling #1393

refactor: SQLConnector connection lifecycle handling #1393

Comments

qbatten commented Feb 6, 2023

Feature scope

Description

qbatten commented Feb 6, 2023

tayloramurphy commented Feb 6, 2023 • edited Loading

BuzzCutNorman commented Feb 7, 2023 • edited Loading

qbatten commented Feb 7, 2023 • edited Loading

BuzzCutNorman commented Feb 7, 2023

qbatten commented Feb 14, 2023

tayloramurphy commented Feb 6, 2023 •

edited

Loading

BuzzCutNorman commented Feb 7, 2023 •

edited

Loading

qbatten commented Feb 7, 2023 •

edited

Loading