Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update redshift-connector requirement from 2.0.913 to 2.0.915 #601

Merged

Conversation

soksamnanglim
Copy link
Contributor

@soksamnanglim soksamnanglim commented Sep 12, 2023

resolves #556 , resolves #456 , resolves #347

Problem

Updates the version to new release of redshift-connector. Here's the relevant changelog for 2.0.914

here are the relevant PRs that improve error messaging in the drive

Solution

The new release:

  • maps socket timeout errors to OperationalError instead of InterfaceError
  • maps bind parameter limitation errors to DataError Prepared statement exceeds bind parameter limit 32767. instead of struct.error 'h' format requires -32768 <= number <= 32767
  • maps server side socket closing errors to InterfaceError BrokenPipe: server socket closed. instead of struct.error unpack_from requires a buffer of at least 5 bytes for unpacking 5 bytes at offset 0 (actual buffer size is 0)

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • Run manual tests (planning on doing this soon)

@cla-bot
Copy link

cla-bot bot commented Sep 12, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Soksamnang Lim.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Sep 12, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Soksamnang Lim.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@soksamnanglim soksamnanglim marked this pull request as ready for review September 12, 2023 19:08
@soksamnanglim soksamnanglim requested a review from a team as a code owner September 12, 2023 19:08
@soksamnanglim soksamnanglim marked this pull request as draft September 12, 2023 21:23
@mikealfare mikealfare removed the request for review from VersusFacit September 12, 2023 22:34
@mikealfare mikealfare self-assigned this Sep 12, 2023
@soksamnanglim
Copy link
Contributor Author

converted this PR to a draft, as I still need to conduct manual testing, would love your recommendations for smoke tests!

@cla-bot cla-bot bot added the cla:yes label Sep 13, 2023
@dataders dataders closed this Sep 13, 2023
@dataders dataders reopened this Sep 13, 2023
@dataders dataders marked this pull request as ready for review September 13, 2023 19:38
@cla-bot cla-bot bot added the cla:yes label Sep 13, 2023
@mikealfare
Copy link
Contributor

mikealfare commented Sep 13, 2023

All of the integration tests failed or raised an error with an exception similar to the following:

Click me
[gw1] linux -- Python 3.8.18 /home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/bin/python

self = <logbook.compat.LoggingCompatRecord object at 0x7f1799a9de20>

    @cached_property
    def message(self):
        """The formatted message."""
        if not (self.args or self.kwargs):
            return self.msg
        try:
            try:
>               return self._format_message(self.msg, *self.args,
                                            **self.kwargs)
E                                           TypeError: _format_message() keywords must be strings

/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/logbook/base.py:554: TypeError

During handling of the above exception, another exception occurred:

self = <pytest_logbook.LogbookManager object at 0x7f17c89e34f0>
item = <Function test_redshift_simple_seed_with_column_override_redshift>

    @pytest.hookimpl(hookwrapper=True)
    def pytest_runtest_call(self, item):
        """Enable logbook capturing during test call."""
        with self.capturing(item, 'call'):
>           yield

/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/pytest_logbook.py:97: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/contextlib.py:120: in __exit__
    next(self.gen)
/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/pytest_logbook.py:74: in capturing
    '\n'.join(handler.formatted_records))
/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/logbook/handlers.py:1031: in formatted_records
    self._formatted_records = [self.format(r) for r in self.records]
/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/logbook/handlers.py:1031: in <listcomp>
    self._formatted_records = [self.format(r) for r in self.records]
/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/logbook/handlers.py:195: in format
    return self.formatter(record, self)
/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/logbook/handlers.py:387: in __call__
    line = self.format_record(record, handler)
/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/logbook/handlers.py:371: in format_record
    return self._formatter.format(record=record, handler=handler)
/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/logbook/helpers.py:283: in __get__
    value = self.func(obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <logbook.compat.LoggingCompatRecord object at 0x7f1799a9de20>

    @cached_property
    def message(self):
        """The formatted message."""
        if not (self.args or self.kwargs):
            return self.msg
        try:
            try:
                return self._format_message(self.msg, *self.args,
                                            **self.kwargs)
            except UnicodeDecodeError:
                # Assume an unicode message but mixed-up args
                msg = self.msg.encode('utf-8', 'replace')
                return self._format_message(msg, *self.args, **self.kwargs)
            except (UnicodeEncodeError, AttributeError):
                # we catch AttributeError since if msg is bytes,
                # it won't have the 'format' method
                if (sys.exc_info()[0] is AttributeError
                        and (PY2 or not isinstance(self.msg, bytes))):
                    # this is not the case we thought it is...
                    raise
                # Assume encoded message with unicode args.
                # The assumption of utf8 as input encoding is just a guess,
                # but this codepath is unlikely (if the message is a constant
                # string in the caller's source file)
                msg = self.msg.decode('utf-8', 'replace')
                return self._format_message(msg, *self.args, **self.kwargs)
    
        except Exception:
            # this obviously will not give a proper error message if the
            # information was not pulled and the log record no longer has
            # access to the frame.  But there is not much we can do about
            # that.
            e = sys.exc_info()[1]
            errormsg = ('Could not format message with provided '
                        'arguments: {err}\n  msg={msg!r}\n  '
                        'args={args!r} \n  kwargs={kwargs!r}.\n'
                        'Happened in file {file}, line {lineno}').format(
                err=e, msg=self.msg, args=self.args,
                kwargs=self.kwargs, file=self.filename,
                lineno=self.lineno
            )
            if PY2:
                errormsg = errormsg.encode('utf-8')
>           raise TypeError(errormsg)
E           TypeError: Could not format message with provided arguments: _format_message() keywords must be strings
E             msg='connection.redshift_types=%s'
E             args=() 
E             kwargs=defaultdict(<function <lambda> at 0x7f17c8279f70>, {<RedshiftOID.ABSTIME: 702>: (1, <function abstime_recv at 0x7f17c8270f70>), <RedshiftOID.BOOLEAN: 16>: (1, <function bool_recv at 0x7f17c8270700>), <RedshiftOID.NAME: 19>: (1, <function text_recv at 0x7f17c8270670>), <RedshiftOID.BIGINT: 20>: (1, <function int8_recv at 0x7f17c8270820>), <RedshiftOID.SMALLINT: 21>: (1, <function int2_recv at 0x7f17c82708b0>), <RedshiftOID.INT2VECTOR: 22>: (0, <function vector_in at 0x7f17c8270940>), <RedshiftOID.INTEGER: 23>: (1, <function int4_recv at 0x7f17c82709d0>), <RedshiftOID.REGPROC: 24>: (1, <function oid_recv at 0x7f17c8270af0>), <RedshiftOID.TEXT: 25>: (1, <function text_recv at 0x7f17c8270670>), <RedshiftOID.OID: 26>: (1, <function oid_recv at 0x7f17c8270af0>), <RedshiftOID.XID: 28>: (0, <function int_in at 0x7f17c8270a60>), <RedshiftOID.JSON: 114>: (0, <function json_in at 0x7f17c8270b80>), <RedshiftOID.REAL: 700>: (1, <function float4_recv at 0x7f17c8270c10>), <RedshiftOID.FLOAT: 701>: (1, <function float8_recv at 0x7f17c8270ca0>), <RedshiftOID.UNKNOWN: 705>: (1, <function text_recv at 0x7f17c8270670>), <RedshiftOID.SMALLINT_ARRAY: 1005>: (1, <function array_recv_binary at 0x7f17c8279ca0>), <RedshiftOID.INTEGER_ARRAY: 1007>: (1, <function array_recv_binary at 0x7f17c8279ca0>), <RedshiftOID.TEXT_ARRAY: 1009>: (1, <function array_recv_binary at 0x7f17c8279ca0>), <RedshiftOID.CHAR_ARRAY: 1002>: (1, <function array_recv_binary at 0x7f17c8279ca0>), <RedshiftOID.OID_ARRAY: 1028>: (1, <function array_recv_binary at 0x7f17c8279ca0>), <RedshiftOID.ACLITEM: 1033>: (1, <function text_recv at 0x7f17c8270670>), <RedshiftOID.ACLITEM_ARRAY: 1034>: (1, <function array_recv_binary at 0x7f17c8279ca0>), <RedshiftOID.VARCHAR_ARRAY: 1015>: (1, <function array_recv_binary at 0x7f17c8279ca0>), <RedshiftOID.REAL_ARRAY: 1021>: (1, <function array_recv_binary at 0x7f17c8279ca0>), <RedshiftOID.CHAR: 18>: (1, <function text_recv at 0x7f17c8270670>), <RedshiftOID.BPCHAR: 1042>: (1, <function text_recv at 0x7f17c8270670>), <RedshiftOID.STRING: 1043>: (1, <function text_recv at 0x7f17c8270670>), <RedshiftOID.DATE: 1082>: (1, <function date_recv_binary at 0x7f17c8279790>), <RedshiftOID.TIME: 1083>: (1, <function time_recv_binary at 0x7f17c82795e0>), <RedshiftOID.TIMESTAMP: 1114>: (1, <function timestamp_recv_integer at 0x7f17c8270ee0>), <RedshiftOID.TIMESTAMPTZ: 1184>: (1, <function timestamptz_recv_integer at 0x7f17c8279160>), <RedshiftOID.TIMETZ: 1266>: (1, <function timetz_recv_binary at 0x7f17c8279550>), <RedshiftOID.INTERVAL: 1186>: (1, <function interval_recv_integer at 0x7f17c82794c0>), <RedshiftOID.NUMERIC: 1700>: (1, <function numeric_in_binary at 0x7f17c8279280>), <RedshiftOID.GEOGRAPHY: 3001>: (1, <function geographyhex_recv at 0x7f17c8279dc0>), <RedshiftOID.GEOMETRY: 3000>: (0, <function text_recv at 0x7f17c8270670>), <RedshiftOID.GEOMETRYHEX: 3999>: (0, <function geometryhex_recv at 0x7f17c8279e50>), <RedshiftOID.SUPER: 4000>: (0, <function text_recv at 0x7f17c8270670>), <RedshiftOID.VARBYTE: 6551>: (0, <function text_recv at 0x7f17c8270670>)}).
E           Happened in file /home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/redshift_connector/core.py, line 851

/home/runner/work/dbt-redshift/dbt-redshift/.tox/integration-redshift/lib/python3.8/site-packages/logbook/base.py:590: TypeError

Have you tried running this locally?

@dataders
Copy link
Contributor

piggybacking on top of @mikealfare. see this excerpt of the test log. the error is raised by logbook a dependency of redshift_connector by a call to _logger.debug() on redshift_connector/core.py#L851 which was introduce as a part of this latest release (commit 858abad)

_logger.debug("connection.redshift_types=%s", self.redshift_types)

The error message is complaining that self.redshift_types is not a string but instead (unsurprisingly) a struct of all the redshift_types.

@soksamnanglim
Copy link
Contributor Author

soksamnanglim commented Sep 13, 2023

Hi @mikealfare, thanks for testing! I ran the tests in a venv using tox and all tests passed (must have been all unit tests). What should I do to reproduce this locally? How do I run the integration tests?

Thanks for the briefing @dataders, I'll take a look 🙂

@mikealfare
Copy link
Contributor

@soksamnanglim Yeah, just running tox runs the unit tests (default environment). You should be able to do any of the following, depending on your preference:

make integration

# py38 below can be one of {py38,py39,py310,py311}
tox -e py38-redshift

# you can throw a -v or -vv at the end below to get more verbose output
python -m pytest tests/functional

@dataders
Copy link
Contributor

my hunch is that self.redshift_types needs to be explicitly cast to string, i.e. with str()

_logger.debug("connection.redshift_types=%s", str(self.redshift_types))

@soksamnanglim
Copy link
Contributor Author

The following is a paraphrase from Brooke:
redshift-connector doesn't have actually have a dependency on logbook (we use python std logging). Maybe this type of log statement is incompatible with logbook and causes this error? It seems logbook is a dependency indev-requirements.txt, is there a logbook setting we could potentially change to fix this?

Rs connector tested logging, however we didn't do any compatibility testing with logbook.

@mikealfare
Copy link
Contributor

Agreed that this is not from pytest-logbook, and that pytest-logbook is used in dbt-redshift and not redshift-connector. However, the error appears to be thrown by the line that @dataders identified.

It looks like the arguments in .debug(msg, *args, **kwargs) need to be "stringifyable" as they get passed into msg.format(*args, **kwargs): https://docs.python.org/3/library/logging.html#logging.Logger.debug. Do we know that the dictionary self.redshift_types has keys (RedshiftOID.*) and values (Tuple[{FC_BINARY, FC_TEXT},redshift_connector.utils.*]) that have str defined?

@dataders
Copy link
Contributor

@soksamnanglim can you get this PR's tests running with the changes that you said you had made? The tox output currently shows the same log message error that IIRC, I don't understand

@soksamnanglim soksamnanglim changed the title Update redshift-connector requirement from 2.0.913 to 2.0.914 Update redshift-connector requirement from 2.0.913 to 2.0.915 Oct 16, 2023
@soksamnanglim
Copy link
Contributor Author

Hi dbt team!

The python driver release came out today, so I updated this CR to bump the version to 2.0.915.
All integration tests pass except for this one:
TestRedshiftMaterializedViewsBasic.test_table_replaces_materialized_view

@dataders
Copy link
Contributor

yo @soksamnanglim -- thanks for the follow-up! The end is in sight.

Two questions:

  1. Did you adjust something to have the redshift_connector.core DEBUG logs added? I don't remember seeing them before
  2. Are you seeing a local failure of TestRedshiftMaterializedViewsBasic.test_table_replaces_materialized_view as well?

@soksamnanglim
Copy link
Contributor Author

  1. The 2.0.915 release only included the logging fix and lints to the codebase. Perhaps the logging improvements in 2.0.914 are responsible for the debug logs.
  2. Yes, however, the entire test suite for TestRedshiftMaterializedViewsBasic and TestRedshiftMaterializedViewChangesApply/Continue/Fail fail locally.

@soksamnanglim
Copy link
Contributor Author

Update! I ran integration tests with py38 locally and all the materialized view tests pass with version 2.0.915.
dbt-redshift commit used was 4213f7d9676d8aefa819bd8b393d40743699819c

Hopefully we can get this resolved soon 🙂

@colin-rogers-dbt colin-rogers-dbt merged commit fe30bb1 into dbt-labs:main Oct 23, 2023
colin-rogers-dbt pushed a commit that referenced this pull request Oct 25, 2023
* Update redshift-connector requirement form 2.0.913 to 2.0.914

* Ran changie to update changelog

* Update redshift-connector requirement from 2.0.913 to 2.0.915

---------

Co-authored-by: Anders <anders.swanson@dbtlabs.com>
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
(cherry picked from commit fe30bb1)
@colin-rogers-dbt colin-rogers-dbt mentioned this pull request Oct 25, 2023
4 tasks
colin-rogers-dbt pushed a commit that referenced this pull request Oct 25, 2023
* Update redshift-connector requirement form 2.0.913 to 2.0.914

* Ran changie to update changelog

* Update redshift-connector requirement from 2.0.913 to 2.0.915

---------

Co-authored-by: Anders <anders.swanson@dbtlabs.com>
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
(cherry picked from commit fe30bb1)
@colin-rogers-dbt colin-rogers-dbt mentioned this pull request Oct 25, 2023
4 tasks
colin-rogers-dbt added a commit that referenced this pull request Oct 27, 2023
…640)

* Update redshift-connector requirement form 2.0.913 to 2.0.914

* Ran changie to update changelog

* Update redshift-connector requirement from 2.0.913 to 2.0.915

---------

Co-authored-by: Anders <anders.swanson@dbtlabs.com>
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
(cherry picked from commit fe30bb1)

Co-authored-by: soksamnanglim <68088894+soksamnanglim@users.noreply.github.com>
mikealfare added a commit that referenced this pull request Oct 27, 2023
…639)

* Update redshift-connector requirement form 2.0.913 to 2.0.914

* Ran changie to update changelog

* Update redshift-connector requirement from 2.0.913 to 2.0.915

---------

Co-authored-by: Anders <anders.swanson@dbtlabs.com>
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
(cherry picked from commit fe30bb1)

Co-authored-by: soksamnanglim <68088894+soksamnanglim@users.noreply.github.com>
Co-authored-by: Mike Alfare <13974384+mikealfare@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment