Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise DbtProfileError on invalid default credential #40

Merged
merged 4 commits into from
Oct 22, 2021
Merged

Conversation

jeremyyeo
Copy link
Contributor

@jeremyyeo jeremyyeo commented Oct 14, 2021

Resolves #27

Description

Resolves the issue of dbt unable to install packages if a bigquery profile is invalid as per the issue linked.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-bigquery next" section.

@jeremyyeo jeremyyeo requested a review from jtcohen6 October 14, 2021 08:33
@cla-bot cla-bot bot added the cla:yes label Oct 14, 2021
@jeremyyeo
Copy link
Contributor Author

Hey @jtcohen6, thanks for the pointers on the issue. Have a couple of beginner questions if you don't mind :P


Logging stuff

Trying to understand logging things to stdout a bit more and using logger.info() as you suggested in a Slack thread. If I were to replace my changes in this PR here with a bunch of logger.info()'s:

logger.info(">>>>> OUTSIDE")
@lru_cache()
def get_bigquery_defaults(scopes=None) -> Tuple[Any, Optional[str]]:
    """
    Returns (credentials, project_id)

    project_id is returned available from the environment; otherwise None
    """
    # Cached, because the underlying implementation shells out, taking ~1s
    logger.info(">>>>> INSIDE")
    try:
        logger.info(">>>>> INNER")
        credentials, _ = google.auth.default(scopes=scopes)
        return credentials, _
    except google.auth.exceptions.DefaultCredentialsError as e:
        raise DbtProfileError(INVALID_PROFILE_MESSAGE.format(error_string=e))

On an invalid bq project in profiles.yml:

Running with dbt=1.0.0-b1
>>>>> OUTSIDE
>>>>> INSIDE
>>>>> INNER
No profile "bigquery" found, continuing with no target
Installing dbt-labs/dbt_utils@0.7.3
  Installed from version 0.7.3
  Up to date!

But on a valid project, the output is as follows where I don't see the other logger.info's:

Running with dbt=1.0.0-b1
>>>>> OUTSIDE
Installing dbt-labs/dbt_utils@0.7.3
  Installed from version 0.7.3
  Up to date!

Guessing it has nothing to do with where I placed logger.info but something else... basically was trying to print some other objects with logger.info(credentials), logger.info(_) but no luck in getting them to print.


Adding tests

How would I go about adding a test to this? I can see an oauth-no-project mock profile:

'oauth-no-project': {

And a couple of dbt_run functions in various files:

def run_dbt(self, args=None, expect_pass=True, profiles_dir=True):

Should I copy test_debug.py to a test_deps.py and replace 'debug' with 'deps':

And that's about what I grokked... (tbh, I don't really know how to run the bq tests locally 🤦 ).


Thanks again @jtcohen6 🙌

@jeremyyeo
Copy link
Contributor Author

Added a deps test but almost certain it doesn't actually test what we want it to test...

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremyyeo Thanks for the contribution! This change looks good to me.

It's going to be tricky to test this in CI, since we don't use gcloud-based connection methods for any integration tests. The ideal would be a unit test that ensures
get_bigquery_defaults will return either successful credentials (as a tuple), or raise a DbtProfileError... but I think we may have to rely on local testing for this one.

Speaking of which, have you been able to actually reproduce the error reported in #27, when running dbt deps without this change?

I just tried (using dbt-bigquery==0.21.0, after running gcloud config unset project), and it looks to me like dbt deps succeeds while dbt run fails with a different version of the error:

ERROR: Database Error
  Project was not passed and could not be determined from the environment.

In any case, it is documented that google.auth.default will return DefaultCredentialsError if no credentials are found, or if the default credentials are invalid—so I think catching and wrapping that exception as a DbtProfileError is technically correct and an uncontroversial improvement over the status quo.

@jeremyyeo
Copy link
Contributor Author

Note I didn't use the gcloud utility for the following.

On dbt==0.21.0 dbt-bigquery==0.21.0, I found that dbt deps installs packages with:

(A) A nonsense project config:

# ~/.dbt/profiles.yml
bigquery:
  outputs:
    dev:
      type: bigquery
      method: service-account
      project: anonsensicalprojectthatdoesntexist
      dataset: sandbox_dataset
      threads: 1
      keyfile: /path_to_my_keyfile/gcp_service_keyfile.json
      timeout_seconds: 300
      priority: interactive
      retries: 1
  target: dev

Screen Shot 2021-10-19 at 9 55 22 PM

dbt run throws an expected:

Runtime Error
  404 GET https://bigquery.googleapis.com/bigquery/v2/projects/anonsensicalprojectthatdoesntexist/datasets?maxResults=10000&prettyPrint=false: Not found: Project anonsensicalprojectthatdoesntexist

(B) Setting a target that doesn't exists:

dbt deps -t null

Screen Shot 2021-10-19 at 9 55 57 PM

dbt run throws an expected:

Encountered an error while reading profiles:
  ERROR Runtime Error
  The profile 'bigquery' does not have a target named 'null'. The valid target names for this profile are:
   - dev

But not with:

(C) Commenting out the project config or just having the config not exists:

# ~/.dbt/profiles.yml
bigquery:
  outputs:
    dev:
      type: bigquery
      method: service-account
      dataset: sandbox_dataset
      threads: 1
      keyfile: /path_to_my_keyfile/gcp_service_keyfile.json
      timeout_seconds: 300
      priority: interactive
      retries: 1
  target: dev

Screen Shot 2021-10-19 at 9 59 15 PM

dbt run throws similar message.


Pretty strange... or maybe expected... perhaps in parsing the profiles.yml we should error if a key (project) does not exists?

Should we still merge this (after removing my non test 😂 ) to handle case (C) and as you pointed out that it is documented that DefaultCredentialsError is raised and then in another PR do the key checking?

@jtcohen6
Copy link
Contributor

Pretty strange... or maybe expected... perhaps in parsing the profiles.yml we should error if a key (project) does not exists?

Actually expected! This was a feature introduced in v0.19 that allows you to fall back to your "default project," configured via gcloud, if you don't explicitly specify project in profiles.yml (dbt-labs/dbt-core#2908). Of course, if you don't specify project and don't have a valid default, we raise an error... but if you're just running dbt deps, which doesn't require a valid profile, we shouldn't!

Hence, issue #27.

Should we still merge this (after removing my non test 😂 ) to handle case (C) and as you pointed out that it is documented that DefaultCredentialsError is raised and then in another PR do the key checking?

Yes! & no need for another PR to do the key checking, given the background above.

Can you confirm that, with the setup in case (C) and using the changes in this PR, you're able to run dbt deps locally without error?

@jeremyyeo
Copy link
Contributor Author

Yup can confirm they work:

test

Thanks :)

@jeremyyeo jeremyyeo marked this pull request as ready for review October 22, 2021 04:22
@jtcohen6
Copy link
Contributor

@jeremyyeo Love it!

I'm going to rebase this branch against main to get test fixes, add a changelog entry, and then get this merged in :)

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the contribution, JerYeo! :)

@jtcohen6 jtcohen6 merged commit 76f658f into main Oct 22, 2021
@jtcohen6 jtcohen6 deleted the fix/issue-27 branch October 22, 2021 08:41
siephen pushed a commit to AgencyPMG/dbt-bigquery that referenced this pull request May 16, 2022
* raise profile error

* add deps test

* remove test

* Add changelog entry

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dbt deps on BigQuery requires either gcloud auth or "project" set
2 participants