Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: "Can not create a folder inside a [SOURCE]" when writing table into object storage #195

Closed
1 task done
mxmarg opened this issue Jul 7, 2023 · 6 comments · Fixed by #219
Closed
1 task done
Labels
bug Something isn't working

Comments

@mxmarg
Copy link
Contributor

mxmarg commented Jul 7, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

When attempting to create a materialized table in a not yet existing folder, Dremio dbt throws the following error message:
dbt.adapters.dremio.api.rest.error.DremioBadRequestException: Bad request:: (400 Client Error: Bad Request for url: http://<DREMIO_ENDPOINT>/api/v3/catalog): ({"errorMessage":"Can not create a folder inside a [SOURCE].","moreInfo":""})

The current workaround is to create the data source folder outside of dbt.

Expected Behavior

Since Dremio is able to write Iceberg tables while creating new directories, this error should not occur.
Likely, Dremio is using the catalog REST API method to create a folder in the Semantic Layer, which is throws an error, since it is not possible to create Semantic Layer folders on an object storage in Dremio.

Steps To Reproduce

{{ config(
materialized="table",
database="DEBUG_SPACE",
schema="Debug",
object_storage_source="source_name",
object_storage_path="container_name.NOT_YET_EXISTING_FOLDER"
) }}
SELECT 1

Environment

- OS: macOS
- dbt-dremio: 1.5.0
- Dremio Software: 24.1.1
- Dremio Cloud: N/A

Relevant log output

ddl_dbt_model git:(main) ✗ dbt run -t sandbox -s models/Debug
10:59:54  Running with dbt=1.5.2
10:59:54  Registered adapter: dremio=1.5.0
10:59:54  Found 151 models, 5 tests, 0 snapshots, 0 analyses, 349 macros, 0 operations, 0 seed files, 38 sources, 0 exposures, 0 metrics, 0 groups
10:59:54  
10:59:56  
10:59:56  Finished running  in 0 hours 0 minutes and 1.59 seconds (1.59s).
10:59:56  Encountered an error:
Bad request:: (400 Client Error: Bad Request for url: http://<DREMIO_ENDPOINT>/api/v3/catalog): ({"errorMessage":"Can not create a folder inside a [SOURCE].","moreInfo":""})
10:59:56  Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/cli/requires.py", line 86, in wrapper
    result, success = func(*args, **kwargs)
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/cli/requires.py", line 71, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/cli/requires.py", line 142, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/cli/requires.py", line 168, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/cli/requires.py", line 215, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/cli/requires.py", line 250, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/cli/main.py", line 565, in run
    results = task.run()
              ^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/task/runnable.py", line 443, in run
    result = self.execute_with_hooks(selected_uids)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/task/runnable.py", line 405, in execute_with_hooks
    self.before_run(adapter, selected_uids)
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/task/run.py", line 446, in before_run
    self.create_schemas(adapter, required_schemas)
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/task/runnable.py", line 562, in create_schemas
    create_future.result()
  File "/opt/homebrew/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/opt/homebrew/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/utils.py", line 464, in connected
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/task/runnable.py", line 526, in create_schema
    adapter.create_schema(relation)
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/adapters/dremio/impl.py", line 66, in create_schema
    self.connections.create_catalog(database, schema)
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/adapters/dremio/connections.py", line 220, in create_catalog
    self._create_folders(database, schema, api_parameters)
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/adapters/dremio/connections.py", line 244, in _create_folders
    create_catalog_api(api_parameters, folder_json)
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/adapters/dremio/api/rest/endpoints.py", line 216, in create_catalog_api
    return _post(
           ^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/adapters/dremio/api/rest/endpoints.py", line 65, in _post
    return _check_error(response, details)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/dbt/adapters/dremio/api/rest/endpoints.py", line 118, in _check_error
    raise DremioBadRequestException("Bad request:" + details, error, response)
dbt.adapters.dremio.api.rest.error.DremioBadRequestException: Bad request:: (400 Client Error: Bad Request for url: http://<DREMIO_ENDPOINT>/api/v3/catalog): ({"errorMessage":"Can not create a folder inside a [SOURCE].","moreInfo":""})
@mxmarg mxmarg added the bug Something isn't working label Jul 7, 2023
@donatobarone
Copy link

interested as well, but if you don't mind @mxmarg I have a question, I have been trying to use the materialized=table configuraiton and the table is created in dremio as expected, in the location where it is supposed to but the dbt run command never finishes, it seems that it hangs, have you ever experienced this?

@mxmarg
Copy link
Contributor Author

mxmarg commented Jul 7, 2023

I have not experienced this particular behaviour, no. Is it reproducible? If yes, you could probably open a separate bug.
The issue I described occurs during preparation, probably because dbt tries to run the following API call against a data source path, which is not allowed: https://docs.dremio.com/software/api/catalog/folder/#creating-a-folder

@Conq1
Copy link

Conq1 commented Jul 9, 2023

interested as well, but if you don't mind @mxmarg I have a question, I have been trying to use the materialized=table configuraiton and the table is created in dremio as expected, in the location where it is supposed to but the dbt run command never finishes, it seems that it hangs, have you ever experienced this?

I hope someone can clean these threads, but to me it sounds like your issue is this one: #176
Basically it retrieves the table you just created, which make it look like it hangs.

@fabrice-etanchaud
Copy link

fabrice-etanchaud commented Jul 13, 2023

Hi all, yes, the adapter should not try to create schemas in writable sources, as dremio will do it itself.
The problem is dremio segretates tables and views, but for dbt they all lives in common schemas.
All non existing dbt 'schemas' will be created beforehand at dbt start. I am not sure the adapter has a way to tell dbt not to issue a create_schema for missing source directories. Could it be possible to override the dbt code in the adapter in order to issue schema creation only for views ?

It seems to be already some watchdogs in the code :

def create_catalog(self, database, schema):
    thread_connection = self.get_thread_connection()
    connection = self.open(thread_connection)
    credentials = connection.credentials
    api_parameters = connection.handle.get_parameters()


    if database == ("@" + credentials.UID):
        logger.debug("Database is default: creating folders only")
    else:
        self._create_space(database, api_parameters)
    if database != credentials.datalake:
        self._create_folders(database, schema, api_parameters)
    return

But at that stage one cannot guess if the database is a writable source or a space.

Maybe the simplest way is to ignore this exception in the _create_space and _create_forders routines ?

@mxmarg
Copy link
Contributor Author

mxmarg commented Feb 1, 2024

After having tested the new dbt-dremio release 1.5.1, I was no longer able to reproduce this error

@mxmarg mxmarg closed this as completed Feb 1, 2024
@mxmarg
Copy link
Contributor Author

mxmarg commented Feb 2, 2024

Sorry for the confusion, but I must have missed something in the repro, as the bug is still valid.

As per @fabrice-etanchaud 's suggestion, the command works, if we add additional error handling like the following:

  def _create_folders(self, database, schema, api_parameters):
       temp_path_list = [database]
       for folder in schema.split("."):
           temp_path_list.append(folder)
           folder_json = self._make_new_folder_json(temp_path_list)
           try:
               create_catalog_api(api_parameters, folder_json)
           except DremioAlreadyExistsException:
               logger.debug(f"Folder {folder} already exists.")
           except DremioBadRequestException as e:
               if "Can not create a folder inside a [SOURCE]" in e.message:
                   logger.debug(f"Ignoring {e}")
               else:
                   raise DremioBadRequestException(e)

@mxmarg mxmarg reopened this Feb 2, 2024
ravjotbrar added a commit that referenced this issue Mar 5, 2024
… table (#219)

### Summary

See linked issue

### Description

The existing code logic verifies whether the database property matches
the object storage source. If it does, the code should avoid attempting
schema creation via the REST API. However, when the object storage
source is provided via a configuration block within a model, this
condition fails because it only reads the object_storage_source value
from the profiles.yml file. By modifying the condition to check the
materialization type, the logic becomes more robust. This change ensures
that schema creation is only attempted when the relation is a view.

### Test Results

Ran all tests

### Changelog

-   [x] Added a summary of what this PR accomplishes to CHANGELOG.md

### Related Issue

#195
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

4 participants