Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to import duckdb file #3

Open
arjunanandkumar opened this issue Feb 10, 2025 · 11 comments
Open

Unable to import duckdb file #3

arjunanandkumar opened this issue Feb 10, 2025 · 11 comments

Comments

@arjunanandkumar
Copy link

Environment

Docker

How did you encounter the bug?

I think this might be down to my lack of experience with duckdb, but I am trying to import a .duckdb database that contains some sensitive data from a pipeline. I can query the dataset fine via CLI, but I cannot import it:

Image

If I give the table a name something like "orders", it says successful in the import menu, but I see this:

Image

Via the CLI, I can query just fine, for example:

 select * from noos_pipeline_dataset.orders limit 10;

What am I doing incorrectly?

What did you expect?

I expected that the database could be imported.

Actual Result

Fails to import the database.

Browser

Chrome

Browser Version

132.0.6834.160

Version

latest

@caioricciuti
Copy link
Owner

Got it you are not doing anything wrong, the thing is since it's a duckdb file it will load the hole database... have you try not setting a table name? Just upload the db and look on the data explorer?

@arjunanandkumar
Copy link
Author

I forgot to mention that I tried that earlier, and I got this:

Image

@caioricciuti
Copy link
Owner

Ahh ok, so... I've tried here...

Image

so I used test (I'll fix that on the next release, when it's .duckdb file it should use the file name and that's it).

For me it worked:
I see test on the data explorer...

Image

Is this file you are trying to use open? I mean I don't see any errors on my end. So if the database you are using is public I can try to see if it's something Duck-UI is failing or if it's the database that's corrupted.

@arjunanandkumar
Copy link
Author

arjunanandkumar commented Feb 10, 2025

No unfortunately it is not, it contains some sensitive details, which is why I didn't include it here. But I can share a bit:

CREATE TABLE noos_pipeline_dataset.orders(id BIGINT, master_customer_id BIGINT, "type" VARCHAR, address_type VARCHAR, address_id VARCHAR, address__street VARCHAR, address__house_number BIGINT, address__floor VARCHAR, address__door VARCHAR, address__postcode VARCHAR, address__city VARCHAR, billing_system VARCHAR, source VARCHAR, access_form VARCHAR, customer__first_name VARCHAR, customer__last_name VARCHAR, customer__email VARCHAR, customer__cpr_number VARCHAR, customer__phone_number VARCHAR, customer__full_name VARCHAR, status VARCHAR, is_cancelled BOOLEAN, vendor VARCHAR, vendor_id VARCHAR, is_test BOOLEAN, sales_id VARCHAR, credit_classification VARCHAR, requested_start_date TIMESTAMP WITH TIME ZONE, completed_at TIMESTAMP WITH TIME ZONE, created_at TIMESTAMP WITH TIME ZONE, updated_at TIMESTAMP WITH TIME ZONE, _dlt_load_id VARCHAR NOT NULL, _dlt_id VARCHAR NOT NULL, address__building_letter VARCHAR, account_id__billing_system VARCHAR, account_id__code VARCHAR, delivery_address_id VARCHAR, delivery_address__city VARCHAR, delivery_address__door VARCHAR, delivery_address__floor VARCHAR, delivery_address__street VARCHAR, delivery_address__postcode VARCHAR, delivery_address__house_number BIGINT, delivery_address__building_letter VARCHAR, available_at TIMESTAMP WITH TIME ZONE, batch_import_id BIGINT, batch_import_ref VARCHAR, assignee_id BIGINT);

I think I somewhat understand what the issue is, because if i just do: duckdb noos_pipeline.duckdb, and do .tables, I see:

Image

But I can't directly select from orders, I need to give it a fully qualified name:

Image

I think it's just a question of being allowed to use . in the table name.

@caioricciuti
Copy link
Owner

Ahh, I don't quite totally get it but it's something the us in duckUI can do something?

@arjunanandkumar
Copy link
Author

Would it be possible for you to allow having dots in the table name? Then I can just enter noos_pipeline_dataset.orders as the table name.

@caioricciuti
Copy link
Owner

the issue if I allow . is that when it imports, the function breaks...

Image

as you can see here what it does is to attach the db to the wasm connection

if (fileType === "duckdb") {
              await connection.query(
                `ATTACH DATABASE '${fileName}' AS ${tableName}`
              );
              await get().fetchDatabasesAndTablesInfo();
              return;
            }

@arjunanandkumar
Copy link
Author

Now that I read the error again, I think the issue happens after the import - it does successfully attach the file, but then tries to read tables without the schema name:

Failed to load schema: Catalog Error: Table with name orders does not exist! Did you mean "noos_pipeline.noos_pipeline_dataset.orders"?

This is how the schemas look:

`
SELECT schema_name
FROM information_schema.schemata;

Gives:

information_schema │
│ main │
│ pg_catalog │
│ information_schema │
│ main │
│ noos_pipeline_dataset │
│ pg_catalog │
│ information_schema │
│ main │
│ pg_catalog │
│ information_schema │
│ main │
│ pg_catalog
`

And noos_pipeline_dataset is the schema where I have the tables.

@caioricciuti
Copy link
Owner

you are right! I'm looking into the code to see how we can manage this kind of use case...

@caioricciuti
Copy link
Owner

Hey @arjunanandkumar,

if you try to CREATE SCHEMA noos_pipeline_dataset; and then try to upload, it works?

@arjunanandkumar
Copy link
Author

Hey @arjunanandkumar,

if you try to CREATE SCHEMA noos_pipeline_dataset; and then try to upload, it works?

Hmm, I just tried this out and it is slightly better:

Failed to load schema: Catalog Error: Table with name orders does not exist! Did you mean "noos_pipeline.noos_pipeline_dataset.orders"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants