Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguous error message when type checking fails makes it almost impossible to figure out why an import failed, especially when using triggers. #3270

Closed
geweldon opened this issue Apr 21, 2022 · 36 comments

Comments

@geweldon
Copy link

I ran into an issue attempting to load a csv file, which unusually hadn't happened previously despite identical steps taken

After doing some investigation, it seems that new tables created with the below schema definition are not allowing me to import the attached csv. What is strange is that this previously worked fine, and I have located a particular branch on one of the test repos I had set up, where despite an identical schema the csv still imports.

Repo: geweldon/Source3
branch with error: main
branch that still somehow works: history5

//Schema
SBQQ__ProductOption__c @ working
CREATE TABLE SBQQ__ProductOption__c (
Id varchar(18),
OwnerId varchar(18),
SBQQ__AppliedImmediatelyContext__c varchar(255),
SBQQ__AppliedImmediately__c tinyint DEFAULT "0",
SBQQ__Bundled__c tinyint DEFAULT "0",
SBQQ__ComponentCodePosition__c decimal(6,0),
SBQQ__ComponentCode__c varchar(60),
SBQQ__ComponentDescriptionPosition__c decimal(6,0),
SBQQ__ComponentDescription__c varchar(255),
SBQQ__ConfiguredSKU__c varchar(18),
SBQQ__DefaultPricingTable__c varchar(255),
SBQQ__DiscountAmount__c decimal(14,2),
SBQQ__DiscountSchedule__c varchar(18),
SBQQ__Discount__c decimal(8,2),
SBQQ__DiscountedByPackage__c tinyint DEFAULT "0",
SBQQ__ExistingQuantity__c decimal(12,2),
SBQQ__Feature__c varchar(18),
SBQQ__MaxQuantity__c decimal(12,2),
SBQQ__MinQuantity__c decimal(12,2),
SBQQ__Number__c decimal(5,0),
SBQQ__OptionalSKU__c varchar(18),
SBQQ__QuantityEditable__c tinyint DEFAULT "0",
SBQQ__Quantity__c decimal(12,2),
SBQQ__QuoteLineVisibility__c varchar(255),
SBQQ__RenewalProductOption__c varchar(18),
SBQQ__Required__c tinyint DEFAULT "0",
SBQQ__Selected__c tinyint DEFAULT "0",
SBQQ__SubscriptionScope__c varchar(255),
SBQQ__System__c tinyint DEFAULT "0",
SBQQ__Type__c varchar(255) DEFAULT "Component",
SBQQ__UnitPrice__c decimal(14,2),
SBQQ__UpliftedByPackage__c tinyint DEFAULT "0",
CPQ_Custom_1__c tinyint DEFAULT "1",
CPQ_Custom_2__c tinyint DEFAULT "0",
Custom_Picklist__c varchar(255),
Example_Record__c tinyint DEFAULT "0",
Pre_Existing__c tinyint DEFAULT "0",
ProdRecordSeedExtID__c varchar(255),
ATGExtId__c varchar(255),
ATGSourceId__c varchar(255) NOT NULL,
PRIMARY KEY (ATGSourceId__c),
KEY Id (Id),
KEY OwnerId (OwnerId),
KEY SBQQ__ConfiguredSKU__c (SBQQ__ConfiguredSKU__c),
KEY SBQQ__DiscountSchedule__c (SBQQ__DiscountSchedule__c),
KEY SBQQ__Feature__c (SBQQ__Feature__c),
KEY SBQQ__OptionalSKU__c (SBQQ__OptionalSKU__c),
KEY SBQQ__RenewalProductOption__c (SBQQ__RenewalProductOption__c)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

partner_org___dep_train_3-backup-sbqq__productoption__c-20220413205903.csv

@geweldon
Copy link
Author

I should also note that the problem persists using dolthub's web import

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

We'll get on this immediately. Attempting repro now.

@geweldon
Copy link
Author

Really appreciate it :)

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

What's the exact dolt table import command you are trying to run?

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

Is it this?

Source3 $ dolt table import -u SBQQ__ProductOption__c partner_org___dep_train_3-backup-sbqq__productoption__c-20220413205903.csv 
An error occurred moving data:

cause: branch not found

@geweldon
Copy link
Author

Yes

@geweldon
Copy link
Author

Same issue occurred in geweldon/ATGSource as well, though that one didn’t produce the branch error.

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

OK. Repro:

dolt $ dolt clone geweldon/Source3
cloning https://doltremoteapi.dolthub.com/geweldon/Source3
364 of 364 chunks complete. 0 chunks being downloaded currently.
dolt $ cd Source3
Source3 $ wget https://github.com/dolthub/dolt/files/8533639/partner_org___dep_train_3-backup-sbqq__productoption__c-20220413205903.csv
Source3 $ dolt table import -u SBQQ__ProductOption__c partner_org___dep_train_3-backup-sbqq__productoption__c-20220413205903.csv 
An error occurred moving data:

cause: branch not found

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

What error did that one make?

@geweldon
Copy link
Author

749[01:51 PM]root:ATGSource √$>dolt table import -u SBQQ__ProductOption__c partner_org___dep_train_3-backup-sbqq__productoption__c-20220413205903.csv

A bad row was encountered while moving data.
string is too large for column
These can be ignored using '--continue'

@geweldon
Copy link
Author

attempting to use the --continue option showed that it was running into that error on all the rows

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

I'm concerned that the true/false values are not being cast to tinyints.

So this might be two bugs:

  1. Why branch not found?
  2. "true"/"false" strings not cast to tinyints on dolt table import

@geweldon
Copy link
Author

geweldon commented Apr 21, 2022

That makes sense.

With regards to the branch not found, I have a feeling that's probably a result of the triggers on the table. I had triggers that pulled from views that queried different branches. I'm going to guess that I deleted a branch and didn't clean up the query.

Things were getting a bit messy thus the fresh repo, ATGSource. Kind of taking an iterative approach to spinning things up.

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

If you run this:

Source3 $ sed s/\"true\"/1/g partner_org___dep_train_3-backup-sbqq__productoption__c-20220413205903.csv | sed s/\"false\"/0/g > output.csv
Source3 $ dolt table import -u SBQQ__ProductOption__c output.csv 

Does that work?

Triggers to modify other branches! Mindblown!

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

Or try mine:

output.csv

@geweldon
Copy link
Author

image

@geweldon
Copy link
Author

Sadly not....

Ehhhh, less modifying other branches with triggers, and more pulling values from other branches with triggers

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

What about this?

output-2.csv

That gets rid of the empty strings as well.

@geweldon
Copy link
Author

506[02:44 PM]root:ATGSource √$>dolt table import -u SBQQ__ProductOption__c output-2.csv

A bad row was encountered while moving data.
string is too large for column
These can be ignored using '--continue'

@zachmu
Copy link
Member

zachmu commented Apr 21, 2022

Grant, the branch not found error we're seeing when we clone is a result of one of your table triggers using a view that looks like this:

Select main.ATGSourceId__c as SrcId, main.Id as prod_id, Staging.Id as staging_id, QA.Id as qa_id, Dev.Id as dev_id From (Select ATGSourceId__c, Id From SBQQ__CustomAction__c AS OF 'main') AS main Join (Select ATGSourceId__c, Id From SBQQ__CustomAction__c AS OF 'Staging') AS Staging ON main.ATGSourceId__c = Staging.ATGSourceId__c Join (Select ATGSourceId__c, Id From SBQQ__CustomAction__c AS OF 'QA') AS QA ON main.ATGSourceId__c = QA.ATGSourceId__c Join (Select ATGSourceId__c, Id From SBQQ__CustomAction__c AS OF 'Dev') AS Dev ON main.ATGSourceId__c = Dev.ATGSourceId__c

Those branches (Staging etc) don't exist on dolthub, and so not on our clones.

What error are you seeing in the original report?

Can you push your other branches to dolthub so we can debug this on our end better?

@geweldon
Copy link
Author

I'm attaching here the table export of the Source3 branch that is still somehow allowing table imports of the original csv file
hist5.csv
.

@geweldon
Copy link
Author

Yes, I'll update the repo with the branches.

@geweldon
Copy link
Author

image

@geweldon
Copy link
Author

The branches exist in dolthub. I had to manually check them out in the cloned repo for the error to resolve

@geweldon
Copy link
Author

image

@zachmu
Copy link
Member

zachmu commented Apr 21, 2022

I made the same mistake, forgot that remote branches aren't shown on clones by default.

We'll get that error message fixed up in the next release, it should tell you the column and the value that failed. In the meantime we'll figure out which value has the problem and let you know.

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

@geweldon This is a really cool use of branches.

@geweldon
Copy link
Author

Well thank you :). For what it’s worth, I switched to using subqueries on the branches in the select statement rather than as table joins for the views. Resolved the issue of items only being viewable if they existed in each branch, and is syntactically cleaner

@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

@jcor11599 is working on this. He'll get you an update soon.

@geweldon
Copy link
Author

image

image

@geweldon
Copy link
Author

geweldon commented Apr 21, 2022

@timsehn @jcor11599 I think I found the issue. The issue wasn't with the csv itself, but with the values being replaced on insert with the triggers.

CREATE TRIGGER bfi_lookup6_SBQQ__ProductOption__c BEFORE INSERT ON SBQQ__ProductOption__c FOR EACH ROW SET NEW.SBQQ__Feature__c = IFNULL( (Select SrcId From uuid_SBQQ__ProductFeature__c Where NEW.SBQQ__Feature__c IN (prod_id, dev_id, qa_id, staging_id)), NEW.SBQQ__Feature__c );

CREATE TRIGGER bfu_lookup6_SBQQ__ProductOption__c BEFORE UPDATE ON SBQQ__ProductOption__c FOR EACH ROW SET NEW.SBQQ__Feature__c = IFNULL( (Select SrcId From uuid_SBQQ__ProductFeature__c Where NEW.SBQQ__Feature__c IN (prod_id, dev_id, qa_id, staging_id)), NEW.SBQQ__Feature__c );

@geweldon
Copy link
Author

SBQQ__Feature__c varchar(18), vs ATGSourceId__c varchar(255)'.

So the issue was my lookup mapping trigger was trying to put UUIDs into fields too small.

@geweldon
Copy link
Author

image

@timsehn timsehn changed the title Error importing a csv into table. Works in one branch but not another, despite identical table schema and csv file Ambiguous error message when type checking fails makes it almost impossible to figure out why an import failed, especially when using triggers. Apr 21, 2022
@timsehn
Copy link
Contributor

timsehn commented Apr 21, 2022

Glad you found the problem. We still need to update the error message to make this an easier process.

Your use of tiggers + branches has the whole office discussing possible use cases. We honestly never thought of this. Also, we think using triggers to modify other branches would break. Like, we think only one working set can be modified at a time. But we're not certain. Thanks for sharing.

@geweldon
Copy link
Author

I would have to agree with that. I did a lot of pondering on the flow here, and the only way I could see updates to other branches being sane is if there was a way to designate which branches can be externally updated and how. I think in general though it's much safer to keep the flow within a branch.

@jycor
Copy link
Contributor

jycor commented Apr 22, 2022

I have improved the error message you were getting to make future debugging easier.
These changes have been merged to main, and will be usable next release (happening later today).

@jycor jycor closed this as completed Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants