Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: handle missing timestamp value for load data #11093

Merged
merged 7 commits into from
Jul 12, 2019

Conversation

imtbkcat
Copy link

@imtbkcat imtbkcat commented Jul 5, 2019

What problem does this PR solve?

fix #10504
LOAD DATA has different behavior with MySQL when handling missing timestamp/datetime value.

For example:
If we create a table:

create table t_default(id int, t timestamp null);
load data local infile "~/1.csv" into table t_default;
// 1.csv
12

The result for MySQL and TiDB are same:

mysql> select * from t_default;
+------+------+
| id   | t    |
+------+------+
|   12 | NULL |
+------+------+
1 row in set (0.00 sec)

But if t is NOT NULL, things are different.

create table t_default(id int, t timestamp not null default current_timestamp);
load data local infile "~/1.csv" into table t_default;

TiDB:

mysql> select * from t_default;
+------+---------------------+
| id   | t                   |
+------+---------------------+
|   12 | 0000-00-00 00:00:00 |
+------+---------------------+
1 row in set (0.00 sec)

MySQL:

mysql> select * from test.t_default;
+------+---------------------+
| id   | t                   |
+------+---------------------+
|   12 | 2019-07-05 13:27:05 |
+------+---------------------+
1 row in set (0.00 sec)

What is changed and how it works?

Add check for missing timestamp columns when insert load data.
If these columns have NOT NULL flag, set them to CURRENT_TIMESTAMP.

Check List

Tests

  • Unit test

Side effects

  • Increased code complexity

Related changes

  • Need to cherry-pick to the release branch

@codecov
Copy link

codecov bot commented Jul 5, 2019

Codecov Report

Merging #11093 into master will decrease coverage by 0.1194%.
The diff coverage is 20%.

@@               Coverage Diff               @@
##             master    #11093        +/-   ##
===============================================
- Coverage   81.1334%   81.014%   -0.1195%     
===============================================
  Files           419       419                
  Lines         90801     89324      -1477     
===============================================
- Hits          73670     72365      -1305     
+ Misses        11877     11720       -157     
+ Partials       5254      5239        -15

@codecov
Copy link

codecov bot commented Jul 5, 2019

Codecov Report

Merging #11093 into master will increase coverage by 0.0314%.
The diff coverage is 100%.

@@               Coverage Diff               @@
##             master    #11093        +/-   ##
===============================================
+ Coverage   81.2235%   81.255%   +0.0314%     
===============================================
  Files           423       423                
  Lines         90230     90056       -174     
===============================================
- Hits          73288     73175       -113     
+ Misses        11634     11579        -55     
+ Partials       5308      5302         -6

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add unit test?

@imtbkcat
Copy link
Author

imtbkcat commented Jul 5, 2019

@crazycs520 test case will be added right now.

@imtbkcat
Copy link
Author

imtbkcat commented Jul 5, 2019

/run-all-tests

for i := 0; i < len(e.row); i++ {
if i >= len(cols) {
// If some columns is missing and their type is time and has not null flag, they should be set as current time.
if totalCols[i].Tp == mysql.TypeTimestamp || totalCols[i].Tp == mysql.TypeDate || totalCols[i].Tp == mysql.TypeDatetime {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to use IsTypeTime.

@imtbkcat
Copy link
Author

imtbkcat commented Jul 5, 2019

PTAL @crazycs520

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@imtbkcat imtbkcat added component/DDL-need-LGT3 status/LGT1 Indicates that a PR has LGTM 1. and removed component/DDL-need-LGT3 labels Jul 8, 2019
@jackysp
Copy link
Member

jackysp commented Jul 8, 2019

PTAL @crazycs520

@imtbkcat
Copy link
Author

/run-all-tests

@lysu lysu requested a review from cfzjywxk July 12, 2019 06:54
Copy link
Contributor

@lysu lysu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lysu lysu removed the request for review from cfzjywxk July 12, 2019 06:56
@lysu lysu added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 12, 2019
@lysu lysu merged commit e0fc847 into pingcap:master Jul 12, 2019
imtbkcat pushed a commit to imtbkcat/tidb that referenced this pull request Jul 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution status/LGT2 Indicates that a PR has LGTM 2. type/compatibility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load the imported data into the database, timestamp column, TiDB and MySQL behavior inconsistent
4 participants