Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: fix escape for select into outfile #19661

Merged
merged 22 commits into from
Sep 9, 2020

Conversation

ichn-hu
Copy link
Contributor

@ichn-hu ichn-hu commented Sep 1, 2020

What problem does this PR solve?

Issue Number: close #19517

Problem Summary:

SelectInto executor does not handle escape.

What is changed and how it works?

Proposal: xxx

What's Changed:

Added the handling, and also some more test cases.

How it Works:

Scan the field, and add escape.

Related changes

  • Need to cherry-pick to the release 4.0

Check List

Tests

  • Unit test

Side effects

  • Performance regression
    • Consumes more CPU

Release note

  • N/A

@ichn-hu ichn-hu requested a review from a team as a code owner September 1, 2020 09:47
@ichn-hu ichn-hu requested review from lzmhhh123 and removed request for a team September 1, 2020 09:47
@github-actions github-actions bot added the sig/execution SIG execution label Sep 1, 2020
@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 1, 2020

/uncc @lzmhhh123

@ti-srebot ti-srebot removed the request for review from lzmhhh123 September 1, 2020 09:47
@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 1, 2020

/cc @qw4990 , @SunRunAway

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 2, 2020

/run-all-tests

1 similar comment
@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 2, 2020

/run-all-tests

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 2, 2020

/run-unit-test

@qw4990 qw4990 added the type/bugfix This PR fixes a bug. label Sep 3, 2020
Copy link
Contributor

@qw4990 qw4990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 3, 2020
@SunRunAway SunRunAway requested review from lzmhhh123 and fzhedu and removed request for SunRunAway September 3, 2020 08:42
@breezewish
Copy link
Member

What about other special characters like \r\n?

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 4, 2020

What about other special characters like \r\n?

What do you mean? If they are used as line terminator, then it will be appended, if they appeared in a string, they will probably be enclosed (otherwise, we can't do anything).

Tested in MySQL:

mysql> insert into tx values ('\r\n', 123);
mysql> select * from tx into outfile '/tmp/test-r-n.txt' fields enclosed by '' lines terminated by '\r\n';

ichn-arch-pc# cat test-r-n.txt 
a       1
\N      \N
\
        123
ichn-arch-pc# hexdump -c test-r-n.txt 
0000000   a  \t   1  \r  \n   \   N  \t   \   N  \r  \n   \  \r  \n  \t
0000010   1   2   3  \r  \n                                            
0000015
ichn-arch-pc# cat test-r-n-enclosed.txt 
"a"     "1"
\N      \N
"\
"       "123"
ichn-arch-pc# hexdump -c test-r-n-enclosed.txt
0000000   "   a   "  \t   "   1   "  \r  \n   \   N  \t   \   N  \r  \n
0000010   "   \  \r  \n   "  \t   "   1   2   3   "  \r  \n            
000001d

TiDB:

mysql root@localhost:test> create table tx (a varchar(30), b int);
mysql root@localhost:test> insert into tx values ('\r\n', 123);
mysql root@localhost:test> select * from tx into outfile 'tidb-r-n.txt' fields enclosed by '' lines terminated by '\r\n';
mysql root@localhost:test> select * from tx into outfile 'tidb-r-n-enclosed.txt' fields enclosed by '"' lines terminated by '\r\n';


╭─ichn@ichn-arch-pc ~/go/src/github.com/pingcap/tidb ‹fix-select-into*› 
╰─$ cat tidb-r-n.txt

        123
╭─ichn@ichn-arch-pc ~/go/src/github.com/pingcap/tidb ‹fix-select-into*› 
╰─$ hexdump -c tidb-r-n.txt
0000000  \r  \n  \t   1   2   3  \r  \n                                
0000008
╭─ichn@ichn-arch-pc ~/go/src/github.com/pingcap/tidb ‹fix-select-into*› 
╰─$ cat tidb-r-n-enclosed.txt 
"
"       "123"
╭─ichn@ichn-arch-pc ~/go/src/github.com/pingcap/tidb ‹fix-select-into*› 
╰─$ hexdump -c tidb-r-n-enclosed.txt 
0000000   "  \r  \n   "  \t   "   1   2   3   "  \r  \n                
000000c

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 4, 2020

/cc @breeswish

@breezewish
Copy link
Member

@ichn-hu Thank you for pointing out! Could you further verify that the results of the following statements are identical?

CREATE TABLE `tx` (
  `a` varbinary(20) DEFAULT NULL
);

insert into tx values ('d","e",');
insert into tx values (unhex("00"));
insert into tx values ("\r\n\b\Z\t");
insert into tx values (null);

select * from tx into outfile '~/1.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-all-tests

1 similar comment
@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-all-tests

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-sqllogic-test-1

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-all-tests

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-unit-test

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-check-dev

@SunRunAway
Copy link
Contributor

/run-all-tests

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-all-tests

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-unit-test

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-unit-test

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-check-dev

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 8, 2020

/run-all-tests

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 9, 2020

/run-all-tests

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 9, 2020

/run-all-tests

@ichn-hu
Copy link
Contributor Author

ichn-hu commented Sep 9, 2020

/run-all-tests

@SunRunAway SunRunAway merged commit 7b6a5cd into pingcap:master Sep 9, 2020
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Sep 9, 2020
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #19905

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/expression sig/execution SIG execution sig/sql-infra SIG: SQL Infra status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SELECT INTO does not quote correctly
6 participants