Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][Multi-Catalog] Filter temporary dir which is generated by Flink #33718

Open
2 of 3 tasks
wangyjx opened this issue Apr 16, 2024 · 0 comments
Open
2 of 3 tasks

[Bug][Multi-Catalog] Filter temporary dir which is generated by Flink #33718

wangyjx opened this issue Apr 16, 2024 · 0 comments

Comments

@wangyjx
Copy link

wangyjx commented Apr 16, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.0.4

What's Wrong?

looks like it is very similar to this issue : #26194
Error:

When query hive table, encounter error as following.
SQL 错误 [1105] [HY000]: errCode = 2, detailMessage = (10.xx.xx.xx)[CANCELLED][CANCELLED][INTERNAL_ERROR]failed to init reader for file hdfs://10.xx.xx.xx:9000/user/hive/warehouse/xxx/tableA/partition=xdyy/.staging_1704802668632/task-1/part-5a1d9e1d-f2a7-46e8-81ef-9ac3b5a9434a-task-1-file-0, err: [INTERNAL_ERROR]Init OrcReader failed. reason = File size too small

0#  doris::vectorized::OrcReader::_create_file_reader() at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
1#  doris::vectorized::OrcReader::init_reader(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::variant<doris::ColumnValueRange<(doris::PrimitiveType)3>, doris::ColumnValueRange<(doris::PrimitiveType)4>, doris::ColumnValueRange<(doris::PrimitiveType)5>, doris::ColumnValueRange<(doris::PrimitiveType)6>, doris::ColumnValueRange<(doris::PrimitiveType)7>, doris::ColumnValueRange<(doris::PrimitiveType)15>, doris::ColumnValueRange<(doris::PrimitiveType)10>, doris::ColumnValueRange<(doris::PrimitiveType)23>, doris::ColumnValueRange<(doris::PrimitiveType)11>, doris::ColumnValueRange<(doris::PrimitiveType)25>, doris::ColumnValueRange<(doris::PrimitiveType)12>, doris::ColumnValueRange<(doris::PrimitiveType)26>, doris::ColumnValueRange<(doris::PrimitiveType)20>, doris::ColumnValueRange<(doris::PrimitiveType)2>, doris::ColumnValueRange<(doris::PrimitiveType)19>, doris::ColumnValueRange<(doris::PrimitiveType)28>, doris::ColumnValueRange<(doris::PrimitiveType)29>, doris::ColumnValueRange<(doris::PrimitiveType)30> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::variant<doris::ColumnValueRange<(doris::PrimitiveType)3>, doris::ColumnValueRange<(doris::PrimitiveType)4>, doris::ColumnValueRange<(doris::PrimitiveType)5>, doris::ColumnValueRange<(doris::PrimitiveType)6>, doris::ColumnValueRange<(doris::PrimitiveType)7>, doris::ColumnValueRange<(doris::PrimitiveType)15>, doris::ColumnValueRange<(doris::PrimitiveType)10>, doris::ColumnValueRange<(doris::PrimitiveType)23>, doris::ColumnValueRange<(doris::PrimitiveType)11>, doris::ColumnValueRange<(doris::PrimitiveType)25>, doris::ColumnValueRange<(doris::PrimitiveType)12>, doris::ColumnValueRange<(doris::PrimitiveType)26>, doris::ColumnValueRange<(doris::PrimitiveType)20>, doris::ColumnValueRange<(doris::PrimitiveType)2>, doris::ColumnValueRange<(doris::PrimitiveType)19>, doris::ColumnValueRange<(doris::PrimitiveType)28>, doris::ColumnValueRange<(doris::PrimitiveType)29>, doris::ColumnValueRange<(doris::PrimitiveType)30> > > > >*, std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std::shared_ptr<doris::vectorized::VExprContext> > > const&, bool, doris::TupleDescriptor const*, doris::RowDescriptor const*, std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std::shared_ptr<doris::vectorized::VExprContext> > > const*, std::unordered_map<int, std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std::shared_ptr<doris::vectorized::VExprContext> > >, std::hash<int>, std::equal_to<int>, std::allocator<std::pair<int const, std::vector<std::shared_ptr<doris::vectorized::VExprContext>, std::allocator<std::shared_ptr<doris::vectorized::VExprContext> > > > > > const*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:442
2#  doris::vectorized::VFileScanner::_get_next_reader() at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/vfile_scanner.cpp:785
3#  doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:442
4#  doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/vscanner.cpp:0
5#  doris::vectorized::ScannerScheduler::_scanner_scan(doris::vectorized::ScannerScheduler*, doris::vectorized::ScannerContext*, std::shared_ptr<doris::vectorized::VScanner>) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:354
6#  std::_Function_handler<void (), doris::vectorized::ScannerScheduler::_schedule_scanners(doris::vectorized::ScannerContext*)::$_1::operator()() const::{lambda()#4}>::_M_invoke(std::_Any_data const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
7#  doris::WorkThreadPool<true>::work_thread(int) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:646
8#  execute_native_thread_routine at /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:85
9#  start_thread
10# __clone


0#  doris::vectorized::VFileScanner::_get_next_reader() at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
1#  doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:442
2#  doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exec/scan/vscanner.cpp:0
3#  doris::vectorized::ScannerScheduler::_scanner_scan(doris::vectorized::ScannerScheduler*, doris::vectorized::ScannerContext*, std::shared_ptr<doris::vectorized::VScanner>) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:354
4#  std::_Function_handler<void (), doris::vectorized::ScannerScheduler::_schedule_scanners(doris::vectorized::ScannerContext*)::$_1::operator()() const::{lambda()#4}>::_M_invoke(std::_Any_data const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
5#  doris::WorkThreadPool<true>::work_thread(int) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:646
6#  execute_native_thread_routine at /data/gcc-11.1.0/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/unique_ptr.h:85
7#  start_thread
8#  __clone

What You Expected?

Should be able to ignore the temporary directories and files.

How to Reproduce?

No response

Anything Else?

After some investigation, it seems only happened to tables of which contains ".staging-xxx" sub directories that were generated by Flink job.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@wangyjx wangyjx changed the title [Bug][Multi-Catalog] Filter _temporary dir which is generated by Flink [Bug][Multi-Catalog] Filter temporary dir which is generated by Flink Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant