Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC for ORC chunked reader #14759

Closed
wants to merge 75 commits into from

Commits on Dec 18, 2023

  1. Add default constructor

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Dec 18, 2023
    Configuration menu
    Copy the full SHA
    451817f View commit details
    Browse the repository at this point in the history
  2. Change order of initialization

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Dec 18, 2023
    Configuration menu
    Copy the full SHA
    9bb8bea View commit details
    Browse the repository at this point in the history

Commits on Dec 21, 2023

  1. Implementing chunk reader

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Dec 21, 2023
    Configuration menu
    Copy the full SHA
    f39ec89 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4f4563d View commit details
    Browse the repository at this point in the history
  3. Fix compile errors

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Dec 21, 2023
    Configuration menu
    Copy the full SHA
    18e88c4 View commit details
    Browse the repository at this point in the history
  4. Tests pass

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Dec 21, 2023
    Configuration menu
    Copy the full SHA
    2d24847 View commit details
    Browse the repository at this point in the history

Commits on Dec 30, 2023

  1. Minor changes

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Dec 30, 2023
    Configuration menu
    Copy the full SHA
    27fc3c1 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    40c7653 View commit details
    Browse the repository at this point in the history

Commits on Dec 31, 2023

  1. Cleanup

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Dec 31, 2023
    Configuration menu
    Copy the full SHA
    1fe0590 View commit details
    Browse the repository at this point in the history

Commits on Jan 1, 2024

  1. Rewrite function

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 1, 2024
    Configuration menu
    Copy the full SHA
    4e40847 View commit details
    Browse the repository at this point in the history

Commits on Jan 2, 2024

  1. Remove (unused) chunking code

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    451323f View commit details
    Browse the repository at this point in the history
  2. Update copyright year

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    7153fcf View commit details
    Browse the repository at this point in the history
  3. Remove unused chunking code

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    ad0ea2b View commit details
    Browse the repository at this point in the history
  4. Break dependency

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    b7a98c3 View commit details
    Browse the repository at this point in the history
  5. Reorder variables

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    2f039d8 View commit details
    Browse the repository at this point in the history
  6. Cleanup

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    1df55a6 View commit details
    Browse the repository at this point in the history
  7. Cleanup

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    62afe56 View commit details
    Browse the repository at this point in the history
  8. Change namespace

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    3fb08f7 View commit details
    Browse the repository at this point in the history
  9. Update copyright year

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    46cb03d View commit details
    Browse the repository at this point in the history
  10. Update cpp/include/cudf/io/orc.hpp

    Co-authored-by: Vukasin Milovanovic <vmilovanovic@nvidia.com>
    ttnghia and vuule committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    e8d482c View commit details
    Browse the repository at this point in the history
  11. Remove redundant namespace import

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    322ad6c View commit details
    Browse the repository at this point in the history
  12. Remove prefix namespace

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    4212960 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    84e09c7 View commit details
    Browse the repository at this point in the history
  14. Merge branch 'refactor_orc_namespace' into refactor_orc_reader

    # Conflicts:
    #	cpp/src/io/orc/reader_impl.cu
    #	cpp/src/io/orc/reader_impl.hpp
    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    05bc865 View commit details
    Browse the repository at this point in the history
  15. Adopt changes

    ttnghia committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    173235b View commit details
    Browse the repository at this point in the history

Commits on Jan 3, 2024

  1. Merge branch 'branch-24.02' into refactor_orc_reader

    # Conflicts:
    #	cpp/src/io/functions.cpp
    #	cpp/src/io/orc/reader_impl.cu
    #	cpp/src/io/orc/reader_impl.hpp
    ttnghia committed Jan 3, 2024
    Configuration menu
    Copy the full SHA
    b891d32 View commit details
    Browse the repository at this point in the history

Commits on Jan 7, 2024

  1. Configuration menu
    Copy the full SHA
    ea13634 View commit details
    Browse the repository at this point in the history

Commits on Jan 9, 2024

  1. Remove namespace import

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 9, 2024
    Configuration menu
    Copy the full SHA
    0d971f1 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    087357b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    68f7d2d View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2024

  1. Rename variable

    ttnghia committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    549bace View commit details
    Browse the repository at this point in the history
  2. Update docs

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    b557546 View commit details
    Browse the repository at this point in the history
  3. Return metadata even if there is no column

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    1b14b96 View commit details
    Browse the repository at this point in the history
  4. Rename function

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    aa167b7 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    0f682cd View commit details
    Browse the repository at this point in the history
  6. Revert "Return metadata even if there is no column"

    This reverts commit 1b14b96.
    
    # Conflicts:
    #	cpp/src/io/orc/reader_impl.cu
    ttnghia committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    8e7075f View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    6013b9a View commit details
    Browse the repository at this point in the history
  8. Remove redundant declaration

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    0f10df8 View commit details
    Browse the repository at this point in the history
  9. Remove unused function

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    56886d9 View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2024

  1. Some more cleanup

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 11, 2024
    Configuration menu
    Copy the full SHA
    bf68f57 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0a42078 View commit details
    Browse the repository at this point in the history

Commits on Jan 12, 2024

  1. Revert "Remove (unused) chunking code"

    This reverts commit 451323f.
    
    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    
    # Conflicts:
    #	cpp/src/io/orc/reader_impl.cu
    #	cpp/src/io/orc/reader_impl.hpp
    #	cpp/src/io/orc/reader_impl_chunking.hpp
    ttnghia committed Jan 12, 2024
    Configuration menu
    Copy the full SHA
    470f86e View commit details
    Browse the repository at this point in the history
  2. Reorganize variables

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 12, 2024
    Configuration menu
    Copy the full SHA
    4801441 View commit details
    Browse the repository at this point in the history
  3. Rename variable

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 12, 2024
    Configuration menu
    Copy the full SHA
    4bd3933 View commit details
    Browse the repository at this point in the history

Commits on Jan 13, 2024

  1. Misc

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 13, 2024
    Configuration menu
    Copy the full SHA
    3cf47fb View commit details
    Browse the repository at this point in the history
  2. Remove unused variables

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 13, 2024
    Configuration menu
    Copy the full SHA
    acaf936 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8d0b824 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    34a5f41 View commit details
    Browse the repository at this point in the history
  5. Revert "Remove unused variables"

    This reverts commit acaf936.
    ttnghia committed Jan 13, 2024
    Configuration menu
    Copy the full SHA
    a552e8b View commit details
    Browse the repository at this point in the history

Commits on Jan 14, 2024

  1. Add chunked reader interface

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 14, 2024
    Configuration menu
    Copy the full SHA
    afa506b View commit details
    Browse the repository at this point in the history
  2. Fix copyright header

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 14, 2024
    Configuration menu
    Copy the full SHA
    4d0a435 View commit details
    Browse the repository at this point in the history

Commits on Jan 18, 2024

  1. Fix comments

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    843ae4b View commit details
    Browse the repository at this point in the history
  2. Implementing chunking code

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    dcccaf4 View commit details
    Browse the repository at this point in the history
  3. Merge branch 'branch-24.02' into orc_chunked_reader

    # Conflicts:
    #	cpp/CMakeLists.txt
    #	cpp/src/io/orc/reader_impl.cu
    #	cpp/src/io/orc/reader_impl.hpp
    #	cpp/src/io/orc/reader_impl_chunking.hpp
    #	cpp/src/io/orc/reader_impl_preprocess.cu
    ttnghia committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    f9ce987 View commit details
    Browse the repository at this point in the history
  4. Add peak memory stat

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    e834cf7 View commit details
    Browse the repository at this point in the history
  5. Adding statistic variables

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    f86b131 View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2024

  1. Implementing intermediate variables

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    3ef4c76 View commit details
    Browse the repository at this point in the history
  2. Remove temporary variable

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    d51258f View commit details
    Browse the repository at this point in the history
  3. Implementing temporary variables

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    23de92e View commit details
    Browse the repository at this point in the history
  4. Finalize stripe size computation, except string size

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    cbaa4c1 View commit details
    Browse the repository at this point in the history
  5. Update header copyright year

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    650ae56 View commit details
    Browse the repository at this point in the history

Commits on Jan 20, 2024

  1. Configuration menu
    Copy the full SHA
    8bdd287 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6c542cf View commit details
    Browse the repository at this point in the history
  3. Add debug info, and fix vector init

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    cb60236 View commit details
    Browse the repository at this point in the history
  4. Applying row range

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    5c62e0b View commit details
    Browse the repository at this point in the history
  5. Implement splits

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    129b85a View commit details
    Browse the repository at this point in the history
  6. Fix sync

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    b78836f View commit details
    Browse the repository at this point in the history
  7. Fix debug

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    6054399 View commit details
    Browse the repository at this point in the history

Commits on Jan 21, 2024

  1. Rename variable

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    d3364d4 View commit details
    Browse the repository at this point in the history
  2. Add chunk reader tests

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    e6599c9 View commit details
    Browse the repository at this point in the history

Commits on Jan 22, 2024

  1. Rename variable

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    2acaa23 View commit details
    Browse the repository at this point in the history
  2. Fix style

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    83d2cc6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e6e5b8f View commit details
    Browse the repository at this point in the history
  4. Fix parameter

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    c61b530 View commit details
    Browse the repository at this point in the history
  5. Output chunks as sliced of the table read from the entire file

    Signed-off-by: Nghia Truong <nghiat@nvidia.com>
    ttnghia committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    eeed482 View commit details
    Browse the repository at this point in the history