Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON tree traversal #4

Open
wants to merge 70 commits into
base: fea-json-tree-gpu
Choose a base branch
from

Conversation

karthikeyann
Copy link
Owner

Creating PR fore reviewing PR rapidsai#11610 JSON tree traversal

karthikeyann and others added 12 commits September 19, 2022 15:12
This PR removes all excluded filename patterns from the `isort` configuration. We should run `isort` on all files, and if exclusions are needed, those should be handled with action comments like `# isort: skip` on a case-by-case basis (this is sometimes needed for `setup.py` to control import order with Cython / setuptools / etc.). See: https://pycqa.github.io/isort/docs/configuration/action_comments.html

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Matthew Roeschke (https://github.com/mroeschke)
  - H. Thomson Comer (https://github.com/thomcom)

URL: rapidsai#11680
Since libcudf doesn't keep track of StructDtype key names, round-tripping through outer_explode loses information. We know the correct dtype, since it is the element_type of the exploded list column, so attach that type metadata before handing back the return value.

Exploding a list series should be equivalent to unwrapping one level of list from the dtype, so that

    x = cudf.Series([[{'a': 'b'}]])
    x.explode().dtype == x.dtype.element_type

Previously this was not the case, since we would lose the names resulting in

    x.explode().dtype == StructDtype({'0': dtype('O')})

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Matthew Roeschke (https://github.com/mroeschke)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: rapidsai#11687
elstehle and others added 27 commits September 22, 2022 06:11
…pidsai#11682)

This PR adds the option to take an explicit nested schema, allowing users to specify the target data types of the leave columns in the nested JSON reader. This PR adds the corresponding interface and implementation to libcudf. 

In addition, the PR makes existing JSON reader tests parametrised tests and enables those tests for dual execution of (1) the existing JSON reader and (2) the new nested JSON reader.

Authors:
  - Elias Stehle (https://github.com/elstehle)
  - Vukasin Milovanovic (https://github.com/vuule)
  - Yunsong Wang (https://github.com/PointKernel)
  - Karthikeyan (https://github.com/karthikeyann)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - David Wendt (https://github.com/davidwendt)
  - Karthikeyan (https://github.com/karthikeyann)

URL: rapidsai#11682
Closes rapidsai#10941 

This PR refactors the CSV reader benchmarks with nvbench and reduces the number of test cases by isolating data type, IO type, column selection, and row selection.
 
Example output of the new benchmarks:
<details>
  <summary>Benchmark results</summary>
## csv_read_data_type

### [0] Quadro RTX 8000

| data_type | Samples |  CPU Time  | Noise |  GPU Time  | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|-----------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------|
|  INTEGRAL |      5x |    1.140 s | 0.09% |    1.140 s | 0.09% |        235553841 |         1.202 GiB |       668.564 MiB |
|     FLOAT |      5x |    1.262 s | 0.04% |    1.262 s | 0.04% |        212718321 |         1.041 GiB |       713.885 MiB |
|   DECIMAL |      5x | 272.787 ms | 0.03% | 272.784 ms | 0.03% |        984060406 |       396.279 MiB |       167.951 MiB |
| TIMESTAMP |      7x |    1.681 s | 0.47% |    1.681 s | 0.47% |        159723724 |         2.281 GiB |       814.268 MiB |
|  DURATION |      7x |    2.121 s | 0.50% |    2.121 s | 0.50% |        126587514 |         2.588 GiB |       971.320 MiB |
|    STRING |     19x | 496.713 ms | 0.50% | 496.710 ms | 0.50% |        540426462 |       859.526 MiB |       277.082 MiB |

## csv_read_io

### [0] Quadro RTX 8000

|     io      | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|-------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------|
|    FILEPATH |      9x |  1.185 s | 0.49% |  1.185 s | 0.49% |        226466264 |         1.445 GiB |       618.876 MiB |
| HOST_BUFFER |      5x |  1.170 s | 0.14% |  1.170 s | 0.14% |        229459856 |         1.445 GiB |       618.876 MiB |

## csv_read_column_selection

### [0] Quadro RTX 8000

| column_selection | row_selection | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|------------------|---------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------|
|              ALL |           ALL |      5x |  1.246 s | 0.18% |  1.246 s | 0.18% |        215514992 |         1.582 GiB |       653.520 MiB |
|        ALTERNATE |           ALL |      5x |  1.128 s | 0.08% |  1.128 s | 0.08% |        119009844 |         1.116 GiB |       648.908 MiB |
|       FIRST_HALF |           ALL |      5x |  1.143 s | 0.07% |  1.143 s | 0.07% |        117443933 |         1.121 GiB |       653.520 MiB |
|      SECOND_HALF |           ALL |      5x |  1.152 s | 0.16% |  1.152 s | 0.16% |        116478469 |         1.121 GiB |       653.520 MiB |

## csv_read_row_selection

### [0] Quadro RTX 8000

| column_selection | row_selection | num_chunks | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|------------------|---------------|------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------|
|              ALL |    BYTE_RANGE |          1 |      5x |  1.244 s | 0.16% |  1.244 s | 0.16% |        215763257 |         1.582 GiB |       653.520 MiB |
|              ALL |    BYTE_RANGE |          8 |      5x |  1.170 s | 0.04% |  1.170 s | 0.04% |        229339594 |       202.596 MiB |       653.520 MiB |
|              ALL |         NROWS |          1 |      5x |  1.244 s | 0.12% |  1.244 s | 0.12% |        215808401 |         1.582 GiB |       653.520 MiB |
|              ALL |         NROWS |          8 |      4x |  4.560 s |  inf% |  4.560 s |  inf% |         58870122 |       320.771 MiB |       653.520 MiB |
|              ALL |    SKIPFOOTER |          1 |      5x |  1.245 s | 0.10% |  1.245 s | 0.10% |        215660012 |         1.582 GiB |       653.520 MiB |
|              ALL |    SKIPFOOTER |          8 |      3x |  7.443 s |  inf% |  7.443 s |  inf% |         36065528 |         1.269 GiB |       653.520 MiB |

</details>

Authors:
  - Yunsong Wang (https://github.com/PointKernel)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: rapidsai#11678
Issue rapidsai#10941

<details>
  <summary>Example benchmark Results</summary>

## parquet_write_encode

### [0] Quadro RTX 8000

| data_type | cardinality | run_length | Samples |  CPU Time  | Noise |  GPU Time  | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|-----------|-------------|------------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------|
|  INTEGRAL |           0 |          1 |     13x |    1.100 s | 2.13% |    1.100 s | 2.13% |        487932662 |         2.146 GiB |       506.565 MiB |
|  INTEGRAL |        1000 |          1 |     37x | 371.301 ms | 3.51% | 371.290 ms | 3.51% |       1445960381 |         2.770 GiB |       165.810 MiB |
|  INTEGRAL |           0 |         32 |      6x |  95.736 ms | 0.33% |  95.727 ms | 0.33% |       5608374209 |         2.770 GiB |        27.592 MiB |
|  INTEGRAL |        1000 |         32 |    183x |  67.311 ms | 2.25% |  67.304 ms | 2.25% |       7976843257 |         2.770 GiB |        14.369 MiB |
|     FLOAT |           0 |          1 |     13x |    1.094 s | 1.57% |    1.094 s | 1.57% |        490681898 |         1.100 GiB |       510.303 MiB |
|     FLOAT |        1000 |          1 |     53x | 245.566 ms | 1.58% | 245.553 ms | 1.58% |       2186376800 |         1.765 GiB |       110.206 MiB |
|     FLOAT |           0 |         32 |    159x |  74.142 ms | 2.54% |  74.134 ms | 2.54% |       7241937929 |         1.765 GiB |        23.587 MiB |
|     FLOAT |        1000 |         32 |    266x |  45.006 ms | 3.69% |  44.999 ms | 3.69% |      11930657028 |         1.765 GiB |         9.888 MiB |
|   DECIMAL |           0 |          1 |     33x | 426.241 ms | 1.20% | 426.228 ms | 1.20% |       1259587153 |         1.039 GiB |       141.641 MiB |
|   DECIMAL |        1000 |          1 |    111x | 109.277 ms | 3.73% | 109.266 ms | 3.73% |       4913426291 |         1.145 GiB |        44.820 MiB |
|   DECIMAL |           0 |         32 |    309x |  37.947 ms | 3.60% |  37.940 ms | 3.60% |      14150565744 |         1.145 GiB |         8.327 MiB |
|   DECIMAL |        1000 |         32 |    371x |  32.174 ms | 4.67% |  32.167 ms | 4.67% |      16690275220 |         1.145 GiB |         6.669 MiB |
| TIMESTAMP |           0 |          1 |     14x |    1.047 s | 2.11% |    1.047 s | 2.11% |        512870450 |         1.178 GiB |       462.140 MiB |
| TIMESTAMP |        1000 |          1 |     60x | 208.567 ms | 2.25% | 208.555 ms | 2.25% |       2574239221 |         1.474 GiB |        92.808 MiB |
| TIMESTAMP |           0 |         32 |    162x |  71.909 ms | 1.82% |  71.901 ms | 1.82% |       7466791943 |         1.474 GiB |        20.855 MiB |
| TIMESTAMP |        1000 |         32 |    296x |  40.141 ms | 3.10% |  40.134 ms | 3.10% |      13376977353 |         1.474 GiB |         8.718 MiB |
|  DURATION |           0 |          1 |     14x |    1.010 s | 2.36% |    1.010 s | 2.36% |        531706626 |         1.150 GiB |       436.918 MiB |
|  DURATION |        1000 |          1 |     59x | 208.890 ms | 2.81% | 208.877 ms | 2.81% |       2570271173 |         1.474 GiB |        92.663 MiB |
|  DURATION |           0 |         32 |    166x |  69.930 ms | 1.94% |  69.922 ms | 1.94% |       7678100086 |         1.474 GiB |        19.551 MiB |
|  DURATION |        1000 |         32 |    295x |  39.998 ms | 3.72% |  39.991 ms | 3.72% |      13424718570 |         1.474 GiB |         8.541 MiB |
|    STRING |           0 |          1 |      5x |    1.281 s | 0.45% |    1.281 s | 0.45% |        418985121 |         1.342 GiB |       597.486 MiB |
|    STRING |        1000 |          1 |    100x | 123.906 ms | 3.22% | 123.895 ms | 3.22% |       4333268264 |       677.964 MiB |        46.473 MiB |
|    STRING |           0 |         32 |      5x |    1.283 s | 0.22% |    1.283 s | 0.22% |        418593329 |         1.342 GiB |       597.486 MiB |
|    STRING |        1000 |         32 |     96x |  36.813 ms | 4.16% |  36.806 ms | 4.16% |      14586612568 |       677.964 MiB |         8.504 MiB |
|      LIST |           0 |          1 |      5x |    1.552 s | 0.09% |    1.552 s | 0.09% |        345842800 |         1.695 GiB |       526.626 MiB |
|      LIST |        1000 |          1 |      5x | 697.747 ms | 0.23% | 697.734 ms | 0.23% |        769449441 |         2.911 GiB |       175.888 MiB |
|      LIST |           0 |         32 |     42x | 336.564 ms | 1.01% | 336.555 ms | 1.01% |       1595194403 |         2.911 GiB |        38.433 MiB |
|      LIST |        1000 |         32 |     45x | 316.764 ms | 0.68% | 316.757 ms | 0.68% |       1694897420 |         2.911 GiB |        25.115 MiB |
|    STRUCT |           0 |          1 |      5x |    1.236 s | 0.16% |    1.236 s | 0.16% |        434277368 |         1.283 GiB |       569.525 MiB |
|    STRUCT |        1000 |          1 |      5x | 225.491 ms | 0.36% | 225.478 ms | 0.36% |       2381034954 |         1.324 GiB |        90.699 MiB |
|    STRUCT |           0 |         32 |      5x | 903.626 ms | 0.21% | 903.615 ms | 0.21% |        594136463 |         1.477 GiB |       409.290 MiB |
|    STRUCT |        1000 |         32 |    182x |  67.608 ms | 2.69% |  67.601 ms | 2.69% |       7941800457 |         1.324 GiB |        15.399 MiB |

## parquet_write_io_compression

### [0] Quadro RTX 8000

|     io      | compression | cardinality | run_length | Samples |  CPU Time  | Noise |  GPU Time  | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|-------------|-------------|-------------|------------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------|
|    FILEPATH |      SNAPPY |           0 |          1 |      4x |    3.939 s |  inf% |    3.939 s |  inf% |        136302643 |         1.643 GiB |       521.113 MiB |
|    FILEPATH |      SNAPPY |        1000 |          1 |      5x |    1.941 s | 0.49% |    1.941 s | 0.49% |        276656089 |         2.727 GiB |       170.914 MiB |
|    FILEPATH |      SNAPPY |           0 |         32 |      5x |    1.329 s | 0.45% |    1.329 s | 0.45% |        403934692 |         2.722 GiB |        50.835 MiB |
|    FILEPATH |      SNAPPY |        1000 |         32 |     12x |    1.275 s | 0.51% |    1.275 s | 0.51% |        421015682 |         2.727 GiB |        24.365 MiB |
|    FILEPATH |        NONE |           0 |          1 |      7x |    2.378 s | 0.77% |    2.378 s | 0.77% |        225765543 |         1.643 GiB |       529.611 MiB |
|    FILEPATH |        NONE |        1000 |          1 |      7x |    1.262 s | 0.49% |    1.262 s | 0.49% |        425263712 |         2.727 GiB |       180.315 MiB |
|    FILEPATH |        NONE |           0 |         32 |      5x |    1.116 s | 0.30% |    1.116 s | 0.30% |        480884592 |         2.722 GiB |        58.968 MiB |
|    FILEPATH |        NONE |        1000 |         32 |      8x |    1.014 s | 0.50% |    1.014 s | 0.50% |        529606276 |         2.727 GiB |        32.308 MiB |
| HOST_BUFFER |      SNAPPY |           0 |          1 |      4x |    4.181 s |  inf% |    4.181 s |  inf% |        128399871 |         1.643 GiB |       521.112 MiB |
| HOST_BUFFER |      SNAPPY |        1000 |          1 |      6x |    2.026 s | 0.48% |    2.026 s | 0.48% |        264969784 |         2.727 GiB |       170.914 MiB |
| HOST_BUFFER |      SNAPPY |           0 |         32 |      5x |    1.363 s | 0.41% |    1.363 s | 0.41% |        393913005 |         2.722 GiB |        50.835 MiB |
| HOST_BUFFER |      SNAPPY |        1000 |         32 |      5x |    1.277 s | 0.43% |    1.277 s | 0.43% |        420459944 |         2.727 GiB |        24.364 MiB |
| HOST_BUFFER |        NONE |           0 |          1 |      5x |    2.649 s | 0.42% |    2.649 s | 0.42% |        202649168 |         1.643 GiB |       529.611 MiB |
| HOST_BUFFER |        NONE |        1000 |          1 |      5x |    1.332 s | 0.41% |    1.332 s | 0.41% |        403090403 |         2.727 GiB |       180.315 MiB |
| HOST_BUFFER |        NONE |           0 |         32 |      5x |    1.151 s | 0.46% |    1.151 s | 0.46% |        466449565 |         2.722 GiB |        58.968 MiB |
| HOST_BUFFER |        NONE |        1000 |         32 |     13x |    1.039 s | 0.50% |    1.039 s | 0.50% |        516732638 |         2.727 GiB |        32.308 MiB |
|        VOID |      SNAPPY |           0 |          1 |      5x |    3.559 s | 0.62% |    3.559 s | 0.62% |        150867866 |         1.643 GiB |       521.113 MiB |
|        VOID |      SNAPPY |        1000 |          1 |      7x |    1.817 s | 0.47% |    1.817 s | 0.47% |        295405582 |         2.727 GiB |       170.914 MiB |
|        VOID |      SNAPPY |           0 |         32 |      5x |    1.299 s | 0.04% |    1.299 s | 0.04% |        413272964 |         2.722 GiB |        50.836 MiB |
|        VOID |      SNAPPY |        1000 |         32 |      5x |    1.264 s | 0.28% |    1.264 s | 0.28% |        424605071 |         2.727 GiB |        24.364 MiB |
|        VOID |        NONE |           0 |          1 |      5x |    2.003 s | 0.50% |    2.003 s | 0.50% |        268012332 |         1.643 GiB |       529.611 MiB |
|        VOID |        NONE |        1000 |          1 |      5x |    1.127 s | 0.45% |    1.127 s | 0.45% |        476312808 |         2.727 GiB |       180.315 MiB |
|        VOID |        NONE |           0 |         32 |      5x |    1.081 s | 0.47% |    1.081 s | 0.47% |        496747581 |         2.722 GiB |        58.968 MiB |
|        VOID |        NONE |        1000 |         32 |      5x | 999.381 ms | 0.48% | 999.378 ms | 0.48% |        537205288 |         2.727 GiB |        32.308 MiB |

## parquet_write_options

### [0] Quadro RTX 8000

|     statistics      | compression |      file_path      | Samples | CPU Time | Noise | GPU Time | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|---------------------|-------------|---------------------|---------|----------|-------|----------|-------|------------------|-------------------|-------------------|
|     STATISTICS_NONE |      SNAPPY | unused_path.parquet |      3x |  5.961 s |  inf% |  5.961 s |  inf% |         90067884 |         2.427 GiB |       122.010 MiB |
|     STATISTICS_NONE |      SNAPPY |                     |      3x |  5.962 s |  inf% |  5.962 s |  inf% |         90054559 |         2.427 GiB |       121.968 MiB |
|     STATISTICS_NONE |        NONE | unused_path.parquet |      4x |  4.253 s |  inf% |  4.253 s |  inf% |        126221980 |         2.427 GiB |       141.623 MiB |
|     STATISTICS_NONE |        NONE |                     |      4x |  4.249 s |  inf% |  4.249 s |  inf% |        126356682 |         2.427 GiB |       141.623 MiB |
| STATISTICS_ROWGROUP |      SNAPPY | unused_path.parquet |      3x |  6.011 s |  inf% |  6.011 s |  inf% |         89314511 |         2.427 GiB |       122.055 MiB |
| STATISTICS_ROWGROUP |      SNAPPY |                     |      3x |  5.983 s |  inf% |  5.983 s |  inf% |         89740066 |         2.427 GiB |       122.022 MiB |
| STATISTICS_ROWGROUP |        NONE | unused_path.parquet |      4x |  4.282 s |  inf% |  4.282 s |  inf% |        125372100 |         2.427 GiB |       141.626 MiB |
| STATISTICS_ROWGROUP |        NONE |                     |      4x |  4.287 s |  inf% |  4.287 s |  inf% |        125241731 |         2.427 GiB |       141.626 MiB |
|     STATISTICS_PAGE |      SNAPPY | unused_path.parquet |      3x |  5.976 s |  inf% |  5.976 s |  inf% |         89837494 |         2.427 GiB |       122.090 MiB |
|     STATISTICS_PAGE |      SNAPPY |                     |      3x |  5.979 s |  inf% |  5.979 s |  inf% |         89788086 |         2.427 GiB |       121.977 MiB |
|     STATISTICS_PAGE |        NONE | unused_path.parquet |      4x |  4.290 s |  inf% |  4.290 s |  inf% |        125138510 |         2.427 GiB |       141.633 MiB |
|     STATISTICS_PAGE |        NONE |                     |      4x |  4.292 s |  inf% |  4.292 s |  inf% |        125087291 |         2.427 GiB |       141.633 MiB |

## parquet_write_num_cols

### [0] Quadro RTX 8000

| num_cols | Samples |  CPU Time  | Noise |  GPU Time  | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|----------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------|
|        8 |      5x | 217.270 ms | 0.13% | 217.262 ms | 0.13% |       2471073081 |         2.648 GiB |       114.635 MiB |
|     1024 |      5x | 339.592 ms | 0.25% | 339.582 ms | 0.25% |       1580974198 |         2.649 GiB |       145.293 MiB |

## parquet_chunked_write

### [0] Quadro RTX 8000

| num_cols | num_chunks | Samples |  CPU Time  | Noise |  GPU Time  | Noise | bytes_per_second | peak_memory_usage | encoded_file_size |
|----------|------------|---------|------------|-------|------------|-------|------------------|-------------------|-------------------|
|        8 |          8 |      5x | 239.509 ms | 0.05% | 239.501 ms | 0.05% |       2241622403 |       338.950 MiB |       115.038 MiB |
|     1024 |          8 |      5x | 441.931 ms | 0.46% | 441.921 ms | 0.46% |       1214856630 |       339.430 MiB |       158.714 MiB |
|        8 |         64 |      5x | 458.133 ms | 0.10% | 458.125 ms | 0.10% |       1171887455 |        42.372 MiB |       117.129 MiB |
|     1024 |         64 |     12x |    1.284 s | 0.80% |    1.284 s | 0.80% |        418236962 |        42.828 MiB |       214.851 MiB |

</details>

Authors:
  - Yunsong Wang (https://github.com/PointKernel)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Tobias Ribizel (https://github.com/upsj)

URL: rapidsai#11623
Issue rapidsai#9313
The root cause is that the sum value was encoded as an unsigned int. ORC specs show that the value should be encoded as signed.
Because both encode and decode where assuming unsigned encoding, the existing C++ test (OrcStatisticsTest, Basic) was passing even without this fix. Added a Python test that uses a different decode method, so it fails without the fix.

Authors:
  - Vukasin Milovanovic (https://github.com/vuule)

Approvers:
  - Tobias Ribizel (https://github.com/upsj)
  - David Wendt (https://github.com/davidwendt)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#11740
…sai#11735)

Updates the instruction to build the libcudf documentation files in DOCUMENTATION.md.
The `cmake --build . --target docs_cudf` will invoke the appropriate make tool as setup when cmake was configured for building libcudf.

Closes rapidsai#11719

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Tobias Ribizel (https://github.com/upsj)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#11735
Disables a `ContiguousSplitUntypedTest` that simply creates a very large (over 3GB) column to test the output buffer size does not overflow. The gtests ends requiring 25GB of device memory when used with the arena allocator as mentioned in rapidsai#11249. Very large columns like this should be not part of the unit test for libcudf.
This PR disables the test so it can be available for testing on specific conditions outside of CI.

Closes rapidsai#11249

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#11706
Co-authored-by: Elias Stehle <3958403+elstehle@users.noreply.github.com>
This PR fixes an issue where the `strings_udf` conda package for python 3.9 is missing, due to the way `strings_udf` is plumbed through CI.

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: rapidsai#11730
…memory usage to benchmarks (rapidsai#11732)

This PR reduces memory requirements in the new nested JSON parser and adds `bytes_per_second` and `peak_memory_usage` usage to benchmarks

Authors:
  - Elias Stehle (https://github.com/elstehle)

Approvers:
  - Tobias Ribizel (https://github.com/upsj)
  - Karthikeyan (https://github.com/karthikeyann)
  - Yunsong Wang (https://github.com/PointKernel)

URL: rapidsai#11732
…1745)

This PR adds ability to construct a `ListColumn` when `size` is `None`:

```python
In [1]: from cudf.core.column import build_list_column
In [2]: from cudf.core.column import as_column
In [3]: build_list_column(indices = as_column([0, 3]), elements = as_column([0, 2, 4]))
...
TypeError: an integer is required
```

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)

URL: rapidsai#11745
…ai#11755)

The `sort=True` default change in dask/dask#9486 was not meant to propagate to the DataFrameGroupby and SeriesGroupby classes just yet. This PR adds the necessary `sort=None` defaults needed to avoid CI failures.

Authors:
  - Richard (Rick) Zamora (https://github.com/rjzamora)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Ashwin Srinath (https://github.com/shwina)

URL: rapidsai#11755
…ai#11726)

Currently many of our tests are only stream-safe because libcudf runs everything on the default stream. This PR updates tests to ensure that any function that launches a kernel and supports passing streams will act on cudf's default stream even when it is _not_ CUDA's default stream.

There are other aspects required for stream-safety that are not addressed in this PR. For instance, some of our tests make use of `thrust::device_vector`, and its initialization is implicitly always on the default stream. I'll work on that in a separate PR since that also requires some discussion with the team on what expectations a stream-based libcudf API could like like for consumers that make use of thrust (i.e. do we start requiring device syncs for such consumers?). There are also numerous tests that fail when swapping in an alternate default stream, indicating other potential dependencies on streams. I'll work through those remaining issues separately as well to limit the scope of this PR.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Robert Maynard (https://github.com/robertmaynard)
  - Nghia Truong (https://github.com/ttnghia)

URL: rapidsai#11726
karthikeyann pushed a commit that referenced this pull request Jun 10, 2023
This implements stacktrace and adds a stacktrace string into any exception thrown by cudf. By doing so, the exception carries information about where it originated, allowing the downstream application to trace back with much less effort.

Closes rapidsai#12422.

### Example:
```
#0: cudf/cpp/build/libcudf.so : std::unique_ptr<cudf::column, std::default_delete<cudf::column> > cudf::detail::sorted_order<false>(cudf::table_view, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x446
#1: cudf/cpp/build/libcudf.so : cudf::detail::sorted_order(cudf::table_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x113
#2: cudf/cpp/build/libcudf.so : std::unique_ptr<cudf::column, std::default_delete<cudf::column> > cudf::detail::segmented_sorted_order_common<(cudf::detail::sort_method)1>(cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x66e
#3: cudf/cpp/build/libcudf.so : cudf::detail::segmented_sort_by_key(cudf::table_view const&, cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x88
#4: cudf/cpp/build/libcudf.so : cudf::segmented_sort_by_key(cudf::table_view const&, cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::mr::device_memory_resource*)+0xb9
#5: cudf/cpp/build/gtests/SORT_TEST : ()+0xe3027
#6: cudf/cpp/build/lib/libgtest.so.1.13.0 : void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x8f
#7: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::Test::Run()+0xd6
#8: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::TestInfo::Run()+0x195
#9: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::TestSuite::Run()+0x109
#10: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::internal::UnitTestImpl::RunAllTests()+0x44f
#11: cudf/cpp/build/lib/libgtest.so.1.13.0 : bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x87
#12: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::UnitTest::Run()+0x95
rapidsai#13: cudf/cpp/build/gtests/SORT_TEST : ()+0xdb08c
rapidsai#14: /lib/x86_64-linux-gnu/libc.so.6 : ()+0x29d90
rapidsai#15: /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0x80
rapidsai#16: cudf/cpp/build/gtests/SORT_TEST : ()+0xdf3d5
```

### Usage

In order to retrieve a stacktrace with fully human-readable symbols, some compiling options must be adjusted. To make such adjustment convenient and effortless, a new cmake option (`CUDF_BUILD_STACKTRACE_DEBUG`) has been added. Just set this option to `ON` before building cudf and it will be ready to use.

For downstream applications, whenever a cudf-type exception is thrown, it can retrieve the stored stacktrace and do whatever it wants with it. For example:
```
try {
  // cudf API calls
} catch (cudf::logic_error const& e) {
  std::cout << e.what() << std::endl;
  std::cout << e.stacktrace() << std::endl;
  throw e;
} 
// similar with catching other exception types
```

### Follow-up work

The next step would be patching `rmm` to attach stacktrace into `rmm::` exceptions. Doing so will allow debugging various memory exceptions thrown from libcudf using their stacktrace.


### Note:
 * This feature doesn't require libcudf to be built in Debug mode.
 * The flag `CUDF_BUILD_STACKTRACE_DEBUG` should not be turned on in production as it may affect code optimization. Instead, libcudf compiled with that flag turned on should be used only when needed, when debugging cudf throwing exceptions.
 * This flag removes the current optimization flag from compiling (such as `-O2` or `-O3`, if in Release mode) and replaces by `-Og` (optimize for debugging).
 * If this option is not set to `ON`, the stacktrace will not be available. This is to avoid expensive stracktrace retrieval if the throwing exception is expected.

Authors:
  - Nghia Truong (https://github.com/ttnghia)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Robert Maynard (https://github.com/robertmaynard)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Jason Lowe (https://github.com/jlowe)

URL: rapidsai#13298
karthikeyann pushed a commit that referenced this pull request Sep 24, 2023
Pin conda packages to `aws-sdk-cpp<1.11`. The recent upgrade in version `1.11.*` has caused several issues with cleaning up (more details on changes can be read in [this link](https://github.com/aws/aws-sdk-cpp#version-111-is-now-available)), leading to Distributed and Dask-CUDA processes to segfault. The stack for one of those crashes looks like the following:

```
(gdb) bt
#0  0x00007f5125359a0c in Aws::Utils::Logging::s_aws_logger_redirect_get_log_level(aws_logger*, unsigned int) () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../.././libaws-cpp-sdk-core.so
#1  0x00007f5124968f83 in aws_event_loop_thread () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-io.so.1.0.0
#2  0x00007f5124ad9359 in thread_fn () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1
#3  0x00007f519958f6db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#4  0x00007f5198b1361f in clone () from /lib/x86_64-linux-gnu/libc.so.6
```

Such segfaults now manifest frequently in CI, and in some cases are reproducible with a hit rate of ~30%. Given the approaching release time, it's probably the safest option to just pin to an older version of the package while we don't pinpoint the exact cause for the issue and a patched build is released upstream.

The `aws-sdk-cpp` is statically-linked in the `pyarrow` pip package, which prevents us from using the same pinning technique. cuDF is currently pinned to `pyarrow=12.0.1` which seems to be built against `aws-sdk-cpp=1.10.*`, as per [recent build logs](https://github.com/apache/arrow/actions/runs/6276453828/job/17046177335?pr=37792#step:6:1372).

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#14173
karthikeyann pushed a commit that referenced this pull request Nov 10, 2023
…#4)

Fixes: rapidsai#14148

This PR resolves a RunTimeError by raising when a pd.PeriodIndex is passed to the column constructor, the reason being there is no cudf.PeriodIndex implemented yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.