Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Segmentation fault: H5Pget_vol_cap_flags when using the Passthru VOL #2417

Closed
yzanhua opened this issue Jan 24, 2023 · 5 comments
Closed
Assignees

Comments

@yzanhua
Copy link

yzanhua commented Jan 24, 2023

Describe the bug
We came across a segmentation fault when running the vol-tests using the Passthru VOL provided in HDF5 1.14.0. The same segmentation fault will also occur if other VOLs are used. (However, using the native VOL won't give any issue.)

The test program provided below (test.c) mimics the behavior of the vol-tests and can reproduce the problem.

Click here to see test.c:
#include <stdlib.h>
#include <string.h>
#include "hdf5.h"

#define CHECK_ERR(A)                                             \
  {                                                            \
      if (A < 0) {                                             \
          printf ("Error at line %d: code %d\n", __LINE__, A); \
          goto err_out;                                        \
      }                                                        \
  }

hid_t get_vol_id (int argc, char **argv, int rank);

int main (int argc, char **argv) {
  herr_t err = 0;
  int rank;
  hid_t fapl_id, connector_id;
  uint64_t vol_cap_flags;

  MPI_Init (&argc, &argv);
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);

  // get Passthru VOL's ID
  connector_id = H5VL_pass_through_register();
  CHECK_ERR(connector_id);

  // create a file access property list
  fapl_id = H5Pcreate (H5P_FILE_ACCESS);
  CHECK_ERR (fapl_id);

  // Set the underlying VOL of fapl_id
  err = H5Pset_vol(fapl_id, connector_id, NULL);
  CHECK_ERR(err);

  // get vol_cap_flags
  err = H5Pget_vol_cap_flags(fapl_id, &vol_cap_flags);  // seg fault happens inside this line

  // codes below are not able to run due to the seg fault above
  CHECK_ERR (err);

err_out:;
  if (fapl_id > 0) H5Pclose (fapl_id);
  MPI_Finalize ();
  return err;
}
Click here to see a Makefile:
HDF5=/home/HDF5/1.14.0

all:
  mpicc test.c -g -o test \
  -I${HDF5}/include \
  -L${HDF5}/lib -lhdf5 -lhdf5_hl

run:
  LD_LIBRARY_PATH=${HDF5}/lib:${LD_LIBRARY_PATH} \
  HDF5_PLUGIN_PATH="${HDF5}/lib" \
  mpirun -n 1 ./test

make to compile.

make run to run the test program using Passthru VOL.

Click here to see gdb output:
#0  H5VL_pass_through_introspect_get_cap_flags (_info=0x0, cap_flags=0x7ffe90d132c8)
  at H5VLpassthru.c:2609
#1  0x00007fed72e78598 in H5VL_introspect_get_cap_flags (info=0x0, cls=0x1eec6c0,
  cap_flags=cap_flags@entry=0x7ffe90d132c8) at H5VLcallback.c:6442
#2  0x00007fed72e8321c in H5VL_get_cap_flags (
  connector_prop=connector_prop@entry=0x7ffe90d13270,
  cap_flags=cap_flags@entry=0x7ffe90d132c8) at H5VLint.c:2875
#3  0x00007fed72d22eef in H5Pget_vol_cap_flags (plist_id=792633534417207315,
  cap_flags=0x7ffe90d132c8) at H5Pfapl.c:6279
#4  0x0000000000400a6b in main (argc=1, argv=0x7ffe90d133c8) at test.c:37

Expected behavior

No segmentation fault is expected.

Platform (please complete the following information)

  • HDF5 version: 1.14.0
  • Compiler and version: gcc 8.5.0, MPICH 3.4.2
  • Any configure options you specified: --enable-parallel, --enable-build-mode=debug
  • MPI library and version (parallel HDF5): MPICH 3.4.2
@raylu-hdf
Copy link
Contributor

To use the pass-through VOL correctly, you need to specify the underneath VOL as the following example. Otherwise, you'll run into problems. The library should do better than having a seg fault, e.g. returning an error message. But that's another issue. If you ran into the similar problem with another VOL, please send it to us for debugging.

    H5VL_pass_through_info_t passthru_info;

    passthru_info.under_vol_id   = H5VL_NATIVE;
    passthru_info.under_vol_info = NULL;

    H5Pset_vol(fapl_id, connector_id, &passthru_info);

@raylu-hdf raylu-hdf reopened this Feb 16, 2023
raylu-hdf added a commit to raylu-hdf/hdf5 that referenced this issue Feb 16, 2023
…cted places, make sure the underneath VOL ID is specified.
@yzanhua
Copy link
Author

yzanhua commented Feb 16, 2023

@raylu-hdf Thank you for the answer! I'd like to confirm my understanding is correct that

1.as HDF5 users, we should (always?) avoid passing NULL to the third argument of H5Pset_vol. If this is the case, then I'll create an issue about vol-tests' initialization in its repo.
2. as VOL developers (e.g. developing the log VOL in my case), we should make sure info is not NULL at file create/open. Otherwise, we might run into problems in places that are out of the VOL's "control" (e.g. H5Pget_vol_cap_flags).

I also have another question regarding the provided fix: as long as I use H5VL_NATIVE, the program fails directly. I am using HDF5 1.14.0.

@yzanhua
Copy link
Author

yzanhua commented Feb 16, 2023

The error message for H5VL_NATIVE is

test: H5Eint.c:667: H5E_printf_stack: Assertion `cls_id > 0' failed.

@raylu-hdf
Copy link
Contributor

raylu-hdf commented Feb 16, 2023

1.as HDF5 users, we should (always?) avoid passing NULL to the third argument of H5Pset_vol. If this is the case, then I'll create an issue about vol-tests' initialization in its repo.

    Reply: The third parameter of H5Pset_vol could be null for some connectors.  But the pass-through VOL 
    requires the underneath VOL ID.  We'll improve vol-tests to handle different connectors.  Thanks for 
    pointing out the potential problem there.    
  1. as VOL developers (e.g. developing the log VOL in my case), we should make sure info is not NULL at file create/open. Otherwise, we might run into problems in places that are out of the VOL's "control" (e.g. H5Pget_vol_cap_flags).
    Reply: I don't know much about Log VOL.  But if it's based on the pass-through VOL, you probably 
    need to specify the underneath VOL ID.

I also have another question regarding the provided fix: as long as I use H5VL_NATIVE, the program fails directly. I am using HDF5 1.14.0.

    Reply: Can I have your test program?  When I tried H5VL_NATIVE, it worked for me.

derobins pushed a commit that referenced this issue Feb 17, 2023
…ces, make sure the underneath VOL ID is specified. (#2475)

* GitHub #2417: to avoid the pass-through VOL failing in unexpected places, make sure the underneath VOL ID is specified.

* Committing clang-format changes

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
@raylu-hdf
Copy link
Contributor

My PR (#2475) was merged into the develop branch: to avoid the pass-through VOL failing in unexpected places, make sure the underneath VOL ID is specified. This issue can be closed if the user has no further question or issue.

@yzanhua yzanhua closed this as completed Mar 3, 2023
brtnfld pushed a commit to brtnfld/hdf5 that referenced this issue May 17, 2023
…cted places, make sure the underneath VOL ID is specified. (HDFGroup#2475)

* GitHub HDFGroup#2417: to avoid the pass-through VOL failing in unexpected places, make sure the underneath VOL ID is specified.

* Committing clang-format changes

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants