Deploy preview for PR 121 🛫

cloudnativegeo · Dec 20, 2024 · d542faf · d542faf
1 parent bcc7271
commit d542faf
Show file tree

Hide file tree

Showing 2 changed files with 2 additions and 2 deletions.
diff --git a/pr-preview/pr-121/cloud-optimized-netcdf4-hdf5/index.html b/pr-preview/pr-121/cloud-optimized-netcdf4-hdf5/index.html
@@ -563,7 +563,7 @@ <h2 class="anchored" data-anchor-id="consolidated-internal-file-metadata">Consol
 <p>HDF5 file organization—data, metadata, and free space—depends on the file space management strategy. Details on these strategies are in <a href="https://support.hdfgroup.org/documentation/hdf5-docs/advanced_topics/FileSpaceManagement.html">HDF Support: File Space Management</a>.</p>
 <p>Here are a few additional considerations for understanding and implementing the <code>H5F_FSPACE_STRATEGY_PAGE</code> strategy:</p>
 <ul>
-<li><strong>Chunks vs.&nbsp;Pages:</strong> In HDF5, datasets can be chunked, meaning the dataset is divided into smaller blocks of data that can be individually compressed (see also <a href="https://support.hdfgroup.org/documentation/hdf5-docs/advanced_topics/chunking_in_hdf5.html">Chunking in HDF5</a>). Pages, on the other hand, represent the smallest unit HDF5 uses for reading and writing data. To optimize performance, chunk sizes should ideally align with the page size or be a multiple thereof. A chunk does not have to fit within a single page, however misalignment leads to chunks spanning multiple pages, which increases read latency. Entire pages are read into memory when accessing chunks or metadata. Only the relevant data (e.g., a specific chunk) is decompressed.</li>
+<li><strong>Chunks vs.&nbsp;Pages:</strong> In HDF5, datasets can be chunked, meaning the dataset is divided into smaller blocks of data that can be individually compressed (see also <a href="https://support.hdfgroup.org/documentation/hdf5-docs/advanced_topics/chunking_in_hdf5.html">Chunking in HDF5</a>). Pages, on the other hand, represent the smallest unit HDF5 uses for reading and writing data. To optimize performance, chunk sizes should ideally align with the page size or be a multiple thereof. Entire pages are read into memory when accessing chunks or metadata. Only the relevant data (e.g., a specific chunk) is decompressed.</li>
 <li><strong>Page Size Considerations:</strong> The page size applies to both metadata and raw data. Therefore, the chosen page size should strike a balance: it must consolidate metadata efficiently while minimizing unused space in raw data chunks. Excess unused space can significantly increase file size. File size is typically not a concern for I/O performance when accessing parts of a file. However, increased file size can become a concern for storage costs.</li>
 </ul>
 </div>

diff --git a/pr-preview/pr-121/search.json b/pr-preview/pr-121/search.json
@@ -1373,7 +1373,7 @@
     "href": "cloud-optimized-netcdf4-hdf5/index.html#consolidated-internal-file-metadata",
     "title": "Cloud-Optimized HDF/NetCDF",
     "section": "Consolidated Internal File Metadata",
-    "text": "Consolidated Internal File Metadata\nConsolidated metadata is a key characteristic of cloud-optimized data and enables “lazy loading” (see the Lazy Loading block below). Client libraries use file metadata to understand what’s in the file and where it is stored. When metadata is scattered across a file (which is the default for HDF5 writing), client applications have to make multiple requests for metadata information.\nFor HDF5 files, to consolidate metadata, files should be written with the paged aggregation file space management strategy (see also H5F_FSPACE_STRATEGY_PAGE). When using this strategy, HDF5 will write data in pages where metadata is separated from raw data chunks. Note the page size should also be set, as the default size is 4096 bytes (or 4KB, source). Further, only files using paged aggregation can use the HDF5 page buffer cache – a low-level library cache (Jelenak 2022) – to reduce subsequent data access.\n\n\n\n\n\n\nLazy loading\n\n\n\nLazy loading is a common term for first loading only metadata, and deferring reading of data values until required by computation.\n\n\n\n\n\n\n\n\nHDF5 File Space Management Strategies\n\n\n\nHDF5 file organization—data, metadata, and free space—depends on the file space management strategy. Details on these strategies are in HDF Support: File Space Management.\nHere are a few additional considerations for understanding and implementing the H5F_FSPACE_STRATEGY_PAGE strategy:\n\nChunks vs. Pages: In HDF5, datasets can be chunked, meaning the dataset is divided into smaller blocks of data that can be individually compressed (see also Chunking in HDF5). Pages, on the other hand, represent the smallest unit HDF5 uses for reading and writing data. To optimize performance, chunk sizes should ideally align with the page size or be a multiple thereof. A chunk does not have to fit within a single page, however misalignment leads to chunks spanning multiple pages, which increases read latency. Entire pages are read into memory when accessing chunks or metadata. Only the relevant data (e.g., a specific chunk) is decompressed.\nPage Size Considerations: The page size applies to both metadata and raw data. Therefore, the chosen page size should strike a balance: it must consolidate metadata efficiently while minimizing unused space in raw data chunks. Excess unused space can significantly increase file size. File size is typically not a concern for I/O performance when accessing parts of a file. However, increased file size can become a concern for storage costs.",
+    "text": "Consolidated Internal File Metadata\nConsolidated metadata is a key characteristic of cloud-optimized data and enables “lazy loading” (see the Lazy Loading block below). Client libraries use file metadata to understand what’s in the file and where it is stored. When metadata is scattered across a file (which is the default for HDF5 writing), client applications have to make multiple requests for metadata information.\nFor HDF5 files, to consolidate metadata, files should be written with the paged aggregation file space management strategy (see also H5F_FSPACE_STRATEGY_PAGE). When using this strategy, HDF5 will write data in pages where metadata is separated from raw data chunks. Note the page size should also be set, as the default size is 4096 bytes (or 4KB, source). Further, only files using paged aggregation can use the HDF5 page buffer cache – a low-level library cache (Jelenak 2022) – to reduce subsequent data access.\n\n\n\n\n\n\nLazy loading\n\n\n\nLazy loading is a common term for first loading only metadata, and deferring reading of data values until required by computation.\n\n\n\n\n\n\n\n\nHDF5 File Space Management Strategies\n\n\n\nHDF5 file organization—data, metadata, and free space—depends on the file space management strategy. Details on these strategies are in HDF Support: File Space Management.\nHere are a few additional considerations for understanding and implementing the H5F_FSPACE_STRATEGY_PAGE strategy:\n\nChunks vs. Pages: In HDF5, datasets can be chunked, meaning the dataset is divided into smaller blocks of data that can be individually compressed (see also Chunking in HDF5). Pages, on the other hand, represent the smallest unit HDF5 uses for reading and writing data. To optimize performance, chunk sizes should ideally align with the page size or be a multiple thereof. Entire pages are read into memory when accessing chunks or metadata. Only the relevant data (e.g., a specific chunk) is decompressed.\nPage Size Considerations: The page size applies to both metadata and raw data. Therefore, the chosen page size should strike a balance: it must consolidate metadata efficiently while minimizing unused space in raw data chunks. Excess unused space can significantly increase file size. File size is typically not a concern for I/O performance when accessing parts of a file. However, increased file size can become a concern for storage costs.",
     "crumbs": [
       "Formats",
       "Cloud-Optimized HDF/NetCDF",