-
Notifications
You must be signed in to change notification settings - Fork 101
Tips For Large Files
Here are some tips on how to traverse larger Parquet files.
By default, the application will only load and display the first 1000 records within the Parquet file. This is because records must be loaded into memory in order to be displayed. And this might not be possible for larger Parquet files with a lot of records.
- This setting can be found in the top right corner of the UI.
- Adjust this according to how much your machine can handle. This number directly determines how much RAM the program will use to load the necessary records
- Selecting too much will cause a crash
- Reducing the amount of fields to display will help reduce the memory footprint.
By default the application will try to load all the fields within the Parquet file into the UI. This might be okay for smaller files but when dealing with larger files selecting all the fields might not be very efficient
Reducing the amount of fields to load will decrease load times and reduce memory usage, allowing more records to be loaded into memory for display.
Using the Edit → Column Sizing → Columns and Contents
option comes at a performance cost because the utility needs to go through all the cells and determine the minimum width necessary to show all contents. Using the Column Names
option will provide a performance boost.