Python scripts for uploading MODIS images to SciDB. These scripts provide the way to load several MODIS HDF files to a 3-dimension SciDB array by making calls to SciDB data loading tools.
Loading MODIS data to SciDB is a 3 step process:
- Export the HDF image file to SciDB's binary. MODIS images are available in HDF format; on the other hand, SciDB is able to load data using SciDB's binary format.
- Load the binary image to a 1-dimension SciDB array.
- Redimension the array from 1 to 3 dimensions.
The script checkFolder.py monitors a folder looking for SciDB binary data. Each time a new file is found it calls the script load2scidb.py which loads the data into a SciDB 3D array (steps 2 & 3). Loading data to a 3D array is not straight forward, instead, the binary data is loaded first into a temporal 1D array which is re-dimensioned later into a 3D array. Then, the 1D array is deleted. These temporal 1D arrays are named following the pattern load_XXXXXXXXX and they are deleted by the script once the re-dimension is done.
The script hdfs2sdbbin.py exports MODIS data to SciDB binary format into a specific folder. For this, it calls the binary tool for exporting HDF to SciDB binary modis2scidb.
Since the exporting is independent from the loading script, the HDf-to-binary script can be executed on several servers simultaneously while loading is only done by the SciDB's coordinator instance.
- git.
- Python.
- SciDB 14.3. SciDB must be installed in the default location
- These scripts must be installed on the SciDB coordinator instance and they must be ran using an user enabled to execute IQUERY.
- The binary tool for exporting HDF to SciDB binary called: modis2scidb
LICENSE
- License file.README.md
- This file.addHdfs2bin.py
- Script that export/adds an HDF file to SciDB's binary format.checkFolder.py
- Script that checks a folder for SciDB's binary files.load2scidb.py
- Script that loads a binary file to a SciDB database.install_pyhdf.sh
- Script for installing pyhdf.run.py
- It builds the path to the MODIS files and then it callsaddHdfs2bin.py
.
- Download the scripts to the script-folder. Use:
git clone https://github.com/albhasan/modis2scidb.git
- Use the
install_pyhdf.sh
script to install pyhdf on the SciDB coordinator instance. For examplesudo ./install_pyhdf.sh
- Create a destination array in SciDB. This is the dest-array
- For MOD13Q1:
CREATE ARRAY MOD09Q1 <red:int16, nir:int16, quality:uint16> [col_id=48000:72000,1014,5,row_id=38400:62400,1014,5,time_id=0:9200,1,0];
- For MOD13Q1:
CREATE ARRAY MOD13Q1 <ndvi:int16, evi:int16, quality:uint16, red:int16, nir:int16, blue:int16, mir:int16, viewza:int16, sunza:int16, relaza:int16, cdoy:int16, reli:int16> [col_id=48000:72000,502,5,row_id=38400:62400,502,5,time_id=0:9200,1,0];
- For MOD13Q1:
- Create a folder accessible by SciDB. This is the check-folder from where data is loaded to SciDB.
- Run
checkFolder.py
pointing to the check-folder; the files found here will be uploaded to SciDB. For example:python checkFolder.py /home/scidb/toLoad/ /home/scidb/modis2scidb/ MOD09Q1 &
- Run
addHdfs2bin.py
to export MODIS HDFs to binary files. After finishing, the file can be copied to the check-folder. For example:python addHdfs2bin.py /home/scidb/MODIS_ARC/MODIS/MOD09Q1.005/2000.02.18/MOD09Q1.A2000049.h10v08.005.2006268191328.hdf /home/scidb/MOD09Q1.A2000049.h10v08.005.2006268191328.sdbbin
mv /home/scidb/MOD09Q1.A2000049.h10v08.005.2006268191328.sdbbin /home/scidb/toLoad/MOD09Q1.A2000049.h10v08.005.2006268191328.sdbbin
- NOTE: Alternatively, you can use
run.py
to make calls toaddHdfs2bin.py
on many HDFs.