Python scripts for easy RELION STAR file manipulation. Base library file metadata.py needed by all scripts.
!!! Scripts are now RELION 3.1+ compatible !!!
Backward compatible with RELION <=3.0 format star files!
! only Relion <=3.0 format star files ! Add beamtilt class to the particles. Script adds rlnBeamTiltClass extracted from the micrograph name (in FoilHoleXXXX.mrc FEI format).
--i Input STAR filename.
--o Output STAR filename.
Adds or removes labels from star file.
--i Input STAR filename.
--o Output STAR filename.
--add Add new label to the star file.
--rm Remove label from the star file.
--lb Label to be added or removed. Use comma separated label values to add or remove multiple labels. Default: None
--val Value filled for added labels. Use comma separated values if adding multiple labels. Default: 0
--data Data table from star file to be used (Default: data_particles).
Example 1: Remove rlnCoordinateY from input.star
add_remove_label.py --i input.star --o output.star --lb rlnCoordinateY --rm
Example 2: Remove rlnCoordinateX,rlnCoordinateY from input.star
add_remove_label.py --i input.star --o output.star --lb rlnCoordinateX,rlnCoordinateY --rm
Example 3: Add rlnCoordinateX,rlnCoordinateY with default values 10,20 respectively
add_remove_label.py --i input.star --o output.star --lb rlnCoordinateX,rlnCoordinateY --add --val 10,20
Add label (col_lb) to Input1 and assigns values to it from Input2 where the label (comp_lb) of Input2 matches Input1
--i1 Input1 STAR filename.
--i2 Input2 STAR filename.
--o Output STAR filename.
--data Data table from star file to be used, Default: data_particles
--col_lb Label of the new column assigned to Input1; Default: rlnDefocusU
--comp_lb Compare label used for Input1 and Input2 for value assignment. Default:rlnMicrographName
Example 1: Assign values of DefocusU from input2.star as a column to input1.star where the value of column rlnMicrographName matches in both inputs.
assign_column_star.py --i1 input1.star --i2 input2.star --o output.star --col_lb rlnDefocusU --comp_lb rlnMicrographName
Remove other columns than particle coords from star file.
--i Input STAR filename with particles.
--o Output STAR filename.
! only Relion <=3.0 format star files ! Clusters beam-shifts extracted from xml files into beam-tilt classes.
--i Input XML directory
--o Output star file. If empty no file generated generated
--o_shifts Output file with extracted beam-shifts and cluster numbers. If empty no file generated generated
--clusters Number of clusters the beam-shifts should be divided in. (default: 1)
--elbow Number of max clusters used in Elbow method optimal cluster number determination. (default: 0)
--max_iter Expert option: Maximum number of iterations of the k-means algorithm for a single run. (default: 300)
--n_init Expert option: Number of time the k-means algorithm will be run with different centroid seeds. (default: 10)
Requires specific conda environment. Install:
conda create -n beamtiltclass-env
conda activate beamtiltclass-env
conda install scikit-learn
conda install matplotlib
! only Relion <=3.0 format star files ! Clusters beam-shifts extracted from serialem mdoc files into beam-tilt classes.
--i Input mdoc directory
--o Output star file. If empty no file generated generated
--o_shifts Output file with extracted beam-shifts and cluster numbers. If empty no file generated generated
--clusters Number of clusters the beam-shifts should be divided in. (default: 1)
--elbow Number of max clusters used in Elbow method optimal cluster number determination. (default: 0)
--max_iter Expert option: Maximum number of iterations of the k-means algorithm for a single run. (default: 300)
--n_init Expert option: Number of time the k-means algorithm will be run with different centroid seeds. (default: 10)
Requires specific conda environment. Install:
conda create -n beamtiltclass-env
conda activate beamtiltclass-env
conda install scikit-learn
conda install matplotlib
! only Relion >=3.1 format star files !
Creates optic groups according to acquisition position identifier (number) in the FoilHole_XXX_Data_YYY_ZZZ_AAA_BBB-CCC.mrc filename of the micrograph.
--i Input STAR filename.
--o, Output STAR filename.
--word_count Position of the acquisition position identifier in the FoilHole_XXX_Data_YYY_ZZZ.mrc filename. Default: 4 (th word).")
! only Relion >=3.1 format star files !
Clusters beam-shifts extracted from xml files into optic groups.
--i Input XML directory
--istar Input particles star file.
--o Output star file. If empty no file generated generated
--o_shifts Output file with extracted beam-shifts and cluster numbers. If empty no file generated generated
--clusters Number of clusters the beam-shifts should be divided in. (default: 1)
--elbow Number of max clusters used in Elbow method optimal cluster number determination. (default: 0)
--max_iter Expert option: Maximum number of iterations of the k-means algorithm for a single run. (default: 300)
--n_init Expert option: Number of time the k-means algorithm will be run with different centroid seeds. (default: 10)
Requires specific conda environment. Install:
conda create -n beamtiltclass-env
conda activate beamtiltclass-env
conda install scikit-learn
conda install matplotlib
! only Relion >=3.1 format star files !
Clusters beam-shifts extracted from serialem mdoc files into optic groups.
--i Input mdoc directory
--istar Input particles star file.
--o Output star file. If empty no file generated generated
--o_shifts Output file with extracted beam-shifts and cluster numbers. If empty no file generated generated
--clusters Number of clusters the beam-shifts should be divided in. (default: 1)
--elbow Number of max clusters used in Elbow method optimal cluster number determination. (default: 0)
--max_iter Expert option: Maximum number of iterations of the k-means algorithm for a single run. (default: 300)
--n_init Expert option: Number of time the k-means algorithm will be run with different centroid seeds. (default: 10)
Requires specific conda environment. Install:
conda create -n beamtiltclass-env
conda activate beamtiltclass-env
conda install scikit-learn
conda install matplotlib
Calculates the absolute apix for the optics groups according to https://www3.mrc-lmb.cam.ac.uk/relion/index.php/Pixel_size_issues
--i Input STAR filename
Generates heatmap of particle orientations from star file. Cartesian (hexbin, healpix, legacy) and mollweide (healpix, legacy) representations are generated. For symmetrical particles, first make symmetry expand of the star file (relion_particle_symmetry_expand).
Requires:
- Matplotlib
- Numpy
- healpy
--i Input STAR filename with particles and orientations.
--o Output files prefix. Default: heatmap_orient
--show Only shows the resulting heatmap. Does not store any output file.
--format Output format. Available formats: png, svg, jpg, tif. Default: png
--cmap Color map used for the heatmap. Matplotlib colormap names accepted. Recommended: jet, inferno, viridis, turbo. (Default: turbo)")
--mask_zero Mask zero values not to be represented by color. (i.e. zero values represented by white)")
--grid_size Grid size of the hexbin grid. The higher number the finer sampling. Default: 50")
--hlpx Create HealPix style maps
--hlpx_order HealPix sampling order used for plotting (2->15deg,3->7.5deg, 4->3.75). Default: 4 (3.75deg)
--no_graticule Do not plot graticule on HealPix maps.
--vmin Min values represented on color bar. Default: -1 (auto)
--vmax Max values represented on color bar. Default: -1 (auto)
--legacy Creates old (original) style heatmaps
Modify star file to be compatible with helix refinement
--i Input STAR filename with particles.
--o Output STAR filename.
Join two star files. Joining options: UNION, INTERSECT, EXCEPT.
--i1 Input1 STAR filename with particles.
--i2 Input2 STAR filename with particles.
--o Output STAR filename.
--data Data table from star file to be used, Default: data_particles
--lb Label used for intersect/except joining. e.g. rlnAngleTilt, rlnDefocusU...; Default: rlnMicrographName
--op Operator used for comparison. Allowed: "union", "intersect", "except"
Example 1: Include all line from Input1 and Input2 in the Output star file.
join_star.py --i1 input1.star --i2 input2.star --o output.star
Example 2: Select all lines from Input1 where micrographs DO match micrographs in Input2.
join_star.py --i1 input1.star --i2 input2.star --o output.star --lb rlnMicrographName --op \"intersect\"
Example 3: Select all lines from Input1 where micrographs DO NOT match micrographs in Input2.
join_star.py --i1 input1.star --i2 input2.star --o output.star --lb rlnMicrographName --op \"except\""
Convert EMAN2 json type box files into RELION coordinate STAR file. Coordinates might be corrected for binning, when the particles were picked on binned micrographs. Boxes lying outside micrograph boundaries might be discarded.
--i Input directory with json files (EMAN2 info directory location.)
--o Output directory to store STAR files.
--suffix Star file suffix (e.g. "_box" will produce micname123_box.star). Default: "_box"
--boxsize Box size used to exclude boxes that violates micrograph boundaries). Default: 256
--binning Binning factor correction. Use when particles we picked on binned micrographs. Default: 1
--maxX Max size of the micrograph in pixels in X dimension (used to exclude boxes from micrograph edges). Default: 4096
--maxY Max size of the micrograph in pixels in Y dimension (used to exclude boxes from micrograph edges). Default: 4096
Perform basic math operations on star file values.
--i Input STAR filename with particles.
--o Output STAR filename.
--data Data table from star file to be used, Default: data_particles
--lb Label used for math operation. e.g. rlnAngleTilt, rlnDefocusU...
--op Operator used for comparison. Allowed: "+", "-", "*", "/","^","abs","=","mod","remainder". Use double quotes!!!
--val Value used for math operation.
--sellb Label used for selection. e.g. rlnAngleTilt, rlnDefocusU... Default: None
--selop Operator used for comparison. Allowed: "=", "!=", ">=", "<=", "<". Use double quotes!!!
--selval Value used for comparison. Used together with --selop parameter.
--rh Selection range Hi (upper bound). Default: Disabled
--rl Selection range Lo (lower bound). Default: Disabled
Example 1: Add 15 deg to rlnAngleTilt.
math_star.py --i input.star --o output.star --lb rlnAngleTilt --op "+" --val 15
Example 2: Multiply rlnOriginX by 2.
math_star.py --i input.star --o output.star --lb rlnOriginX --op "*" --val 2
Example 3: Compute remainder of rlnAngleRot where rlnGroupNumber is 2.
math_star.py --i input.star --o output.star --lb rlnAnlgeRot --op "remainder" --sellb rlnGroupNumber --selval 2
Base library required by all scripts.
Extracts coordinates from particles STAR file and saves as per micrograph box files.
--i Input STAR filename with particles.
--o Output directory where the box files will be stored.
--box_size Box size. Default: 256
Extracts coordinates from particles STAR file and saves as per micrograph coords star files.
--i Input STAR filename with particles.
--o Output directory where the coords files will be stored.
Plots values of defined label(s) from STAR file.
--i Input STAR filename. Multiple files allowed separated by comma or by space (then all must be enclosed in double quotes).
--data Data table from star file to be used (Default: data_particles).
--lbx Label used for X axis (Default: None). If not defined, X axis is per record in the data table (e.g. per particle)
--lby Labels used for plot (Y-axis values). Accepts multiple labels to plot (separated by comma, or by space (then all must be enclosed in double quotes)).
--hist_bins Number of bins for plotting a histogram. If set to >0 then histogram is plotted.
--scatter Sets scatter type of plot.")
--threshold Draw a threshold line at the defined y value. Multiple values accepted, separated by comma (e.g. 0.5,0.143). (Default: none)
--thresholdx Draw a threshold line at the defined x value. Multiple values accepted, separated by comma (e.g. 0.5,0.143). (Default: none)"
--multiplotY Create separate plot for each --lby in a grid (Default: 1,1 = single plot). Define in parameter number of rows and columns (e.g. --multiplotY \"2,3\")
--multiplotFile Create separate plot for each file in --i in a grid (Default: 1,1 = single plot). Define in parameter number of rows and columns (e.g. --multiplotFile \"2,3\")
Example 1: Creates a scatter plot of DefocusU values per particle
plot_star.py --i input.star --lby rlnDefocusU --scatter
Example 2: Creates a histogram (in 50 bins) of DefocusU values per particle
plot_star.py --i input.star --lby rlnDefocusU --hist_bins 50
Example 3: Creates a scatter plot of DefocusU and DefocusV values in single plot
plot_star.py --i input.star --lby rlnDefocusU,rlnDefocusV --scatter
Example 4: Creates a scatter plot of DefocusU dependent on DefocusV values in single plot
plot_star.py --i input.star --lby rlnDefocusU --lbx rlnDefocusV --scatter
Example 5: Plot rlnDefocusU,rlnDefocusU values in 2 separate plots (1 row 2 plots)
plot_star.py --i micrographs_all_gctf_og.star --lby rlnDefocusU,rlnDefocusV --data data_micrographs --hist_bins 50 --multiplotY "1,2"
Note: If grid is < than the number of --lby => it iterates over the tiles from beginning.
Example 6: Plot rlnDefocusU, rlnDefocusV values of 2 datasets in a 2 plots (histogram)
plot_star.py --i micrographs_all_gctf_og.star,micrographs_all_gctf_og_200.star --lby rlnDefocusU,rlnDefocusV --data data_micrographs --hist_bins 50 --multiplotFile 1,2
Example 7: Plot out FSC from PostProcess
plot_star.py --i PostpProcess/job001/post.star --lby rlnFourierShellCorrelationCorrected,rlnFourierShellCorrelationUnmaskedMaps,rlnCorrectedFourierShellCorrelationPhaseRandomizedMaskedMaps --lbx rlnResolution --data data_fsc --threshold 0.143
Example 8: Plot (compare) FSC curves from 2 separate postprocess with multiple thresholds set
plot_star.py --i "PostpProcess/job001/post.star PostpProcess/job002/post.star" --lby rlnFourierShellCorrelationCorrected --lbx rlnResolution --data data_fsc --threshold 0.143,0.3,0.5
Example 9: Plot FSC from all iterations of an auto-refine run
plot_star.py --i "$(ls Refine3D/job003/run_it*_half1_model.star)" --lby rlnGoldStandardFsc --lbx rlnResolution --data data_model_class_1
Creates a regular pattern of small boxes around the center coordinate of the particle.
--i Input STAR filename with particles.
--o Output directory where the coords files will be stored.
--orig_box Size of the box in pixels around the center coordinate of the original particles. (Default: 512)
--pattern_box Size of the box in the regular pattern. (Default: 128)
--overlap Overlap in percents between the neighboring boxes in pattern. (Default: 30)
--sph_mask If set then only boxes inside a circular mask touching the orig_box are included.
Converts particle star from RELION 3.1 format to RELION 3.0 format.
--i Input STAR filename (RELON 3.1 format).
--o Output STAR filename.
!!! DEPRECATED USE: remove_preferred_orient_hlpx.py, which gives better results !!! Remove particles with overrepresented orientations. Average count of particles at each orientation is calculated. Then the count of particles that are n-times SD over the average is modified by retaining the particles with the highest rlnMaxValueProbDistribution.
--i Input STAR filename with particles and orientations.
--o Output star file. Default: output.star
--sd This many times SD above the average count will be representations kept. Default: 3
Remove particles with overrepresented orientations by sorting them into HealPix based orientation bins. Average count of particles per orientation bin is calculated. Then the count of particles that are n-times SD over the average is modified by retaining the particles with the highest rlnMaxValueProbDistribution.
--i Input STAR filename with particles and orientations.
--o Output star file. Default: output.star")
--hlpx_order HealPix sampling order used for sorting particles into orientation bins (2->15deg,3->7.5deg, 4->3.75). Default: 4 (3.75deg)
--sd This many times SD above the average count will be representations kept. Default: 3
Deprecated - need to be rewritten
Perform rotation of particles according to given euler angles.
--i Input STAR filename with particles.
--o Output STAR filename.
--rot Rotattion Euler angle. Default 0
--tilt Tilt Euler angle. Default 0
--psi Psi Euler angle. Default 0
--x Shift along X axis. Default 0
--y Shift along Y axis. Default 0
--z Shift along Z axis. Default 0
Example:
rotate_particles_star.py --i input.star --o output.star --rot 15 --tilt 20 --psi 150
Limit orientations of particles in STAR file. Select particles that are in the defined range of ROT, TILT, PSI angle.
--i Input STAR filename with particles.
--o Output STAR filename.
--rot_min Minimum rot angle.
--rot_max Minimum rot angle.
--tilt_min Minimum tilt angle.
--tilt_max Minimum tilt angle.
--psi_min Minimum psi angle.
--psi_max Minimum psi angle.
Select one orientation per particle from 3D classified symmetry expanded star files according to the greatest value of rlnMaxValueProbDistribution.
--i Input STAR filename with particles.
--o Output STAR filename.
Example: You created a C5 symmetry expanded star file that was 3D classified into 5 classes. You select the best looking class, which should in theory contain 1/5 of the particles from the symmetry expanded star. Because the classification is not perfect there are multiple redundant (symmetry) copies of some of the particles present in the selected class. To filter out only a single copy (unique) of every particle you can use this script, which will chose the particle with the greatest value of rlnMaxValueProbDistribution.
select_maxprob_sym_copy_ptcls.py --i selected_class.star --o selected_class_unique.star
Select random orientation from symmetry expanded star files. One orientation per particle.
--i Input STAR filename with particles.
--o Output STAR filename.
Select particles complying with selection rule on specified label.
--i Input STAR filename with particles.
--o Output STAR filename.
--data Data table from star file to be used, Default: data_particles
--lb Label used for selection. e.g. rlnAngleTilt, rlnDefocusU...
--op Operator used for comparison. Allowed: "=", "!=", ">=", "<=", "<". Use double quotes!!!
--val Value used for comparison. Used together with --op parameter.
--rh Range Hi (upper bound). If defined --op and -val disabled.
--rl Range Lo (lower bound). If defined --op and -val disabled.
--prctl_h Select particles above defined percentile of values (e.g. 25, 50, 75). Used together with --lb parameter
--prctl_l Select particles below defined percentile of values (e.g. 25, 50, 75). Used together with --lb parameter
Example 1: Select lines from input.star where source micrograph does not equals to mic123456789.mrc
select_values_star.py --i input.star --o output.star --lb rlnMicrographName --op "!=" --val mic123456789.mrc
Example 2: Select lines from input.star where tilt angles are less than 15 deg.
select_values_star.py --i input.star --o output.star --lb rlnAngleTilt --op "<" --val 15
Example 3: Select particles where rlnMaxValueProbDistribution values are above 75-th percentile.
select_values_star.py --i input.star --o output.star --lb rlnMaxValueProbDistribution --prctl_h 75
Example 4: Select particles where rlnMaxValueProbDistribution values are below 75-th percentile.
select_values_star.py --i input.star --o output.star --lb rlnMaxValueProbDistribution --prctl_l 75
Converts particle MRC stacks into separate MRC files and generate micrograph star file for them. In comparison to split_stacks.py, it is easier to keep track of the original particle in the resulting star file.
--i Input STAR filename.
--o Output prefix.
Example:
# Let's assume you have following input.star file:
_rlnImageName
_rlnMicrographName
1@particles/mic123682.mrcs Micrographs/mic123682.mrc
6@particles/mic123682.mrcs Micrographs/mic123682.mrc
2@particles/mic777772.mrcs Micrographs/mic777772.mrc
# running the following command
split_particles_to_micrographs.py --i input.star --o splitParticles
# will create a directory named splitParticles and a file splitParticles.star which will look like this:
_rlnMicrographName
splitParticles/mic123682_1.mrc
splitParticles/mic123682_6.mrc
splitParticles/mic777772_2.mrc
# Note: All other labels except rlnImageName remains in the output file preserved
Split MRC stacks listed in STAR file into separate files, and writes a new STAR file with split files info.
--i Input STAR filename.
--o_dir Output folder.
--o_pref Output image prefix.
Print basic statistics on numerical labels present in STAR file
--i Input STAR filename.
--lb Labels used for statistics (Default: ALL). Multiple labels can be used enclosed in double quotes. (e.g. "rlnAngleTilt rlnAngleRot")
--data Data table from star file to be used, Default: data_particles
Example 1: Print out statistics on particles - all labels:
stats_star.py --i input.star
Example 2: Print out statistics on data_model_class_1 - all labels
stats_star.py --i run_model.star --data data_model_class_1
Example 3: Print out statistics on rlnAngleTilt:
stats_star.py --i input.star --lb rlnAngleTilt
Example 4: Print out statistics on rlnAngleTilt, rlnAngleRot and rlnAnglePsi
stats_star.py --i input.star --lb "rlnAngleTilt rlnAngleRot rlnAnglePsi"
Perform transformation of euler angles to produce X-flipped reconstruction. The resulting map is the same as if "--invert_hand" is applied on the map in relion_image_handler.
--i Input STAR filename with particles.
--o Output STAR filename.
Perform transformation of euler angles to produce Y-flipped reconstruction.
--i Input STAR filename with particles.
--o Output STAR filename.