Skip to content

Releases: eqcorrscan/EQcorrscan

EQcorrscan 0.5.0

12 Dec 00:57
442fd76
Compare
Choose a tag to compare

This release represents a significant increase in efficiency in large-scale matched-filters in EQcorrscan. Lots of work has gone in to reducing memory usage in the non-correlation components of the matched-filter workflow, streamlining the code, making better use of shared memory multi-threaded parallelism and increasing CPU loads. In our testing we can now achieve and maintain >190% CPU efficiency (e.g. >95% hyperthreaded performance). We can also better load GPUs by making use of concurrent CPU and GPU processing of workflow steps. You should not need to change your code to make use of most of these speed-ups. Hopefully you will notice that you can run larger datasets faster than even!

Changelog

  • core.match_filter.tribe
    • Significant re-write of detect logic to take advantage of parallel steps (see #544)
    • Significant re-structure of hidden functions.
  • core.match_filter.matched_filter
    • 5x speed up for MAD threshold calculation with parallel (threaded) MAD
      calculation (#531).
  • core.match_filter.detect
    • 1000x speedup for retrieving unique detections for all templates.
    • 30x speedup in handling detections (50x speedup in selecting detections,
      4x speedup in adding prepick time)
  • core.match_filter.template
    • new quick_group_templates function for 50x quicker template grouping.
    • Templates with nan channels will be considered equal to other templates with shared
      nan channels.
    • New grouping strategy to minimise nan-channels - templates are grouped by
      similar seed-ids. This should speed up both correlations and
      prep_data_for_correlation. See PR #457.
  • utils.pre_processing
    • _prep_data_for_correlation: 3x speedup for filling NaN-traces in templates
    • New function ``quick_trace_select` for a very efficient selection of trace
      by seed ID without wildcards (4x speedup).
    • process, dayproc and shortproc replaced by multi_process. Deprecation
      warning added.
    • multi_process implements multithreaded GIL-releasing parallelism of slow
      sections (detrending, resampling and filtering) of the processing workflow.
      Multiprocessing is no longer supported or needed for processing. See PR #540
      for benchmarks. New approach is slightly faster overall, and significantly
      more memory efficeint (uses c. 6x less memory than old multiprocessing approach
      on a 12 core machine)
  • utils.correlate
    • 25 % speedup for _get_array_dicts with quicker access to properties.
  • utils.catalog_to_dd
    • _prepare_stream
      • Now more consistently slices templates to length = extract_len * samp_rate
        so that user receives less warnings about insufficient data.
    • write_correlations
      • New option use_shared_memory to speed up correlation of many events by
        ca. 20 % by moving trace data into shared memory.
      • Add ability to weight correlations by raw correlation rather than just
        correlation squared.
  • utils.cluster.decluster_distance_time
    • Bug-fix: fix segmentation fault when declustering more than 46340 detections
      with hypocentral_separation.

EQcorrscan 0.5.0rc0

12 Dec 00:17
442fd76
Compare
Choose a tag to compare
EQcorrscan 0.5.0rc0 Pre-release
Pre-release

Release candidate for version 0.5.0

EQcorrscan 0.4.4

09 Aug 22:15
c822f57
Compare
Choose a tag to compare

EQcorrscan 0.4.4:

Changelog

  • core.match_filter
    • Bug-fix: peak-cores could be defined twice in _group_detect through kwargs.
      Fix: only update peak_cores if it isn't there already.
  • core.match_filter.tribe
  • Detect now allows passing of pre-processed data
  • core.match_filter.template
  • Remove duplicate detections from overlapping windows using ._uniq()
  • core.lag_calc._xcorr_interp
  • CC-interpolation replaced with resampling (more robust), old method
    deprecated. Use new method with use_new_resamp_method=True as **kwarg.
  • core.lag_calc:
  • Fixed bug where minimum CC defined via min_cc_from_mean_cc_factor was not
    set correctly for negative correlation sums.
  • utils.correlate
  • Fast Matched Filter now supported natively for version >= 1.4.0
  • Only full correlation stacks are returned now (e.g. where fewer than than
    the full number of channels are in the stack at the end of the stack, zeros
    are returned).
  • utils.mag_calc.relative_magnitude
  • fixed bug where S-picks / traces were used for relative-magnitude calculation
    against user's choice.
  • implemented full magnitude bias-correction for CC and SNR
  • utils.mag_calc.relative_amplitude:
  • returns dicts for SNR measurements
  • utils.catalog_to_dd.write_correlations
  • Fixed bug on execution of parallel execution.
  • Added parallel-options for catalog-dt measurements and for stream-preparation
    before cross correlation-dt measurements.
  • Default parallelization of dt-computation is now across events (loads CPUs
    more efficiently), and there is a new option ``max_trace_workers` to use
    the old parallelization strategy across traces.
  • Now includes all_horiz-option that will correlate all matching horizontal
    channels no matter to which of these the S-pick is linking.
  • utils.clustering
  • Allow to handle indirect comparison of event-waveforms when (i.e., events
    without matching traces which can be compared indirectly via a third event)
  • Allows to set clustering method, metric, and sort_order from
    scipy.cluster.hierarchy.linkage.
  • tribe, template, template_gen, archive_read, clustering: remove option to read
    from seishub (deprecated in obspy).

EQcorrscan 0.4.4 Release Candidate 0

09 Aug 21:18
c822f57
Compare
Choose a tag to compare
Pre-release

Release candidate 0 for release 0.4.4.

  • core.match_filter
    • Bug-fix: peak-cores could be defined twice in _group_detect through kwargs.
      Fix: only update peak_cores if it isn't there already.
  • core.match_filter.tribe
  • Detect now allows passing of pre-processed data
  • core.match_filter.template
  • Remove duplicate detections from overlapping windows using ._uniq()
  • core.lag_calc._xcorr_interp
  • CC-interpolation replaced with resampling (more robust), old method
    deprecated. Use new method with use_new_resamp_method=True as **kwarg.
  • core.lag_calc:
  • Fixed bug where minimum CC defined via min_cc_from_mean_cc_factor was not
    set correctly for negative correlation sums.
  • utils.correlate
  • Fast Matched Filter now supported natively for version >= 1.4.0
  • Only full correlation stacks are returned now (e.g. where fewer than than
    the full number of channels are in the stack at the end of the stack, zeros
    are returned).
  • utils.mag_calc.relative_magnitude
  • fixed bug where S-picks / traces were used for relative-magnitude calculation
    against user's choice.
  • implemented full magnitude bias-correction for CC and SNR
  • utils.mag_calc.relative_amplitude:
  • returns dicts for SNR measurements
  • utils.catalog_to_dd.write_correlations
  • Fixed bug on execution of parallel execution.
  • Added parallel-options for catalog-dt measurements and for stream-preparation
    before cross correlation-dt measurements.
  • Default parallelization of dt-computation is now across events (loads CPUs
    more efficiently), and there is a new option ``max_trace_workers` to use
    the old parallelization strategy across traces.
  • Now includes all_horiz-option that will correlate all matching horizontal
    channels no matter to which of these the S-pick is linking.
  • utils.clustering
  • Allow to handle indirect comparison of event-waveforms when (i.e., events
    without matching traces which can be compared indirectly via a third event)
  • Allows to set clustering method, metric, and sort_order from
    scipy.cluster.hierarchy.linkage.
  • tribe, template, template_gen, archive_read, clustering: remove option to read
    from seishub (deprecated in obspy).

EQcorrscan Version 0.4.3

21 Apr 00:12
d3f2e97
Compare
Choose a tag to compare

Changelog

  • core.match_filter
    • match_filter:
      • Provide option of exporting the cross-correlation sums for additional later analysis.
  • core.match_filter.party.write
    • BUG-FIX: When format='tar' is selected, added a check for .tgz-file suffix before checking the filename against an existing file. Previously, when a filename without '.tgz'-suffix was supplied, then the file was overwritten against the function's intention.
    • Add option overwrite=True to allow overwriting of existing files.
  • core.match_filter.party.read
    • BUG-FIX: Ensure wildcard reading works as expected: #453
  • core.match_filter.party.rethreshold:
    • added option to rethreshold based on absolute values to keep relevant detections with large negative detect_val.
  • core.lag_calc:
    • Added option to set minimum CC threshold individually for detections based on: min(detect_val / n_chans * min_cc_from_mean_cc_factor, min_cc).
    • Added the ability of saving correlation data of the lag_calc.
  • utils.mag_calc.calc_b_value:
    • Added useful information to doc-string regarding method and meaning of residuals
    • Changed the number of magnitudes used to an int (from a string!?)
  • utils.mag_calc.relative_magnitude:
    • Refactor so that min_cc is used regardless of whether weight_by_correlation is set. See issue #455.
  • utils.archive_read
    • Add support for wildcard-comparisons in the list of requested stations and channels.
    • New option arctype='SDS' to read from a SeisComp Data Structure (SDS). This option is also available in utils.clustering.extract_detections and in utils.archive_read._check_available_data.
  • utils.catalog_to_dd
    • Bug-fixes in #424:
      • only P and S phases are used now (previously spurious amplitude picks were included in correlations);
      • Checks for length are done prior to correlations and more helpful error outputs are provided.
      • Progress is not reported within dt.cc computation
    • write_station now supports writing elevations: #424.
  • utils.clustering
    • For cluster, distance_matrix and cross_chan_correlation, implemented full support for shift_len != 0. The latter two functions now return, in addition to the distance-matrix, a shift-matrix (both functions) and a shift-dictionary (for distance_matrix). New option for shifting streams as a whole or letting traces shift individually (allow_individual_trace_shifts=True).
  • utils.plotting
    • Function added (twoD_seismplot) for plotting seismicity (#365).

EQcorrscan 0.4.3 Release Candidate 0

20 Apr 22:11
d3f2e97
Compare
Choose a tag to compare
Pre-release

Changelog

  • core.match_filter
    • match_filter:
      • Provide option of exporting the cross-correlation sums for additional later analysis.
  • core.match_filter.party.write
    • BUG-FIX: When format='tar' is selected, added a check for .tgz-file suffix before checking the filename against an existing file. Previously, when a filename without '.tgz'-suffix was supplied, then the file was overwritten against the function's intention.
    • Add option overwrite=True to allow overwriting of existing files.
  • core.match_filter.party.read
    • BUG-FIX: Ensure wildcard reading works as expected: #453
  • core.match_filter.party.rethreshold:
    • added option to rethreshold based on absolute values to keep relevant detections with large negative detect_val.
  • core.lag_calc:
    • Added option to set minimum CC threshold individually for detections based on: min(detect_val / n_chans * min_cc_from_mean_cc_factor, min_cc).
    • Added the ability of saving correlation data of the lag_calc.
  • utils.mag_calc.calc_b_value:
    • Added useful information to doc-string regarding method and meaning of residuals
    • Changed the number of magnitudes used to an int (from a string!?)
  • utils.mag_calc.relative_magnitude:
    • Refactor so that min_cc is used regardless of whether weight_by_correlation is set. See issue #455.
  • utils.archive_read
    • Add support for wildcard-comparisons in the list of requested stations and channels.
    • New option arctype='SDS' to read from a SeisComp Data Structure (SDS). This option is also available in utils.clustering.extract_detections and in utils.archive_read._check_available_data.
  • utils.catalog_to_dd
    • Bug-fixes in #424:
      • only P and S phases are used now (previously spurious amplitude picks were included in correlations);
      • Checks for length are done prior to correlations and more helpful error outputs are provided.
      • Progress is not reported within dt.cc computation
    • write_station now supports writing elevations: #424.
  • utils.clustering
    • For cluster, distance_matrix and cross_chan_correlation, implemented full support for shift_len != 0. The latter two functions now return, in addition to the distance-matrix, a shift-matrix (both functions) and a shift-dictionary (for distance_matrix). New option for shifting streams as a whole or letting traces shift individually (allow_individual_trace_shifts=True).
  • utils.plotting
    • Function added (twoD_seismplot) for plotting seismicity (#365).

EQcorrscan Version 0.4.2

13 Jul 23:30
Compare
Choose a tag to compare

A Python package for the detection and analysis of repeating and near-repeating seismicity.

Changelog

  • Add seed-ids to the _spike_test's message.
  • utils.correlation
    • Cross-correlation normalisation errors no-longer raise an error
    • When "out-of-range" correlations occur a warning is given by the C-function
      with details of what channel, what template and where in the data vector
      the issue occurred for the user to check their data.
    • Out-of-range correlations are set to 0.0
    • After extensive testing these errors have always been related to data issues
      within regions where correlations should not be computed (spikes, step
      artifacts due to incorrectly padding data gaps).
    • USERS SHOULD BE CAREFUL TO CHECK THEIR DATA IF THEY SEE THESE WARNINGS
  • utils.mag_calc.amp_pick_event
    • Added option to output IASPEI standard amplitudes, with static amplification
      of 1 (rather than 2080 as per Wood Anderson specs).
    • Added filter_id and method_id to amplitudes to make these methods more
      traceable.
  • core.match_filter
    • Bug-fix - cope with data that are too short with ignore_bad_data=True.
      This flag is generally not advised, but when used, may attempt to trim all
      data to zero length. The expected behaviour is to remove bad data and run
      with the remaining data.
    • Party:
      • decluster now accepts a hypocentral_separation argument. This allows
        the inclusion of detections that occur close in time, but not in space.
        This is underwritten by a new findpeaks.decluster_dist_time function
        based on a new C-function.
    • Tribe:
      • Add monkey-patching for clients that do not have a get_waveforms_bulk
        method for use in .client_detect. See issue #394.
  • utils.pre_processing
    • Only templates that need to be reshaped are reshaped now - this can be a lot
      faster.

Version 0.4.2 Release Candidate 0

13 Jul 23:07
Compare
Choose a tag to compare
Pre-release

Pre-release for 0.4.2 for testing on conda-forge

EQcorrscan Version 0.4.1

18 Apr 03:44
1484d09
Compare
Choose a tag to compare

A Python package for the detection and analysis of repeating and near-repeating seismicity.

Changelog

  • core.match_filter
    • BUG-FIX: Empty families are no longer run through lag-calc when using
      Party.lag_calc(). Previously this resulted in a "No matching data" error,
      see #341.
  • core.template_gen
    • BUG-FIX: Fix bug where events were incorrectly associated with templates
      in Tribe().construct() if the given catalog contained events outside
      of the time-range of the stream. See issue #381 and PR #382.
  • utils.catalog_to_dd
    • Added ability to turn off parallel processing (this is turned off by
      default now) for write_correlations - parallel processing for moderate
      to large datasets was copying far too much data and using lots of memory.
      This is a short-term fix - ideally we will move filtering and resampling to
      C functions with shared-memory parallelism and GIL releasing.
      See PR #374.
    • Moved parallelism for _compute_dt_correlations to the C functions to
      reduce memory overhead. Using a generator to construct sub-catalogs rather
      than making a list of lists in memory. See issue #361.
  • utils.mag_calc:
    • amp_pick_event now works on a copy of the data by default
    • amp_pick_event uses the appropriate digital filter gain to correct the
      applied filter. See issue #376.
    • amp_pick_event rewritten for simplicity.
    • amp_pick_event now has simple synthetic tests for accuracy.
    • _sim_wa uses the full response information to correct to velocity
      this includes FIR filters (previously not used), and ensures that the
      wood-anderson poles (with a single zero) are correctly applied to velocity
      waveforms.
    • calc_max_curv is now computed using the non-cumulative distribution.
  • Some problem solved in _match_filter_plot. Now it shows all new detections.
  • Add plotdir to eqcorrscan.core.lag_calc.lag_calc function to save the images.

Version 0.4.1 Release Candidate 0

18 Apr 02:26
1484d09
Compare
Choose a tag to compare
Pre-release

Pre-release for 0.4.1