Skip to content

1. Using the Pre‐Trained FSP‐Net

EthanTreg edited this page Sep 18, 2024 · 4 revisions

The network has been trained on a simple 5-parameter model from Xspec, TBabs(simplcutx$\otimes$ezdiskbb).
Therefore, weights are provided if this is the desired model for parameter prediction.

There is a configuration file, config.yaml that has several variables that can be changed. There are 5 sections within config.yaml:

  • global-variables - Variables shared across different scripts
  • spectrum-fit - Main autoencoder script for spectrum_fit.py
  • data-preprocessing - Data preprocessing script for data_preprocessing.py
  • synthesize-spectra - Synthesising synthetic spectra script for synthesize_spectra.py
  • network-optimizer - Network optimisation script for network_optimizer.py

These sections will be referenced throughout this README.
Some parameters found in each section may be set under global-variables.

All file paths can be absolute or relative.

Data Requirements

To generate parameters on a dataset, first make sure that data has been properly preprocessed.

  • The network is designed for an input spectrum of size 240.
  • All spectra must be saved in a .pickle file as a numpy array of dimensions $n \times 240$ in a dictionary with the key spectra
  • The network is designed for the data to be pre-normalised; however, the network may still perform well or require retraining, pre-processing follows:
    1. Divide spectrum and background by their respective exposure time
    2. Subtract background from the spectrum
    3. Bin the spectrum and divide by the energy of each bin
    4. Divide by the number of detectors
    5. Remove energies outside the range of 0.3 - 10 keV

If these requirements are not met, modules from data_preprocessing.py can be used to perform all the necessary preprocessing.

Data Preprocessing

If your data doesn't meet the requirements above, data_preprocessing.py can produce the necessary data type.

  1. Set data paths in data-preprocessing:
    • spectra-directory: spectra fits files path
    • background-directory: background fits files path + background path in spectra fits file
    • processed-path: processed output file path
    • Other settings are optional to change
  2. Run python3 -m fspnet.data_preprocessing from the root directory or import using from fspnet.data_preprocessing import preprocess
    The config file path can be added as an optional argument, the default is ../config.yaml

Network Configuration

To configure the network, first, check the settings under spectrum-fit:

  • training options:
    • encoder-load: 1
    • encoder-name: 'Encoder V10'
    • network-configs-directory: '../network_configs/', make sure this directory exists/is correct and contains the file Encoder V10.json
  • data options:
    • encoder-data-path: path to your data file containing the spectra that you want parameters for

All other options should be fine as defaults or aren't required.

Fitting the Data

There are two methods to fit data:

  • Run spectrum_fit.py: Provides several analysis results; however, this makes it slower and more configuration settings are required, this is not recommended, but is useful to see how functions can be used
  • Import from fspnet.spectrum_fit import init and net.predict: Fast, integrates with code and is the recommended approach

For the first approach, simply run spectrum_fit.py and the results will be saved as a pickle file under spectrum-fitoutputparameter-predictions-path.
spectrum_fit.py can take the optional argument of the configuration file path with the default of ../config.yaml if your differs from this.

For the second approach, init returns the autoencoder and decoder data loaders, and the decoder and autoencoder.
Then call data = net.predict(e_loaders[1], path=predictions_path), this returns the predicted data as a dictionary and saves it to a pickle file.

Example code:

from fspnet.spectrum_fit import init

# Initialise data loaders and networks
e_loaders, d_loaders, decoder, net = init()

# Generate predictions for the autoencoder validation data loader (e_loaders contains training and validation data loaders)
data = net.predict(e_loaders[-1], path='optional/path/to/predictions')

The predictions generated by either of the methods can be loaded from the file using Pickle, or in the second method, directly obtained from the function return.
The predictions is a dictionary of Numpy arrays with the keys:

  • preds: Output predictions from the autoencoder, which corresponds to the reconstructed spectra
  • latent: Latent space of the autoencoder, which corresponds to the predicted parameters
  • targets: Target latent space values, which corresponds to the target parameters
  • inputs: Target outputs, which corresponds to the input spectra