-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Similarity or Distance - Higher is closer or lower is closer? #73
Comments
Hi @Lucew, thanks for the question. We've never seen an application that required this to be so clearly differentiated, but I agree it would be useful to label, as metadata. There is no one currently actively maintaining this package, and I'll be on leave for a while, but I will add this to the list of new functionalities. Currently, you will have to work this out yourself by understanding the algorithm (or broadly, but looking for correlations to known correlation metrics or known distance metrics—in many cases many features will be strongly correlated or anticorrelated). |
@fkiraly Yes, that is precisely how I approach it. Config Walkerdef iterate_config(config_path: str):
# load the config we want to use for the pyspi run
with open(config_path, 'r') as file:
config = yaml.load(file, Loader=yaml.FullLoader)
# go through the config and write single configs out of there
for spi_type, spi_names in config.items():
for spi_name, spi_configs in spi_names.items():
# create the dict we want to keep
keep_dict = {key: val for key, val in spi_configs.items() if key != 'config'}
# check whether we have configs then iterate over those
spi_configs = spi_configs.get('configs', None)
if spi_configs is None:
keep_dict['configs'] = None
yield {spi_type: {spi_name: keep_dict}}
else:
for config in spi_configs:
# this needs to be a list otherwise it will fail!
keep_dict['configs'] = [config]
yield {spi_type: {spi_name: keep_dict}} @benfulcher Once I'm done, I could provide a preliminary list of which SPIs are distances and which are similarities in this issue, if there is time. |
@benfulcher, if no one is actively maintaining the package, would you consider a maintenance collaboration, with a refactor towards #72? |
First of all, thanks for creating this awesome package.
I want to use it to evaluate the measures for finding related time series within industrial/building time series datasets.
For that, I have the following questions:
Is there a place where I can quickly gather whether the outcome of a measure is a similarity or a distance? In other words, whether something is more related if the number is higher (similarity) or more related if the metric is lower (distance).
The simplest example would be the differentiation between correlation and Euclidean distance of two time series. For related time series, one could expect the correlation to be high but the Euclidean distance to be low. When, for example, searching for the k most related time series throughout the different metrics implemented here, this information is of great interest.
I'm sorry if I missed that in the publication or the paper. Kindly refer me if there is information on that anywhere.
The text was updated successfully, but these errors were encountered: