Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New way of automatically selecting a threshold for the estimation of the number of time points #173

Closed
eurunuela opened this issue Mar 12, 2020 · 13 comments
Assignees
Labels
Discussion Discussion of a concept or implementation. Need to stay always open. Enhancement New feature or request

Comments

@eurunuela
Copy link
Collaborator

eurunuela commented Mar 12, 2020

Detailed Description

The current part of the code that automatically selects and suggest a threshold for estimating the number of time points does not work properly. It calculates the first derivative of the trigger channel and sets a threshold on the positive values.

Context / Motivation

Suggesting and automatically selecting a threshold would be a very nice and useful feature for the project.

Possible Implementation

After a brief meeting yesterday, we thought of the following options to make this work:

  1. Apply a threshold that is 2 times de STD of the trigger channel. This would be a very easy way of selecting the threshold, as 2 times de STD would get rid of the "noisy baseline" of the trigger channel.
  2. The values in the trigger channel could be represented as two Gaussian distributions that ideally would not overlap. The threshold would be given by the point that separates the two distributions. This is a well-known problem in statistics and it is probably implemented in Python already. In my opinion, this would be the most elegant way of selecting the threshold.
  3. Similar to the first option, we could base the threshold on the Median Absolute Deviation (MAD); i.e. the threshold would be set by the median value of the trigger channel plus the MAD. I expect this option to be similar to the STD one in terms of functionality.
  4. If we'd like to work with the first derivative, we should find the starting and ending point of the triggers (and not only the starting one, i.e. the positive one). If we have both the starting and ending points in the first derivative space and apply a threshold that removes the rest values, we could go back to the original space by integrating. This integration would be multiplying the trigger channel by a lower-triangular matrix of ones. Once in the original space, which would be clear of noise, we could select a threshold somewhere between the baseline (0) and the trigger amplitude.

Feel free to suggest possible solutions!

@eurunuela eurunuela added Discussion Discussion of a concept or implementation. Need to stay always open. Enhancement New feature or request labels Mar 12, 2020
@smoia
Copy link
Member

smoia commented Mar 12, 2020

The current master doesn't contain any code that automatically selects/suggests a threshold to detect the number of timepoitns. That code is contained in PR #153 .
However, we could add automatic detection of the number of timepoints or at least suggest a threshold to be used by the user. That would be a very nice feature to add to our code. In this case, we need to change the detection of the peaks as it's done now in physio_obj.py.
As we discussed with @eurunuela and @vinferrer yesterday, I think as well that the option number two (leveraging the fact that our trigger channel is a binomial distribution) is the most elegant, and albeit computationally more intense, also most precise.
Alternatively, we need to use some statistics that is sensitive to outliers (the triggers should be outliers), hence a solution based on mean/standard deviation would be adequate.

@rmarkello
Copy link
Member

This is a great idea! 🙌

I would be interested in seeing how effective an "easy" solution is before going forward with something that's a bit more computationally intense. For example, does using 2x SD or MAD outliers or e.g., the 95th percentile of the distribution work as a threshold for >~80% of the cases? If so, I don't think it makes sense to dedicate that much time to option two (despite agreeing with you both that it seems like the coolest option).

@eurunuela
Copy link
Collaborator Author

Indeed, we agreed on testing the easy solutions first. We will work on it once #153 is merged.

@vinferrer
Copy link
Collaborator

PR #153 merged, we can start working on this issue.

@vinferrer
Copy link
Collaborator

I am gonna implement the STD solution and see how it works

@vinferrer
Copy link
Collaborator

vinferrer commented Mar 23, 2020

Guys do you think we should preserve check_trigger_amount function? the way i am doing the trigger count now if it works, renders this function usseless

@rmarkello
Copy link
Member

It's a bit tough to say without seeing how you've implemented the alternative, but if you're no longer using the function at all then I see no reason to keep it, personally!

@vinferrer
Copy link
Collaborator

so basically the function I am designing does the same as check_trigger_amount but without needing a threshold, maybe we can have both of them for the moment and activate mine with an optional argument in phys2bids

@vinferrer
Copy link
Collaborator

May I open a PR and we discuss it over the code?

@rmarkello
Copy link
Member

That'd be great!

@vinferrer
Copy link
Collaborator

#183 merged, should we pause this?

@smoia
Copy link
Member

smoia commented Mar 27, 2020

I think we can close this. If we'll have bug reports or we want to implement a new strategy, we can reopen it.

@vinferrer
Copy link
Collaborator

good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Discussion Discussion of a concept or implementation. Need to stay always open. Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants