Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help wanted : How to get position data from the video #1

Open
zhenze-zhang opened this issue Sep 6, 2024 · 4 comments
Open

Help wanted : How to get position data from the video #1

zhenze-zhang opened this issue Sep 6, 2024 · 4 comments

Comments

@zhenze-zhang
Copy link

Hi! I'm an undergraduate cognitive science student using tcow to track my experiment videos. I'm interested in extracting data for everything inside the contour of my target objects, not just their center positions. Is there a way to access this data from the model's output? Thank you!

@basilevh
Copy link
Owner

Hello,
Thanks for your question! TCOW does not give you center positions, instead, it gives you a triplet of dense segmentation masks for every frame in the video. As input, you need two separate files: the video itself and the query mask for the first frame, for example:

my_data/
  my_video.mp4
  my_video_0_query.png

Then you can run:

python eval/test.py --resume tcow --name tcow_demo --gpu_id 0 --data_path my_data/my_video.mp4 --num_queries 1 --extra_visuals 1

In logs/tcow/test_tcow_demo/visuals/, there will be a number of different visualizations. The predicted masks are not contours but are filled areas. Does this answer your question? If not, would you mind clarifying what you want to achieve?

@zhenze-zhang
Copy link
Author

Hello, Thanks for your question! TCOW does not give you center positions, instead, it gives you a triplet of dense segmentation masks for every frame in the video. As input, you need two separate files: the video itself and the query mask for the first frame, for example:

my_data/
  my_video.mp4
  my_video_0_query.png

Then you can run:

python eval/test.py --resume tcow --name tcow_demo --gpu_id 0 --data_path my_data/my_video.mp4 --num_queries 1 --extra_visuals 1

In logs/tcow/test_tcow_demo/visuals/, there will be a number of different visualizations. The predicted masks are not contours but are filled areas. Does this answer your question? If not, would you mind clarifying what you want to achieve?

Hi!
Thank you for replying and sorry for not explaining it clearly! So the background is that we are using an eye-tracker to record subjects' gaze positions while they watch a video. The objects in the video that we are tracking with the tcow are our regions of interest (ROIs). I saw the filled areas it produced but we want to see if we can get a set of coordinates defining the ROI boundaries. This would allow us to determine whether a given gaze position [x, y] falls within the defined ROI boundaries.

@basilevh
Copy link
Owner

basilevh commented Sep 11, 2024

Is the query mask given, as in, you know which object you are tracking? If not, you can use something like Segment Anything to turn a point query (for example from eye tracking) into an object mask and then run TCOW from there.

If the query mask is given as input however, and you simply want to correspond that with a sequence of eye tracking coordinates, then my recommendation would be to write an additional script that looks at the video outputs and measures the strength of the predictions at those coordinates [x, y]. You can easily load video frames as numpy arrays in Python and then the ROI could be defined as a thresholded output (for example, for each pixel, whether the confidence is >0.5 for either the target/occluder/container channel, which map to green/red/blue in the video respectively). If you want the ROI to be a rectangle instead of an arbitrary shape you could also do some light image processing to calculate the encompassing rectangle of the activated pixels.

For completeness, my script exports multiple output video files, make sure you load the one that directly stores the segmentation against a black background (I forgot the exact file names right now), and not overlaid on top of the original video.

Hope this helps and please let me know if not! ;)

@zhenze-zhang
Copy link
Author

Thank you for your advice! turned out we found some other way to get the position data in matlab, thanks so much for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants