-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathparse_camm.py
414 lines (294 loc) · 16.1 KB
/
parse_camm.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
import subprocess
import time
import os
import json
import re
import uuid
import numpy as np
import argparse
'''
The post that gives me more clues
https://community.theta360.guide/t/theta-z1-gps-track-in-video-files/7101/2
exiftool -a -G1 -n -handlertype /Users/jordivallverdu/Documents/360code/apps/motion_theta/R0013304_0.MP4
[Track1] Handler Type : vide
[Track2] Handler Type : camm
[Track2] Handler Type : url
[QuickTime] Handler Type : mdta
https://github.com/trek-view/360-camera-metadata/blob/master/0-standards/camm.md
https://developers.google.com/streetview/publish/camm-spec?hl=es-419#data-format
https://exiftool.org/forum/index.php?topic=5095.45
The following command utilizes the exiftool utility to extract embedded metadata from a specified video file.
It provides detailed information about the operations being performed and the metadata extracted.
The -V3 option sets the verbosity level to the highest detail, offering in-depth insights into the metadata processing.
This level of verbosity is particularly useful for troubleshooting and comprehensive analysis of metadata within the file.
exiftool -ee -V3 /Users/jordivallverdu/Documents/360code/apps/motion_theta/R0013304_0.MP4
exiftool -ee -V3 /Users/jordivallverdu/Documents/360code/apps/motion_theta/R0013304_0.MP4 > /Users/jordivallverdu/Documents/360code/apps/motion_theta/R0013304_0.txt
A bit of explanation on what's going on :
I want to sticth the images from the new format for the ricoh theta Z1, which allows to generate a 3648*3648 videos at 1 or 2 fps, resulting in to videos :
R0013350_0.MP4 and R0013350_1.MP4 for example, one for each lens.
Additionally there must be the IMU information somewhere, and yes it's there, the camm file format it's actually being used.
Extract and interpret the camera motion data (specifically, acceleration and angular velocity) embedded within a video file. This data, referred to as "camm" data, is stored in a binary format that required a careful approach to decode and understand. The journey involved multiple steps, which I'll outline below.
1. **Initial Data Extraction**: We first extracted the binary data corresponding to the "camm" track from the video file. This required understanding the structure of the video file and locating the appropriate bytes. We then wrote these bytes to a separate file for further analysis.
2. **Pattern Analysis**: We analyzed the binary data and identified some patterns, notably a repeating sequence of bytes at the beginning of each chunk of data. We hypothesized that these bytes could be serving as markers or identifiers for the data chunks.
3. **Data Interpretation**: Based on the identified patterns, we initially hypothesized that the data might be split into chunks of 16 bytes each, where each chunk corresponds to a single frame of the video. However, the number of chunks did not match the number of frames in the video, so we refined our hypothesis.
4. **Refining the Interpretation**: We noticed that the number of data chunks was roughly proportional to the video duration in seconds, suggesting that the camm data was sampled at a fixed frequency independent of the video frame rate. We then modified our approach to interpret the data accordingly.
5. **Data Parsing**: We wrote a Python script to parse the binary data into readable form. This involved reading the data in chunks of 16 bytes each and converting each chunk to three floating-point numbers using the struct module.
6. **Exploring Other Tools**: We found that the ExifTool command-line utility could also extract the camm data from the video file in a readable format. We decided to use the output from ExifTool as a basis for further analysis.
7. **Refining the Parsing Script**: Based on the ExifTool output, we refined our Python script to parse the data more accurately. We now also extracted the sample time and duration for each data sample.
8. **Matching Samples to Frames**: Given the frame rate of the video, we wrote a function to associate each video frame with the closest camm data sample in time.
9. **Calculating Orientation**: Finally, we used the acceleration data to calculate the pitch and roll of the camera for each sample. We used the numpy library to perform these calculations.
Through this iterative process, we built a tool to extract, parse, and interpret the camm data embedded in a video file. The final result is a Python script that takes a video file as input and produces a JSON file containing the acceleration, angular velocity, sample time, duration, and calculated pitch and roll for each sample, as well as a unique ID for each sample. This data can then be used for further analysis or visualization.
QUICK USAGE :
python parse_camm.py -v '/Users/jordivallverdu/Documents/360code/apps/motion_theta/video_samples/R0013304_0.MP4' -c
'''
# Function to calculate pitch and roll
def calculate_pitch_roll(acceleration):
x, y, z = acceleration
pitch = np.arctan2(x, np.sqrt(y**2 + z**2))
roll = np.arctan2(y, np.sqrt(x**2 + z**2))
return np.degrees(pitch), np.degrees(roll)
def write_camm_to_csv(camm_data, csv_path):
import csv
headers = ["Time (s)", "Gyroscope X (deg/s)", "Gyroscope Y (deg/s)", "Gyroscope Z (deg/s)", "Accelerometer X (g)", "Accelerometer Y (g)", "Accelerometer Z (g)"]
with open(csv_path, "w", newline='') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(headers)
for sample in camm_data:
time = sample["sampletime"]
GyroscopeX, GyroscopeY, GyroscopeZ = sample["angularvelocity"]
AccelerometerX, AccelerometerY, AccelerometerZ = sample["acceleration"]
csvwriter.writerow([time, GyroscopeX, GyroscopeY, GyroscopeZ, AccelerometerX, AccelerometerY, AccelerometerZ])
def write_camm_to_json(filePath, jsonPath):
with open(filePath, 'r') as f:
lines = f.readlines()
data = []
current_sample = {}
sample_time = None
sample_duration = None
for line in lines:
if 'SampleTime' in line:
sample_time = float(re.findall("\d+\.?\d*", line)[0])
elif 'SampleDuration' in line:
sample_duration = float(re.findall("\d+\.?\d*", line)[0])
elif 'AngularVelocity' in line or 'Acceleration' in line:
identifier, values = line.strip().split(' = ')
# Remove "|" character and make the identifier lowercase
identifier = identifier.replace("| ", "").lower()
values = list(map(float, values.split(' ')))
current_sample[identifier] = values
if 'acceleration' in identifier:
# Once we have both AngularVelocity and Acceleration, consider the sample complete
# Calculate pitch and roll
pitch, roll = calculate_pitch_roll(values)
current_sample['pitch'] = pitch
current_sample['roll'] = roll
current_sample['uuid'] = str(uuid.uuid4())
current_sample['sampletime'] = sample_time
current_sample['sampleduration'] = sample_duration
data.append(current_sample)
current_sample = {}
# Convert to JSON-like format
data_json = json.dumps(data, indent=4)
# Open the file for writing and write the JSON data
with open(jsonPath, "w") as json_file:
json_file.write(data_json)
return data
# Function to get frame samples
def get_frame_samples(data, frame_rate, duration):
frame_samples = []
total_frames = int(duration * frame_rate)
for frame in range(total_frames):
frame_time = frame / frame_rate
closest_sample = min(data, key=lambda x:abs(x['sampletime']-frame_time))
frame_samples.append(closest_sample)
return frame_samples
def get_video_data(video_path, csv):
# Get the file name without extension
file_name_without_extension = os.path.splitext(os.path.basename(video_path))[0]
# Get the directory of the video file and create the .txt file path
directory = os.path.dirname(video_path)
txt_file_path = os.path.join(directory, f"{file_name_without_extension}.txt")
# Run exiftool to generate the .txt file
with open(txt_file_path, 'w') as f:
subprocess.run(['exiftool', '-ee', '-V3', video_path, '>', txt_file_path], stdout=f)
# Get the frame rate
result = subprocess.run(['exiftool', '-VideoFrameRate', '-Duration', video_path], capture_output=True, text=True)
lines = result.stdout.split('\n')
frame_rate = None
duration = None
for line in lines:
if 'Video Frame Rate' in line:
frame_rate = float(line.split(':')[-1].strip())
elif 'Duration' in line:
duration = float(line.split(':')[-1].split()[0].strip())
json_file_path = os.path.join(directory, f"{file_name_without_extension}.json")
camm_data = write_camm_to_json(txt_file_path, json_file_path)
if csv :
csv_file_path = os.path.join(directory, f"{file_name_without_extension}.csv")
write_camm_to_csv(camm_data, csv_file_path)
return frame_rate, duration, camm_data
def animate_model(camm_data):
import open3d as o3d
# Prepare Open3D visualizer
vis = o3d.visualization.Visualizer()
vis.create_window()
# Load the 3D model
mesh = o3d.io.read_triangle_mesh("theta_v.stl")
vis.add_geometry(mesh)
# Create a line set to display the edges of the mesh
edges = o3d.geometry.LineSet.create_from_triangle_mesh(mesh)
edges.paint_uniform_color([0, 0, 0]) # Black edges
vis.add_geometry(edges)
# Define a function to create rotation matrix
def create_rotation_matrix(pitch, roll):
pitch = np.deg2rad(pitch)
roll = np.deg2rad(roll)
Rx = np.array([[1, 0, 0], [0, np.cos(pitch), -np.sin(pitch)], [0, np.sin(pitch), np.cos(pitch)]])
Ry = np.array([[np.cos(roll), 0, np.sin(roll)], [0, 1, 0], [-np.sin(roll), 0, np.cos(roll)]])
return Rx @ Ry
# Apply rotations to the model
for sample in camm_data:
pitch, roll = sample['pitch'], sample['roll']
R = create_rotation_matrix(pitch, roll)
mesh.rotate(R, center=(0, 0, 0)) # Rotate around the origin
edges.rotate(R, center=(0, 0, 0)) # Rotate around the origin
vis.update_geometry(mesh)
vis.update_geometry(edges)
vis.poll_events()
vis.update_renderer()
time.sleep(0.1) # Delay between each frame
vis.destroy_window()
def plot_pitch_roll(camm_data):
import matplotlib.pyplot as plt
# Extract pitch and roll values
pitch_values = []
roll_values = []
for sample in camm_data:
pitch_values.append(sample['pitch'])
roll_values.append(sample['roll'])
# Plot pitch and roll values
plt.figure(figsize=(14, 6))
plt.subplot(1, 2, 1)
plt.plot(pitch_values)
plt.title('Pitch values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Pitch (degrees)')
plt.subplot(1, 2, 2)
plt.plot(roll_values)
plt.title('Roll values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Roll (degrees)')
plt.tight_layout()
plt.show()
def plot_raw_data (camm_data):
import matplotlib.pyplot as plt
# Extract acceleration and angular velocity values
acceleration_values = []
angular_velocity_values = []
# Extract pitch and roll values
pitch_values = []
roll_values = []
for sample in camm_data:
acceleration_values.append(sample['acceleration'])
angular_velocity_values.append(sample['angularvelocity'])
pitch_values.append(sample['pitch'])
roll_values.append(sample['roll'])
# Convert lists of lists into numpy arrays for easier manipulation
acceleration_values = np.array(acceleration_values)
angular_velocity_values = np.array(angular_velocity_values)
# Plot acceleration, angular velocity, pitch, and roll values
plt.figure(figsize=(18, 12))
# Acceleration plots
plt.subplot(3, 3, 1)
plt.plot(acceleration_values[:, 0])
plt.title('Acceleration X values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Acceleration (m/s^2)')
plt.subplot(3, 3, 2)
plt.plot(acceleration_values[:, 1])
plt.title('Acceleration Y values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Acceleration (m/s^2)')
plt.subplot(3, 3, 3)
plt.plot(acceleration_values[:, 2])
plt.title('Acceleration Z values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Acceleration (m/s^2)')
# Angular velocity plots
plt.subplot(3, 3, 4)
plt.plot(angular_velocity_values[:, 0])
plt.title('Angular Velocity X values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Angular Velocity (rad/s)')
plt.subplot(3, 3, 5)
plt.plot(angular_velocity_values[:, 1])
plt.title('Angular Velocity Y values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Angular Velocity (rad/s)')
plt.subplot(3, 3, 6)
plt.plot(angular_velocity_values[:, 2])
plt.title('Angular Velocity Z values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Angular Velocity (rad/s)')
# Pitch and roll plots
plt.subplot(3, 3, 7)
plt.plot(pitch_values)
plt.title('Pitch values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Pitch (degrees)')
plt.subplot(3, 3, 8)
plt.plot(roll_values)
plt.title('Roll values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Roll (degrees)')
plt.tight_layout()
plt.show()
return
# Plot acceleration and angular velocity values
plt.figure(figsize=(14, 6))
plt.subplot(2, 2, 1)
plt.plot(acceleration_values[:, 0])
plt.title('Acceleration X values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Acceleration (m/s^2)')
plt.subplot(2, 2, 2)
plt.plot(acceleration_values[:, 1])
plt.title('Acceleration Y values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Acceleration (m/s^2)')
plt.subplot(2, 2, 3)
plt.plot(acceleration_values[:, 2])
plt.title('Acceleration Z values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Acceleration (m/s^2)')
plt.subplot(2, 2, 4)
plt.plot(angular_velocity_values[:, 0], label='X')
plt.plot(angular_velocity_values[:, 1], label='Y')
plt.plot(angular_velocity_values[:, 2], label='Z')
plt.title('Angular Velocity values over time')
plt.xlabel('Time (frames)')
plt.ylabel('Angular Velocity (rad/s)')
plt.legend()
plt.tight_layout()
plt.show()
def main(videoPath, outputPath, stitch_frames, format, csv, debug):
frame_rate, duration, camm_data = get_video_data(videoPath, csv)
frame_samples = get_frame_samples(camm_data, frame_rate, duration)
if debug :
# animate_model(camm_data)
# plot_pitch_roll(camm_data)
plot_raw_data(camm_data)
for i, sample in enumerate(frame_samples):
print(f"Frame {i+1}:")
print(json.dumps(sample, indent=4))
print()
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Convert camm data of a ricoh theta Z1 to a json file with')
parser.add_argument('-v' ,'--videoPath', default='/home/jordi/Documents/dataset/calibration_mobile/VID_20230801_130453.mp4', type=str, help='Path to the video so that we can extract the frames')
parser.add_argument('-o' ,'--outputPath', default='/home/jordi/Documents/dataset/calibration_mobile/atlas1', type=str, help='Output Path folder for atlas structure')
parser.add_argument('-s', '--stitch_frames', action='store_true', help='Set this flag to extract frames')
parser.add_argument('-f', '--format', type=str, default="jpg", choices=["png", "jpg", "jpeg"], help="Specify the image format for the extracted frames. Default is 'png'.")
parser.add_argument('-d', '--debug', action='store_true', help='Visualize the data in open3d')
parser.add_argument('-c', '--csv', action='store_true', help='create csv file with the camm data')
args = parser.parse_args()
main(args.videoPath, args.outputPath, args.stitch_frames, args.format, args.csv, args.debug)