Trail Maker Developer Docs

Video Input

High level flow

1. Extract metadata from video

Done using exiftool:
1
$ exiftool -api LargeFileSupport=1 -X IN_VIDEO.mp4 > VIDEO_META.xml
Copied!
Note, if dual fisheyes, uses the front image (GBFR) for extraction.

2. Extract telemetry as GPX from video

Done using gopro-telemetry with following options (set by user in config):
1
stream: GPS5,
2
GPS5Fix: USER_DEFINED,
3
GPS5Precision: USER_DEFINED, // default is 100 -- user must set lower than this
4
WrongSpeed: USER_DEFINED,
5
preset: gpx
Copied!
The output will be filtered based on user options.
Also a complete telemetry json file too (not filtered by user setting and with all streams), in addition to the gpx file, in meta directory.
Note, for dual GoPro Fisheye input, this should be run on the front video (that holds gpmf track).
The GPX file is in following format for reference;
1
<trkpt lat=\"28.7015115\" lon=\"-13.9204121\">\n
2
<ele>243.304</ele>\n
3
<time>2020-08-02T11:59:05.905Z</time>\n
4
<fix>3d</fix>\n
5
<hdop>107</hdop>\n
6
<cmt>altitude system: MSLV; 2dSpeed: 1.067; 3dSpeed: 0.73</cmt>\n
7
</trkpt>
Copied!

3. Extract frames at set rate for video type

User can set the ffmpeg -r value. Depending on the mode of the video (set by user), we extract like so:

3A GoPro EAC .360

Quality 2-6
1
$ ffmpeg -i INPUT.360 -map 0:0 -r XXX -q:v QQQ trackN/img%d.jpg -map 0:5 -r XXX -q:v QQQ trackN/img%d.jpg
Copied!
Where XXX = framerate user passes in CLI. And QQQ = quality.
Note: if timelapse mode is used, the track numbers are different:
  • Regular video = -map 0:0 and -map 0:5
  • Timewarp video = -map 0:0 and -map 0:4

3B Dual GoPro Fisheye

Quality 2-6
1
$ ffmpeg -i INPUT_FR.mp4 -r XXX -q:v QQQ FR/img%d.jpg
2
$ ffmpeg -i INPUT_BK.mp4 -r XXX -q:v QQQ BK/img%d.jpg
Copied!
Where XXX = framerate user passes in CLI. And QQQ = quality.

3C Equirectangular / HERO mp4

Quality 2-6
1
$ ffmpeg -i INPUT.mp4 -r XXX -q:v QQQ img%d.jpg
Copied!

4. Process to equirectangular for EAC / Fisheyes only

Skip step if Equirectangular / HERO mp4 input

4A. MAX2Sphere flow (2 GoPro EAC Frames)

1
$ @SYSTEM_PATH/MAX2spherebatch -w XXXX -n 1 -m YYYY track%d/frame%4d.jpg
Copied!
Note, -w flag variable (XXXX), if in XML ImageWidth is:
  • 4096, then -w = 5376
  • 2272, then -w = 3072
For -m flag variable YYYY is equal to number of frames extracted.

4B. Fusion2Sphere flow (2 GoPro Fisheye Frames)

1
@SYSTEM_PATH/fusion2sphere -b 5 -w NNNN -f FR/img%d.jpg BK/img%d.jpg -o FINAL/img%d.jpg parameter-examples/PPPP.txt
Copied!
Note, PPPP variable is determined by frame both ImageWidth which should both be either of the following values based on the video mode used (3k / 5.2k):
  • 1568, then
    • PPPP = video-3k-mode.txt
    • NNNN = 3072
  • 2704, then
    • PPPP = video-5_2k-mode.txt
    • NNNN = 5228

5. Insert equirectangular metadata to frames (equirectangular only)

Skip step if HERO mp4 input
Value type
Image metadata field injected
Example injected
Fixed
XMP-GPano:StitchingSoftware
Spherical Metadata Tool
Fixed
XMP-GPano:SourcePhotosCount
2
Fixed
XMP-GPano:UsePanoramaViewer
TRUE
Fixed
XMP-GPano:ProjectionType
equirectangular
Is same as ImageHeight value
XMP-GPano:CroppedAreaImageHeightPixels
2688
Is same as ImageWidth value
XMP-GPano:CroppedAreaImageWidthPixels
5376
Is same as ImageHeight value
XMP-GPano:FullPanoHeightPixels
2688
Is same as ImageWidth value
XMP-GPano:FullPanoWidthPixels
5376
Fixed
XMP-GPano:CroppedAreaLeftPixels
0
Fixed
XMP-GPano:CroppedAreaTopPixels
0
Note, some spatial fields are always fixed (e.g. XMP-GPano:SourcePhotosCount b/c GoPro 360 cameras only have 2 lenses), so values are static.

6. Add photo times and add GPS data

6A Add photo times

First frame (all modes)
To assign first photo time, we use the first GPSDateTime value reported in telemetry and assign it to photo time fields as follows:
Video metadata field extracted
Example extracted
Image metadata field injected
Example injected
1.streams.gps5.date
2020-08-02T11:59:05.905Z
DateTimeOriginal
2020-08-02T11:59:05Z
.streams.gps5.date
2020-08-02T11:59:05.905Z
SubSecTimeOriginal
905
.streams.gps5.date
2020-08-02T11:59:05.905Z
SubSecDateTimeOriginal
2020-08-02T11:59:05.905Z
Example exiftool command to write these values:
1
$ exiftool DateTimeOriginal:"2020:04:13 15:37:22Z" SubSecTimeOriginal:"444" SubSecDateTimeOriginal: "2020:04:13 15:37:22.444Z"
Copied!
Other frames (normal mode)
Now we need to assign time to other photos. To do this we simply order the photos in ascending numerical order (as we number them sequentially when extracting frames).
We always extract videos at a fixed frame rate based on transport type. Therefore, we really only need to know the video start time, to determine time of first photo. From there we can incrementally add time based on extraction rate (e.g. photo 2 is 0.2 seconds later than photo one where framerate is set at extraction as 5 FPS).
Extraction Frame rate
Photo spacing (sec)
0.1
10
0.5
5
1
1
2
0.5
5
0.2
Other frames (time lapse video mode) -- Fusion, MAX, and HERO
Time lapse video mode is used by GoPro Fusion Cameras.
As an example, if shooting at 0.5 second setting, the video will bundle 30 frames in one seconds of footage (the equivalent of 15 seconds in real time).
The Time Lapse video calculation to determine frame spacing time is (30 / Timelapse video mode value set) / FPS value which gives (see tab 2):
Note, we always use the value 30 in the calculation as GoPro packs time lapse video into an output video with 30 frames per second (so this is always.
To give an example, lets say first photo gets assigned first GPS time = 00:00:01.000 and we extract photos at 5FPS for time lapse video mode 2 sec. in this case second photo has time 00:00:01.000 + 12 secs.
Other frames (Time Warp mode) -- MAX and HERO
Note, if user selects Time Lapse video mode, then they cannot set Time Warp Mode -- it is one or the other.
Time Warp is a GoPro mode that speeds up the video (e.g. when set a 5x every second of video is 5 seconds of footage).
We therefore explicitly ask use if video was shot in Time Warp mode and the settings used (there is no easy way to determine this automatically).
The Time Warp calculation to determine frame spacing time is Time Warp mode / FPS value set which gives:
To give an example, lets say first photo gets assigned first GPS time = 00:00:01.000 and we extract photos at 5FPS for timewarp mode 30x. in this case second photo has time 00:00:01.000 +6 secs.

6B Add GPS points

Now we can use the photo time and GPS positions / times to geotag the photos:
All frames (all modes)
1
exiftool -Geotag file.gpx "-Geotime<SubSecDateTimeOriginal" dir
Copied!
This will write the following fields into the photos
Image metadata field injected
Example injected
GPS:GPSDateStamp
2020:04:13
GPS:GPSTimeStamp
15:37:22.444
GPS:GPSLatitude
51 deg 14' 54.51"
GPS:GPSLatitudeRef
North
GPS:GPSLongitude
16 deg 33' 55.60"
GPS:GPSLongitudeRef
West
GPS:GPSAltitudeRef
Above Sea Level
GPS:GPSAltitude
157.641 m

7. Set spacing

Note, user can set spacing by distance and by time (using FPS).
However, be aware, we first split frames and space by time (FPS) before this steps, so this process only considers the frames extracted from the video.

Frames every [x] meters

User can decide how far photos should be spaced using distance.
Distance uses gps_distance_meters_next to next value.
It should be calculated after time spacing (FPS) has been applied.
This option is useful when user has lots of images close together on map because they are moving slowly. For example, traveling a 1 meter/second and taking a photo every second gives a photo every meter. Maybe they want photos every 5 meters.
User can enter value between 0.5 and 20 meters. For example, a value of 5 means 1 photo every 5 meters. A value of 1 means 1 photo every 1 meters.\
It is not possible to always have exact spacing (i.e. if user has timelapse of images 10 meters apart and selects 5 meter spacing). In such cases, no photos will be modified as the spacing condition is met.
The logic of the calculation is as follows:
  1. 1.
    App analyses first distance value between photo 1>2.
  2. 2.
    If
    • distance < value entered discards destination photo (2) and calculates new connection 1>3
    • distance >= value entered keeps destination photo (2) and calculates next connection 2>3
Example:
Photos are 1 meter apart. User enters value in meter of 5.
  1. 1.
    Meters from 1 to 2 calculated.
    • Is 1 Meters so photo 2 discarded.
  2. 2.
    Meters from 1 to 3 calculated.
    • Is 2 Meters so photo 3 discarded.
  3. 3.
    Meters from 1 to 4 calculated.
    • Is 3 Meters so photo 4 discarded.
  4. 4.
    Meters from 1 to 5 calculated.
    • Is 4 Meters so photo 5 discarded.
  5. 5.
    Meters from 1 to 6 calculated.
    • Is 5 Meters so photo 6 kept.
  6. 6.
    Meters from 6 to 7 is calculated.
    • Is 1 Meters so photo 7 discarded ...

8. Create photo GPX

Similar output to video gpx, but now only using GPS points added to photos (not all video points added to photos as more video gps than photos).

9. Create sequence json

The sequence JSON is a small document has aggregated stats for the sequence, as follows
1
{
2
"Sequence": {
3
"TM_UUID": "bac0c65d-3ad1-470e-abd1-e94d3a91b688",
4
"Name": "My sequence",
5
"Description": "Some desc",
6
"Tags": ["tag1","tag2"],
7
"TransportType": "Land,Hike",
8
"ProjectionType" "equirectangular",
9
"CameraMake": "GoPro",
10
"CameraModel": "GoPro Max",
11
"RawVersionExists": "true",
12
"LogoVersionExists": "true",
13
"Photo": {
14
"0": {
15
"Filename": "GSAJ6106.JPG",
16
"GPSDateTime": "2021-09-04T07:24:07.744Z",
17
"GPSLatitude": "51.2725456",
18
"GPSLongitude": "-0.8459696",
19
"GPSAltitude": "82.008",
20
"CalculatedGPSNextPhotoTimeSeconds": "10.2",
21
"CalculatedGPSNextPhotoDistanceMeters": "10.2",
22
"CalculatedGPSNextPhotoElevationChangeMeters": "10.2",
23
"CalculatedGPSNextPhotoPitchDegrees": "10.2",
24
"CalculatedGPSNextPhotoAzimuthHeadingDegrees": "10.2",
25
"CalculatedGPSNextPhotoSpeedKilometersHour": "10.2"
26
},
27
...
28
}
29
}
30
}
Copied!

9A Sequence level data

Field in telemetry
Description of field values
Sample value
TM_UUID
Randomly generate v4 UUID
f4226a87-56fa-4cd0-b3ee-e6eb8f78c0b4
Name
Entered by user in UI
My sequence
Description
Entered by user in UI
A description
Tags
Entered by user in UI
tag1,tag2
TransportTypes
Entered by user in UI
Water,Boat
ProjectionType
Either equirectangular or hero. Taken from first photo metadata ProjectionType field.
hero
CameraMake
Taken from first photo metadata CameraMake field.
GoPro
CameraModel
Taken from first photo metadata CameraModel field.
GoPro MAX
RawVersionExists
If user has raw copy of images (might be false if only nadir copy exists)
true
LogoVersionExists
If user has created nadir copy of the images.
false

9B Photo level data

Field in telemtry
Description of field values
Sample value
Filename
Photo filename
GSAJ6106.JPG
GPSDateTime
Reported in photo exif
2021-09-04T07:24:07.744Z
GPSLatitude
Reported in photo exif
51.2725456
GPSLongitude
Reported in photo exif
-0.8459696
GPSAltitude
Reported in photo exif
82.008
CalculatedGPSNextPhotoTimeSeconds
Calculated using GPS lat,lon position between this and next photo. For last position, is always 0.
43.118
CalculatedGPSNextPhotoDistanceMeters
Calculated using GPS lat,lon position between this and next photo. For last position, is always 0.
55.005
CalculatedGPSNextPhotoElevationChangeMeters
Calculated using GPS elevation position between this and next photo. For last position, is always 0.
1.293
CalculatedGPSNextPhotoPitchDegrees
Calculated using GPS alt position between this and next photo. For last position, is always 0.
78.473
CalculatedGPSNextPhotoAzimuthHeadingDegrees
Calculated using GPS lat,lon position between this and next photo. For last position, is always 0.
93.948
CalculatedGPSNextPhotoSpeedKilometersHour
Calculated using (CalculatedGPSNextPhotoDistanceMeters / CalculatedGPSNextPhotoTimeSeconds) * 3.6
3.4837

10. Insert final metadata to frames

Video metadata field extracted
Image metadata field injected
Description
Example injected
Trackn:DeviceName
IFD0:Model
Is camera model from video
GoPro Max
​
EXIF:UserComment
Always same
Processed by Trek View Trail Maker
​
GPS:GPSSpeed
Is CalculatedGPSNextPhotoSpeedKilometersHour
3.4837
​
GPS:GPSSpeedRef
Always K
K
​
GPS:GPSImgDirection
Is CalculatedGPSNextPhotoAzimuthHeadingDegrees
93.948
​
GPS:GPSImgDirectionRef
Always M
M

11. Add logo (nadir / watermark)

Skip if option not selected by user.

11A. Add nadir (equirectangular)

User can set nadir overlay size and logo.
Example of nadir image height (equirectangular)
Note, nadir convert to equirectangular/resize step only needs to be converted to equirectangular and resized once (as all frames have same dimensions so output can be overlaid over each).

11B. Add watermark (HERO)

User can set watermark overlay size and logo.
Example of watermark image height (non-equirectangular)

12: Mapillary preprocessing

We must further process the images before upload by adding additional metadata for Mapillary.
Mapillary looks for a JSON object inside each of the ImageDescription metadata fields to process imagery.
The JSON object inside each image has the following values;
  • MAPLongitude (float): Longitude of the image
  • MAPLatitude (float): Latitude of the image
  • MAPAltitude (float): Altitude of the image
  • MAPCaptureTime (string): Capture time in UTC, specified in the format of %Y_%m_%d_%H_%M_%S_%f. For example, the standard time 2019-12-29T10:25:39.898Z should be formatted as 2019_12_29_10_25_39_898.
This is written with exiftool into each image like so:
1
exiftool XMP-tiffImageDescription:"{JSON_OBJECT_BODY_FOR_IMAGE}" FILENAME.jpg
Copied!
An example of a final object in the ImageDescription field will look like this:
1
{"MAPAltitude":319.248,"MAPLatitude":28.7151519999722,"MAPLongitude":-13.8921749,"MAPCaptureTime":"2020_08_14_11_43_46_000",}
Copied!

13: Google Street View preprocessing

We must further process the images for Google Street View before upload by creating a set of video files for the sequence.
To do this, we create video (and accompanying JSON telemetry files)
For this we do 2 things;

13A Batch photos into 300

Each video should hold no more than 300 frames (run for 1 minute). For sequences with more than 300 frames we create multiple sets of videos and GPX files with 300 frames/GPX points.
This is important for later use, as Google Street View works better with shorter segments.
13B create a video
Now create a video (or multiple videos if > 300 frames) with a frame rate of 5 FPS (-f X).
1
ffmpeg -r 5 -i PHOTO_%06d.jpg -c:v libx264 -pix_fmt yuv420p OUTPUT.mp4
Copied!
Then copy camera metadata from the first frame of the video (you will need to use first frame from each video if more than one created):
1
exiftool -TagsFromFile FIRSTFRAME.jpg "-all:all>all:all" OUTPUT.mp4
Copied!
If equirectangular frames used to create video
Then add required spatial metadata (equirectangular sequences only) using the Spatial Media Metadata Injector the following values should be added to the video:​
New video metadata
Example
XMP-GSpherical:Spherical
true
XMP-GSpherical:Stitched
true
XMP-GSpherical:StitchingSoftware
Spherical Metadata Tool
XMP-GSpherical:ProjectionType
equirectangular
​
13C create a Street View GPS timeline JSON file
Each video should also be accompanied by a custom JSON telemetry file used for Google Street View.

14: Upload

At this point the sequence is upload is started to Explorer, which includes all files stored in the sequence directory.
The upload happens like so
  1. 1.
    TM checks username is correct
    1. 1.
      It does this by sending a request to /users endpoint with API key and checks username returned matches what user entered
  2. 2.
    TM sends valid sequence.json to Explorer API (authenticating with valid API token)
  3. 3.
    Explorer validates sequence.json and create local entry in DB
  4. 4.
    Explorer makes request to S3 to create bucket in directory (sequences/EXPUSERID/EXPSEQUENCEID)
  5. 5.
    Explorer send 3 hour presigned bucket URL back to Trail Maker
  6. 6.
    Trail Maker starts upload of files (meta, raw, nadir, gsv)
  7. 7.
    Once upload complete (number of complete uploads = number of submitted uploads), Trail Maker closes upload session (using kill API endpoint) on Explorer
    1. 1.
      Upload session automatically closed after 3 hours if not closed and all files deleted
    2. 2.
      Upload session automatically closed after 3 failures

Technical challenges for large filesizes

It is likely our users will upload images of 20mb (but potentially 2000 at a time). This poses a number of challenges.
We use the AWS SDK functions to ensure large files have support for network failures (resumable uploads: https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/welcome.html)
If upload enters failed state (is not completed in 3 hours), any uploaded content is deleted from Explorer and the S3 bucket.
If any upload errors, we should attempt to retry 3 times, 1 minute after each failure.
If fails 3 times, upload considered failed. User shown error: "There was an error during upload. Please retry to create this Sequence. If this problem persists, please reduce the number of images in your Sequence."
To save bandwidth costs, files are uploaded directly to S3 (and do not touch webserver).

15: Done

The script will then output the final content as follows:
  • a directory SEQUENCENAME_DATETIME/ in the specified output directory
  • with a set of geotagged photos in a directory (raw/) and/or if nadir selected (nadir/)
  • with a directory call gsv/ with Google Street View videos and GPX files
  • one photo and one video gpx in a /meta directory (SEQUENCENAME_video.gpx or SEQUENCENAME_photo.gpx)
  • a file called SEQUENCENAME_sequence.json in /meta with aggregated sequence information -- used for upload

Image Input

High level flow

1. Extract metadata from photos

Done using exiftool:
1
$ exiftool -ee -G3 -X IN_PHOTO.jpg > PHOTO_META.xml
Copied!

2. Fusion2Sphere flow (2 GoPro Fisheye Frames)

Skip if not Fusion dual fisheye input

2A Merge to equirectangular

1
@SYSTEM_PATH/fusion2sphere -b 5 -w NNNN -f FR/img%d.jpg BK/img%d.jpg -o FINAL/img%d.jpg parameter-examples/PPPP.txt
Copied!
Note, PPPP variable is determined by both frame ImageWidth which should always be the following for photo mode:
  • 3104, then
    • PPPP = photo-mode.txt
    • NNNN = 5760
Note, this is not the output width (this width is defined in photo-mode.txt). For 3104 input, output of equirectangular is 5760.

2B Insert equirectangular metadata to frames (equirectangular only)

Skip step 7 if HERO mp4 input
Value type
Image metadata field injected
Example injected
Fixed
XMP-GPano:StitchingSoftware
Spherical Metadata Tool
Fixed
XMP-GPano:SourcePhotosCount
2
Fixed
XMP-GPano:UsePanoramaViewer
TRUE
Fixed
XMP-GPano:ProjectionType
equirectangular
Is same as ImageHeight value
XMP-GPano:CroppedAreaImageHeightPixels
2688
Is same as ImageWidth value
XMP-GPano:CroppedAreaImageWidthPixels
5376
Is same as ImageHeight value
XMP-GPano:FullPanoHeightPixels
2688
Is same as ImageWidth value
XMP-GPano:FullPanoWidthPixels
5376
Fixed
XMP-GPano:CroppedAreaLeftPixels
0
Fixed
XMP-GPano:CroppedAreaTopPixels
0

2C Add GPS times and other metadata

See video input for details.

3. Set image spacing

3A Frames every [x] meters

Same as for video

4. Create photo GPX

Same as for video

5. Create sequence JSON

Same as for video.

6. Add final metadata to images

Same as for video.

7. Add nadir/watermark (optional)

Skip if option not selected by user.
Same as for video.

8. Mapillary / Google pre-processing

Skip if option not selected by user.
Same as for video.

9. Upload

Same as for video.

10: Done

The script will then output the final content as follows:
  • a directory SEQUENCENAME_DATETIME/ in the specified output directory
  • with a set of geotagged photos in a directory (raw/) and/or if nadir selected (nadir/)
  • with a directory call gsv/ with Google Street View videos and GPX files
  • one photo gpx in a /meta directory (SEQUENCENAME_photo.gpx)
  • a file called SEQUENCENAME_sequence.json in /meta with aggregated sequence information