15 min read

Using Similarity Search to Classify Events in Drone Racing Logs

Learn how to use the Roboto SDK to quickly find patterns in your log data!

Written by

Yves Albers

Published

November 14, 2024

As an engineer analyzing robot logs, you’re likely asking questions such as:

Have we encountered this type of issue before?

How often does this event occur?

Can we identify similar events in past logs?

Can we set up alerts to catch this issue if it happens again?

Answering these questions is essential for diagnosing system behaviors, anticipating potential failures, and continuously improving your robot's performance.

However, answering them presents two key challenges:

(1) Data volume and format: Robots generate huge amounts of data—sometimes tens or hundreds of gigabytes per mission—stored in specialized formats like ROS bags or MCAP files. Parsing these logs to extract relevant segments is typically manual and labor-intensive. This gets even worse when dealing with log data from fleets of robots.

‍(2) Data diversity and pattern complexity: Robot logs are multimodal, containing a wide variety of sensor data. While simple issues can be flagged by looking for discrete values or conditions (e.g., vehicle_gps_lost = True), more complex events—such as repeated instability in drone flights or subtle gripper failures—demand more advanced methods.

At Roboto, our goal is to address challenge (1) by making it easier to retrieve task-specific data at scale, allowing you to manage and analyze large datasets more effectively. In parallel, we’re expanding our tooling to help address challenge (2) as well—building features that identify and classify complex patterns in multimodal data.

In this post, we’ll walk through an example where we use our Python SDK to retrieve IMU data from drone racing logs and then leverage signal similarity search to identify and classify patterns.

Scenario

As a systems engineer, you're tasked with investigating hard landings in your drone's latest flight campaign. To do this, you need to generate landing videos from the drone’s forward-facing camera and extract the corresponding IMU signals. These will then be reviewed by the controls team to assess performance.

While this sounds straightforward, the challenge is that your ROS bags contain raw IMU and image data—there are no annotations or flags indicating hard landing events. As a result, you'd typically need to:

1. Download dozens of raw flight logs, each several gigabytes in size.

2. Manually review flights to identify hard landings and record the timestamps.

3. Write custom scripts to extract images and IMU signals, then compile them into videos.

4. Upload these videos to a storage system for shared access.

This would be painstaking, and could take several days with a large number of logs.

Instead of manually reviewing every flight, we're going to use Roboto to find and retrieve hard landings automatically:

1. Single Annotation: Use Roboto’s visualizer to annotate an exemplary hard landing event.

2. Data Retrieval: Use Roboto’s SDK to efficiently retrieve IMU signals from flights to search.

3. Similarity Search: Use similarity search to identify other hard landings in the IMU signals.

4. Artifact Generation: Generate clips of the relevant landing sequences.

This approach will streamline the entire process and save considerable time.

Prerequisites

For this example, we ingested 25 drone flights as ROS bags from UZH-FPV Drone Racing. The flights cover various drone racing scenes, both indoor and outdoor, using forward and downward-facing cameras.

To reproduce the steps below: create your own account, then install the Python SDK and configure a personal access token. You can see the flight logs in the public collection on Roboto and run the notebook by following the README in the repository.

Single Annotation

Our first task is to use Roboto’s visualizer to annotate an exemplary hard landing event. This will serve as a reference to find other hard landings afterwards.

1. Open a dataset in the visualizer. Create panels for the image topics, and plot the [x, y, z] values from /snappy_imu/linear_acceleration. Refer to our guide: Visualizing ROS Data.

2. Identify the drone's landing by looking for spikes in the linear acceleration signals, indicating contact with the ground.

3. Create an event around the landing, setting the Event Scope to "All Topics in this Panel." Refer to our guide: Create Events on Data.

Creating an event in the Roboto visualizer. Data is from UZH-FPV Drone Racing.

Data Retrieval

We can use the Roboto SDK to access the public log collection and set up a query client.

Next, we call roboto_search.find_topics to retrieve all topics named /snappy_imu using RoboQL syntax. This search can be further refined by adding specific metadata, such as drone software version, or applying tags like "outdoor" to narrow the results.

import roboto

query_client = roboto.query.QueryClient(
    owner_org_id="og_najtcyyee2qa"  # Drone Racing (public)
)

roboto_search = roboto.RobotoSearch(query_client)

topics_to_search = roboto_search.find_topics(
    "topic.name = '/snappy_imu'"
)

Similarity Search

Now that we've defined our search space (the /snappy_imu topics we just retrieved), we can initiate our similarity search using the event we created earlier.

1. Get the Event: Get the event using roboto.Event.from_id. The event is pointing to the specific IMU signals that we annotated.

2. Extract Data from the Event: Extract the linear_acceleration and angular_velocity data from the IMU event into a pandas dataframe; this will serve as the query signal.

3. Find Similar Signals: Run the similarity search using the query signal with roboto.analytics.find_similar_signals.

import roboto.analytics

event = roboto.Event.from_id("ev_6funfjngoznn17x3")

query_signal = event.get_data_as_df(
    message_paths_include=["linear_acceleration", "angular_velocity"]
)

matches = roboto.analytics.find_similar_signals(
    query_signal,
    topics_to_search,
    max_matches_per_topic=1,
    normalize=True
)

With the code above, finding similar patterns in other flights becomes straightforward. We're able to precisely retrieve and search the IMU data, avoiding the need to download and process entire logs.

Artifact Generation

Finally, we can run some utility code to generate artifacts and visualize the results.

from notebook_utils.match_results import print_match_results

print_match_results(
    matches[:5], 
    image_topic="/snappy_cam/stereo_l",
)

Hard Landings

The first column shows the distance scores of the closest matches. The second column shows the plots of the matched IMU subsequences, and the third column shows the corresponding image sequences for each match. The first row, with a distance score of zero, represents the query signal. As expected, the top matches correspond to hard landing sequences in other flights.

Distance

Match

Camera

Link

0.0