As an engineer analyzing robot logs, you’re likely asking questions such as:
Have we encountered this type of issue before?
How often does this event occur?
Can we identify similar events in past logs?
Can we set up alerts to catch this issue if it happens again?
Answering these questions is essential for diagnosing system behaviors, anticipating potential failures, and continuously improving your robot's performance.
However, answering them presents two key challenges:
(1) Data volume and format: Robots generate huge amounts of data—sometimes tens or hundreds of gigabytes per mission—stored in specialized formats like ROS bags or MCAP files. Parsing these logs to extract relevant segments is typically manual and labor-intensive. This gets even worse when dealing with log data from fleets of robots.
(2) Data diversity and pattern complexity: Robot logs are multimodal, containing a wide variety of sensor data. While simple issues can be flagged by looking for discrete values or conditions (e.g., vehicle_gps_lost = True), more complex events—such as repeated instability in drone flights or subtle gripper failures—demand more advanced methods.
At Roboto, our goal is to address challenge (1) by making it easier to retrieve task-specific data at scale, allowing you to manage and analyze large datasets more effectively. In parallel, we’re expanding our tooling to help address challenge (2) as well—building features that identify and classify complex patterns in multimodal data.
In this post, we’ll walk through an example where we use our Python SDK to retrieve IMU data from drone racing logs and then leverage signal similarity search to identify and classify patterns.
As a systems engineer, you're tasked with investigating hard landings in your drone's latest flight campaign. To do this, you need to generate landing videos from the drone’s forward-facing camera and extract the corresponding IMU signals. These will then be reviewed by the controls team to assess performance.
While this sounds straightforward, the challenge is that your ROS bags contain raw IMU and image data—there are no annotations or flags indicating hard landing events. As a result, you'd typically need to:
1. Download dozens of raw flight logs, each several gigabytes in size.
2. Manually review flights to identify hard landings and record the timestamps.
3. Write custom scripts to extract images and IMU signals, then compile them into videos.
4. Upload these videos to a storage system for shared access.
This would be painstaking, and could take several days with a large number of logs.
Instead of manually reviewing every flight, we're going to use Roboto to find and retrieve hard landings automatically:
1. Single Annotation: Use Roboto’s visualizer to annotate an exemplary hard landing event.
2. Data Retrieval: Use Roboto’s SDK to efficiently retrieve IMU signals from flights to search.
3. Similarity Search: Use similarity search to identify other hard landings in the IMU signals.
4. Artifact Generation: Generate clips of the relevant landing sequences.
This approach will streamline the entire process and save considerable time.
For this example, we ingested 25 drone flights from UZH-FPV Drone Racing. The flights cover various drone racing scenes, both indoor and outdoor, using forward and downward-facing cameras.
To reproduce the steps below: create your own account, then install the Python SDK and configure a personal access token. You can see the flight logs in the public collection on Roboto and run the notebook by following the README in the repository.
Our first task is to use Roboto’s visualizer to annotate an exemplary hard landing event. This will serve as a reference to find other hard landings afterwards.
1. Open a dataset in the visualizer. Create panels for the image topics, and plot the [x, y, z] values from /snappy_imu/linear_acceleration. Refer to our guide: Visualizing ROS Data.
2. Identify the drone's landing by looking for spikes in the linear acceleration signals, indicating contact with the ground.
3. Create an event around the landing, setting the Event Scope to "All Topics in this Panel." Refer to our guide: Create Events on Data.
We can use the Roboto SDK to access the public log collection and set up a query client.
Next, we call roboto_search.find_topics to retrieve all topics named /snappy_imu using RoboQL syntax. This search can be further refined by adding specific metadata, such as drone software version, or applying tags like "outdoor" to narrow the results.
import roboto
query_client = roboto.query.QueryClient(
owner_org_id="og_najtcyyee2qa" # Drone Racing (public)
)
roboto_search = roboto.RobotoSearch(query_client)
topics_to_search = roboto_search.find_topics(
"topic.name = '/snappy_imu'"
)
Now that we've defined our search space (the /snappy_imu topics we just retrieved), we can initiate our similarity search using the event we created earlier.
1. Get the Event: Get the event using roboto.Event.from_id. The event is pointing to the specific IMU signals that we annotated.
2. Extract Data from the Event: Extract the linear_acceleration and angular_velocity data from the IMU event into a pandas dataframe; this will serve as the query signal.
3. Find Similar Signals: Run the similarity search using the query signal with roboto.analytics.find_similar_signals.
import roboto.analytics
event = roboto.Event.from_id("ev_6funfjngoznn17x3")
query_signal = event.get_data_as_df(
message_paths_include=["linear_acceleration", "angular_velocity"]
)
matches = roboto.analytics.find_similar_signals(
query_signal,
topics_to_search,
max_matches_per_topic=1,
normalize=True
)
With the code above, finding similar patterns in other flights becomes straightforward. We're able to precisely retrieve and search the IMU data, avoiding the need to download and process entire logs.
Finally, we can run some utility code to generate artifacts and visualize the results.
from match_visualization_utils import print_match_results
print_match_results(
matches[:5],
image_topic="/snappy_cam/stereo_l",
)
Hard Landings
The first column shows the distance scores of the closest matches. The second column shows the plots of the matched IMU subsequences, and the third column shows the corresponding image sequences for each match. The first row, with a distance score of zero, represents the query signal. As expected, the top matches correspond to hard landing sequences in other flights.
How does find_similar_signals work?
The find_similar_signals search is powered by Mueen's Algorithm for Similarity Search (MASS), which uses Euclidean distance profiles to identify matching subsequences in time-series data. We apply this method across multiple signals, then combine the results to find the closest match. This approach has also proven effective for detecting other types of events. Let's explore a few more cases.
Calibration Sequences
We looked for IMU patterns which correspond to the calibration sequence of the downward-facing camera before takeoff. The results look great!
Left Turns
Finally, we tried to find similarly aggressive left turns. The results qualitatively make sense.
Note that we only used angular_velocity to find turns, as it directly captures rotational motion. Dropping linear_acceleration makes sense because turns are primarily characterized by changes in angular velocity, while linear acceleration data may introduce noise unrelated to the rotational dynamics of the turn. It's generally a good idea to experiment with including different signals in the similarity search to better capture events.
Limitations
The approach using Euclidean distance profiles may struggle in a few cases. First, noise or signal distortions can cause the algorithm to misinterpret small variations as dissimilar, leading to missed matches. Second, temporal misalignment can prevent the algorithm from recognizing similar events that are slightly shifted in time, a problem better handled by methods like Dynamic Time Warping (DTW). Additionally, events with different semantic meanings—such as a takeoff versus manually lifting the drone - can have similar IMU signatures, potentially leading to false positives.
Conclusion
In this post, we showed how Roboto’s SDK makes it easy to extract slices of your robotics data so you can focus on more advanced tasks, like finding events using similarity search.
By simplifying the retrieval process, we’ve demonstrated how you can streamline workflows that usually take hours into minutes. We're excited to continue adding new techniques for pattern matching in robotics data, so stay tuned for more features and examples soon. In the meantime, give it a try with your own data and let us know what you find!
This demo and the accompanying open source notebook use data from UZH-FPV Drone Racing, published by the Robotics and Perception Group (RPG) at the University of Zurich.