8 min read

Analyzing 1,000 PX4 Drone Logs Using the Roboto SDK

Discover how the Roboto SDK simplifies aggregating key metrics from your robot logs!

Written by

Yves Albers

Published

December 11, 2024

Introduction

In the world of robotics, capturing the right performance metrics can make or break a team’s success. Without timely access to these insights, organizations risk missing critical issues and making decisions based on incomplete data. Metrics are essential for understanding system health, achieving business objectives, and enabling diverse teams—from operations to engineering and leadership—to stay aligned and work effectively.

Most metrics can be derived from data in robot log files. However, generating meaningful metrics at scale presents several challenges:

Data Volume: Fleets of robots generate massive amounts of log data.

Data Format: Logs are often stored in specialized formats, requiring dedicated tools and expertise to unpack them.

Data Modality: Logs contain diverse, multimodal data, such as GPS signals and object detections - each requiring different analytical approaches.

Moreover, the use of traditional metrics platforms can be cumbersome with robotics data. These platforms don’t support specialized log formats like rosbags or ulogs natively, forcing teams to spend extra time extracting and post-processing the raw log data before it’s usable.

At Roboto, we eliminate the need for these complex ETL processes by enabling teams to go directly from their robot log files to actionable metrics. By simplifying data access, we help organizations create safer autonomous systems, accelerate their time-to-market, and empower every team member to make informed decisions.

In this post, we’ll demonstrate how our SDK can efficiently retrieve detailed metrics from robot logs, enabling dashboards to track performance, identify trends, and evaluate system health.

Preparation

For this post, we ingested 1,000 publicly available drone flights from PX4 Flight Review into a Roboto account. The flights cover various software versions, hardware configurations and locations.

To reproduce the results: you can create a Roboto account, install the Python SDK and then configure a personal access token. You can see the ingested flights in the public collection on Roboto and run the notebook by following the README in our repository.

Note, you will also need to install tqdm and plotly in your environment by running:

pip install tqdm plotly

Background

Metrics capabilities in Roboto are powered by two core components:

RoboQL: Roboto’s query language enables precise searches across all data elements in the platform. For more information, check out the docs.
Topic Statistics: During log ingestion, Roboto calculates statistics such as min, max, mean, and median for each field—enabling efficient data aggregation.

You can view the statistical data for an ingested log file by expanding its topics. For example, we can expand the cpuload topic to see statistics for the load and ram_usage fields, and even open them up in the visualizer.

Inspecting one of the ingested PX4 logs in Roboto

Aggregation & Analysis

Now, let’s dive into some analysis! We’ll aggregate statistics across all 1,000 log files and explore a few examples. In these examples, we’ll assume various job roles within a company developing an autonomous drone. You can find the full notebook here.

(1) Altitude incursions, geofence violations and RC signal loss

As a Systems Engineer, your task is to report on key operational metrics, including (a) altitude incursions—instances where the drone exceeded the maximum flight altitude of 250m—(b) geofence violations, where the drone left the designated flight area, and (c) occurrences of remote control (RC) signal loss. These metrics are essential for assessing compliance with flight protocols and ensuring safety within the airspace.

Here’s how we could obtain those metrics using the Roboto SDK.

(a) Number of Altitude Incursions:

query = 'topics[0].msgpaths[vehicle_air_data.baro_alt_meter].max > 250 AND created > "2024-01-01"'
results = roboto_search.find_files(query)
nr_altitude_incursion_events = len(list(results))

(b) Number of Geofence Violations:

query = 'topics[0].msgpaths[vehicle_status.geofence_violated].true_count > 0 AND created > "2024-01-01"'
results = roboto_search.find_files(query)
nr_geofence_violations = len(list(results))

query = 'topics[0].msgpaths[vehicle_status.rc_signal_lost].true_count > 0 AND created > "2024-01-01"'
results = roboto_search.find_files(query)
nr_rc_signal_lost = len(list(results))

(2) CPU load statistics for master software branch

As an Avionics Engineer developing the next-generation autopilot, you need to analyze CPU load metrics from flights on the master branch. Key metrics include:

Min CPU Load
Max CPU Load
Mean CPU Load

Here's how we can obtain the metrics using the SDK:

query = 'topic.name = "cpuload" AND path="load" AND file.metadata.ver_sw_branch = "master"'
results = list(roboto_search.find_message_paths(query))

min_cpu_load_master_branch = min([m.min for m in results])*100
max_cpu_load_master_branch = max([m.max for m in results])*100
mean_cpu_load_master_branch = mean([m.mean for m in results])*100

(3) Compare the mean accelerometer temperature between different hardware versions

As a Thermal Engineer, you need to investigate reports of overheating following a hardware change. Specifically, you need to retrieve the mean accelerometer temperature for hardware versions PX4_FMU_V6C and PX4_FMU_V3.

Here's how we can obtain the metrics using the SDK:

query1 = 'topic.name = "vehicle_imu_status_00" AND path="temperature_accel" AND file.metadata.ver_hw = "PX4_FMU_V6C"'
query2 = 'topic.name = "vehicle_imu_status_00" AND path="temperature_accel" AND file.metadata.ver_hw = "PX4_FMU_V3"'

results1 = list(roboto_search.find_message_paths(query1))
results2 = list(roboto_search.find_message_paths(query2))

mean_accelerometer_temp_px4_fmu_v6c = mean([m.mean for m in results1])
mean_accelerometer_temp_px4_fmu_v3 = mean([m.mean for m in results2])

(4) Analyze flight distribution by country and popularity of PX4 hardware versions

Finally, we thought it would be interesting to look at the distribution of the 1,000 drone flights by country and identify the most popular PX4 hardware versions in use.

Let’s start with a breakdown of the top 20 PX4 hardware versions:

from collections import Counter

# Retrieve and count `ver_hw` occurrences
results = roboto_search.find_files("*")
ver_hw_counts = Counter(entry.metadata.get('ver_hw') for entry in results if entry.metadata.get('ver_hw'))

# Sort counts in descending order
sorted_ver_hw_counts = dict(sorted(ver_hw_counts.items(), key=lambda item: item[1], reverse=True))

# Extract hardware names and values
hardware_names = list(sorted_ver_hw_counts.keys())
hardware_values = list(sorted_ver_hw_counts.values())

It’s interesting to see PX4_SITL, which represents Simulation-in-the-Loop (SITL) flights, emerging as the second-most frequent hardware version. This highlights the critical role of simulation in drone development, demonstrating how developers heavily rely on SITL to test their systems before transitioning to hardware.

Now let’s try to find the distribution of drone flights by country.

To achieve this, we’ll start by defining a helper function to run queries and normalize the coordinates given discrepancies in PX4 versions:

def get_coordinates(query):
    normalization_factors = {"lat": 1e7, "lon": 1e7}
    result_list = roboto_search.find_message_paths(query)
    return [
        entry.mean / normalization_factors.get(entry.path, 1)
        for entry in result_list
    ]

Next, we can obtain the median latitude and longitude for each flight that includes the vehicle_gps_position topic. Then, we’ll perform a reverse GPS lookup to determine the country for each set of coordinates.

# The field names in the `vehicle_gps_position` topic vary by PX4 version: 
# It may be lat/lon or latitude_deg/longitude_deg; we query both below.

# Define query strings to obtain latitude and longitude values
latitude_query = 'topic.name="vehicle_gps_position" AND (path="lat" OR path="latitude_deg")'
longitude_query = 'topic.name="vehicle_gps_position" AND (path="lon" OR path="longitude_deg")'

# Retrieve latitude and longitude values using helper function
latitude_list = get_coordinates(latitude_query)
longitude_list = get_coordinates(longitude_query)

# Lookup countries from GPS coordinates (this may take some time)
country_dict = get_countries_by_reverse_gps(latitude_list, longitude_list)

# Sort and extract country names and values
sorted_country_dict = dict(sorted(country_dict.items(), key=lambda item: item[1], reverse=True))

One of the more striking insights is Switzerland’s position as the third most active country for PX4 Flight Review uploads. This is unexpected given its small size compared to larger nations like China, South Korea, India, and the United States. However, Switzerland’s ranking could be explained by PX4’s roots at ETH Zurich and the country’s dynamic robotics community.

In the previous analysis, we observed that SITL flights are the second most popular hardware type. Upon closer examination, we found that these simulations are geographically centered in Switzerland by default, which significantly boosts the country’s numbers.

When simulated flights are excluded from the breakdown, the rankings shift —Switzerland drops to fourth place. This adjustment offers a clearer view of where real-world drone activity is taking place.

Conclusion

In this post, we demonstrated how to easily generate aggregated metrics from 1,000 PX4 log files. To learn more about how you can leverage Roboto’s SDK with your own logs, be sure to check out our documentation and notebook!