Open Nav
features

Transforming How Our Health Insights Engine Works

Author avatar

Agnieszka Górecka

Updated onAugust 21, 2024

We’ve replaced our old rule-based insight engine with a new one that analyzes and correlates hundreds of health and fitness data points in less than a second.


From the very beginning, Aidlab’s mission has been pretty clear: aggregating, organizing, and analyzing health data to help extend human lifespan by finding patterns, spotting correlations, and tracking key trends that could impact longevity. We started with the Aidlab mobile app, which, when paired with the Aidlab chest strap, allowed users to capture raw signals like electrocardiograms (ECG) and extract valuable high-level metrics. For instance, from raw ECG signals, we calculate the intervals between successive heartbeats (RR intervals), which helps us derive Heart Rate Variability—an important indicator of cardiovascular health and resilience to stress.

We didn’t stop there. We introduced the concept of computed data, combining multiple data sets. For example, to get an accurate calorie burn rate, we take into account heart rate (along with max HR and resting HR), weight, current activity, and sex. Similarly, events like coughing can be detected by analyzing sound volume together with chest movements.

flowchart TD A["Raw ECG"] --> n1["RR Intervals"] n1 --> n2["Heart Rate Variability"] & n3["Heart Rate"] n3 --> E["Estimated Calorie Burn Rate"] n4["Weight Records"] --> E n6["Raw Respiration Data"] --> n7["Respiration Rate"] n7 --> n8["Estimated Stress Level"] n2 --> n8 style n1 fill:#BBDEFB style n3 fill:#BBDEFB style n2 fill:#BBDEFB style E fill:#FFF9C4 style n7 fill:#BBDEFB style n8 fill:#FFF9C4

Examples of relationships between different type of data: raw data is highlighted in blue, high-level data is yellow, and computed metrics are shown in light blue.

This laid the groundwork for Aidlab Teams, a service designed to organize and visualize complex data in ways that are sometimes too advanced for mobile platforms.

Data Visualisation

Aidlab Teams

While collecting and presenting physiological signals and events was important, we soon realized that understanding well-being is about more than just tracking numbers.

In 2021, we began working on a rule-based analysis engine to help users make sense of their data by delivering simple insights into their daily activities and habits. The engine was designed to be both motivational and informational, fitting perfectly with our mission to improve overall health.

Betting on Physical Activity as a Path to Longevity

With our limited resources and in line with the Pareto principle, we knew we had to focus on areas that would have the greatest impact on longevity with a reasonable engineering effort from the team. Physical activity stood out as a key factor—one that provided significant health benefits through relatively straightforward analysis and was supported by a wealth of reproducible studies.

Activity is a key element in improving overall health. It strengthens your body, makes it more resilient, and improves cardiovascular fitness. As a result, your resting heart rate decreases, signaling better heart health. Seeing the direct impact of your daily exercises on your body can help you make more informed decisions about your lifestyle.

For example, we analyzed users' average daily steps over the past months and compared them to previous periods. This simple trend, though basic, was able to highlight significant changes in daily routines for many users.

From Rule-Based Insights to Personalized Insights

However, we quickly identified the limitations of our analytical engine. To truly understand a user's health, more complex patterns had to be detected. For instance, consistently low step counts over seven consecutive days might suggest illness, a change in routine, or even a loss of motivation. Manually programming these kinds of rules became time-consuming and error-prone, especially when accounting for the complexity of individual behaviors.

Throughout this journey, we’ve learned that focusing on a single data point isn’t enough—you need a system that can interpret multiple factors together to show why certain changes are happening and push you to take action when it matters most. That’s where AI Insights steps in.

Unlike rigid, rule-based systems that only report what’s happening, AI Insights goes further by detecting subtle shifts in your metrics and linking them to relevant factors, such as changes in your daily routine or variations in your activity patterns. This provides you with more meaningful, context-driven insights.

Examples of AI Insights

For example, if you typically start your mornings with a walk to the coffee shop but have recently been skipping it, our updated engine can now detect this deviation. It might suggest that something in your schedule or energy levels has changed—whether it's work stress, a lack of sleep, or even the early signs of burnout or illness. By identifying these changes early, AI Insights can send you personalized suggestions, like a reminder to take a short walk or to plan a break during the day. It's like having a health coach who knows when you need encouragement, before you even realize it yourself.

What’s even more exciting is how AI Insights keeps you accountable. Motivation isn’t just about setting goals—it’s about receiving the right nudges to keep you moving toward them. Whether it’s a subtle prompt to get moving or a celebration of hitting a milestone, AI Insights is there to keep you motivated and on track.

How AI Insights Works

We use LLMs to process and interpret complex health data in ways that our old system couldn't. However, anyone familiar with these models knows that their calculations can sometimes be inaccurate, as they are prone to hallucinations and not designed for precise numerical tasks. This is why we augment the LLM-based system with pre-computations.

Do LLMs Even Know How Many 'R's Are in 'Raspberry'?

LLMs are particularly good at recognizing patterns from structured inputs. While it’s true that raw health data like ECG or audio signals require pre-processing before being fed into an LLM, the patterns that emerge from these datasets (such as changes in activity levels or deviations in health metrics over time) are much like the contextual relationships LLMs excel at interpreting. Ultimately, we don’t rely on the LLM to handle raw data, but instead, we provide it with structured, aggregated data that it can interpret and correlate across time and different metrics. I’ll explain that further later.

LLMs also bring a lot of flexibility. The data we’re interpreting isn’t isolated; it’s deeply connected to broader context like daily routines, lifestyle habits, and stress levels. Traditional AI models often work in silos, optimized for specific data types (we’ve experienced this limitation ourselves), but LLMs let us analyze diverse datasets all at once, in a more holistic way.

It's also worth mentioning that we don't rely entirely on LLMs for everything. Some parts of our older rule-based system are still in place. We continue using those same mechanisms to capture and process the most relevant data. This includes keeping certain pre-calculations to ensure accuracy and reliability. For example, we often group data into time-windowed averages based on factors like user activity levels or sensor inputs. Once that's done, we structure, package, and feed the aggregated data points into the LLM.

A simplified flow is shown below:

flowchart TD A["Raw Health and Fitness Data"] --> n1["High-Level Data"] B["Time-windowed Aggregation"] --> C["Structured, Cached Data"] E["LLM Processes Data"] --> F["Output in Structured JSON Format"] n1 --> n2["Computed Data"] & B n2 --> B C --> E style n1 fill:#BBDEFB style B stroke-width:1px,stroke-dasharray: 3 style n2 fill:#FFF9C4
  1. Preparing High-Level, and Computed Data from Raw Data

    In this step, the goal is to extract the most relevant data for further processing. Raw data can't be sent directly to LLMs because they are not designed to process unstructured or high-frequency sensor data (e.g., >1Hz), such as accelerometer data. Instead, we extract high-level and computed data, as mentioned before, using techniques such as averaging, filtering, and sampling. Some of these calculations are performed directly on Aidlab, to reduce data bandwidth as we didn't have much choice (looking at you, Bluetooth). For instance, in the case of audio, we extract key parameters, referred to as sound features, which contain the most relevant characteristics.

  2. Time-windowed Aggregation

    The next step is time-windowed aggregation, where we further reduce data complexity by grouping the data into different time intervals. This allows us to capture trends over varying periods, such as steps taken over the past week, month, or year. For example, we may emphasize periods when physical exercises were performed or when Aidlab was used for health checkups. We also developed a decision-making mechanism that customizes the analysis for user-specific events, providing more relevant and actionable insights.

  3. Structure and Cache Data

    Once aggregated, the data is structured and formatted into predefined formats. This step ensures that the data is organized efficiently and is easy to cache.

  4. Data Processing and Generating Output

    Here, the data is fed into the model. The LLM interprets the data and generates output in a structured JSON format, following a predefined schema. We then validate the outputs against set rules and metrics to ensure trustworthy insights. The results are also cached and refreshed every hour, ensuring up-to-date analysis without overloading system resources.

Final Thoughts

This change has turned it from a basic tool that shows isolated data into one that delivers deeper insights through cohort comparisons. For instance, we can now show how resting heart rate differs across age groups and genders, letting users compare their own data with broader population averages. This means we can provide users with interesting insights, like whether their resting heart rate is above or below the typical range for their demographic.

Finally, AI Insights are available via Aidlab API, allowing developers and researchers to integrate them directly into their applications.

Note on the Privacy of Your Data

We’re constantly working to improve the user experience. The more the system learns, the better it gets at understanding the best way to communicate with you. With each update, AI Insights becomes smarter, more intuitive, and more tailored to your needs. But aside from that, we take privacy very seriously. We never use personally identifiable data for purposes unrelated to enhancing your experience. Some features, like the Timeline feature, are opt-in only (used to accurately determine your routine), allowing you to choose how much information you want to share.


Back to Blog

READ ALSO

Aidlab™ is a registered trademark. Copyright © 2024