CircadifyCircadify
Clinical Validation13 min read

How to Validate Camera-Based Vital Signs for Clinical Use

A detailed analysis of validation methodologies for camera-based rPPG vital signs, covering Bland-Altman analysis, regulatory frameworks, and what telehealth platforms should demand from clinical evidence.

telehealthvitals.com Research Team·
How to Validate Camera-Based Vital Signs for Clinical Use

The question of whether camera-based vital signs are accurate enough for clinical use keeps coming up in telehealth procurement conversations. It is a reasonable question. Telehealth platforms considering rPPG integration need to understand not just whether the technology works, but how to evaluate the evidence behind it. Validation methodology matters because a vendor can claim "clinical-grade accuracy" while using study designs that would not survive peer review.

This article breaks down how to validate camera vitals for clinical use, what the current evidence actually shows, and where the gaps remain.

"The agreement between rPPG-derived pulse rate and ECG reference showed a mean bias of 0.3 BPM with limits of agreement within clinically acceptable ranges across cardiovascular disease patients." — Clinical validation study published in PMC, 2025

Why validation methodology matters for telehealth platforms

When a telehealth platform evaluates an rPPG vendor, the first thing that usually arrives is a slide deck with accuracy numbers. Heart rate within 2 BPM. SpO2 within 2 percentage points. These numbers mean nothing without context.

The question is not "what accuracy did you achieve?" but "how did you measure it, on whom, and under what conditions?" A study conducted on 30 healthy college students sitting still in a well-lit lab tells you very little about how the technology will perform on a 72-year-old patient with dark skin connecting from a poorly lit living room over a spotty Wi-Fi connection.

Clinical validation for camera-based vitals requires the same rigor applied to any medical measurement device. The relevant standards and statistical methods come from decades of work in medical device evaluation, and they exist for good reason.

The reference standard problem

Every validation study compares rPPG measurements against a reference device. The choice of reference matters more than most people realize.

For heart rate, the gold standard is electrocardiography (ECG). A 3-lead or 12-lead ECG provides beat-to-beat timing with sub-millisecond precision. When an rPPG system reports a heart rate of 72 BPM and the ECG shows 73 BPM, you have a meaningful comparison.

For SpO2, the reference is a laboratory co-oximeter analyzing arterial blood samples, not a fingertip pulse oximeter. This distinction trips up many studies. Fingertip pulse oximeters themselves have an accuracy of roughly plus or minus 2 percentage points at saturations above 90%. Using one as your reference while claiming your rPPG system matches it to within 2 points is circular reasoning.

For respiratory rate, the gold standard is capnography or impedance pneumography. Manual counting by a trained observer is acceptable as a secondary reference, though inter-observer variability introduces its own error.

For blood pressure, the reference is an arterial line for beat-to-beat comparison or an automated oscillometric cuff calibrated according to the Association for the Advancement of Medical Instrumentation (AAMI) standards. Camera-based blood pressure estimation remains the least mature rPPG measurement, and validation studies here require particular scrutiny.

| Vital sign | Gold standard reference | Acceptable alternative | Common but insufficient | |---|---|---|---| | Heart rate | 12-lead ECG | 3-lead ECG, Holter monitor | Fingertip pulse oximeter HR | | SpO2 | CO-oximetry (arterial blood) | FDA-cleared pulse oximeter (with stated uncertainty) | Consumer-grade finger oximeter | | Respiratory rate | Capnography / impedance pneumography | Trained observer count (2+ observers) | Patient self-report | | Blood pressure | Intra-arterial catheter | AAMI-validated oscillometric cuff | Single-reading home cuff | | Heart rate variability | ECG-derived RR intervals | Research-grade PPG with validated HRV algorithms | Wrist-worn consumer wearable |

Bland-Altman analysis: the standard statistical approach

The Bland-Altman method is the accepted statistical approach for comparing two measurement methods. It was introduced by Martin Bland and Douglas Altman in their 1986 Lancet paper, and it remains the standard nearly four decades later.

The method works by plotting the difference between two measurements against their mean. For each paired measurement (rPPG reading and reference reading), you calculate the difference and the average of the two values. The resulting plot reveals three things: the mean bias (systematic over- or under-estimation), the limits of agreement (how much individual measurements vary), and whether the error depends on the magnitude of the measurement.

A 2025 clinical validation study of rPPG pulse rate monitoring in cardiovascular disease patients, published in PMC, used Bland-Altman analysis and found a mean absolute error of 1.061 BPM with an RMSE of 2.845 BPM. Most errors fell within plus or minus 5.76 BPM. For context, that level of agreement is comparable to what you would see comparing two different fingertip pulse oximeters against each other.

What to look for in a Bland-Altman analysis:

  • The mean bias should be close to zero. A consistent offset of 5 BPM in heart rate means the system is systematically wrong.
  • The limits of agreement (typically 1.96 standard deviations from the mean) should fall within clinically acceptable ranges.
  • The scatter should be uniform. If errors get larger at higher heart rates, the system has a proportional bias that needs separate reporting.
  • The sample size should be adequate. Bland and Altman themselves recommended at least 100 paired measurements for stable estimates of the limits of agreement.

Regulatory frameworks and what they require

Three regulatory pathways matter for camera-based vitals in telehealth contexts: FDA clearance in the United States, CE marking under the EU Medical Device Regulation, and the IEC 80601 family of standards that applies internationally.

FDA pathway

The FDA classifies pulse oximeters and physiological monitors under 21 CFR 870.2710 and related product codes. Camera-based vital sign systems can pursue 510(k) clearance by demonstrating substantial equivalence to a predicate device, or de novo classification if no suitable predicate exists.

In 2023, Oxehealth received de novo FDA clearance for its camera-based vital signs system, which monitors heart rate and respiratory rate using a near-infrared camera. FaceHeart has received multiple FDA clearances for its contactless video-based vital sign measurement SDK, covering heart rate, respiratory rate, blood pressure, SpO2, and HRV.

These clearances set a precedent for the product category, but each specific implementation still requires its own regulatory submission. A telehealth platform cannot assume that because one rPPG vendor has FDA clearance, all rPPG systems meet the same standard.

IEC 80601 standards

The IEC 80601 series covers particular requirements for basic safety and essential performance of medical electrical equipment. IEC 80601-2-61 covers pulse oximeter equipment, and its accuracy requirements (Arms of 4% or better for SpO2 in the 70-100% range) apply to any device claiming to measure oxygen saturation, regardless of the sensing modality.

For heart rate monitoring, IEC 60601-2-27 covers electrocardiographic monitoring equipment, and while rPPG systems are not ECGs, the performance expectations established in this standard inform what clinicians expect from any heart rate measurement device.

What telehealth platforms should ask vendors

When evaluating an rPPG vendor for clinical integration, ask these questions:

  • What regulatory clearances do you hold, and for which specific measurements?
  • What reference devices were used in your validation studies?
  • How many subjects were in your studies, and what was the demographic breakdown by age, sex, and skin tone?
  • Were studies conducted under controlled laboratory conditions or in realistic telehealth settings?
  • What are your Bland-Altman limits of agreement for each vital sign?
  • How does accuracy change under low-light, motion, and variable bandwidth conditions?

Population diversity in validation studies

One of the most significant weaknesses in the rPPG validation literature is the lack of demographic diversity in study populations. A comprehensive review published in Frontiers in Digital Health in 2025 noted that many rPPG studies rely on datasets with limited representation of darker skin tones, older adults, and people with cardiovascular or respiratory conditions.

This is not a theoretical concern. The physics of rPPG involves detecting subtle color changes in the skin caused by blood flow. Higher melanin concentration reduces the signal-to-noise ratio, and validation data must reflect this. The Fitzpatrick skin type scale (types I through VI) should be represented in any credible validation study, with performance reported separately for each group.

Age matters too. Older adults typically have reduced peripheral perfusion, thinner skin, and higher rates of arrhythmia, all of which affect rPPG signal quality. A system validated only on 20-to-40-year-old subjects cannot be assumed to work equally well on an 80-year-old patient.

A 2025 review in PMC covering deep learning approaches to rPPG heart rate measurement emphasized that real-world deployment requires validation across diverse populations, lighting conditions, and motion scenarios. The review found that while lab-based accuracy was high across multiple studies, performance degradation in uncontrolled environments remained a challenge.

Real-world conditions vs. laboratory performance

The gap between laboratory validation and real-world performance is probably the biggest unresolved issue in camera-based vitals. Laboratory studies control for lighting (typically 300+ lux, consistent color temperature), subject motion (typically seated and still), camera quality (typically a high-end webcam or DSLR), and network conditions (typically local processing, no compression).

A telehealth visit involves none of these controls. The patient might be on a phone in their car. The lighting might be a single overhead fluorescent. The video feed might be compressed to 360p by an overloaded connection. The patient might be coughing, gesturing, or looking away from the camera.

| Condition | Laboratory setting | Typical telehealth visit | |---|---|---| | Lighting | 300+ lux, consistent | Variable, often low, mixed sources | | Subject motion | Seated, still, instructed | Natural movement, gesturing, coughing | | Camera | High-quality webcam at fixed distance | Smartphone at variable distance and angle | | Video quality | Uncompressed or lightly compressed | H.264/VP9, often 360p-720p | | Network | Local processing | Variable latency, packet loss, bitrate drops | | Subject state | Healthy, resting | Potentially ill, anxious, in discomfort |

Validation studies that only report lab performance are incomplete. The more useful evidence comes from studies that deliberately introduce these real-world conditions, or from post-deployment data collected during actual telehealth encounters.

Adaptive correction algorithms are one approach to closing this gap. A 2026 paper in npj Digital Medicine described a physiology-informed correction method for rPPG heart rate monitoring that improved accuracy under challenging conditions by using physiological priors to filter out noise from motion and lighting changes. The Bland-Altman analysis showed narrowed limits of agreement after correction compared to raw rPPG estimates.

Current state of evidence by vital sign

Not all camera-based vital signs are at the same level of maturity. Here is where things stand:

Heart rate is the most validated rPPG measurement. Multiple studies with hundreds of subjects have demonstrated accuracy within 2-3 BPM of ECG reference under controlled conditions. Real-world accuracy degrades somewhat but remains clinically useful for trending and screening. Two FDA-cleared systems exist.

Respiratory rate has solid evidence, though fewer large-scale studies than heart rate. The signal is extracted from the amplitude modulation of the PPG waveform, and accuracy is typically within 2-3 breaths per minute of capnography reference. FaceHeart holds FDA clearance specifically for contactless respiratory rate measurement.

SpO2 has moderate evidence. The challenge is that SpO2 estimation by camera requires multi-wavelength analysis (typically red and blue channels), and the accuracy depends heavily on camera sensor characteristics and ambient lighting. Most studies report accuracy within 2-3 percentage points of pulse oximetry reference in the 90-100% range, but performance drops at lower saturations where clinical decisions matter most.

Blood pressure remains early-stage. Camera-based BP estimation uses pulse transit time, pulse wave analysis, or morphological features of the PPG waveform. A 2022 review published in the journal ChatMED (Oxford Academic) found that while results were promising, most studies had small sample sizes and limited demographic diversity. BP estimation by rPPG should be considered investigational for clinical purposes.

Heart rate variability can be derived from rPPG when beat-to-beat timing is sufficiently accurate. Time-domain metrics like SDNN and RMSSD are extractable, though frequency-domain analysis requires longer measurement windows and higher signal quality. HRV is increasingly used as a biomarker for stress and autonomic function, but its clinical interpretation from rPPG data is still being standardized.

Frequently asked questions

What accuracy should a telehealth platform require from an rPPG vendor?

For heart rate, look for Bland-Altman limits of agreement within plus or minus 5 BPM against ECG reference, with data from at least 100 subjects across diverse demographics. For SpO2, require Arms (root mean square accuracy) of 4% or better against co-oximetry or FDA-cleared pulse oximetry, matching the IEC 80601-2-61 standard. For respiratory rate, limits of agreement within plus or minus 3 breaths per minute against capnography is reasonable. Blood pressure estimation should be treated as supplementary information rather than a clinical measurement at this stage.

Do camera-based vitals need FDA clearance for use in telehealth?

It depends on how the measurements are used. If vital signs are presented to a clinician as part of a clinical decision workflow, the measurement system generally falls under FDA oversight as a medical device. If vitals are presented as wellness information without clinical claims, the regulatory requirements may be lower. Telehealth platforms should consult with regulatory counsel, as the FDA's position on software-as-a-medical-device continues to evolve.

How do lighting conditions affect rPPG accuracy?

Lighting is the single biggest environmental factor. Low light reduces the signal-to-noise ratio because the camera cannot detect subtle skin color changes. Flickering artificial light (especially from older fluorescent fixtures at 50 or 60 Hz) can introduce periodic noise that interferes with the pulse signal. Direct sunlight causes sensor saturation. The optimal range is 200 to 500 lux of consistent, diffuse lighting, which corresponds roughly to a well-lit room without direct sunlight.

Can rPPG work through video compression during a telehealth call?

Yes, but accuracy depends on the compression level. Modern video codecs like H.264 and VP9 preserve enough color information at 720p and above for most rPPG algorithms to function. At 360p or below, or with aggressive bitrate constraints (under 500 kbps), the subtle color variations that carry the rPPG signal can be lost. Some systems pre-extract the physiological signal on the client device before video compression, which sidesteps this issue entirely.

Where the field is heading

The validation landscape for camera-based vitals is maturing, but it is not there yet. The FDA clearances from Oxehealth and FaceHeart established that the product category is viable. The growing body of peer-reviewed literature provides increasingly robust evidence for heart rate and respiratory rate. SpO2 and blood pressure need more work.

For telehealth platforms evaluating this technology, the practical advice is straightforward: demand published validation data, check the study methodology against the criteria outlined above, and pilot the technology in your actual clinical environment before making claims to providers. The lab numbers will not match your deployment numbers. The question is whether the gap is clinically acceptable.

Solutions like Circadify are working to close this gap by building rPPG SDKs designed for real-world telehealth conditions, with on-device processing that avoids the video compression problem entirely.

For more on the technical architecture of integrating vitals into telehealth platforms, see our article on rPPG integration for telemedicine software. And for the business case behind adding vitals capture, read how telehealth companies build a vitals integration business case.

clinical validationrPPG accuracytelehealth vitalsBland-Altman analysis
Request a Platform Demo