Data Extraction

ROOK extracts health data from two primary types of sources:

API-Based Data Sources: Platforms with centralized APIs, such as Fitbit, Garmin, and Polar.
Mobile-Based Data Sources: Platforms like Android, Apple Health, and Health Connect, where data resides only on the user's device.

Before extraction, user authorization is required. Refer to the Data Authorization section for details.

API-Based Extractions

For API-based sources, ROOK employs a combination of polling and webhook integration to retrieve data.

User Authorization:
- Users authorize access via a web interface or app view, configured using the /authorizer endpoint.
- For sandbox testing, ROOK provides a pre-configured Connections Page. For production, custom views must be created.
Data Retrieval:
- ROOK periodically queries the API (polling) and listens to webhooks for real-time updates.
- Redundant mechanisms ensure consistent data retrieval, even when one method is temporarily unavailable.
Data Delivery:
- Extracted data is processed and delivered via:
  - ROOK Webhooks for real-time updates.
  - ROOK API for on-demand queries.

Pre-Existing Data:
- Upon authorization, ROOK retrieves up to 7 days of pre-existing data from supported sources. Refer to the Pre-Existing Data feature for details.
Custom Extraction Times:
- Default extraction times are 00:01 for physical summaries and 12:00 for sleep summaries (user's local time).
- Clients can request custom extraction times. ROOK uses the user's time zone to adjust scheduling. Learn more about the Time Zone Feature.
Retry Logic:
- If a summary is unavailable at the scheduled extraction time, ROOK retries extraction:
  - Day 1: 23 attempts, one every hour.
  - Next 30 Days: One attempt daily at the configured time.
- Successful extractions stop the retry process.
Duplication Handling:
- ROOK evaluates and updates duplicate summaries, sending the most complete version with an incremented document_version. Learn more about the Duplication Feature.

Mobile-based extractions rely on ROOK SDKs or the ROOK Extraction App to access health data directly from users' devices.

User Authorization:
- Authorization is initiated via SDK popups in the client app or through the ROOK Extraction App.
Data Retrieval:
- SDKs extract data every hour, respecting source-specific limitations such as:
  - App states (foreground or background).
  - Device settings (e.g., locked screens).
  - Request quotas or historical data limits.
Data Delivery:
- Extracted data is sent to ROOK servers for processing and delivered to clients via webhooks or the ROOK API.
- Certain metrics, such as step events, are available locally on the device and can be accessed directly via the SDK.

Pre-Existing Data:
- Mobile-based sources provide up to 30 days of pre-existing data, depending on the platform’s restrictions.
Limitations:
- Data availability depends on platform-specific constraints, such as request limits and device states. Refer to the Data Sources section for details.

A pre-configured Connections Page is available for sandbox testing, enabling quick integration and testing.
Production environments require clients to build their own connections page using the /authorizer endpoint. Learn more in the Data Authorization section.

A ready-to-use solution for collecting mobile-based data without building a custom app.
Supports user onboarding via QR codes and handles authorization and extraction seamlessly. Learn more in the ROOK Extraction App module.

Feature	API-Based Extractions	Mobile-Based Extractions
Use Case	API-based data sources	Data stored on mobile devices
Examples	Fitbit, Garmin, Polar, Oura	Apple Health, Health Connect, Android, iOS
Pre-Existing Data	Up to 7 days	Up to 29 days
Tools Provided by ROOK	Connections Page, `/authorizer`	SDKs, Extraction App

Both API and mobile-based extractions can be used to access a wide range of health data.
ROOK’s retry logic and duplication handling enhance data reliability.