Data Freshness & Monitoring

Flipside data shares in Snowflake are continuously updated as blockchain data is produced. This page shows you how to monitor freshness and build incremental data pipelines using Snowflake SQL.

Looking for latency targets? See the general Data Freshness guide for detailed latency expectations by schema type and blockchain.

Quick freshness check

Use this SQL query to check how current your data is:

-- Check latest block timestamp for a chain
SELECT
    MAX(block_number) AS latest_block,
    MAX(block_timestamp) AS latest_timestamp,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.core.fact_blocks;

Interpretation:

Less than 1 hour behind: Normal for Core tables
1-6 hours behind: Normal for DeFi/NFT tables
Greater than 24 hours behind: Potential issue, contact support

Monitoring freshness across schemas

Monitor data freshness across multiple table types:

-- Check freshness of different schemas
SELECT
    'Core: fact_blocks' AS table_name,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.core.fact_blocks

UNION ALL

SELECT
    'Core: fact_transactions' AS table_name,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.core.fact_transactions

UNION ALL

SELECT
    'DeFi: ez_dex_swaps' AS table_name,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.defi.ez_dex_swaps

UNION ALL

SELECT
    'NFT: ez_nft_sales' AS table_name,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ethereum.nft.ez_nft_sales

ORDER BY minutes_behind DESC;

Cross-chain freshness comparison

Compare freshness across multiple blockchains:

-- Compare freshness across your mounted shares
SELECT
    'Ethereum' AS chain,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ETHEREUM_CORE.ethereum.core.fact_blocks

UNION ALL

SELECT
    'Arbitrum' AS chain,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM ARBITRUM_CORE.arbitrum.core.fact_blocks

UNION ALL

SELECT
    'Solana' AS chain,
    MAX(block_timestamp) AS latest_data,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind
FROM SOLANA_CORE.solana.core.fact_blocks

ORDER BY minutes_behind;

Understanding timestamp fields

All Flipside tables include three timestamp fields with different purposes:

`block_timestamp` (clustered)

The timestamp when the block was produced on the blockchain. Clustered: ✅ Yes (primary clustering key) Use for:

Time-series analysis
Historical queries
Filtering by when events occurred on-chain
Always use this for performance (see Query Optimization)

-- Analyze transactions from last week (fast - uses clustering)
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 7;

`_inserted_timestamp` (not clustered)

The timestamp when the row was first inserted into Flipside’s database. Clustered: ❌ No Use for:

Identifying newly added data
Incremental data loads
Monitoring ingestion progress

-- Find data inserted in last hour
SELECT *
FROM ethereum.core.fact_transactions
WHERE _inserted_timestamp >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
  AND block_timestamp >= CURRENT_DATE - 7;  -- Still include block_timestamp for performance

Performance tip: Always include a block_timestamp filter even when filtering on _inserted_timestamp to leverage clustering.

`_modified_timestamp` (not clustered)

The timestamp when the row was last modified (for updates or corrections). Clustered: ❌ No Use for:

Detecting data corrections
Change data capture (CDC)
Audit trails

-- Find rows modified recently (corrections/updates)
SELECT
    block_timestamp,
    tx_hash,
    _inserted_timestamp,
    _modified_timestamp
FROM ethereum.core.fact_transactions
WHERE _modified_timestamp >= DATEADD(hour, -24, CURRENT_TIMESTAMP())
  AND block_timestamp >= CURRENT_DATE - 30  -- Always filter by block_timestamp
ORDER BY _modified_timestamp DESC;

Incremental data loading patterns

For building data pipelines that process only new data:

Pattern 1: Using `_inserted_timestamp`

Best for tracking which rows are new in Flipside’s database:

-- Step 1: Create checkpoint table
CREATE OR REPLACE TABLE my_checkpoints (
    table_name VARCHAR,
    last_processed_timestamp TIMESTAMP_NTZ
);

-- Step 2: Initialize checkpoint (run once)
INSERT INTO my_checkpoints
VALUES ('fact_transactions', CURRENT_TIMESTAMP());

-- Step 3: Incremental load query
SELECT *
FROM ethereum.core.fact_transactions
WHERE _inserted_timestamp > (
    SELECT last_processed_timestamp
    FROM my_checkpoints
    WHERE table_name = 'fact_transactions'
)
  AND block_timestamp >= CURRENT_DATE - 30  -- Always include for performance
ORDER BY _inserted_timestamp;

-- Step 4: Update checkpoint after successful load
UPDATE my_checkpoints
SET last_processed_timestamp = CURRENT_TIMESTAMP()
WHERE table_name = 'fact_transactions';

Pattern 2: Using `block_timestamp` (recommended)

Best for time-based incremental processing:

-- Step 1: Create checkpoint table
CREATE OR REPLACE TABLE my_checkpoints (
    table_name VARCHAR,
    last_processed_block_time TIMESTAMP_NTZ
);

-- Step 2: Initialize checkpoint (run once)
INSERT INTO my_checkpoints
SELECT
    'fact_transactions' AS table_name,
    MAX(block_timestamp) AS last_processed_block_time
FROM my_existing_local_table;

-- Step 3: Incremental load with buffer for late data
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp > (
    SELECT last_processed_block_time
    FROM my_checkpoints
    WHERE table_name = 'fact_transactions'
)
  AND block_timestamp <= CURRENT_TIMESTAMP() - INTERVAL '15 minutes'  -- Buffer for late-arriving data
ORDER BY block_timestamp;

-- Step 4: Update checkpoint after successful load
UPDATE my_checkpoints
SET last_processed_block_time = (
    SELECT MAX(block_timestamp)
    FROM my_existing_local_table
)
WHERE table_name = 'fact_transactions';

Best practice: Use block_timestamp with a 10-15 minute buffer to account for late-arriving or reprocessed data due to chain reorgs.

Pattern 3: Upsert with MERGE

For maintaining an up-to-date replica with corrections:

-- Incremental upsert using MERGE
MERGE INTO my_local_transactions AS target
USING (
    SELECT *
    FROM ethereum.core.fact_transactions
    WHERE _modified_timestamp >= DATEADD(hour, -24, CURRENT_TIMESTAMP())
      AND block_timestamp >= CURRENT_DATE - 30
) AS source
ON target.tx_hash = source.tx_hash
WHEN MATCHED THEN
    UPDATE SET
        target.block_number = source.block_number,
        target.block_timestamp = source.block_timestamp,
        target.from_address = source.from_address,
        target.to_address = source.to_address,
        target.eth_value = source.eth_value,
        -- ... other columns
        target._modified_timestamp = source._modified_timestamp
WHEN NOT MATCHED THEN
    INSERT (tx_hash, block_number, block_timestamp, from_address, to_address, eth_value, _modified_timestamp)
    VALUES (source.tx_hash, source.block_number, source.block_timestamp, source.from_address, source.to_address, source.eth_value, source._modified_timestamp);

Data quality validation

Flipside runs 40,000+ automated tests continuously, but you can also validate data yourself:

Check for gaps in block data

-- Identify missing blocks in sequence
WITH block_sequence AS (
    SELECT
        block_number,
        LAG(block_number) OVER (ORDER BY block_number) AS prev_block
    FROM ethereum.core.fact_blocks
    WHERE block_timestamp >= CURRENT_DATE - 1
)
SELECT
    prev_block AS last_block_before_gap,
    block_number AS first_block_after_gap,
    block_number - prev_block AS gap_size
FROM block_sequence
WHERE block_number - prev_block > 1
ORDER BY gap_size DESC;

Validate transaction counts

-- Verify transaction counts match between fact_blocks and fact_transactions
WITH block_counts AS (
    SELECT
        block_number,
        tx_count AS reported_tx_count
    FROM ethereum.core.fact_blocks
    WHERE block_timestamp >= CURRENT_DATE - 1
),
actual_counts AS (
    SELECT
        block_number,
        COUNT(*) AS actual_tx_count
    FROM ethereum.core.fact_transactions
    WHERE block_timestamp >= CURRENT_DATE - 1
    GROUP BY block_number
)
SELECT
    b.block_number,
    b.reported_tx_count,
    COALESCE(a.actual_tx_count, 0) AS actual_tx_count,
    b.reported_tx_count - COALESCE(a.actual_tx_count, 0) AS difference
FROM block_counts b
LEFT JOIN actual_counts a ON b.block_number = a.block_number
WHERE b.reported_tx_count != COALESCE(a.actual_tx_count, 0)
ORDER BY ABS(b.reported_tx_count - COALESCE(a.actual_tx_count, 0)) DESC
LIMIT 20;

Automated monitoring with tasks

Set up Snowflake tasks to monitor freshness automatically:

-- Create a task to check freshness hourly
CREATE OR REPLACE TASK monitor_ethereum_freshness
  WAREHOUSE = compute_wh
  SCHEDULE = 'USING CRON 0 * * * * America/Los_Angeles'  -- Every hour
AS
INSERT INTO freshness_alerts
SELECT
    CURRENT_TIMESTAMP() AS check_time,
    'ethereum.core.fact_blocks' AS table_name,
    MAX(block_timestamp) AS latest_block_time,
    DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) AS minutes_behind,
    CASE
        WHEN DATEDIFF('minute', MAX(block_timestamp), CURRENT_TIMESTAMP()) > 120
        THEN 'ALERT: Data is stale'
        ELSE 'OK'
    END AS status
FROM ethereum.core.fact_blocks;

-- Resume the task to start monitoring
ALTER TASK monitor_ethereum_freshness RESUME;

When to contact support

Contact [email protected] if:

Data is significantly stale

Core tables are greater than 6 hours behind chain head
DeFi/NFT tables are greater than 24 hours behind
Data hasn’t updated in 48+ hours

Data quality issues

Large gaps in block sequences - Missing transactions or events - Inconsistent transaction counts between tables

Unexpected changes

Previously available data is now missing
Sudden changes in data structure or schema
Tables or schemas disappeared from your share

Best practices for Snowflake

Build time buffers

Don’t query the last 10-15 minutes of data to account for processing latency and potential chain reorgs.

Use clustering

Always filter by block_timestamp first to leverage Flipside’s table clustering for maximum performance.

Monitor incrementally

Set up Snowflake tasks to automatically check data freshness for your critical pipelines.

Handle late data

Use block_timestamp buffers in ETL pipelines to catch late-arriving data from chain reorgs or reprocessing.

Next steps

General freshness guide

See latency targets by schema and blockchain

Query optimization

Learn how clustering affects performance

Use cases

See example queries with proper time filtering

Troubleshooting

Fix common data freshness issues

Getting Started

Access & Setup

Using Data Shares

Help

Data Freshness & Monitoring

Quick freshness check

Monitoring freshness across schemas

Cross-chain freshness comparison

Understanding timestamp fields

`block_timestamp` (clustered)

`_inserted_timestamp` (not clustered)

`_modified_timestamp` (not clustered)

Incremental data loading patterns

Pattern 1: Using `_inserted_timestamp`

Pattern 2: Using `block_timestamp` (recommended)

Pattern 3: Upsert with MERGE

Data quality validation

Check for gaps in block data

Validate transaction counts

Automated monitoring with tasks

When to contact support

Best practices for Snowflake

Build time buffers

Use clustering

Monitor incrementally

Handle late data

Next steps

General freshness guide

Query optimization

Use cases

Troubleshooting

Getting Started

Access & Setup

Using Data Shares

Help

​Quick freshness check

​Monitoring freshness across schemas

​Cross-chain freshness comparison

​Understanding timestamp fields

​block_timestamp (clustered)

​_inserted_timestamp (not clustered)

​_modified_timestamp (not clustered)

​Incremental data loading patterns

​Pattern 1: Using _inserted_timestamp

​Pattern 2: Using block_timestamp (recommended)

​Pattern 3: Upsert with MERGE

​Data quality validation

​Check for gaps in block data

​Validate transaction counts

​Automated monitoring with tasks

​When to contact support

​Best practices for Snowflake

Build time buffers

Use clustering

Monitor incrementally

Handle late data

​Next steps

General freshness guide

Query optimization

Use cases

Troubleshooting

Quick freshness check

Monitoring freshness across schemas

Cross-chain freshness comparison

Understanding timestamp fields

`block_timestamp` (clustered)

`_inserted_timestamp` (not clustered)

`_modified_timestamp` (not clustered)

Incremental data loading patterns

Pattern 1: Using `_inserted_timestamp`

Pattern 2: Using `block_timestamp` (recommended)

Pattern 3: Upsert with MERGE

Data quality validation

Check for gaps in block data

Validate transaction counts

Automated monitoring with tasks

When to contact support

Best practices for Snowflake

Next steps