Skip to main content
Flipside’s blockchain data is optimized for performance through strategic clustering and indexing. Understanding how tables are organized will help you write faster, more cost-effective queries.

How Flipside clusters data

All Flipside tables are clustered on block_timestamp (or equivalent time field). What this means:
  • Queries filtering by time are extremely fast
  • Data is physically organized by timestamp on disk
  • Snowflake can skip irrelevant micro-partitions
Always filter by time for optimal performance.

Understanding timestamp fields

Flipside tables include multiple timestamp fields:

block_timestamp

The timestamp when the block was produced on the blockchain. Use for:
  • Time-series analysis
  • Historical queries
  • Filtering by when events occurred on-chain
Clustered: βœ… Yes (primary clustering key)
-- βœ… GOOD: Clustered query
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= '2024-01-01'
  AND block_timestamp < '2024-02-01';

_inserted_timestamp

The timestamp when the row was first inserted into Flipside’s database. Use for:
  • Identifying newly added data
  • Incremental data loads
  • Data pipeline orchestration
Clustered: ❌ No
-- Use for incremental loads
SELECT *
FROM ethereum.core.fact_transactions
WHERE _inserted_timestamp >= DATEADD(hour, -1, CURRENT_TIMESTAMP());

_modified_timestamp

The timestamp when the row was last modified (for updates or corrections). Use for:
  • Detecting data corrections
  • Change data capture (CDC)
  • Audit trails
Clustered: ❌ No
Performance tip: Always include a block_timestamp filter, even when filtering on _inserted_timestamp or _modified_timestamp. This leverages the clustering.

Query optimization best practices

1. Always filter by time

SELECT
    block_number,
    tx_hash,
    from_address,
    to_address
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 7
  AND to_address = '0x...'
LIMIT 1000;
Why it matters:
  • Without time filter: Scans entire table (billions of rows)
  • With time filter: Scans only relevant partitions (millions of rows)
  • Result: 10-100x faster queries, lower costs

2. Use specific time ranges

Narrow time ranges = faster queries.
-- βœ… Best: Specific range
WHERE block_timestamp >= '2024-01-01'
  AND block_timestamp < '2024-01-08'

-- ⚠️ OK: Last N days
WHERE block_timestamp >= CURRENT_DATE - 30

-- ❌ Avoid: Open-ended queries
WHERE block_timestamp >= '2020-01-01'

3. Limit result sets

Use LIMIT to constrain output:
-- For exploration
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 1
LIMIT 1000;

-- For production queries
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 30
  AND success = TRUE
ORDER BY block_timestamp DESC
LIMIT 10000;

4. Filter early, aggregate late

Push filters down before joins or aggregations:
WITH recent_txs AS (
    SELECT *
    FROM ethereum.core.fact_transactions
    WHERE block_timestamp >= CURRENT_DATE - 7
      AND success = TRUE
)
SELECT
    DATE_TRUNC('day', block_timestamp) AS date,
    COUNT(*) AS tx_count
FROM recent_txs
GROUP BY 1;

5. Use appropriate join strategies

When joining multiple tables, ensure both have time filters:
SELECT
    t.tx_hash,
    t.from_address,
    s.amount_usd
FROM ethereum.core.fact_transactions t
JOIN ethereum.defi.ez_dex_swaps s
    ON t.tx_hash = s.tx_hash
WHERE t.block_timestamp >= CURRENT_DATE - 7  -- Filter on both tables
  AND s.block_timestamp >= CURRENT_DATE - 7
LIMIT 10000;

6. Leverage materialized CTEs for complex queries

For queries with multiple CTEs, consider using MATERIALIZED:
WITH base_data AS MATERIALIZED (
    SELECT
        block_timestamp,
        from_address,
        to_address,
        eth_value
    FROM ethereum.core.fact_transactions
    WHERE block_timestamp >= CURRENT_DATE - 30
      AND eth_value > 0
)
SELECT
    from_address,
    COUNT(*) AS tx_count,
    SUM(eth_value) AS total_eth
FROM base_data
GROUP BY 1
HAVING total_eth > 100
ORDER BY total_eth DESC;

Common query patterns

Time-series aggregations

-- Daily transaction counts
SELECT
    DATE_TRUNC('day', block_timestamp) AS date,
    COUNT(*) AS tx_count,
    COUNT(DISTINCT from_address) AS unique_senders
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 30
GROUP BY 1
ORDER BY 1;

Address-specific queries

-- All activity for an address
SELECT
    block_timestamp,
    tx_hash,
    from_address,
    to_address,
    eth_value
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 90
  AND (from_address = LOWER('0x...')
   OR to_address = LOWER('0x...'))
ORDER BY block_timestamp DESC;
Case sensitivity: Ethereum addresses are case-insensitive. Use LOWER() for comparisons to ensure matches.

Token transfer analysis

-- USDC transfers over \$10K
SELECT
    block_timestamp,
    from_address,
    to_address,
    amount,
    amount_usd
FROM ethereum.core.ez_token_transfers
WHERE block_timestamp >= CURRENT_DATE - 7
  AND contract_address = LOWER('0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48') -- USDC
  AND amount_usd > 10000
ORDER BY amount_usd DESC
LIMIT 1000;

DEX swap analysis (Premium)

-- Top Uniswap swaps by USD value
SELECT
    block_timestamp,
    tx_hash,
    trader,
    token_in,
    token_out,
    amount_in_usd,
    amount_out_usd
FROM ethereum.defi.ez_dex_swaps
WHERE block_timestamp >= CURRENT_DATE - 1
  AND platform = 'uniswap-v3'
  AND amount_in_usd > 50000
ORDER BY amount_in_usd DESC
LIMIT 100;

Performance monitoring

Check query performance

Use Snowflake’s query profile to identify bottlenecks:
-- Check your query history
SELECT
    query_id,
    query_text,
    execution_time,
    bytes_scanned,
    rows_produced
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY())
WHERE query_text ILIKE '%ethereum%'
ORDER BY start_time DESC
LIMIT 10;

Estimate query cost

-- See bytes scanned (proxy for cost)
SELECT
    query_id,
    query_text,
    bytes_scanned / POWER(1024, 3) AS gb_scanned,
    execution_time / 1000 AS seconds
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY())
WHERE start_time >= DATEADD(hour, -1, CURRENT_TIMESTAMP())
ORDER BY bytes_scanned DESC;

Avoiding common pitfalls

Problem: Selecting all columns scans unnecessary dataSolution: Only select columns you need
-- ❌ Bad
SELECT * FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 7;

-- βœ… Good
SELECT
    block_timestamp,
    tx_hash,
    from_address,
    to_address
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 7;
Problem: Joining without time constraints scans entire tablesSolution: Add time filters to both sides of the join
-- βœ… Good
SELECT t.*, s.amount_usd
FROM ethereum.core.fact_transactions t
JOIN ethereum.defi.ez_dex_swaps s
    ON t.tx_hash = s.tx_hash
WHERE t.block_timestamp >= CURRENT_DATE - 7
  AND s.block_timestamp >= CURRENT_DATE - 7;
Problem: Querying years of data when you only need recent dataSolution: Use explicit, narrow time ranges
-- ❌ Bad: Scans years of data
WHERE block_timestamp >= '2020-01-01'

-- βœ… Good: Scans only what you need
WHERE block_timestamp >= CURRENT_DATE - 30
Problem: Returning millions of rows accidentallySolution: Always use LIMIT when exploring
-- βœ… Good for exploration
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp >= CURRENT_DATE - 1
LIMIT 1000;

Advanced optimization techniques

Using search optimization service

For frequent point lookups on specific addresses:
-- Enable search optimization (requires privileges)
ALTER TABLE ethereum.core.fact_transactions
ADD SEARCH OPTIMIZATION ON (from_address, to_address);
Contact Flipside if you need specific tables optimized for your use case.

Incremental processing patterns

For data pipelines, use incremental loads:
-- Track last processed timestamp
CREATE OR REPLACE TABLE my_checkpoint (
    table_name VARCHAR,
    last_processed_timestamp TIMESTAMP_NTZ
);

-- Incremental query
SELECT *
FROM ethereum.core.fact_transactions
WHERE block_timestamp > (
    SELECT last_processed_timestamp
    FROM my_checkpoint
    WHERE table_name = 'fact_transactions'
)
AND block_timestamp <= CURRENT_TIMESTAMP() - INTERVAL '10 minutes' -- Buffer for late-arriving data
ORDER BY block_timestamp;

Next steps