How to Get 5-Minute Historical Data for OKX BTC Perpetual and Delivery Contracts for Backtesting Arbitrage Strategies

Β·

Backtesting arbitrage strategies requires high-quality, granular historical market data. For traders focusing on Bitcoin (BTC) derivatives, obtaining 5-minute interval data from major exchanges like OKX is essential for accurate simulation and strategy validation. This guide explains how to retrieve historical K-line data for both BTC perpetual and delivery contracts from OKX using Python, while addressing API limitations and ensuring data integrity.

Whether you're building a statistical arbitrage model between perpetual and quarterly futures or testing mean-reversion strategies, reliable time-series data is the foundation. Here’s a complete walkthrough.


Understanding OKX Market Data Structure

OKX provides RESTful APIs under its Open API v5, allowing access to historical candlestick (K-line) data for various instruments. The two key contract types relevant here are:

Each data point includes:

The 5-minute granularity corresponds to bar=300 (300 seconds).

πŸ‘‰ Get real-time and historical crypto data with advanced tools


Overcoming API Limitations

OKX restricts each request to 1000 records max. To collect long-term data (e.g., 3 months), you must paginate through results using the after parameter β€” which returns data before the specified timestamp.

This enables efficient backward traversal without gaps or duplicates.

Core Keywords

These keywords reflect user search intent and should be naturally integrated throughout.


Complete Python Script for Data Collection

Below is a robust script that handles pagination, error management, rate limiting, and data formatting.

import requests
import pandas as pd
import time
import datetime

# OKX API endpoint for historical candles
base_url = "https://www.okx.com/join/BLOCKSTARapi/v5/market/history-candles"

def fetch_okx_data(instrument_id, start_time, end_time, granularity=300):
    """
    Fetch historical K-line data from OKX
    :param instrument_id: e.g., BTC-USDT-SWAP or BTC-USD-250328
    :param start_time: Start timestamp in milliseconds
    :param end_time: End timestamp in milliseconds
    :param granularity: Bar size in seconds (300 = 5 minutes)
    :return: pandas DataFrame or None if failed
    """
    all_data = []
    current_time = end_time  # Start from end and move backward

    while current_time > start_time:
        url = f"{base_url}?instId={instrument_id}&bar={granularity}&after={current_time}"
        try:
            response = requests.get(url)
            response.raise_for_status()
            data = response.json()

            if data['code'] == '0':
                candles = data['data']
                if not candles:
                    break  # No more data available

                all_data.extend(candles)
                # Update current_time to last candle's timestamp
                current_time = int(candles[-1][0])  # Already in ms
            else:
                print(f"API Error: {data['msg']}")
                break

            time.sleep(0.2)  # Stay within rate limits

        except requests.exceptions.RequestException as e:
            print(f"Request failed: {e}")
            return None

    # Convert to DataFrame
    if not all_data:
        return None

    df = pd.DataFrame(all_data, columns=['ts', 'o', 'h', 'l', 'c', 'vol', 'volCcy'])
    df['ts'] = pd.to_datetime(df['ts'], unit='ms')
    df = df.astype({'o': float, 'h': float, 'l': float, 'c': float, 'vol': float, 'volCcy': float})
    df.set_index('ts', inplace=True)
    df.sort_index(inplace=True)  # Oldest first
    return df

# Example: Fetch last 90 days of BTC perpetual data
instrument_id = "BTC-USDT-SWAP"
end_time = int(datetime.datetime.now().timestamp() * 1000)
start_time = int((datetime.datetime.now() - datetime.timedelta(days=90)).timestamp() * 1000)

df = fetch_okx_data(instrument_id, start_time, end_time)

if df is not None:
    print(df.head())
    df.to_csv("btc_perpetual_5m.csv")
else:
    print("Failed to retrieve data.")

# For delivery contracts, change instrument_id:
# instrument_id = "BTC-USD-250328"

Key Improvements in This Implementation

πŸ‘‰ Access institutional-grade trading APIs and historical datasets


Frequently Asked Questions (FAQ)

Q1: Can I get more than 1000 rows at once?

No. OKX API limits responses to 1000 bars per request. You must paginate using the after or before parameters to retrieve longer histories.

Q2: Why use the after parameter instead of looping by time?

Using after ensures you only get data that exists, avoiding gaps due to exchange downtime or missing candles. It’s more reliable than fixed time intervals.

Q3: What does BTC-USDT-SWAP mean?

It refers to the BTC/USDT perpetual contract. All perpetual swaps on OKX follow this naming convention: {Base}-{Quote}-SWAP.

Q4: How do I find delivery contract IDs?

Check the OKX API documentation or use their /api/v5/public/instruments endpoint filtered by instType=FUTURES.

Q5: Is this suitable for high-frequency backtesting?

For strategies below 5-minute intervals (e.g., 1-minute or tick-level), consider augmenting with WebSocket streams or premium data providers. This method works best for medium-frequency strategies.

Q6: How often should I re-download the data?

If caching locally, update daily if your strategy uses recent data. Historical backtests can reuse stored CSVs unless new contracts or corrections are released.


Best Practices for Strategy Backtesting

  1. Validate Data Continuity: Check for missing timestamps in your 5-minute series.
  2. Use Adjusted Prices: For delivery contracts, apply roll adjustments when switching expiries.
  3. Include Trading Fees: Deduct taker/maker fees to reflect realistic P&L.
  4. Test Across Volatility Regimes: Include bull, bear, and sideways markets.
  5. Avoid Look-Ahead Bias: Ensure no future data leaks into signals.

πŸ‘‰ Start building your algorithmic trading system with powerful crypto tools


Final Notes

This approach enables systematic collection of high-resolution BTC futures data from OKX for quantitative analysis. By combining Python automation with proper API etiquette, traders can build reliable datasets for evaluating arbitrage opportunities between perpetual and delivery contracts.

Always monitor OKX’s rate limits and terms of service. For production systems, consider using authenticated endpoints (with API keys) for higher limits and enhanced stability.

With clean, well-structured historical data, you're one step closer to developing a profitable and resilient trading strategy.