Rate Limits

Overview

TensorOne API implements rate limiting to ensure fair usage and maintain service quality for all users. Rate limits vary by endpoint type, subscription plan, and API key permissions.

Rate Limit Tiers

Free Tier

Read Operations: 100 requests/hour
Write Operations: 20 requests/hour
AI Services: 50 requests/hour
Training Jobs: 5 jobs/day

Pro Tier

Read Operations: 1,000 requests/hour
Write Operations: 200 requests/hour
AI Services: 500 requests/hour
Training Jobs: 20 jobs/day

Enterprise Tier

Read Operations: 10,000 requests/hour
Write Operations: 2,000 requests/hour
AI Services: 5,000 requests/hour
Training Jobs: Unlimited

Rate Limit Headers

Every API response includes rate limit information:

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1640998800
X-RateLimit-Window: 3600
Retry-After: 60

Header Descriptions

X-RateLimit-Limit: Maximum requests allowed in the time window
X-RateLimit-Remaining: Requests remaining in current window
X-RateLimit-Reset: Unix timestamp when the rate limit resets
X-RateLimit-Window: Rate limit window in seconds
Retry-After: Seconds to wait before making another request (when rate limited)

Endpoint-Specific Limits

Account Management

GET /accounts/* - 500/hour
POST /accounts/* - 50/hour
PUT /accounts/* - 50/hour
DELETE /accounts/* - 10/hour

GPU Clusters

GET /clusters/* - 1000/hour
POST /clusters - 20/hour (creation)
PUT /clusters/* - 100/hour
DELETE /clusters/* - 20/hour

Serverless Endpoints

GET /endpoints/* - 1000/hour
POST /endpoints - 50/hour (creation)
POST /endpoints/*/execute - 500/hour (execution)

AI Services

POST /ai/chat - 200/hour
POST /ai/text-to-image - 100/hour
POST /ai/text-to-video - 20/hour
POST /ai/text-to-speech - 150/hour

Training Jobs

GET /training/* - 500/hour
POST /training/jobs - 10/hour
PUT /training/* - 50/hour

Rate Limit Strategies

1. Exponential Backoff

Implement exponential backoff when rate limited:

import time
import random

def make_request_with_backoff(func, max_retries=5):
    for attempt in range(max_retries):
        try:
            response = func()
            if response.status_code != 429:
                return response
        except Exception as e:
            if attempt == max_retries - 1:
                raise e

        # Exponential backoff with jitter
        delay = (2 ** attempt) + random.uniform(0, 1)
        time.sleep(delay)

    raise Exception("Max retries exceeded")

2. Request Batching

Batch multiple operations when possible:

# Instead of multiple single requests
curl -X POST "https://api.tensorone.ai/v2/endpoints/ep1/execute" -d '...'
curl -X POST "https://api.tensorone.ai/v2/endpoints/ep2/execute" -d '...'

# Use batch execution
curl -X POST "https://api.tensorone.ai/v2/endpoints/batch-execute" \
  -d '{
    "requests": [
      {"endpoint_id": "ep1", "input": {...}},
      {"endpoint_id": "ep2", "input": {...}}
    ]
  }'

3. Caching

Cache responses when appropriate:

import requests
from functools import lru_cache
import time

@lru_cache(maxsize=128)
def get_cluster_info(cluster_id, cache_time):
    # cache_time parameter ensures cache expires
    response = requests.get(f"https://api.tensorone.ai/v2/clusters/{cluster_id}")
    return response.json()

# Use with current hour to cache for 1 hour
current_hour = int(time.time() // 3600)
cluster_info = get_cluster_info("cluster_123", current_hour)

Monitoring Rate Limits

Check Current Usage

curl -X GET "https://api.tensorone.ai/v2/auth/rate-limits" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response:

{
    "limits": {
        "read_operations": {
            "limit": 1000,
            "remaining": 847,
            "reset_at": "2024-01-15T11:00:00Z"
        },
        "write_operations": {
            "limit": 200,
            "remaining": 195,
            "reset_at": "2024-01-15T11:00:00Z"
        },
        "ai_services": {
            "limit": 500,
            "remaining": 423,
            "reset_at": "2024-01-15T11:00:00Z"
        }
    }
}

Usage Analytics

curl -X GET "https://api.tensorone.ai/v2/analytics/api-usage?period=24h" \
  -H "Authorization: Bearer YOUR_API_KEY"

Rate Limit Errors

429 Too Many Requests

{
    "error": "RATE_LIMIT_EXCEEDED",
    "message": "Rate limit exceeded for endpoint",
    "code": 429,
    "details": {
        "limit": 100,
        "remaining": 0,
        "reset_at": "2024-01-15T11:00:00Z",
        "retry_after": 1800
    }
}

Handling in Code

async function makeAPIRequest(url, options) {
    try {
        const response = await fetch(url, options);

        if (response.status === 429) {
            const retryAfter = response.headers.get("Retry-After");
            console.log(`Rate limited. Retry after ${retryAfter} seconds`);

            // Wait and retry
            await new Promise((resolve) => setTimeout(resolve, retryAfter * 1000));
            return makeAPIRequest(url, options);
        }

        return response;
    } catch (error) {
        console.error("API request failed:", error);
        throw error;
    }
}

Optimization Tips

1. Use Appropriate HTTP Methods

Use HEAD requests to check resource existence
Use PATCH instead of PUT for partial updates
Implement conditional requests with If-Modified-Since

2. Optimize Polling

# Instead of constant polling
while True:
    status = get_job_status(job_id)
    if status == 'completed':
        break
    time.sleep(5)  # Wastes rate limit

# Use progressive intervals
def wait_for_completion(job_id):
    intervals = [5, 10, 30, 60]  # Progressive backoff
    interval_index = 0

    while True:
        status = get_job_status(job_id)
        if status == 'completed':
            return

        interval = intervals[min(interval_index, len(intervals) - 1)]
        time.sleep(interval)
        interval_index += 1

3. Request Prioritization

Use priority headers for critical requests:

curl -X POST "https://api.tensorone.ai/v2/endpoints/critical/execute" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Priority: high" \
  -d '...'

Increasing Rate Limits

Upgrade Your Plan

Higher tier plans come with increased rate limits:

Pro Plan: 5x increase across all endpoints
Enterprise Plan: 50x increase with custom limits available

Request Limit Increase

For specific use cases, contact support with:

Expected request volume
Use case description
Timeline requirements
Current plan tier

Temporary Limit Boosts

For events or migrations:

curl -X POST "https://api.tensorone.ai/v2/auth/temporary-boost" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "multiplier": 2,
    "duration": "24h",
    "reason": "Data migration"
  }'

Consider using webhooks instead of polling to reduce API calls and stay within rate limits.

Getting Started

Account Management

GPU Clusters (VPS)

Serverless Endpoints

Managed Training

AI Services

Payment & Billing

Monitoring & Analytics

Overview

Rate Limit Tiers

Free Tier

Pro Tier

Enterprise Tier

Rate Limit Headers

Header Descriptions

Endpoint-Specific Limits

Account Management

GPU Clusters

Serverless Endpoints

AI Services

Training Jobs

Rate Limit Strategies

1. Exponential Backoff

2. Request Batching

3. Caching

Monitoring Rate Limits

Check Current Usage

Usage Analytics

Rate Limit Errors

429 Too Many Requests

Handling in Code

Optimization Tips

1. Use Appropriate HTTP Methods

2. Optimize Polling

3. Request Prioritization

Increasing Rate Limits

Upgrade Your Plan

Request Limit Increase

Temporary Limit Boosts

Getting Started

Account Management

GPU Clusters (VPS)

Serverless Endpoints

Managed Training

AI Services

Payment & Billing

Monitoring & Analytics

​Overview

​Rate Limit Tiers

​Free Tier

​Pro Tier

​Enterprise Tier

​Rate Limit Headers

​Header Descriptions

​Endpoint-Specific Limits

​Account Management

​GPU Clusters

​Serverless Endpoints

​AI Services

​Training Jobs

​Rate Limit Strategies

​1. Exponential Backoff

​2. Request Batching

​3. Caching

​Monitoring Rate Limits

​Check Current Usage

​Usage Analytics

​Rate Limit Errors

​429 Too Many Requests

​Handling in Code

​Optimization Tips

​1. Use Appropriate HTTP Methods

​2. Optimize Polling

​3. Request Prioritization

​Increasing Rate Limits

​Upgrade Your Plan

​Request Limit Increase

​Temporary Limit Boosts

Overview

Rate Limit Tiers

Free Tier

Pro Tier

Enterprise Tier

Rate Limit Headers

Header Descriptions

Endpoint-Specific Limits

Account Management

GPU Clusters

Serverless Endpoints

AI Services

Training Jobs

Rate Limit Strategies

1. Exponential Backoff

2. Request Batching

3. Caching

Monitoring Rate Limits

Check Current Usage

Usage Analytics

Rate Limit Errors

429 Too Many Requests

Handling in Code

Optimization Tips

1. Use Appropriate HTTP Methods

2. Optimize Polling

3. Request Prioritization

Increasing Rate Limits

Upgrade Your Plan

Request Limit Increase

Temporary Limit Boosts