Rate Limiting As A Service

A policy-driven platform for enforcing limits, quotas, and traffic control across APIs and service workloads — built in Go for speed, designed for any stack.

Fixed Window Token Bucket Sliding Window Concurrency Limiter Quota / Budget Shadow Mode Multi-Tenant
Get Started Read the Design

Three Deployment Models, One Engine

Whether you embed, centralize, or sidecar — the same policy engine powers every decision.

📦 Embedded Go SDK

Import as a library. Sub-millisecond local decisions. No network hop. Perfect for Go services.

🌐 Centralized HTTP / gRPC

Language-agnostic decision service. Centralized governance, unified telemetry, one version to manage.

⚙️ Sidecar Local Proxy

Run alongside your app in Kubernetes. Local latency, central governance. Best of both worlds.

Built for Real Workloads

Everything you need to protect APIs, control telemetry, manage quotas, and enforce traffic policy at scale.

Seven Algorithms

Fixed window, sliding window (log & counter), token bucket, leaky bucket, concurrency limiter, and quota/budget limiter — all behind a single interface.

🛡️

Rich Action Model

Go beyond allow/deny. Support delay, sample, drop, downgrade, drop-low-priority, and shadow-only actions per policy.

🎯

Fine-Grained Matching

Match on 20+ dimensions: org, tenant, service, endpoint, method, user, API key, region, tags, and more. Advanced match_expr expressions supported.

🔄

Safe Rollout

Shadow mode for dry-run evaluation. Gradual rollout percentages. Version history with one-click rollback.

📊

Analytics & Audit

Built-in analytics summary with tag aggregation. Full audit trail and version history for every policy change.

🔒

Fail-Safe by Design

Configure fail-open or fail-closed per policy. Graceful degradation when backends are unavailable.

🚀

High Performance

Lock-sharded in-memory counters (~6 ns/op). Async invalidation with bounded workers. Burst coalescing in sidecar sync.

🌍

Multi-Region Ready

Weighted regional allocation primitives. Overflow detection across regions. Built for global deployments.

📡

OTEL Integration

Processor primitives for batch log and span filtering. Worker pools with fail-open/closed. Control telemetry volume per policy.

Architecture at a Glance

RLAAS separates policy storage from counter storage and uses a canonical internal model so every deployment mode shares the same evaluation engine.

┌─────────────────────────────────────────────────────────────────┐ │ Clients │ │ Go SDK │ HTTP/gRPC │ Python │ TypeScript │ Java/.NET │ └─────┬─────┴──────┬──────┴─────┬────┴──────┬───────┴────┬───────┘ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ RLAAS Decision Engine │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────────┐ │ │ │ Matcher │→ │Algorithm │→ │ Decision │→ │ Analytics / │ │ │ │ (scope │ │(fixed, │ │ Builder │ │ Audit Logger │ │ │ │ + expr) │ │ token,.. │ │ │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └────────────────┘ │ │ │ │ │ ┌───────────┴──────────┐ │ │ ▼ ▼ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │Counter Store│ │Policy Store │ │ │ │ Memory/Redis│ │ File/PG/ORA │ │ │ └─────────────┘ └─────────────┘ │ └─────────────────────────────────────────────────────────────────┘
Design Principle Treat rate limiting as a generic policy decision engine, not as a database-specific algorithm runner. Policies in the database, counters in fast stores.

Works Across Every Signal Type

From HTTP ingress to background jobs — one platform covers all your rate limiting needs.

🌐

HTTP / REST APIs

Per-IP, per-API-key, per-user, per-endpoint, per-org throttling for any REST service.

📨

gRPC Services

Per-method, per-service, per-tenant concurrency & rate limiting via interceptors.

📋

OpenTelemetry Signals

Control log/span/trace volume per org, service, severity, or attribute set.

📩

Event & Messaging

Per-topic, per-consumer-group, per-event-type limits for Kafka, Pub/Sub, SQS, NATS.

🔧

Background Jobs

Per-job-type, per-org, per-workflow-step throttling for batch and async workloads.

🚪

Auth & Abuse Prevention

Login attempts, OTP generation, password resets, device registration — protect every auth flow.

Supported Algorithms

Choose the right algorithm for each use case, or let the policy engine decide.

Fixed Window Sliding Window Log Sliding Window Counter Token Bucket Leaky Bucket Concurrency Limiter Quota / Budget
AlgorithmBest ForTrade-off
Fixed WindowSimple org-wide limits, low-complexity quotasBoundary burst possible
Sliding Window LogSecurity-sensitive exact checks, low-volumeHigher memory cost
Sliding Window CounterAPIs, OTEL signals, general distributed loadsApproximation complexity
Token BucketREST/gRPC throttling, burst controlRefill math & atomicity
Leaky BucketEgress smoothing, outbound traffic shapingLess intuitive
Concurrency LimiterDB-heavy ops, file processing, dependency protectionRequires acquire/release lifecycle
Quota / BudgetSaaS plan enforcement, daily/monthly budgetsNot for short-burst protection alone

Client SDKs for Every Stack

Native Go plus four HTTP client SDKs — integrate in minutes.

Go (Native)

Embedded library with direct engine access. Sub-millisecond decisions, zero network hop.

Python

Lightweight requests-based client. Full API coverage including analytics and audit.

TypeScript

Modern fetch-based client. Type-safe interfaces. Works in Node.js and edge runtimes.

Java

java.net.http.HttpClient with Jackson. Java 11+ compatible. Full CRUD support.

.NET

HttpClient with System.Text.Json. Async/await, CancellationToken. .NET 8 ready.

View SDK Documentation →

API at a Glance

Simple, RESTful endpoints. One POST /v1/check call to get a rate-limit decision.

MethodEndpointDescription
POST /v1/check Evaluate a rate-limit decision
POST /v1/acquire Acquire a concurrency lease
POST /v1/release Release a concurrency lease
GET /v1/policies List all policies
POST /v1/policies Create a new policy
GET /v1/policies/{id} Get a specific policy
PUT /v1/policies/{id} Update a policy
DELETE /v1/policies/{id} Delete a policy
GET /v1/policies/{id}/audit Policy change audit trail
GET /v1/policies/{id}/versions Policy version history
POST /v1/policies/{id}/rollout Update rollout percentage
POST /v1/policies/{id}/rollback Rollback to a previous version
POST /v1/policies/validate Validate a policy definition
GET /v1/analytics/summary Decision analytics summary

Full API Reference →

Ready to Protect Your APIs?

RLAAS is ready for customer integration in controlled production environments. Start with the Quick Start guide.

Quick Start Guide View on GitHub