RLAAS — Rate Limiting As A Service

Rate Limiting As A Service

A policy-driven platform for enforcing limits, quotas, and traffic control across APIs and service workloads — built in Go for speed, designed for any stack.

Fixed Window Token Bucket Sliding Window Concurrency Limiter Quota / Budget Shadow Mode Multi-Tenant

Architecture at a Glance

RLAAS separates policy storage from counter storage and uses a canonical internal model so every deployment mode shares the same evaluation engine.

┌─────────────────────────────────────────────────────────────────┐ │ Clients │ │ Go SDK │ HTTP/gRPC │ Python │ TypeScript │ Java/.NET │ └─────┬─────┴──────┬──────┴─────┬────┴──────┬───────┴────┬───────┘ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ RLAAS Decision Engine │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────────┐ │ │ │ Matcher │→ │Algorithm │→ │ Decision │→ │ Analytics / │ │ │ │ (scope │ │(fixed, │ │ Builder │ │ Audit Logger │ │ │ │ + expr) │ │ token,.. │ │ │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └────────────────┘ │ │ │ │ │ ┌───────────┴──────────┐ │ │ ▼ ▼ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │Counter Store│ │Policy Store │ │ │ │ Memory/Redis│ │ File/PG/ORA │ │ │ └─────────────┘ └─────────────┘ │ └─────────────────────────────────────────────────────────────────┘

Design Principle Treat rate limiting as a generic policy decision engine, not as a database-specific algorithm runner. Policies in the database, counters in fast stores.

Supported Algorithms

Choose the right algorithm for each use case, or let the policy engine decide.

Algorithm	Best For	Trade-off
Fixed Window	Simple org-wide limits, low-complexity quotas	Boundary burst possible
Sliding Window Log	Security-sensitive exact checks, low-volume	Higher memory cost
Sliding Window Counter	APIs, OTEL signals, general distributed loads	Approximation complexity
Token Bucket	REST/gRPC throttling, burst control	Refill math & atomicity
Leaky Bucket	Egress smoothing, outbound traffic shaping	Less intuitive
Concurrency Limiter	DB-heavy ops, file processing, dependency protection	Requires acquire/release lifecycle
Quota / Budget	SaaS plan enforcement, daily/monthly budgets	Not for short-burst protection alone

API at a Glance

Simple, RESTful endpoints. One POST /v1/check call to get a rate-limit decision.

Method	Endpoint	Description
POST	/v1/check	Evaluate a rate-limit decision
POST	/v1/acquire	Acquire a concurrency lease
POST	/v1/release	Release a concurrency lease
GET	/v1/policies	List all policies
POST	/v1/policies	Create a new policy
GET	/v1/policies/{id}	Get a specific policy
PUT	/v1/policies/{id}	Update a policy
DELETE	/v1/policies/{id}	Delete a policy
GET	/v1/policies/{id}/audit	Policy change audit trail
GET	/v1/policies/{id}/versions	Policy version history
POST	/v1/policies/{id}/rollout	Update rollout percentage
POST	/v1/policies/{id}/rollback	Rollback to a previous version
POST	/v1/policies/validate	Validate a policy definition
GET	/v1/analytics/summary	Decision analytics summary

Full API Reference →

Rate Limiting As A Service

Three Deployment Models, One Engine

📦 Embedded Go SDK

🌐 Centralized HTTP / gRPC

⚙️ Sidecar Local Proxy

Built for Real Workloads

Seven Algorithms

Rich Action Model

Fine-Grained Matching

Safe Rollout

Analytics & Audit

Fail-Safe by Design

High Performance

Multi-Region Ready

OTEL Integration

Architecture at a Glance

Works Across Every Signal Type

HTTP / REST APIs

gRPC Services

OpenTelemetry Signals

Event & Messaging

Background Jobs

Auth & Abuse Prevention

Supported Algorithms

Client SDKs for Every Stack

Go (Native)

Python

TypeScript

Java

.NET

API at a Glance

Ready to Protect Your APIs?