Server & WordPress

Rate Limiting 101: Preventing DoS Attacks on Your Forms

· 16 min read

In many environments, an unprotected Contact Form 7 endpoint begins to experience resource contention under sustained submission rates of just a few to a few dozen requests per second. That may sound harmless, but if a bot operator points a script at that endpoint, generating tens of thousands of requests per hour from a single machine is trivial. Each request passes through the WordPress application stack, triggers plugin hooks, writes to the database, and fires off mail delivery (including SMTP or external mail service integrations). Depending on the server’s processing capacity, it can take just minutes for the PHP-FPM pool to hit its ceiling (max_children), queuing subsequent requests, while MySQL develops I/O waits and lock contention under the write load, causing query processing to stall—and real visitors are left staring at 502 errors.

This isn’t a sophisticated distributed DDoS by an advanced threat actor. It’s a single machine running a while true loop against a form endpoint. And the countermeasure—rate limiting—is one of the oldest and most effective tools in a systems engineer’s toolkit. Yet most WordPress installations don’t implement it at all.

This article covers what rate limiting is, how throttling works, why per-IP limits are necessary but insufficient, and the specific challenges that IPv6 addressing creates when implementing request limits in 2026.

The Problem: Your Form Is an Unrestricted API Endpoint

Every contact form on the internet is, architecturally, a public API endpoint. It accepts POST requests from any IP address, processes them through the application stack, and returns a response. Unlike a typical REST API, it has virtually no authentication, quota management, or request limiting.

Think about it. If you built a SaaS API that accepted unlimited unauthenticated requests from any client worldwide, your security team would shut it down before lunch. But that’s what a standard WordPress contact form is: a public endpoint with no admission control.

The consequences are predictable:

  • Resource exhaustion. Each form submission consumes CPU, memory, database I/O, and often network I/O (mail delivery). At sufficient volume, processing of legitimate requests is disrupted.
  • Mail reputation damage. Thousands of notification emails sent in a short period increase the risk of your SMTP IP being listed on blacklists such as Spamhaus or Barracuda.
  • Database bloat. Plugins that log submissions write rows to wp_posts and wp_postmeta. Ten thousand junk submissions per day can add tens of thousands to hundreds of thousands of meta rows.
  • Cascading failure. A PHP-FPM pool that hits its limits can cause the entire site—not just the form—to become unresponsive.

Rate limiting is the minimum viable defense. It’s not a complete solution, but it’s the foundation on which everything else is built.

Rate limiting is not a mechanism for stopping attacks. It is a mechanism for controlling the speed at which your system breaks.

Technical Deep Dive: How Rate Limiting Works

Core Concept

Rate limiting restricts the number of requests a client can make to an endpoint within a specified time window. When a client exceeds the limit, subsequent requests are rejected—typically with an HTTP 429 Too Many Requests response—until the window resets.

The concept is simple. The implementation details are where complexity lives.

Throttling vs. Hard Limits

These terms are often used interchangeably, but they describe different behaviors:

Hard rate limiting is binary. N requests per window are allowed. Request N+1 is immediately rejected. The client receives a 429 and must wait for the window to reset.

Throttling is graduated. Instead of a hard reject, the server delays responses or queues requests. The client continues to receive service, but at reduced speed. Useful when you want graceful degradation rather than a hard cutoff.

In practice, most implementations combine both: throttle first, then apply hard limits if the client persists.

Client sends request #1  -> 200 OK (immediate)
Client sends request #2  -> 200 OK (immediate)
Client sends request #3  -> 200 OK (immediate)
Client sends request #4  -> 200 OK (500ms delay — throttling)
Client sends request #5  -> 200 OK (2000ms delay — harder throttling)
Client sends request #6  -> 429 Too Many Requests (hard limit)

This graduated approach is friendly to legitimate users who accidentally double-click a submit button, while being aggressive against automated scripts that send requests as fast as the connection allows.

Algorithms: Fixed Window, Sliding Window, Leaky Bucket, Token Bucket

There are several standard approaches to counting requests and enforcing limits. Each has trade-offs.

Fixed Window

Divide time into fixed intervals (e.g., 60-second windows). Count requests per client in each window. Reset the counter when the window expires.

Window: 12:00:00 - 12:00:59  |  Requests: 5/5 (limit reached)
Window: 12:01:00 - 12:01:59  |  Requests: 0/5 (counter reset)

Problem: Boundary bursts. A client can send 5 requests at 12:00:58 and another 5 at 12:01:01, processing 10 requests in 3 seconds while technically never exceeding the “5 per minute” limit.

Sliding Window

Instead of fixed intervals, the window slides with each request. The server checks: “How many requests has this client sent in the past 60 seconds?” This eliminates boundary burst exploitation.

# Pseudocode: sliding window rate limiter
def is_allowed(client_id, max_requests=5, window_seconds=60):
    now = time.time()
    # Remove timestamps older than the window
    requests[client_id] = [
        t for t in requests[client_id] if t > now - window_seconds
    ]
    if len(requests[client_id]) >= max_requests:
        return False  # Rate limited
    requests[client_id].append(now)
    return True

Sliding windows are more accurate but require storing individual request timestamps, increasing per-client memory usage.

Leaky Bucket

This is the algorithm used by Nginx’s limit_req module.

Imagine a bucket with a hole in the bottom. Requests pour water into the bucket. Water leaks out at a constant rate (your configured processing speed). When the bucket is full, new water (requests) overflows and is rejected.

Leaky Bucket characteristics: Even if requests arrive in bursts, processing occurs at a constant rate. The burst parameter corresponds to the bucket’s capacity, allowing a certain number of requests to be queued and processed with delay. Adding the nodelay option processes burst requests immediately without queuing, but still consumes bucket capacity.

# Nginx limit_req behavior (Leaky Bucket)
rate=5r/m (1 request processed every 12 seconds) + burst=2

Request #1  -> Processed immediately (bucket remaining: 2)
Request #2  -> Processed immediately (bucket remaining: 1) — burst consumed
Request #3  -> Processed immediately (bucket remaining: 0) — burst consumed
Request #4  -> 429 Rejected (bucket full, 1 slot recovers after 12 seconds)

Token Bucket

Token Bucket is widely used in many API platforms (AWS API Gateway, Redis-based rate limiters, etc.).

Imagine a bucket that holds a fixed number of tokens. Each request consumes one token. Tokens are replenished at a constant rate (e.g., one token every 12 seconds for a 5-per-minute limit). If the bucket is empty, the request is rejected.

# Pseudocode: Token Bucket rate limiter
class TokenBucket:
    def __init__(self, capacity=5, refill_rate=5/60):
        self.capacity = capacity
        self.tokens = capacity
        self.refill_rate = refill_rate  # tokens per second
        self.last_refill = time.time()

    def allow_request(self):
        now = time.time()
        elapsed = now - self.last_refill
        self.tokens = min(
            self.capacity,
            self.tokens + elapsed * self.refill_rate
        )
        self.last_refill = now

        if self.tokens >= 1:
            self.tokens -= 1
            return True  # Request allowed
        return False  # Rate limited

Token Bucket vs. Leaky Bucket: Token Bucket can process burst submissions immediately (as long as tokens remain). Leaky Bucket maintains a constant output rate. Nginx’s limit_req uses Leaky Bucket, but the burst + nodelay option combination can approximate Token Bucket behavior. Both require only a small amount of state per client (a counter and a timestamp), making them memory-efficient even at scale.

Why Per-IP Limits Are Necessary

The obvious question: what should you use as the client identifier?

For unauthenticated endpoints like contact forms, IP addresses remain widely used as the initial key for client identification. Other identification methods exist—cookies, fingerprints, tokens—but all depend on client-side state, making IP addresses the most practical first key for rate limiting.

Why per-IP rate limiting works:

  1. Automated scripts run from a single origin. The majority of low-to-mid-tier bot attacks originate from a single server or a small number of VPS instances. A limit of 5 per minute per IP effectively suppresses them.
  2. It protects shared resources. Even if it can’t stop a distributed attack, per-IP limits prevent a single client from monopolizing PHP workers.
  3. It costs legitimate users nothing. A legitimate human almost never submits a contact form more than once per minute. A limit of 5 per minute is invisible to real visitors.

Implementation at the web server layer is straightforward:

# Nginx: Leaky Bucket rate limiter for form endpoints
# Zone allocates 10MB of shared memory (~160,000 IPv4 addresses)
limit_req_zone $binary_remote_addr zone=formsubmit:10m rate=5r/m;

server {
    location ~* /wp-json/contact-form-7/ {
        limit_req zone=formsubmit burst=2 nodelay;
        limit_req_status 429;

        # Only pass to PHP-FPM if rate limit passes
        include fastcgi_params;
        fastcgi_pass unix:/run/php/php-fpm.sock;
    }
}

Or at the application layer in PHP:

// WordPress: application-level rate limiter using transients
function check_form_rate_limit(): bool {
    $ip = $_SERVER['REMOTE_ADDR'];
    $key = 'rate_limit_' . md5($ip);
    $window = 60; // seconds
    $max_requests = 5;

    $data = get_transient($key);

    if ($data === false) {
        set_transient($key, ['count' => 1, 'start' => time()], $window);
        return true;
    }

    if ($data['count'] >= $max_requests) {
        return false; // Rate limited
    }

    $data['count']++;
    set_transient($key, $data, $window - (time() - $data['start']));
    return true;
}

Both approaches work. The Nginx approach is superior—it rejects requests before they reach PHP, saving the most expensive part of the stack. The PHP approach is a fallback for environments where you can’t configure the web server directly (shared hosting, managed WordPress platforms).

The IPv6 Problem: Why Per-IP Limits Break Down

This is where it gets hard.

Everything in the previous section assumed IPv4—a world where one IP address roughly corresponds to one client (or at least one NAT gateway). IPv6 fundamentally changes that calculation, and most rate limiting implementations haven’t caught up.

The Scale of IPv6 Address Space

The IPv6 prefix that ISPs assign to households typically falls in the /48 to /64 range (it varies by ISP; /56 allocations are also common). A /48 allocation gives the customer 2^80 addresses—effectively infinite IPs. Even a conservative /64 allocation (a single subnet) provides 2^64 addresses—approximately 1.8×10^19, or 18.4 quintillion.

A bot operator with a single /64 IPv6 block can assign a unique source address to every request and never reuse one. The per-IP rate limiter sees each request as a new client, and the limit never triggers.

Request 1:       from 2001:db8:1234:5678::1         -> Counter: 1/5
Request 2:       from 2001:db8:1234:5678::2         -> Counter: 1/5
Request 3:       from 2001:db8:1234:5678::3         -> Counter: 1/5
...
Request 1,000,000: from 2001:db8:1234:5678::f4240  -> Counter: 1/5

One million requests. Zero rate limit triggers. All from “different” IPs.

The Fix: Rate Limit by CIDR Block, Not Individual IP

The solution is to stop treating IPv6 addresses individually and aggregate by prefix. All addresses within a /64 block are typically controlled by a single entity, so the rate limiter should group them accordingly.

Key on the /64 prefix instead of the full 128-bit address:

import ipaddress

def get_rate_limit_key(ip_string: str) -> str:
    addr = ipaddress.ip_address(ip_string)

    if isinstance(addr, ipaddress.IPv6Address):
        # Mask to /64 — all addresses in this block share one counter
        network = ipaddress.IPv6Network(f"{addr}/64", strict=False)
        return str(network.network_address)
    else:
        # IPv4: use individual address
        return str(addr)

# Examples:
get_rate_limit_key("2001:db8:1234:5678::1")
# -> "2001:db8:1234:5678::"

get_rate_limit_key("2001:db8:1234:5678::ffff:9999:abcd")
# -> "2001:db8:1234:5678::"

get_rate_limit_key("203.0.113.42")
# -> "203.0.113.42"

Both IPv6 examples above map to the same key. The rate limiter now counts them together—correct behavior, since they’re from the same allocation.

Implementing CIDR-Aware Rate Limiting in Nginx

Nginx doesn’t natively support prefix-based IPv6 rate limiting. You need to extract the /64 prefix using map or Lua:

# Extract /64 prefix from IPv6, pass IPv4 through unchanged
map $remote_addr $rate_limit_key {
    "~^(?P<prefix>[0-9a-f:]+:[0-9a-f]+:[0-9a-f]+:[0-9a-f]+:[0-9a-f]+):.*$"  $prefix;
    default  $remote_addr;
}

limit_req_zone $rate_limit_key zone=formsubmit:20m rate=5r/m;

For more precise control, OpenResty (Nginx + Lua) provides programmatic access:

-- OpenResty: CIDR-aware rate limiting
local function get_ipv6_prefix_64(addr)
    -- Extract first 4 groups (64 bits) of IPv6 address
    local groups = {}
    for group in addr:gmatch("([0-9a-fA-F]+)") do
        table.insert(groups, group)
        if #groups == 4 then break end
    end
    return table.concat(groups, ":") .. "::"
end

local client_ip = ngx.var.remote_addr
local key

if client_ip:find(":") then
    key = get_ipv6_prefix_64(client_ip)
else
    key = client_ip
end

The /64 Assumption Doesn’t Always Hold

This is where many people stumble. /64 is the standard subnet size defined in RFC 4291, but not all IPv6 allocations follow the rules:

  • Cloud providers may assign individual VMs a /128 address (a single IP). In this case, rate limiting by /64 groups unrelated customers together.
  • Large enterprises may use /48 allocations internally, where the first 48 bits identify the organization and bits 49–64 identify internal subnets.
  • Mobile carriers may assign /128s from a shared /48 pool, meaning completely unrelated users share a /64 prefix.

There is no perfect prefix length. /64 is the practical default. It catches the majority of IPv6 rotation attacks while minimizing false positives on shared infrastructure. However, if you have the data, consider adaptive prefix lengths: start at /128, widen to /64 when rotation is detected, and widen to /48 if rotation continues within the same /48 block.

# Adaptive CIDR-based rate limiting (conceptual)
def adaptive_rate_limit(ip: str, store: dict) -> bool:
    prefixes_to_check = [
        ("/128", ipaddress.IPv6Network(f"{ip}/128", strict=False)),
        ("/64",  ipaddress.IPv6Network(f"{ip}/64",  strict=False)),
        ("/48",  ipaddress.IPv6Network(f"{ip}/48",  strict=False)),
    ]

    for label, network in prefixes_to_check:
        key = str(network.network_address) + label
        count = store.get(key, 0)

        if label == "/128" and count > MAX_PER_IP:
            return False  # Single IP exceeded limit
        if label == "/64" and count > MAX_PER_64:
            return False  # /64 block exceeded aggregate limit
        if label == "/48" and count > MAX_PER_48:
            return False  # /48 block exceeded aggregate limit

        store[key] = count + 1

    return True

This is more complex, but far more resilient. You’re monitoring for abuse at multiple zoom levels simultaneously.

The Solution: A Layered Approach to Form Rate Limiting

Rate limiting alone doesn’t solve form abuse. A bot operator with access to residential proxy networks can distribute requests across thousands of unique /64 blocks, keeping to just 1–2 submissions from each. That’s well below any reasonable rate limit. But rate limiting is the foundation that makes every other defense more effective.

Here’s the architecture that works in practice:

Layer 1: Web Server Rate Limiting (Nginx / Apache)

The coarsest filter. Reject bulk abuse before it reaches PHP. Configure the CIDR-aware limits described above. Set thresholds generous enough that legitimate users won’t normally hit them. 5–10 submissions per minute per /64 is a safe starting point.

Cost: Near zero. Nginx handles this in shared memory with no disk I/O.

Layer 2: Application-Layer Throttling (WordPress / PHP)

For requests that pass the web server filter, add a secondary check inside the application. This layer can be smarter—it has access to form metadata, session state, and submission content.

// Application-layer throttling with graduated response
function throttle_form_submission(string $ip): string {
    $key = 'throttle_' . get_rate_limit_key($ip);
    $count = (int) get_transient($key);

    if ($count === 0) {
        set_transient($key, 1, 300); // 5-minute window
        return 'allow';
    }

    if ($count < 3) {
        set_transient($key, $count + 1, 300);
        return 'allow';
    }

    if ($count < 5) {
        set_transient($key, $count + 1, 300);
        sleep(2); // Throttle: add 2-second delay
        return 'throttle';
    }

    return 'block'; // Hard limit reached
}

Layer 3: Behavioral Verification

Rate limiting tells you how frequently a client is submitting, but nothing about whether the submission is legitimate. Layer behavioral signals on top: honeypot fields, timing analysis, proof-of-work challenges, interaction fingerprinting.

This is where the real spam filtering happens. Rate limiting keeps volume manageable. Behavioral verification keeps content clean.

Layer 4: Silent Failure

When rejecting a rate-limited or flagged submission, one approach is to return a 200 OK with a fake success message. Bots that see a success response tend to stop retrying and move on, while bots that see error responses adapt and try harder. However, 429 is the correct status code per RFC, and returning 200 may cause unintended behavior in WAF or CDN integrations—so evaluate this strategy based on your operational environment.

// Example: returning fake success to a rate-limited bot (silent rejection strategy)
if ($result === 'block') {
    wp_send_json([
        'status'  => 'mail_sent',
        'message' => 'Thank you for your message.'
    ]);
    exit; // Don't actually process the form submission
}

Integration with Samurai Honeypot for Forms

If you're running Contact Form 7 and building CIDR-aware rate limiting, behavioral verification, proof of work, and silent failure from scratch isn't realistic—that's understandable. It's a significant amount of infrastructure code for a problem that should already be solved.

Samurai Honeypot for Forms packages this layered approach into a single plugin. IPv6-aware rate limiting, behavioral entropy scoring, polymorphic honeypot rotation, silent rejection—all handled without external API calls or user-facing friction. For most WordPress sites, installing this plugin is far more efficient than building and maintaining a custom rate limiting stack.

Operational Recommendations

A checklist for system administrators and developers implementing rate limiting on form endpoints:

  1. Start at the edge. Cloudflare, Nginx, or Apache rate limiting is the cheapest defense. Configure this first.
  2. Understand that Nginx's limit_req operates on a Leaky Bucket model. It maintains a constant output rate. The burst parameter controls burst tolerance; nodelay controls whether burst requests are processed immediately.
  3. Aggregate IPv6 by /64. Per-IP limits on IPv6 are effectively no limits at all. Group by prefix.
  4. Log before you block. Run in "log only" mode for a week before enforcing limits. Analyze traffic patterns and set thresholds based on real data, not guesses.
  5. Consider your response code strategy. Depending on your operational policy, returning 200 instead of 429 can prevent leaking information about detection logic to attackers. Verify compatibility with your WAF/CDN first.
  6. Monitor with alerts. A spike in rate-limited requests means someone is probing your endpoint. That signal is valuable. Pipe it to your monitoring system.
  7. Don't rely on rate limiting alone. It stops volumetric abuse. It doesn't stop a bot sending one well-crafted submission per hour from a rotating IP pool. You need a behavioral analysis layer too.

Final Thoughts

Rate limiting isn't glamorous. It's not the kind of security measure that makes headlines or conference talks. But it's the difference between a WordPress site staying online during a bot attack and a service becoming unresponsive as thousands to tens of thousands of requests hit an unprotected endpoint in a short window.

Implementation is straightforward. The IPv6 challenge is real but solvable. And the cost of doing nothing—database bloat, blacklisted IPs, degraded performance, angry support tickets—is far higher than spending an afternoon configuring a limit_req_zone directive.

Protect your endpoints. Aggregate your IPv6 prefixes. Layer your defenses. Bots won't stop. Your infrastructure needs to be ready.

All Columns