TL;DR
- A 2,000% event spike was caused by a double-fire tracking bug. A useEffect hook without a dependency array fired duplicate pageview events on every render after a component re-render.
- Always verify the spike is real before changing code. Cross-reference server logs, session replays, and raw event data in HogQL to confirm the events actually exist in your pipeline.
- Isolating which event types spiked narrows the search. Breaking down event volume by name takes you from auditing your entire tracking surface to fixing one or two broken calls.
- Double-fire bugs hide in lifecycle hooks that run multiple times. React useEffect hooks, route change listeners, and feature flag callbacks can fire dozens of times per page view if misconfigured.
- Event volume alerts catch issues within hours, not billing cycles. Setting up threshold alerts in PostHog prevents a 14-hour bug from generating 2.1 million duplicate events and a surprise invoice.
The Slack Message That Started Everything
You wake up to a Slack notification at 7:14 AM. The message is short and has all the urgency you would expect. The CEO forwarded a screenshot from the PostHog dashboard showing event volume at 2,147% of the daily average. The chart looks like someone dropped a brick wall onto a step function.
This was not a theoretical exercise for us. A SaaS client running a React-based product management tool hit this exact scenario last quarter. They were on PostHog's Scale plan, billing is event-based, and the spike had already been running for roughly 14 hours before anyone noticed. That window matters because PostHog counts every single event for billing purposes. You pay for what gets tracked, even when it is a bug.
Event volume compared to the 30-day daily average, sustained over 14 hours before the team identified the root cause and deployed a fix.
The panic in that Slack thread was understandable. Nobody wants to explain to their leadership team why the analytics bill just went up by a factor of 21x. But panic does not fix the problem. A systematic investigation does. If your dashboards look clean while the underlying data is broken, you face a similar situation to what we covered in our piece on why dashboards can look perfectly fine while the data underneath is compromised. The same principle applies here: the surface numbers tell you something is wrong, but they do not tell you why.
A 2,000% spike is not a growth story. It is a tracking bug, and every unchecked minute costs you money and corrupts your data.
What follows is the exact 5-step investigation process we ran from that first Slack message to a deployed fix, a billing reconciliation, and a set of alerts that catch the same issue if it ever happens again. You can use this framework the next time your PostHog charts go vertical.
"The fastest way to fix a tracking spike is to stop guessing and start isolating. Verify, segment, correlate, trace, and fix — in that order."
-- Jake McMahon, ProductQuant
Step 1: Verify the Spike Is Real
Your first job is to confirm the spike is not a PostHog reporting glitch, a timezone change, or a cached dashboard artifact. You do not touch any tracking code until you can prove the events actually exist in your raw data. Jumping straight to code changes wastes time when the problem might be something else entirely.
Check the Trending Graph in Raw Mode
Open the PostHog Trends view and switch the interval to hourly. A genuine spike will show a sharp inflection point at a specific timestamp. A reporting glitch tends to look like uniform inflation across the entire day or gaps in the data. In our case, the spike started at exactly 5:30 PM UTC the previous evening, which corresponded with a deployment window on the client's engineering calendar.
The insight: A real spike has a sharp inflection point at a specific time. If the inflation is uniform across the whole day, it is a reporting artifact — not a tracking bug.
Query Raw Events with HogQL
HogQL lets you run SQL-like queries directly against your event data. You want to count events per hour and compare against the trailing average. A simple query like the one below tells you whether the numbers are real:
SELECT
toStartOfHour(timestamp) AS hour,
count() AS event_count
FROM events
WHERE timestamp > now() - INTERVAL 7 DAY
GROUP BY hour
ORDER BY hour DESC
LIMIT 48
If the results show a single hour where event counts jumped from roughly 5,000 to 100,000+, the spike is real. You have actual events flooding your pipeline — that is a tracking problem, not a reporting one. The PostHog HogQL documentation covers the full query syntax if you want to build more detailed breakdowns.
The insight: If HogQL shows a 20x jump in a single hour, the events are real and coming from your app — not a PostHog dashboard glitch.
Cross-Reference Session Replays
Open PostHog Session Replay and filter for the spike window. If you see individual sessions generating 15 to 30 pageviews per session instead of the typical 3 to 5, the tracking layer itself is broken. Each real person is generating dozens of events that should only fire once.
The insight: When individual sessions show 15-30 pageviews instead of the normal 3-5, the tracking layer itself is broken — not the analytics pipeline.
At this point you have confirmed the spike is real. The events exist. They are being sent by real users. Now you need to figure out which events.
Step 2: Isolate Which Event Types Spiked
PostHog tracks many event types simultaneously. Pageviews, autocaptured clicks, custom events, feature flag evaluation calls, and group identification events all flow through the same pipeline. You need to know which specific event type is responsible for the spike before you start auditing your codebase. Searching every tracking call across your entire app is not efficient when you can narrow the search with data.
Run a breakdown by event name over the spike window. In PostHog's Trends view, add a breakdown for event name and set the date range to cover the spike period. The bar chart will immediately show you which event type dominates the volume.
Here is what we found in our investigation:
- Pageview events accounted for 94% of the spike volume. Custom events like button clicks and form submissions remained at baseline levels throughout the same period.
- Autocaptured events showed a moderate increase of 180%. This was a secondary effect caused by the same root issue, not a separate bug in click tracking.
- Feature flag evaluation calls stayed flat. This ruled out a recently launched feature flag as the direct trigger, which saved us from chasing the wrong lead.
The pageview event was the smoking gun. Pageviews should fire exactly once per page load in a single-page application. If they are firing multiple times, something in your routing or component lifecycle is triggering the tracking call more than once per navigation action.
Understanding which event spiked also informs your data ownership strategy, because different teams own different event types. Pageviews usually belong to the frontend team, while custom business events might be owned by growth or product analytics.
The insight: A breakdown by event name tells you exactly which tracking call is broken — saving hours of blind codebase auditing.
Step 3: Check for Recent Tracking Code Changes
Once you know which event type spiked, you check what changed in the codebase around the time the spike started. The deployment log from 5:15 PM UTC showed a frontend release that touched the routing configuration and the analytics initialization module. The PR description mentioned refactoring the PostHog SDK setup to support lazy loading. That timing aligned almost exactly with the 5:30 PM UTC inflection point in the event data.
We pulled the diff and found the problem within the first 20 lines of changed code. The developer had moved the posthog.capture('$pageview') call into a React useEffect hook without adding a proper dependency array. Here is what the problematic code looked like:
// BROKEN: This fires on every render, not just page changes
useEffect(() => {
posthog.capture('$pageview', {
path: location.pathname,
referrer: document.referrer
});
}); // Missing dependency array — runs after every render
Without the empty dependency array [] or a proper route-change dependency, that useEffect runs after every single component render. A typical page in this app rendered 8 to 12 times per load due to state updates, API responses, and animation triggers. Each render fired another pageview event.
This is not a PostHog-specific problem. The same bug can happen with Google Analytics, Mixpanel, Amplitude, or any analytics SDK where you manually trigger events from a framework that re-renders components. The PostHog tracking documentation covers the correct patterns for different frameworks, including the recommended approach for React single-page applications.
The insight: Cross-referencing the spike timestamp with your deployment log surfaces the exact PR that introduced the bug — usually within minutes.
Step 4: Confirm the Root Cause with Session-Level Evidence
Before you declare victory and deploy a fix, you need session-level proof that the double-fire bug matches the spike pattern. You already know the code is wrong. Now you need to confirm it is producing the exact behavior you see in the data.
We opened 10 random session replays from the spike window and counted pageview events per session. The results were consistent:
- Sessions that lasted 2 minutes generated between 25 and 40 pageview events each.
- The average time between consecutive pageview events in the same session was 3 to 8 seconds, matching the typical render cycle of components on that page.
- The URL path in every duplicate pageview event was identical, confirming these were not real navigation events but duplicate fires from the same page state.
The evidence was conclusive. The missing dependency array in the useEffect hook was firing a pageview event on every render cycle, multiplying the event count by roughly 10x to 15x per user per session. Combined with a normal traffic day of roughly 2,000 sessions, this produced the 2,000% spike on the dashboard.
The insight: Session replays provide the definitive proof that connects your code bug to the data pattern — no more speculation.
If you are working through a similar investigation and need a structured approach to diagnosing data quality issues, our growth operating system framework includes a diagnostics module that maps directly to this kind of root cause analysis workflow.
Step 5: Calculate the Billing Impact and Deploy the Fix
The fix itself was straightforward. Adding an empty dependency array to the useEffect hook ensured the pageview event fired only once when the component mounted:
// FIXED: Empty dependency array fires once on mount
useEffect(() => {
posthog.capture('$pageview', {
path: location.pathname,
referrer: document.referrer
});
}, []); // Empty array = run once on mount only
For route changes in a single-page application, the better approach is to listen to the router's navigation events directly rather than relying on component lifecycle hooks. PostHog's React integration handles this automatically when you call posthog.init() with the capture_pageview option set to true and the api_host configured correctly.
The billing impact was the harder conversation. The 14-hour window at 21x normal volume generated roughly 2.1 million extra events. On PostHog's Scale plan, that pushed the client over their monthly event allocation and into overage pricing. PostHog's support team was responsive when we explained the situation with evidence — they credited the duplicate event volume back to the account. You should always reach out to your analytics vendor's support team when a tracking bug causes billing overage. Most vendors have processes for this because it happens more often than you would think.
Extra events generated during the 14-hour spike window. PostHog credited these back after we provided session replay evidence and the code diff confirming the bug.
The deployment went out as a hotfix within 90 minutes of the initial Slack message. Event volume returned to baseline within the hour. The team then set up alerts to catch this scenario before it could compound for another billing cycle.
The insight: Most analytics vendors will credit duplicate-event overages if you bring session-level evidence and a code diff — but you have to ask.
We will run the investigation for you.
Our team has traced double-fire bugs across React, Vue, and Angular implementations for 40+ SaaS companies. If your event charts just went vertical, we can help you find the root cause and prevent it from happening again.
The Prevention Setup: Event Volume Alerts
Fixing the bug is only half the job. You need a system that alerts you when event volumes deviate from normal patterns so you catch the next issue in hours instead of days. PostHog has a built-in alerts system that can monitor any metric you can query, and event volume is one of the most important ones to watch.
Create an Alert for Daily Event Volume
Set up an alert that fires when your total daily event count exceeds 3x the trailing 7-day average. This threshold catches genuine spikes while avoiding false positives from normal traffic variation like product launches or marketing campaigns. In PostHog, you can create this alert from the Insights view by building a trend query for total events and adding an alert condition on the metric threshold.
The PostHog alerts documentation walks through the configuration steps for both metric-based and formula-based alerts. For event volume monitoring, the metric-based approach is simpler and covers the most common failure modes.
The insight: A 3x trailing-average alert catches massive spikes like double-fire bugs while ignoring normal traffic variation from launches and campaigns.
Add Per-Event Breakdown Alerts
A total event count alert will catch the big picture. But if only one event type spikes while others stay normal, the total might not cross your threshold until significant damage is done. Add a secondary alert that monitors your top 3 event types individually. If pageviews jump by 500% while clicks stay flat, the per-event alert fires before the total does.
- Set the per-event alert threshold at 5x the trailing 7-day average for each monitored event type.
- Route alerts to a dedicated Slack channel or PagerDuty service so they do not get buried in general notifications.
- Include a link to the relevant PostHog insight in the alert message so the on-call engineer can start investigating immediately without navigating through dashboards.
The insight: Per-event alerts catch single-type anomalies before they inflate the total volume enough to trigger the main alert.
Run a Weekly Event Volume Review
Alerts catch the emergencies. A weekly review catches the slow drifts that alerts miss. Block 15 minutes each week to review your event volume trends, check for gradual increases in per-event counts, and verify that new events being tracked match your product's recent releases. This practice surfaces issues like deprecated events still firing, new features generating unexpectedly high event volumes, and tracking calls that were added during experiments and never removed.
The insight: A 15-minute weekly review catches slow drifts and abandoned tracking calls that alerts miss because they never cross a spike threshold.
We will configure your event volume monitoring end-to-end.
Get the total-volume alert, per-event breakdown alerts, and a weekly review template tailored to your PostHog setup — deployed in under an hour.
| Alert Type | Threshold | Catches | False Positive Risk |
|---|---|---|---|
| Total daily event count | 3x trailing 7-day average | Massive spikes like double-fire bugs | Low — only triggers on major deviations from baseline |
| Per-event type volume | 5x trailing 7-day average | Single event type anomalies masked by overall volume | Medium — some events naturally spike with campaigns |
| Weekly trend review | Manual inspection | Slow drifts, abandoned tracking calls, deprecated events | None — human judgment applies context |
If your team needs help designing an analytics monitoring strategy that covers both spike detection and gradual data quality drift, reach out to us and we will walk through what makes sense for your stack and team size.
The 5-Step Investigation Framework for Any Spike
Every event spike follows the same pattern regardless of the root cause. You can use this exact sequence the next time your charts go vertical. The steps are ordered because skipping ahead wastes time. You cannot fix what you have not verified, and you cannot isolate what you have not quantified.
- Verify the spike is real. Check raw data with HogQL, confirm session replays show abnormal event counts, and rule out reporting artifacts before touching any code.
- Isolate the event type. Break down event volume by name to identify which specific tracking call is misbehaving instead of auditing your entire codebase blindly.
- Check recent changes. Cross-reference the spike start time with your deployment log, feature flag releases, and analytics SDK updates to find what changed.
- Confirm with session evidence. Open individual session replays and count events per session to verify the code issue you found produces the exact data pattern you see.
- Fix, calculate, and alert. Deploy the fix, quantify the billing impact, contact your vendor about crediting duplicate events, and set up alerts to catch the same pattern next time.
This process works for PostHog, Mixpanel, Amplitude, GA4, or any event-based analytics platform. The tools change. The investigation pattern does not. The PostHog troubleshooting guide covers additional diagnostic techniques specific to their SDK and pipeline, which is worth reading if you run PostHog in production.
FAQ
Why did PostHog charge us for events caused by a bug?
PostHog bills based on the number of events that pass through its pipeline, regardless of whether those events represent real user behavior or a tracking bug. The events are still processed, stored, and counted against your plan allocation. This is standard across all event-based analytics platforms because the infrastructure cost is the same whether the event is legitimate or duplicated. Contact PostHog support with evidence of the bug and they will often credit the excess volume as a one-time adjustment.
How do I prevent double-fire events in React?
Always include a dependency array in your useEffect hooks that fire analytics events. Use [] for mount-only execution or pass specific dependencies like [location.pathname] for route-change tracking. Even better, use PostHog's built-in React integration which handles pageview tracking automatically through the router without manual useEffect calls. Test your tracking implementation in a staging environment and verify event counts per session before deploying to production.
What is a reasonable event volume per user session?
A typical session generates 10 to 50 events depending on your tracking configuration. A single pageview, 5 to 15 autocaptured clicks, 2 to 5 custom events, and a handful of feature flag evaluation calls is a normal range for a content-heavy SaaS product. If individual sessions are generating 100+ events, you likely have a tracking bug that needs investigation.
Can I delete the duplicate events from PostHog?
PostHog does not currently support deleting individual events from your project. Once events are ingested, they are permanent. This is by design because analytics platforms prioritize data immutability to maintain audit trails and consistent reporting. The practical approach is to credit the billing overage with PostHog support and then filter out the spike window from any analysis that uses event counts as a denominator for calculations like conversion rates or engagement metrics.
Should I set up alerts for every event type I track?
No. Monitoring every event type creates alert fatigue and makes it harder to spot the ones that matter. Focus your automated alerts on your top 3 to 5 highest-volume event types and your most critical business events like purchase completions or subscription changes. Use the weekly manual review process to catch anomalies in lower-volume events that do not warrant dedicated alert thresholds.
Sources
Audit Your Tracking Before the Next Spike Hits
A 30-minute review of your PostHog event volume trends and tracking code catches double-fire bugs before they generate millions of duplicate events and surprise invoices.