On May 28, I analyzed one of my websites' traffic over 20 minutes to figure out whether my current 2-core setup was holding up — or holding me back. This is what I found.
From access_20min.log.txt
, I counted:
~13,400 total requests
Many from bots: Googlebot, Bingbot, IAS
Many hits on SSR-heavy pages: /swimmer/
, /rankings/
, /records/
That’s roughly:
670 requests per minute
11 requests per second
Under ideal conditions:
~20–50 requests/sec (API or SSR)
Assumes good caching, minimal disk I/O, responsive DB
But SwimStandards also:
Uses Next.js SSR + Feathers.js API
Runs MongoDB aggregation per request
Handles bot load 24/7
Shows socket hang up errors and TIME_WAIT spikes
You're operating at the edge of stability for a 2-core server.
Red flags:
socket hang up
during peak periods
TIME_WAIT
sometimes exceeds 300
MongoDB slow logs show query latency
High I/O due to bot hits on uncached pages
You want margin for real traffic surges
You want better SSR/API responsiveness
You want fewer 502/socket errors without blocking bots
Move MongoDB to its own box (if not already)
Use pm2 scale feathers-dna 2
to add more backend workers
Enable Redis or file-based caching
Use Cloudflare or NGINX with caching to protect SSR routes
sudo awk -v d="$(date -d '20 minutes ago' '+%d/%b/%Y:%H:%M')" '$0 ~ d,0' /var/log/apache2/other_vhosts_access.log | awk '{print $2}' | sort | uniq -c | sort -nr | head -20
wc -l access_20min.log.txt
sudo netstat -anp | grep :443 | awk '{print $6}' | sort | uniq -c
sudo awk '{print $2}' /var/log/apache2/other_vhosts_access.log | sort | uniq -c | sort -nr | head -20
watch -n 1 "netstat -anp | grep :5052 | wc -l"
sudo netstat -anp | grep :443 | awk '{print $6}' | sort | uniq -c
free -m
uptime
sudo grep -i "slow query" /var/log/mongodb/mongod.log | jq -c 'select(.attr.durationMillis > 800)' > slow.log
If your site handles real-time rankings and dynamic pages like mine, a 2-core box can survive — but it won’t thrive. The more traffic (bot or not), the harder it gets to avoid dropped connections, long delays, and random SSR failures.
For me, this test showed it’s time to move up — or keep scraping by with every tweak possible.
UPDATE: 09/04/2025
When all numbers seem to be fine. Try this:
Yep — that’s a common quick fix.
When a Node app (like your next-dna
) gets into a “stuck” state (memory leak, hanging DB query, event loop blocked, etc.), doing:
pm2 restart <process-name-or-id>
will kill the existing process and spawn a fresh one.
A few clarifications that might help you:
pm2 restart
vs pm2 reload
restart
: full stop/start → clears memory, resets connections, resets workers. Useful when things are “stuck.”
reload
: zero-downtime hot reload (round-robin replacement of workers if in cluster mode). Good for deploying code changes, but it won’t always fix stuck states if the issue persists across workers.
Cluster mode
With multiple instances, pm2 restart
restarts all of them at once (brief downtime).
pm2 reload
replaces them one at a time (no downtime).
Why restart “fixes” it
You’re effectively killing the bad process tree, flushing memory leaks, dangling handles, or zombie requests. If it happens often, it’s a symptom (memory leak, unclosed DB/socket, blocking CPU loop) — restart just hides the root cause.
When to automate
You can configure:
pm2 restart <process> --max-memory-restart 500M
to auto-restart when memory exceeds a threshold.
Or use health checks (pm2-runtime
, uptime monitoring, or a separate watchdog).