Performance & Security Testing
How to run load tests with k6 and a methodology-driven penetration test against a Grit app. Both produce evidence — a before/after latency graph, a list of closed vulnerabilities — that's exactly what serious clients pay a premium for.
Performance testing with k6
Every fresh Grit project ships a complete k6 suite in tests/k6/ covering the six load-test types: smoke, average-load, stress, spike, soak, breakpoint. They share a single user journey via tests/k6/lib/common.js — edit the journey once, reshape the load profile per test.
Install k6
# macOSbrew install k6# Linux (Debian/Ubuntu)sudo gpg -k && sudo gpg --no-default-keyring \--keyring /usr/share/keyrings/k6-archive-keyring.gpg \--keyserver hkp://keyserver.ubuntu.com:80 \--recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" \| sudo tee /etc/apt/sources.list.d/k6.listsudo apt-get update && sudo apt-get install k6# Windowswinget install k6
The six test types — when to run each
| Type | Question it answers | When |
|---|---|---|
| smoke.js | Script + system handle minimal load? | Every PR (CI gate) |
| average-load.js | Normal expected traffic behaviour? | Every PR (CI gate) |
| stress.js | Failure mode at the limit? | Before launch |
| spike.js | Survive a sudden surge + recover? | Before launch |
| soak.js | Memory leaks / resource creep over hours? | Before major releases |
| breakpoint.js | Exact capacity — VU count at failure? | Capacity planning |
Run a test
# 1. Start the API (in another terminal)go run . # single-app# or: cd apps/api && go run cmd/server/main.go# 2. Run any testexport BASE_URL=http://localhost:8080k6 run tests/k6/smoke.js # 30s — does it work?k6 run tests/k6/average-load.js # 9m — baseline at 100 VUsk6 run tests/k6/stress.js # 9m — 4× loadk6 run tests/k6/spike.js # 4m — 50 → 1000 → 50k6 run tests/k6/breakpoint.js # 1h — slow ramp to 5000k6 run tests/k6/soak.js # 4h — overnight
Wire smoke + average-load into CI
k6 exits non-zero when a threshold breaches — that's how it gates the pipeline. Add a job to your workflow:
name: perfon: [pull_request]jobs:k6:runs-on: ubuntu-lateststeps:- uses: actions/checkout@v4- name: Build and start APIrun: |go build -o /tmp/server ./tmp/server &sleep 5- uses: grafana/setup-k6-action@v1- name: Smoke + average-loadenv:BASE_URL: http://localhost:8080run: |k6 run tests/k6/smoke.jsk6 run tests/k6/average-load.js
Reading the result
Open /pulse/ui/ in another tab during the run — Pulse shows you live request traces and DB query timings so you can see the bottleneck appear in real time. Look for:
- Smoke: must always pass. Failure = the script is broken, not the system.
- Average-load: p95 ≤ 500 ms and error-rate ≤ 1% is the default SLO. Loosen / tighten in
lib/common.js. - Stress: graceful degradation (latency climbs, errors stay low) vs collapse (error spike). The former is fine; the latter needs work.
- Spike: the recovery curve is the answer. p95 must return to baseline within ~1 min after the surge.
- Soak: a slowly tilting latency line over 4h = memory leak or unclosed connections. Check Pulse's runtime metrics for steady memory growth.
- Breakpoint: the VU count at which thresholds first breach is your real capacity. Plan launches at 50% of this.
Security testing — the pentest methodology
Only test what you're authorised to test.Techniques are identical for attack and defence — the line between "security professional" and "criminal" is the signed scope. Practise on your own apps, OWASP Juice Shop, PortSwigger labs, HackTheBox. Anything else is a crime in most countries, regardless of intent.
The five-phase methodology
Follow these in order, every time. The methodology (not the tools) is what makes a pentest thorough and repeatable.
- Scope & authorisation — signed scope + rules of engagement before anything technical.
- Recon & mapping — passive (OSINT, Shodan) + active (Nmap, Burp + ffuf).
- Vulnerability discovery — automated scan + manual testing of logic & access control.
- Exploitation — prove impact with the minimum necessary; chain low-sev findings into bigger ones.
- Reporting & remediation — CVSS-scored report with reproduction steps + fixes.
The toolkit
| Tool | For |
|---|---|
| Burp Suite (Community) | Intercepting proxy — see / replay / modify every request. The center of any web pentest. |
| Nmap | Port + service scanning — what's exposed. |
| ffuf / Gobuster | Content discovery — brute-force hidden directories & endpoints. |
| sqlmap | Automate SQL-injection detection + exploitation safely. |
| Nuclei | Template-based vuln scanning against a huge community library. |
| govulncheck + pnpm audit | Supply chain — already wired into .github/workflows/security.yml. |
Running a pentest against a Grit app — what to test
Map each OWASP Top 10:2025 category to a concrete test against the generated API. Cross-reference defences on the Security Guide page.
Broken Access Control / IDOR (A01)
# Login as user ATOKEN_A=$(curl -s -X POST http://localhost:8080/api/auth/login \-H 'Content-Type: application/json' \-d '{"email":"a@example.com","password":"password"}' | jq -r '.data.tokens.access_token')# Create an invoiceINV_ID=$(curl -s -X POST http://localhost:8080/api/invoices \-H "Authorization: Bearer $TOKEN_A" \-d '{"amount":100}' | jq -r '.data.id')# Now login as user B and try to read user A's invoiceTOKEN_B=$(curl -s -X POST http://localhost:8080/api/auth/login \-H 'Content-Type: application/json' \-d '{"email":"b@example.com","password":"password"}' | jq -r '.data.tokens.access_token')curl -s http://localhost:8080/api/invoices/$INV_ID \-H "Authorization: Bearer $TOKEN_B"# Expected: 404 (NOT 403, NOT 200). authz.MustOwn returns 404 so existence# of A's invoice doesn't leak to B.
SQL Injection (A05)
# Try classic payloads on any parameter that hits the DBcurl "http://localhost:8080/api/users?email=' OR '1'='1"curl "http://localhost:8080/api/users?email=admin' --"# Expected: 200 with a normal (empty) response. GORM parameterises so the# input is interpreted as data, never SQL.# Time-based blind probecurl "http://localhost:8080/api/users?email=' OR pg_sleep(5)--"# Expected: response in <100ms (no delay). The DB never sees the payload as SQL.
XSS (A05)
# Try stored XSS via a field that's later renderedcurl -X POST http://localhost:8080/api/blogs \-H "Authorization: Bearer $TOKEN_ADMIN" \-d '{"title":"<script>alert(1)</script>","content":"x"}'# View the blog in the SPA. React escapes by default → the script tag# renders as text. The CSP header blocks inline script as a 2nd layer.# Check the response headers:curl -I http://localhost:8080/ | grep -i content-security# Expected: Content-Security-Policy: default-src 'self'; script-src 'self'; ...
SSRF (A01 — 2025)
# Try to make the server fetch the AWS metadata endpoint via any feature# that fetches user-provided URLs (webhook delivery, image-from-URL, etc).curl -X POST http://localhost:8080/api/webhooks/dispatch \-H "Authorization: Bearer $TOKEN" \-d '{"url":"http://169.254.169.254/latest/meta-data/iam/security-credentials/"}'# Expected: 400 with "URL not allowed". internal/safefetch blocks the# request at validation AND re-blocks at TCP-connect time if DNS rebinds.
Authentication brute force (A07)
# Hammer the login endpointfor i in $(seq 1 20); docurl -s -X POST http://localhost:8080/api/auth/login \-H 'Content-Type: application/json' \-d '{"email":"victim@example.com","password":"guess'$i'"}'done# Expected after ~5 attempts: 429 "Rate limited" (Sentinel's per-route limit).# After more attempts at the same email: the account locks (AuthShield).
Misconfiguration / verbose errors (A02)
# Probe for verbose error pagescurl http://localhost:8080/api/this-route-does-not-existcurl http://localhost:8080/api/users/invalid-uuid# Expected: generic error JSON, no stack trace, no DB driver names.# Also check the security headers are set:curl -I http://localhost:8080/api/health | grep -iE 'x-frame|x-content|content-security|strict-transport|referrer'
The audit report — what to deliver
A polished, CVSS-scored, evidence-backed report is what justifies the fee. Structure it for two readers — an executive who needs the bottom line on page 1, and an engineer who needs enough detail to reproduce each finding.
- Executive summary (1 page) — overall risk posture, finding counts by severity, top 3 business risks.
- Scope & methodology — what was tested, what wasn't, dates, approach (e.g. "authorised black-box web pentest per OWASP WSTG").
- Findings — one entry per vuln, sorted critical-first. Each finding needs: title, CVSS score + severity, affected component, plain-English risk description, reproduction steps, evidence (screenshots, request/response), and a specific fix.
- Remediation roadmap — prioritised list with SLAs (Critical: 7 days, High: 14 days, Medium: 30 days, Low: 90 days).
- Appendices — raw tool output, full logs.
CVSS scoring (FIRST.org)
| Score | Severity | SLA |
|---|---|---|
| 9.0–10.0 | Critical | 24h – 7 days |
| 7.0–8.9 | High | within 7 days |
| 4.0–6.9 | Medium | 14–30 days |
| 0.1–3.9 | Low | 60–90 days |
Adjust the raw CVSS by business context — a "Critical" on an air-gapped internal tool may be a real-world Low; a "Medium" on a public payment endpoint may be a real-world Critical. Document the adjustment and the reasoning. That documented judgment is the senior-level deliverable.
Continuous evidence — between pentests
A pentest is a snapshot. Between tests, three things keep the system defensible and prove it:
- Audit trails — Grit's
middleware.LogSecurityEvent+ the activity-log hash chain provide tamper-evident records of every authN/authZ event. - Continuous scanning —
.github/workflows/security.ymlruns govulncheck + pnpm audit + CodeQL on every PR and weekly. Dependabot raises PRs the moment a CVE drops. - Remediation tracking — every finding flows from discovery → ticket (severity + owner + SLA) → fix → re-test. That documented loop is what SOC 2 / ISO 27001 auditors ask to see.
Resources
- PortSwigger Web Security Academy — the best free pentest labs anywhere.
- OWASP Juice Shop — the deliberately vulnerable practice app.
- OWASP Top 10:2025 — the canonical risk map.
- OWASP WSTG — 90+ web-app test cases mapped to the Top 10.
- k6 docs — load-testing reference.
- FIRST CVSS — official scoring spec + calculator.
