web3.path

PHASE 15 Prod · ~5 hours

Production Engineering

The bridge from "works on Hardhat" to "survives a Monday morning with 50k users and a flaky RPC". This phase is pure DevOps with Web3 quirks.

Goal — ship a full-stack dApp: verified contracts, monitored indexer, HA RPC, CI/CD pipeline, pager runbooks.

1. The deploy checklist

2. RPC strategy — your biggest single dependency

ProviderSweet spot
AlchemyMost features (debug, trace, webhooks, NFT APIs)
InfuraBattle-tested; part of Consensys stack
QuickNodeWide chain support
Ankr / Public endpointsDev/backup only
Self-hosted (Erigon / Geth / Reth)At scale, cheaper + no rate limits
Always use ≥2 providers behind a load balancer. Fall over on 5xx / rate limit. Ethers v6 supports FallbackProvider; or use a router (e.g., dRPC).

3. Key management

// Gnosis Safe + Timelock upgrade flow
Safe → schedule(timelock, contract.upgradeTo(newImpl))
        │ 48h pass, users can exit
        ▼
Safe → execute(timelock, contract.upgradeTo(newImpl))

4. Upgradeability — only if you need it

Gotcha — upgradeable contracts have storage-layout constraints. Changing variable order breaks everything. Use OpenZeppelin's hardhat-upgrades with storage checks.

5. CI/CD

# .github/workflows/contracts.yml
name: contracts
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: 20 }
      - run: npm ci
      - run: npx hardhat test
      - run: npx hardhat coverage
      - name: Slither
        uses: crytic/slither-action@v0.4.0
        with: { fail-on: medium }
      - name: Gas report
        run: REPORT_GAS=true npx hardhat test

Ship a deploy workflow that is gated on manual approval + tag, and writes deployments/<chain>.json back to the repo as a PR.

6. Observability

SignalWhyTool
RPC latency & error rateDetects provider issuesProm + Grafana
Indexer lag (head - cursor)Alerts on stuck ingestCustom metric
Stuck tx in mempoolRelayer nonce jamTenderly / custom
Contract balancesTreasury drift, drainsForta / custom
Event-rate anomaliesAttack detectionOpenZeppelin Defender Sentinels
Gas spikesExec budget awarenessBlocknative / EthGasStation

7. Docker + infra shape

[ CDN / Cloudflare ] │ ┌────────┼─────────┐ ▼ ▼ ▼ UI API WebSocket gateway (static) (Node) (Node, ethers WSS) │ ▼ Postgres (managed, RDS/Neon) Redis (cache, queues) │ ▼ Indexer workers (Node/Go) ── RPC LB ── Alchemy / Infura / self-hosted geth │ ▼ Relayer (KMS-signed) → tx submission

For 10–50k DAU, a single VPS + managed DB + Alchemy free tier is enough. Scale by adding worker replicas and a pull-oracle pattern for price fetch.

8. Deploying the frontend

9. Monitoring with OpenZeppelin Defender / Tenderly

10. Incident response

11. Project

Capstone — integrate everything: Phase 6 contracts, Phase 7 UI, Phase 8 indexer. Deploy to Base Sepolia. Wrap ownership in a Safe multisig. Add Grafana dashboards for RPC + indexer lag. Write a README that a fresh engineer can use to redeploy from scratch in < 30 minutes.

Quiz

Q. Your indexer silently fell behind by 4 hours because the RPC returned a 200 with an empty logs: [] array. How do you detect this kind of failure?
"Empty result" is indistinguishable from "nothing happened". Always compute lag against the current head and run a shadow check against a second provider.
← Phase 14Phase 16: Research & Specialization →