to navigate · Space next slide · Click any diagram to zoom · Esc close
GCP Infrastructure Plan

Beaver at 100K Users

Family wealth management platform on Google Cloud. Cloud Run, Alpaca Broker API, PostgreSQL, event-driven architecture.

100K
Users
6
Cloud Run Services
~$1.4K
GCP / month
10-15
Engineers
$500M+
AUM
Clients
Edge / Security
Cloud Run Services
Pub/Sub
Data Stores
External Partners
Dead Letters / Alerts
Platform Services
Slide 2 of 11

Cloud Run vs Kubernetes

Why NOT start with Kubernetes — and why migrating is trivial when you need it.

xychart-beta
    title "Monthly Cost: Cloud Run vs GKE Autopilot"
    x-axis ["MVP 0-1K", "Growth 1-10K", "Scale 10-100K", "100K+"]
    y-axis "USD / month" 0 --> 3500
    bar [14, 120, 1400, 1400]
    bar [274, 600, 2200, 2500]
    line [14, 120, 1400, 1400]
    line [274, 600, 2200, 2500]
                
■ Cloud Run ■ GKE Autopilot
FactorCloud RunGKE Autopilot
Management fee$0$74/mo fixed
Compute MVP~$0 (free tier)~$200/mo min
Compute 100K users~$450/mo~$800-1,200/mo
Setup time2 hours1-2 weeks
Scale to zeroYesNo
Platform engineers0 needed1-2 ($30K/mo)
Deploy commandgcloud run deployhelm upgrade + kubectl
Blue/green deploysAutomaticArgo Rollouts config
YAML files020-30 per service

Advantages of Starting with Cloud Run

  • Speed: Code to production in 1 command, not 1 week of YAML
  • $0 cost at MVP: Free tier covers ~2M req/month. K8s charges from day 1
  • Zero ops: No clusters to upgrade, no node pools, no RBAC config
  • Same Docker image: When you migrate to GKE, the container doesn't change. Only deploy config
  • Focus on product: Every hour on infra is an hour less on features for users

When to Migrate to Kubernetes

  • 15+ microservices — need service mesh for mTLS between all of them
  • GPU workloads — ML fraud detection, recommendation engine
  • Custom autoscaling — scale workers by Pub/Sub queue depth
  • 20+ engineers — justifies a dedicated platform team
  • Migration: ~2 weeks, zero downtime, same image + same data layer

Cumulative Savings to 100K Users

  • Year 1 (MVP): Cloud Run $168 vs K8s $3,288 — save $3,120
  • Year 2 (Growth): CR $1,440 vs K8s $7,200 — save $5,760
  • Year 3 (Scale): CR $16,800 vs K8s $26,400 — save $9,600
  • Total 3 years: ~$18,480 saved on infra
  • Not counting the $30K/mo platform engineers K8s requires
Slide 3 of 11

MVP Essentials — 0 to 10K Users

What you MUST have and what you DON'T. Every dollar and every hour counts as a solo dev.

✓ Required to Launch

Cloud Run $0-5/mo

Your API runs here. Free tier covers up to ~2M requests/month. One service is enough up to 10K users.

Cloud SQL PostgreSQL $8-50/mo

MVP: db-f1-micro ($8). Growth: db-custom-1-3840 HA ($50). Your data lives here.

Alpaca Broker API $0

UGMA/UTMA accounts, KYC, ACH, trading. They are your broker-dealer. Revenue share model.

Secret Manager $0.06/mo

Alpaca keys, JWT keys, DB password. Never hardcode secrets.

GitHub Actions CI/CD $0

Test → Build → Deploy automatically on every push. CI/CD from day 1.

Artifact Registry $0.10/mo

Docker images. Required for Cloud Run.

Apple Developer Account $99/year

TestFlight + App Store. Required for iOS distribution.

Compliance Attorney $5K-10K one-time

Reviews terms, privacy policy, fintech disclosures. Do not launch without this.

✗ NOT Needed Until 10K+ Users

Kubernetes (GKE) SAVE $274+/mo

Cloud Run does everything you need. 1 service is enough up to 10K.

Memorystore Redis SAVE $15-35/mo

Sessions in PostgreSQL. Rate limiting in memory. Redis is an optimization, not a requirement.

VPC Connector SAVE $7/mo

Only needed for Redis private IP. No Redis, no connector.

Cloud Armor WAF SAVE $15+/mo

Rate limiting in app middleware. WAF is for when you have traffic worth attacking.

Cloud KMS (CMEK) SAVE $1/mo

AES-256-GCM with key in Secret Manager is sufficient. CMEK is for SOC2 audits.

Terraform SAVE TIME

Direct gcloud CLI up to 10K. IaC when the team grows and config drift matters.

Pub/Sub SAVE $15/mo

In-process EventBus (Go channels) until you extract services. Same interface.

Read Replica SAVE $150/mo

A single PostgreSQL handles 10K users without issue. Add replica when dashboard queries impact latency.

~$14/mo
Infra MVP (0-1K users)
~$120/mo
Infra Growth (1-10K users)
~$274/mo
K8s minimum (0 users)
~$15K
Legal + compliance (one-time)
Slide 4 of 11

Full Architecture Overview

Everything connected — from iOS app to Alpaca trade execution.

flowchart TB
    IOS["iOS App"] & WEB["Web App"]
    GLB["Global HTTPS LB · TLS 1.3"]
    ARMOR["Cloud Armor WAF"]

    IOS & WEB --> GLB --> ARMOR

    subgraph VPC["GCP VPC — Private Network"]
        direction TB
        ARMOR --> API & WEBSERV

        API["beaver-api\n25+ endpoints\nmin:2 max:100"]
        WEBSERV["beaver-web"]

        API -->|"publish"| TOPICS
        WEBHOOK -->|"publish"| TOPICS

        TOPICS["Pub/Sub Topics\naccount · transfer · trade\nnotification · rebalance · recurring"]

        TOPICS -->|"subscribe"| WORKER & NOTIFIER
        TOPICS -.->|"failed"| DLQ["Dead Letter Topics\nP1 alert"]

        WORKER["beaver-worker\nTrades · Rebalance\nReconciliation"]
        NOTIFIER["beaver-notifier\nPush · Email · SMS"]
        SCHED["beaver-scheduler\nRecurring deposits\nDrift detection"]

        SCHED -->|"publish"| TOPICS

        API -->|"writes"| PRIMARY
        API -->|"reads"| REPLICA
        API <-->|"sessions · cache"| REDIS
        WORKER -->|"writes"| PRIMARY
        WORKER <-->|"locks"| REDIS
        SCHED -->|"reads"| REPLICA
        WEBHOOK -->|"dedup"| REDIS

        PRIMARY[("Cloud SQL Primary\nPostgreSQL 16 HA\n4vCPU · 16GB · 100GB")]
        REPLICA[("Read Replica\n2vCPU · 8GB")]
        REDIS[("Redis 7 HA 2GB")]
    end

    API <-->|"accounts · trading"| ALPACA
    API <-->|"bank link"| PLAID
    WORKER -->|"trade exec"| ALPACA

    ALPACA["Alpaca Broker API"] -->|"webhooks"| WEBHOOK
    PLAID["Plaid"] -->|"webhooks"| WEBHOOK
    P529["529 Provider"] -->|"webhooks"| WEBHOOK
    PIRA["IRA Custodian"] -->|"webhooks"| WEBHOOK
    WEBHOOK["beaver-webhook"]

    NOTIFIER --> APNS["APNs"] & SG["SendGrid"] & TW["Twilio"]

    API -.-> KMS["Cloud KMS"] & SECRETS["Secret Mgr"]
    SCHED --> GCS["Cloud Storage\n7yr WORM audit"]
    API -.-> OBS["Logging · Monitoring · Trace"]
    CICD["GitHub Actions\nCanary deploys"] -.-> API & WORKER & WEBHOOK & NOTIFIER

    style VPC fill:#f6f8fa,stroke:#d0d7de,stroke-width:2px
    style TOPICS fill:#F39C12,stroke:#D68910,color:#000
    style DLQ fill:#E74C3C,stroke:#C0392B,color:#fff
    style API fill:#34A853,stroke:#1E7E34,color:#fff
    style WEBSERV fill:#34A853,stroke:#1E7E34,color:#fff
    style WORKER fill:#34A853,stroke:#1E7E34,color:#fff
    style NOTIFIER fill:#34A853,stroke:#1E7E34,color:#fff
    style SCHED fill:#34A853,stroke:#1E7E34,color:#fff
    style WEBHOOK fill:#34A853,stroke:#1E7E34,color:#fff
    style PRIMARY fill:#8E44AD,stroke:#6C3483,color:#fff
    style REPLICA fill:#8E44AD,stroke:#6C3483,color:#fff
    style REDIS fill:#8E44AD,stroke:#6C3483,color:#fff
    style ALPACA fill:#5DADE2,stroke:#2E86C1,color:#fff
    style PLAID fill:#5DADE2,stroke:#2E86C1,color:#fff
    style P529 fill:#5DADE2,stroke:#2E86C1,color:#fff
    style PIRA fill:#5DADE2,stroke:#2E86C1,color:#fff
    style GLB fill:#E8913A,stroke:#B06D2B,color:#fff
    style ARMOR fill:#E8913A,stroke:#B06D2B,color:#fff
    style IOS fill:#4A90D9,stroke:#2C5F8A,color:#fff
    style WEB fill:#4A90D9,stroke:#2C5F8A,color:#fff
    style KMS fill:#566573,stroke:#2C3E50,color:#fff
    style SECRETS fill:#566573,stroke:#2C3E50,color:#fff
    style GCS fill:#566573,stroke:#2C3E50,color:#fff
    style OBS fill:#566573,stroke:#2C3E50,color:#fff
    style CICD fill:#2ECC71,stroke:#27AE60,color:#fff
    style APNS fill:#AF7AC5,stroke:#8E44AD,color:#fff
    style SG fill:#AF7AC5,stroke:#8E44AD,color:#fff
    style TW fill:#AF7AC5,stroke:#8E44AD,color:#fff
        
Slide 5 of 11

Cloud Run Services

6 independently deployable services. No Kubernetes needed.

beaver-api PRIMARY

Main HTTP API for the iOS app. 25+ endpoints. Handles auth, user management, fund creation, deposits.

  • min: 2 instances (prod)
  • max: 100 instances
  • 1 vCPU / 1GB RAM
  • Concurrency: 80 req/instance

beaver-worker ASYNC

Processes trades, portfolio rebalancing, reconciliation. CPU-intensive, isolated from API latency.

  • min: 1 instance
  • max: 50 instances
  • Triggered by Pub/Sub
  • Distributed locks via Redis

beaver-webhook CRITICAL

Ingests webhooks from Alpaca, Plaid, 529 provider, IRA custodian. Must respond <200ms.

  • min: 1 instance (always ready)
  • max: 30 instances
  • HMAC-SHA256 verification
  • Redis dedup (event_id)

beaver-notifier EVENTS

Dispatches push notifications (APNs), email (SendGrid), SMS (Twilio). Different failure modes from API.

  • min: 0 (scales to zero)
  • max: 20 instances
  • Triggered by Pub/Sub
  • Retry with backoff

beaver-scheduler CRON

Cloud Run Job. Recurring deposits, drift detection, daily reconciliation, audit log export.

  • Runs 5x/day on schedule
  • Reads from replica only
  • Publishes jobs to Pub/Sub
  • Exports audit to Cloud Storage

beaver-web

Landing page and future web app. Static serving via Cloud Run + Cloud CDN.

  • min: 0 (scales to zero)
  • max: 20 instances
  • Behind Cloud CDN
  • React / Next.js
Slide 6 of 11

Deposit Flow

User taps "Add Money" → ACH transfer → auto-invest in ETFs → push notification.

sequenceDiagram
    actor User as iOS User
    participant API as beaver-api
    participant ALP as Alpaca API
    participant PS as Pub/Sub
    participant WH as beaver-webhook
    participant WK as beaver-worker
    participant DB as Cloud SQL
    participant RD as Redis
    participant NT as beaver-notifier

    User->>API: POST /transactions/deposit ($500)
    API->>DB: Create pending transaction
    API->>ALP: POST /transfers (ACH INCOMING)
    ALP-->>API: transfer_id, QUEUED
    API->>PS: publish transfer.initiated
    API-->>User: 200 OK — deposit pending

    Note over ALP: ACH processing (1-3 business days)

    ALP->>WH: Webhook: transfer COMPLETE
    WH->>RD: Dedup check (event_id)
    WH->>DB: Log webhook
    WH->>PS: publish transfer.completed

    PS->>WK: subscribe: transfer.completed
    WK->>RD: Acquire distributed lock (fund_id)
    WK->>DB: Update transaction + create ledger entry

    Note over WK: Auto-invest per fund allocation

    WK->>ALP: Buy $300 VTI (60%)
    WK->>ALP: Buy $150 VXUS (30%)
    WK->>ALP: Buy $50 BND (10%)
    WK->>RD: Release lock
    WK->>PS: publish notification.send

    PS->>NT: subscribe: notification.send
    NT-->>User: Push: "$500 deposited and invested!"
        
Slide 7 of 11

Database Architecture

Cloud SQL PostgreSQL 16 with HA, read replica, connection pooling, and schema separation.

flowchart TB
    subgraph SERVICES["Cloud Run Services"]
        API["beaver-api\nreads + writes"]
        WORKER["beaver-worker\nwrites only"]
        SCHED["beaver-scheduler\nreads only"]
    end

    subgraph POOL["PgBouncer Connection Pooling"]
        PG1["Sidecar\npool=transaction\n10 conn/instance"]
        PG2["Sidecar"]
        PG3["Sidecar"]
    end

    subgraph CLOUDSQL["Cloud SQL PostgreSQL 16"]
        PRIMARY[("PRIMARY — HA\n4 vCPU · 16GB · 100GB\nCMEK · PITR 30d")]
        REPLICA[("READ REPLICA\n2 vCPU · 8GB\nlag less than 1s")]
        PRIMARY -->|"async replication"| REPLICA
    end

    subgraph SCHEMAS["Database Schemas"]
        PUB["public\nusers · funds · transactions\nledger_entries · beneficiaries"]
        PII["pii\nuser_ssn · beneficiary_ssn\nrestricted access · audited"]
        AUD["audit\naudit_log · webhook_log\nINSERT only"]
        AN["analytics\nmaterialized views\ndaily aggregations"]
    end

    API --> PG1 --> PRIMARY
    API -.-> REPLICA
    WORKER --> PG2 --> PRIMARY
    SCHED --> PG3 --> REPLICA

    PRIMARY --- PUB & PII & AUD & AN

    KMS["Cloud KMS"] -.->|"CMEK"| PRIMARY
    GCS["Cloud Storage 7yr WORM"] -.->|"nightly export"| AUD

    style PRIMARY fill:#8E44AD,stroke:#6C3483,color:#fff
    style REPLICA fill:#8E44AD,stroke:#6C3483,color:#fff
    style PII fill:#E74C3C,stroke:#C0392B,color:#fff
    style AUD fill:#F39C12,stroke:#D68910,color:#000
    style PUB fill:#34A853,stroke:#1E7E34,color:#fff
    style AN fill:#5DADE2,stroke:#2E86C1,color:#fff
    style KMS fill:#566573,stroke:#2C3E50,color:#fff
    style GCS fill:#566573,stroke:#2C3E50,color:#fff
        
Slide 8 of 11

Security & Compliance

SOC2 Type II controls mapped to GCP services. FINRA/SEC compliance ready.

Network Isolation

  • Cloud SQL + Redis on private IPs only
  • No public IPs except load balancer
  • VPC connector for Cloud Run egress
  • Cloud Armor WAF at edge

Encryption

  • TLS 1.3 automatic (Cloud Run)
  • mTLS via Cloud SQL Auth Proxy
  • CMEK via Cloud KMS (DB + Storage)
  • Envelope encryption for SSN/PII

IAM — Least Privilege

  • Per-service service accounts
  • No roles/owner or roles/editor
  • Workload Identity Federation (CI/CD)
  • No long-lived service account keys

WAF Rules (Cloud Armor)

  • OFAC country blocking
  • 1000 req/min per IP global
  • 100 req/min on /auth/* (brute force)
  • OWASP Top 10 (SQLi, XSS)
  • Geo-restrict financial transactions

Compliance (FINRA/SEC)

  • Rule 17a-4: Cloud Storage Bucket Lock (7yr WORM)
  • Reg BI: Suitability versioned in DB
  • BSA/AML: CTR for >$10K transactions
  • UGMA/UTMA: Withdrawal justification

SOC2 Type II

  • CC6: IAM + JWT + MFA + ownership guards
  • CC7: Cloud Monitoring + alerts + PagerDuty
  • CC8: GitHub PR reviews + CI/CD pipeline
  • A1: Cloud SQL HA + Cloud Run auto-heal
Slide 9 of 11

CI/CD Pipeline

6 gates from push to production. Canary deploys with automatic rollback.

flowchart LR
    subgraph G1["Gate 1: Quality"]
        LINT["golangci-lint"] --> VET["go vet"] --> TEST["go test -race\ncoverage 80%+"]
    end
    subgraph G2["Gate 2: Security"]
        VULN["govulncheck"] --> TRIVY["Trivy scan\nCRITICAL + HIGH"]
    end
    subgraph G3["Gate 3: Build"]
        BUILD["Build 4 images\nin parallel"] --> AR["Push to\nArtifact Registry"]
    end
    subgraph G4["Gate 4: Staging"]
        MIG["Migrations"] --> DEP["Deploy all\nservices"] --> E2E["E2E tests"]
    end
    subgraph G5["Gate 5: Canary"]
        CAN["Canary deploy\n10% traffic"] --> SOAK["5-min soak\nmonitor errors"]
    end
    subgraph G6["Gate 6: Prod"]
        APR["Manual\napproval"] --> MIGP["Migrations"] --> ROLL["Full\nrollout"]
    end

    PUSH["git push\nmain"] --> G1 & G2
    G1 & G2 --> G3 --> G4 --> G5 --> G6

    style PUSH fill:#4A90D9,stroke:#2C5F8A,color:#fff
    style APR fill:#E74C3C,stroke:#C0392B,color:#fff
    style ROLL fill:#34A853,stroke:#1E7E34,color:#fff
    style E2E fill:#F39C12,stroke:#D68910,color:#000
    style CAN fill:#F39C12,stroke:#D68910,color:#000
        
Slide 10 of 11

Cost Breakdown

~$1,400/mo GCP infrastructure at 100K users. Less than 1% of total cost (team is 98%+).

ServiceSpecMonthly
Cloud SQL Primary + Replica4vCPU 16GB HA + 2vCPU 8GB$550
Cloud Run (6 services)auto-scaled, mixed min instances$450
Memorystore RedisStandard HA, 2GB$70
VPC Connectore2-medium x2$50
Cloud Armor + Global LBWAF rules + HTTPS LB$40
Logging + Monitoring + Trace50GB ingestion, custom metrics$45
Pub/Sub~2M messages/day$15
Cloud Storage50GB (audit, docs, backups)$5
KMS + Secrets + DNS3 keys, 20 secrets, 1 zone$7
Artifact Registry~5GB images$1
TOTAL GCP~$1,233
pie title Monthly GCP Cost Breakdown
    "Cloud SQL" : 550
    "Cloud Run" : 450
    "Redis" : 70
    "VPC + LB + Armor" : 90
    "Observability" : 45
    "Other" : 28
                

External Services

  • Plaid (100K connections): ~$3,000/mo
  • Twilio (100K SMS): ~$800/mo
  • SendGrid (500K emails): ~$90/mo
  • PagerDuty (5 seats): ~$100/mo
  • Alpaca: rev share (no direct cost)

Total Monthly

  • GCP infra: ~$1,400
  • External services: ~$4,000
  • Team (10-15 eng): ~$150K-250K
  • Infra = less than 1% of total cost
Slide 11 of 11

Growth Path

From $0/mo to $1,400/mo — scale infrastructure only when needed.

0 – 1K Users MVP

~$10-14/mo

  • 1 Cloud Run service (scales to zero)
  • Cloud SQL db-f1-micro
  • No Redis, no Terraform
  • Sessions in PostgreSQL
  • Direct gcloud deploy

1K – 10K Users GROWTH

~$100-125/mo

  • Add Memorystore Redis
  • Cloud SQL small HA
  • Add Cloud Armor WAF
  • Terraform (3-4 modules)
  • Still 1 Cloud Run service

10K – 100K Users SCALE

~$1,200-1,600/mo

  • 6 Cloud Run services
  • Cloud SQL 4vCPU HA + replica
  • Pub/Sub (replace EventBus)
  • Full Terraform (14 modules)
  • Canary deploys, SOC2

100K+ Users NEXT

$2,500+/mo

  • Evaluate GKE Autopilot (15+ services)
  • Multi-region (if compliance requires)
  • BigQuery data warehouse
  • CQRS / event sourcing
  • ML fraud detection
timeline
    title Beaver Infrastructure Evolution
    section Bootstrap
        Solo founder : Cloud Run free tier
                     : Neon PostgreSQL free
                     : $0/mo
    section MVP (1K users)
        Solo founder : Cloud SQL micro
                 : Single Cloud Run service
                 : ~$14/mo
    section Growth (10K users)
        5-8 engineers : Add Redis + Cloud Armor
                      : Terraform IaC
                      : ~$120/mo
    section Scale (100K users)
        10-15 engineers : 6 Cloud Run services
                        : Pub/Sub event bus
                        : HA + Read replica
                        : SOC2 Type II
                        : ~$1,400/mo
    section Next (500K+ users)
        20+ engineers : Evaluate GKE
                      : Multi-region
                      : BigQuery + ML
                      : $2,500+/mo