Go concurrency patterns you’ll actually use — Scaling Strategies — Practical Guide (Jun 12, 2026)

Go concurrency patterns you’ll actually use — Scaling Strategies

body { font-family: Arial, sans-serif; line-height: 1.6; }
h2 { margin-top: 2em; }
pre { background: #f4f4f4; padding: 1em; overflow-x: auto; }
code { font-family: Consolas, monospace; }
.audience { font-style: italic; color: #555; }
.social { margin-top: 2em; font-style: italic; color: #666; }

Go concurrency patterns you’ll actually use — Scaling Strategies

Level: Intermediate

As of June 2026, targeting Go 1.21+

Prerequisites

This article assumes you have a working knowledge of Go’s core concurrency constructs: goroutines, channels, and the sync package. You should be comfortable reading and writing idiomatic Go and understand basic concurrency concerns like race conditions and deadlocks.

We focus on Go 1.21+, where concurrency primitives and runtime improvements continue to mature. The patterns discussed are stable and applicable to production services handling scalable workloads.

Introduction

Concurrency is a fundamental feature of Go. However, “concurrency” as a concept is broad, and not all concurrency patterns are useful or efficient for every scenario.

This article narrows in on concurrency patterns that enable scaling strategies: managing and scaling workload across multiple goroutines or workers efficiently, reducing bottlenecks, and ensuring controlled resource use.

Hands-on steps

1. Worker Pools — Controlled Concurrency for CPU/IO-bound Tasks

Worker pools are the go-to pattern to limit concurrency and prevent resource exhaustion when processing many jobs. They’re best when jobs are independent and roughly uniform in cost.

package main

import (
    "fmt"
    "sync"
    "time"
)

func worker(id int, jobs <-chan int, results chan<- int, wg *sync.WaitGroup) {
    defer wg.Done()
    for job := range jobs {
        fmt.Printf("worker %d started job %dn", id, job)
        // Simulate work
        time.Sleep(time.Second)
        results <- job * 2
        fmt.Printf("worker %d finished job %dn", id, job)
    }
}

func main() {
    jobs := make(chan int, 10)
    results := make(chan int, 10)

    var wg sync.WaitGroup
    numWorkers := 3

    for w := 1; w <= numWorkers; w++ {
        wg.Add(1)
        go worker(w, jobs, results, &wg)
    }

    for j := 1; j <= 5; j++ {
        jobs <- j
    }
    close(jobs)

    go func() {
        wg.Wait()
        close(results)
    }()

    for r := range results {
        fmt.Println("result:", r)
    }
}

When to choose Worker Pools: You want to limit concurrency, often to match CPU cores or API rate limits. If you have unbounded goroutines, pool limits protect system stability.

2. Fan-Out, Fan-In — Parallelising Independent Jobs

This pattern is excellent when you have many independent jobs and want to parallelise them, but also need to recombine results.

func square(n int, out chan<- int) {
    out <- n * n
}

func main() {
    nums := []int{2, 4, 6, 8}
    out := make(chan int, len(nums))

    for _, n := range nums {
        go square(n, out)
    }

    // Fan-in: collect all results
    for i := 0; i < len(nums); i++ {
        result := <-out
        fmt.Println("Square:", result)
    }
}

When to choose Fan-Out/Fan-In: When jobs are independent and you want maximum concurrency but don’t need shared state or complex coordination.

3. Rate Limiting with Token Bucket (time.Ticker / time.Timer)

To scale safely with external resources (APIs, databases), you’ll often need to throttle requests to not exceed quotas or overload the backend.

package main

import (
    "fmt"
    "time"
)

func main() {
    requests := make(chan int, 5)
    for i := 1; i <= 5; i++ {
        requests <- i
    }
    close(requests)

    ticker := time.NewTicker(200 * time.Millisecond) // 5 requests per second
    defer ticker.Stop()

    for req := range requests {
        <-ticker.C
        fmt.Println("Processing request", req, "at", time.Now())
    }
}

When to choose Token Bucket via Ticker: When you need a simple rate limiter controlling retries, API calls, or batch processing.

4. Using Context for Cancellation & Scaling Up/Down

Proper cancellation and timeout propagate through your goroutines to manage scaling. Using context.Context in worker pools or batch jobs enables graceful scaling shutdown and prevents goroutine leaks.

func worker(ctx context.Context, id int, jobs <-chan int) {
    for {
        select {
        case <-ctx.Done():
            fmt.Printf("worker %d stoppingn", id)
            return
        case job, ok := <-jobs:
            if !ok {
                fmt.Printf("worker %d no more jobsn", id)
                return
            }
            fmt.Printf("worker %d processing job %dn", id, job)
        }
    }
}

When to use context: For cancellation, timeout, and cleanup in coordinated scalable systems.

Common pitfalls

Unbounded goroutines: Launching goroutines without limit can exhaust memory or cause scheduler thrashing.
Channel leaks: Always close channels when no more values will be sent to avoid goroutine blocks.
Ignoring cancellation: Goroutines that don’t honour context.Context cancellation can pile up internally.
Incorrect synchronization: Race conditions can arise if shared state isn’t protected when scaling beyond isolated jobs.
Overly complex coordination: Avoid complex state machines with channels — consider higher-level abstractions like sync.WaitGroup or third-party libs.

Validation

Validate your concurrency pattern by:

Using go test -race: Detect shared memory races early.
Benchmarking with testing.B: Experiment with different concurrency levels to find optimal parallelism.
Profiling with pprof: Measure goroutine count, CPU, and memory usage to uncover bottlenecks.
Load-testing your system: Use tools like hey or wrk to validate behaviour under real-world concurrency.
Code review focusing on locking and channel closure: Make sure all paths properly close channels and notify WaitGroups.

Checklist / TL;DR

Use worker pools to cap concurrency and avoid resource saturation.
Leverage fan-out/fan-in for parallel processing of independent jobs with recombination.
Rate-limit with time.Ticker or similar token bucket for external resource protection.
Always embed context.Context to enable cancellation and timeout propagation.
Close channels explicitly to signal completion.
Test with race detector and benchmark concurrency levels.
Profile regularly to ensure goroutine and memory usage behave as expected under load.