BH Engineering | Modern Engineering Consulting

Understanding Goroutines and WaitGroups in Go

One of Go's most powerful features is its built-in support for concurrency through goroutines. If you've been writing sequential Go programs, you're about to discover how to make your applications faster, more efficient, and capable of handling multiple tasks simultaneously.

Objectives

Understand what goroutines are and how they work
Learn about WaitGroups and why they're essential for concurrent programming
Explore practical examples of concurrent patterns
Understand common pitfalls and how to avoid them
Learn best practices for concurrent Go programming
Build real-world applications using goroutines

What are Goroutines?

Goroutines are lightweight threads managed by the Go runtime. Think of them as functions that can run concurrently with other functions. Unlike traditional threads, goroutines are incredibly cheap to create and manage.

Why do you think Go chose to implement its own concurrency model instead of using traditional threads?

Let's start with a simple example to see goroutines in action:

package main import ( "fmt" "time" ) func sayHello(name string) { for i := 0; i < 3; i++ { fmt.Printf("Hello, %s! (%d)\n", name, i+1) time.Sleep(100 * time.Millisecond) } } func main() { // Sequential execution fmt.Println("=== Sequential Execution ===") sayHello("Alice") sayHello("Bob") fmt.Println("\n=== Concurrent Execution ===") // Concurrent execution with goroutines go sayHello("Charlie") go sayHello("Diana") // Wait a bit to see the output time.Sleep(500 * time.Millisecond) fmt.Println("Main function finished") }

Key characteristics of goroutines:

Lightweight: Goroutines start with only a few KB of stack space
Multiplexed: Thousands of goroutines can run on a few OS threads
Cooperative: Managed by the Go runtime scheduler
Easy to create: Just prefix any function call with go

The Problem with Basic Goroutines

The example above has a problem - we're using time.Sleep() to wait for goroutines to finish. This is unreliable and inefficient. What if our goroutines take longer than expected?

Can you think of why using time.Sleep() to wait for goroutines is problematic in real applications?

Enter WaitGroups

WaitGroups provide a clean way to wait for a collection of goroutines to finish executing. Think of it as a counter that tracks how many goroutines are still running.

package main import ( "fmt" "sync" "time" ) func worker(id int, wg *sync.WaitGroup) { // Notify the WaitGroup that this goroutine is done when the function returns defer wg.Done() fmt.Printf("Worker %d starting\n", id) // Simulate some work time.Sleep(time.Duration(id*100) * time.Millisecond) fmt.Printf("Worker %d finished\n", id) } func main() { var wg sync.WaitGroup // Start 5 worker goroutines for i := 1; i <= 5; i++ { wg.Add(1) // Increment the WaitGroup counter go worker(i, &wg) } // Wait for all goroutines to complete wg.Wait() fmt.Println("All workers completed!") }

Key WaitGroup methods:

Add(delta int): Adds delta to the WaitGroup counter
Done(): Decrements the WaitGroup counter by one
Wait(): Blocks until the WaitGroup counter is zero

Labs on Goroutines and WaitGroups

Let's explore different concurrent programming patterns through hands-on examples.

1. Simple Parallel Tasks

Problem: Run multiple tasks at the same time and wait for all to complete.

package main import ( "fmt" "sync" "time" ) func doWork(id int, wg *sync.WaitGroup) { defer wg.Done() fmt.Printf("Worker %d starting\n", id) // Simulate some work time.Sleep(time.Duration(id*100) * time.Millisecond) fmt.Printf("Worker %d finished\n", id) } func main() { var wg sync.WaitGroup // Start 5 workers for i := 1; i <= 5; i++ { wg.Add(1) go doWork(i, &wg) } // Wait for all workers to complete wg.Wait() fmt.Println("All work completed!") }

2. Processing Multiple Items

Problem: Process a list of items concurrently.

package main import ( "fmt" "sync" "time" ) func processItem(item string, wg *sync.WaitGroup) { defer wg.Done() fmt.Printf("Processing %s...\n", item) // Simulate processing time time.Sleep(200 * time.Millisecond) fmt.Printf("Finished processing %s\n", item) } func main() { items := []string{"task1", "task2", "task3", "task4", "task5"} var wg sync.WaitGroup fmt.Println("Starting to process items concurrently...") start := time.Now() // Process each item in its own goroutine for _, item := range items { wg.Add(1) go processItem(item, &wg) } // Wait for all processing to complete wg.Wait() fmt.Printf("All items processed in %v\n", time.Since(start)) }

3. Collecting Results with Shared Data

Problem: Run multiple goroutines and collect their results safely.

package main import ( "fmt" "sync" "time" ) var ( results []string mu sync.Mutex ) func calculateAndStore(id int, wg *sync.WaitGroup) { defer wg.Done() // Simulate some calculation time.Sleep(time.Duration(id*50) * time.Millisecond) result := fmt.Sprintf("Result from worker %d", id) // Safely add to shared slice mu.Lock() results = append(results, result) mu.Unlock() fmt.Printf("Worker %d completed\n", id) } func main() { var wg sync.WaitGroup fmt.Println("Starting calculations...") // Start 5 workers for i := 1; i <= 5; i++ { wg.Add(1) go calculateAndStore(i, &wg) } // Wait for all workers to complete wg.Wait() // Print all results fmt.Println("\nAll results:") for i, result := range results { fmt.Printf("%d: %s\n", i+1, result) } }

When to Use Goroutines and WaitGroups

Based on our experiments, here are the guidelines:

Use Goroutines when:

You have I/O-bound operations (file reads, network calls, database queries)
You need to process multiple independent tasks
You want to improve application responsiveness
You're building concurrent servers or services

Use WaitGroups when:

You need to wait for multiple goroutines to complete
You have a known number of goroutines to synchronize
You want clean, deterministic program termination
You're coordinating goroutines that work independently

How do you think shared data structures with mutexes compare to other synchronization patterns?

Real-World Example: Checking Multiple URLs

Let's build a simple example that checks multiple websites concurrently:

package main import ( "fmt" "net/http" "sync" "time" ) func checkWebsite(url string, wg *sync.WaitGroup) { defer wg.Done() fmt.Printf("Checking %s...\n", url) start := time.Now() resp, err := http.Get(url) duration := time.Since(start) if err != nil { fmt.Printf("❌ %s - Error: %v (%.2fs)\n", url, err, duration.Seconds()) return } defer resp.Body.Close() fmt.Printf("✅ %s - Status: %d (%.2fs)\n", url, resp.StatusCode, duration.Seconds()) } func main() { urls := []string{ "https://google.com", "https://github.com", "https://stackoverflow.com", } var wg sync.WaitGroup fmt.Printf("Checking %d websites concurrently...\n\n", len(urls)) start := time.Now() // Check each URL in its own goroutine for _, url := range urls { wg.Add(1) go checkWebsite(url, &wg) } // Wait for all checks to complete wg.Wait() fmt.Printf("\nAll checks completed in %.2fs\n", time.Since(start).Seconds()) }

Common Pitfalls and Best Practices

Let's look at common mistakes and how to avoid them:

1. Forgetting to Call Done()

// ❌ BAD: Forgetting to call Done() func badWorker(wg *sync.WaitGroup) { // If this function returns early due to an error, // Done() is never called, causing a deadlock if someCondition { return // Oops! Forgot to call wg.Done() } wg.Done() } // ✅ GOOD: Always use defer func goodWorker(wg *sync.WaitGroup) { defer wg.Done() // This ensures Done() is always called if someCondition { return // Safe to return early } // Do work... }

2. Race Conditions with Shared Data

package main import ( "fmt" "sync" ) func main() { var counter int var wg sync.WaitGroup var mu sync.Mutex // Add mutex for thread safety // Start 1000 goroutines that increment counter for i := 0; i < 1000; i++ { wg.Add(1) go func() { defer wg.Done() // ❌ BAD: Race condition // counter++ // ✅ GOOD: Protected by mutex mu.Lock() counter++ mu.Unlock() }() } wg.Wait() fmt.Printf("Final counter value: %d\n", counter) // Should be 1000 }

3. WaitGroup Counter Mismatch

package main import ( "fmt" "sync" ) func main() { var wg sync.WaitGroup // ❌ BAD: Add() called after starting goroutine go func() { wg.Add(1) // This could execute after wg.Wait() defer wg.Done() fmt.Println("Working...") }() wg.Wait() // Might not wait for the goroutine // ✅ GOOD: Add() before starting goroutine wg.Add(1) go func() { defer wg.Done() fmt.Println("Working correctly...") }() wg.Wait() }

Performance Considerations

Let's measure the performance benefits of concurrency:

package main import ( "fmt" "sync" "time" ) func simulateWork(id int, duration time.Duration) { fmt.Printf("Worker %d starting (will take %v)\n", id, duration) time.Sleep(duration) fmt.Printf("Worker %d finished\n", id) } func sequentialExecution() { fmt.Println("=== Sequential Execution ===") start := time.Now() for i := 1; i <= 5; i++ { simulateWork(i, 200*time.Millisecond) } fmt.Printf("Sequential total time: %v\n\n", time.Since(start)) } func concurrentExecution() { fmt.Println("=== Concurrent Execution ===") start := time.Now() var wg sync.WaitGroup for i := 1; i <= 5; i++ { wg.Add(1) go func(id int) { defer wg.Done() simulateWork(id, 200*time.Millisecond) }(i) } wg.Wait() fmt.Printf("Concurrent total time: %v\n\n", time.Since(start)) } func main() { sequentialExecution() concurrentExecution() }

Advanced Pattern: Limiting Concurrent Goroutines

Sometimes you want to limit how many goroutines run at the same time:

package main import ( "fmt" "sync" "time" ) func processWithLimit(items []string, maxWorkers int) { var wg sync.WaitGroup semaphore := make(chan struct{}, maxWorkers) // Limit concurrent workers for i, item := range items { wg.Add(1) go func(id int, item string) { defer wg.Done() // Acquire semaphore (blocks if maxWorkers already running) semaphore <- struct{}{} defer func() { <-semaphore }() // Release semaphore fmt.Printf("Worker %d processing %s\n", id, item) time.Sleep(500 * time.Millisecond) // Simulate work fmt.Printf("Worker %d finished %s\n", id, item) }(i+1, item) } wg.Wait() } func main() { items := []string{"task1", "task2", "task3", "task4", "task5", "task6"} fmt.Println("Processing with max 2 concurrent workers:") start := time.Now() processWithLimit(items, 2) fmt.Printf("Completed in %v\n", time.Since(start)) }

What's Next?

We've covered the fundamentals of goroutines and WaitGroups in Go. Understanding these concepts opens the door to:

Building scalable applications - Handle thousands of concurrent operations
Improving performance - Leverage multiple CPU cores effectively
Writing responsive systems - Don't block on slow operations
Understanding Go's concurrency model - Lightweight goroutines with proper synchronization

In the next lesson, we could explore:

Context package for cancellation and timeouts
Sync package primitives (Mutex, RWMutex, Once, Cond)
Advanced synchronization patterns
Performance optimization techniques

Remember: Goroutines make concurrency simple, WaitGroups make it safe. Start with simple patterns and gradually build up to more complex concurrent architectures!

Summary

Goroutines: Lightweight threads managed by Go runtime, created with go keyword
WaitGroups: Synchronization primitive to wait for multiple goroutines to complete
Best practices: Always use defer wg.Done(), avoid race conditions, call Add() before starting goroutines
Use cases: I/O-bound operations, parallel processing, concurrent servers
Patterns: Parallel processing, batch processing, rate limiting, shared data structures
Performance: Significant speedup for I/O-bound and parallelizable tasks

This foundation will help you write concurrent Go programs that are both fast and correct!