In this tutorial, we will learn about CPU bound processes in Golang using 4 best examples. We will first understand what CPU bound processes are and how we implement them in Golang. We will look at few example code in upcoming sections wherein we will notice that execution time of processes varies based on how many cores we use for executing the process. Let’s start the tutorial.
What is CPU Bound Process?
Also read: Channel Synchronization in Golang: [4 Best Examples]
CPU bound process refers to a workload that primarily relies on the computational capabilities of the CPU. CPU bound processes consume a substantial amount of CPU resources and may keep the CPU busy for extended periods. CPU bound processes can be executed concurrently with other Goroutines, allowing efficient utilization of available CPU cores. CPU-bound processes usually involve minimal waiting for external resources like disk I/O or network communication.
Methods to Implement CPU Bound Process
There are broadly two methods to implement CPU bound processes. They are:
Sequential Computation – Sequential computation means no concurrency at all. Here, a sequence of CPU intensive operations are performed in a single goroutine. This is the straight forward way of creating CPU bound loads in Golang.
Concurrent Computation – Concurrent computation means loads are executed concurrently. Concurrent Computation can be implement either using Waitgroups or Channels. Here, multiple goroutines are created and utilized to perform independent CPU bound tasks concurrently, leveraging the availability of all cores available in your machine.
CPU Bound Processes in Golang [4 Best Examples]
Now that we understand what CPU bound processes are, let’s now look at below example codes to understand the implementation using sequential and concurrent method. In all the examples below, we have created CPU bound functions meaning the speed of code is dependent on speed of the CPU. They are all CPU intensive.
Example-1 : Using Sequential Method
In this example, we have created five CPU bound functions each of which will iterate over an empty for loop from 1 to 9999999. runtime.NumCPU() returns total number of cores available in your machine. By default, it set it to total number of cores available in your machine.
package main import ( "fmt" "runtime" "time" ) func main() { cores := runtime.NumCPU() fmt.Println("\nNumber of cores available", cores, "\n") start := time.Now() salaryCount1() salaryCount2() salaryCount3() salaryCount4() salaryCount5() stop := time.Since(start) fmt.Println("Processing took: ", stop) } func salaryCount1() { fmt.Println("salaryCount1 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount1 function Done!!") } func salaryCount2() { fmt.Println("salaryCount2 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount2 function Done!!") } func salaryCount3() { fmt.Println("salaryCount3 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount3 function Done!!") } func salaryCount4() { fmt.Println("salaryCount4 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount4 function Done!!") } func salaryCount5() { fmt.Println("salaryCount5 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount5 function Done!!") }
PS C:\Users\linuxnasa\OneDrive\Desktop\Go-Dump> go run .\hello-world.go Number of cores available 8 salaryCount1 function Executing..... salaryCount1 function Done!! salaryCount2 function Executing..... salaryCount2 function Done!! salaryCount3 function Executing..... salaryCount3 function Done!! salaryCount4 function Executing..... salaryCount4 function Done!! salaryCount5 function Executing..... salaryCount5 function Done!! Processing took: 14.2424ms
In the above example, We see that the processing time taken is around 14s which is too high. This is because, each function is executed one after the other. so adding more number of cores will not help out to reduce the processing time of code. To overcome this problem, we will convert this code into concurrent program in next two examples. Let’s see how that works.
Example-2 : Using Waitgroup Implementation
In this example, we will convert the sequential code used in example-1 to concurrent code using Waitgroup. We have added 5 goroutines in the Waitgroup using wg.Add(5) as we have created five CPU intensive goroutines. We have added wg.Done() at the end of each Goroutine which implies that it’s execution is completed.
package main import ( "fmt" "runtime" "sync" "time" ) var wg = sync.WaitGroup{} func main() { cores := runtime.NumCPU() fmt.Println("\nNumber of cores available", cores, "\n") wg.Add(5) start := time.Now() go salaryCount1() go salaryCount2() go salaryCount3() go salaryCount4() go salaryCount5() wg.Wait() stop := time.Since(start) fmt.Println("Processing took: ", stop) } func salaryCount1() { fmt.Println("salaryCount1 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount1 function Done!!") wg.Done() } func salaryCount2() { fmt.Println("salaryCount2 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount2 function Done!!") wg.Done() } func salaryCount3() { fmt.Println("salaryCount3 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount3 function Done!!") wg.Done() } func salaryCount4() { fmt.Println("salaryCount4 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount4 function Done!!") wg.Done() } func salaryCount5() { fmt.Println("salaryCount5 function Executing.....") for i := 1; i < 10000000; i++ { } fmt.Println("salaryCount5 function Done!!") wg.Done() }
PS C:\Users\linuxnasa\OneDrive\Desktop\Go-Dump> go run .\hello-world.go Number of cores available 8 salaryCount5 function Executing..... salaryCount2 function Executing..... salaryCount1 function Executing..... salaryCount3 function Executing..... salaryCount4 function Executing..... salaryCount5 function Done!! salaryCount1 function Done!! salaryCount3 function Done!! salaryCount2 function Done!! salaryCount4 function Done!! Processing took: 4.2731ms
In the above example, notice that the processing time take is around 4s which is comparatively quite less than example-1. This is because total available core is 8. Each goroutine is using one core and executing parallelly.
Example-3 : Using Channel Implementation
In this example, We have converted same code used in example-2 into Channel version. We have created a buffered channel of size five using ch := make(chan string, 5). When all Goroutines execution is completed, we are closing the channel using close(ch) . Each goroutine is sending the data in the channel. For example, salaryCount1 goroutine will send the data to channel using ch <- “salaryCount1 function Done!!”
package main import ( "fmt" "runtime" "time" ) func main() { cores := runtime.NumCPU() fmt.Println("\nNumber of cores available:", cores, "\n") ch := make(chan string, 5) start := time.Now() go salaryCount1(ch) go salaryCount2(ch) go salaryCount3(ch) go salaryCount4(ch) go salaryCount5(ch) for i := 0; i < 5; i++ { fmt.Println(<-ch) } close(ch) stop := time.Since(start) fmt.Println("Processing took:", stop) } func salaryCount1(ch chan string) { fmt.Println("salaryCount1 function Executing.....") for i := 1; i < 10000000; i++ { } ch <- "salaryCount1 function Done!!" } func salaryCount2(ch chan string) { fmt.Println("salaryCount2 function Executing.....") for i := 1; i < 10000000; i++ { } ch <- "salaryCount2 function Done!!" } func salaryCount3(ch chan string) { fmt.Println("salaryCount3 function Executing.....") for i := 1; i < 10000000; i++ { } ch <- "salaryCount3 function Done!!" } func salaryCount4(ch chan string) { fmt.Println("salaryCount4 function Executing.....") for i := 1; i < 10000000; i++ { } ch <- "salaryCount4 function Done!!" } func salaryCount5(ch chan string) { fmt.Println("salaryCount5 function Executing.....") for i := 1; i < 10000000; i++ { } ch <- "salaryCount5 function Done!!" }
PS C:\Users\linuxnasa\OneDrive\Desktop\Go-Dump> go run .\hello-world.go Number of cores available: 8 salaryCount5 function Executing..... salaryCount2 function Executing..... salaryCount3 function Executing..... salaryCount4 function Executing..... salaryCount1 function Executing..... salaryCount5 function Done!! salaryCount3 function Done!! salaryCount4 function Done!! salaryCount1 function Done!! salaryCount2 function Done!! Processing took: 3.7191ms
In the above example, notice that the execution time taken is around 3.7s which is because each goroutine is using separate core for execution concurrently.
Example-4 : Find prime number
In this example, we will find the total count of prime number in the range 0 to 100000. This code is implemented using Channel. We have created a function called findPrime() which has the logic to identify if a given number is prime. It returns the output to another function called countPrimesInRange(). This function is converted into Goroutine and it keep on increasing the counter everytime a prime number is reported. At the end of for loop execution, this function will send the count data into channel using resultCh <- count. This data will be received in main function using totalPrimeCount += <-resultCh.
package main import ( "fmt" "time" ) func findPrime(num int) bool { if num <= 1 { return false } for i := 2; i*i <= num; i++ { if num%i == 0 { return false } } return true } func countPrimesInRange(start, end int, resultCh chan int) { count := 0 for i := start; i <= end; i++ { if findPrime(i) { count++ } } resultCh <- count } func main() { start := time.Now() const numWorkers = 10 totalNumbers := 100000 numbersPerWorker := totalNumbers / numWorkers resultCh := make(chan int, numWorkers) for i := 0; i < numWorkers; i++ { startRange := i*numbersPerWorker + 2 endRange := startRange + numbersPerWorker - 1 go countPrimesInRange(startRange, endRange, resultCh) } totalPrimeCount := 0 for i := 0; i < numWorkers; i++ { totalPrimeCount += <-resultCh } stop := time.Since(start) fmt.Println("Total prime numbers found:", totalPrimeCount) fmt.Println("Processing took:", stop) }
PS C:\Users\linuxnasa\OneDrive\Desktop\Go-Dump> go run .\hello-world.go Total prime numbers found: 9592 Processing took: 3.1948ms
Summary
We saw multiple examples on how CPU intensive loads are implemented in various ways. CPU bound techinque is widely and hugely used in many scenario. For example, image processing, Data Compression, Cryptography, Mathematical Calculation and more.