Go Programming Example
GO v1
Recently, the Go language release it’s version 1.0, which have a more constant api that aiming on
attracting enterprise developers to use go as part of their solution.
I read the tutorial when it’s first release. At that time, Go was positioned as a new system language, Like C or C++. But right now it become more powerful that can be use on any general programming topic that require efficiency and concurrency.
As a language, Go has pretty weird syntax such as postfix type decleration and interface oriented. But other then that, Go is more like a compiled version scripting language. Which is pretty easy to write, and focusing on making threading easier through It’s Go routine.
Install go
On OSX you can use homebrew:
brew install go
Concurrent
One main feature of Go is go routine:
1 | package main |
It’s equivalent to Java:
1 |
|
While using go routine, the function will fork to run on different thread. Which won’t block the main process and can be efficient on multicore.
Although there are other framework that can make threading easy too, but Go provide language level support and another useful feature: Channel
In Java, if we want to implement a multithread program, a big problem is how to passing data between Threads. One solution is using shared memory. For example:
1 | public class Counter{ |
In this example, There are 2 thread both increasing the counter, if without the synchronzed keyword, the counter will not be 200000000 in the end because when thread 1 and 2 accessing counter at same time, result will be overwritten by other thread and only increase by 1. The synronized keyword enable deadlock to prevent this happened. But even with lock protection, the communication between threads is still a problem that reduce performence and make the parallel computation harder. Therefore the Go language make a new approach: channel, to solve this problem:
1 |
|
This version of counter send the data through go channel.
There are 2 counter routine running in thread, invoke by go counter(channel, closer).
In the main process, the for loop wait to read the channel until it is closed.
What happened here is the cpu core will switch between threads, and main process can be run when there’s data in channel, if not, switch to other threads. Until the channel is closed.
Example in 5 count:
Thread 1 send 0
Thread 2 send 0
Thread 1 send 1
receiving
receiving
Thread 2 send 1
receiving
Thread 1 send 2
receiving
Thread 2 send 2
receiving
Thread 1 send 3
receiving
Thread 2 send 3
receiving
Thread 1 send 4
receiving
Thread 2 send 4
receiving
receiving
Moreover, we can increase the channel buffer to optimize threads, for example:
channel := make(chan int, 10)
Increase buffer to 10 will make the execute sequence different:
Thread 1 send 0
Thread 2 send 0
Thread 1 send 1
Thread 2 send 1
receiving
Thread 1 send 2
Thread 2 send 2
receiving
Thread 1 send 3
Thread 2 send 3
receiving
Thread 1 send 4
Thread 2 send 4
receiving
receiving
receiving
receiving
receiving
receiving
receiving
By the channel model, thread can easily wait and receive data but not exchange data by shared memory. Making parallel programming more easy and natural.
However, accessing channel is still a heavy job. If we tweak the counter program above to only send back the result:
func counter(id int, channel chan int, closer bool) {
x := 0
for i := 0; i < 10000000; i++ {
x++;
}
channel <- x
}
The execution time can be pretty different:
//count to 200000000
//channel count:
real 0m32.650s
//result only:
real 0m0.359s
Interface Oriented
In Go, there is no Class. Go take a lightweight approach that the data and function are seperate.
And we can attach function to data as receiver.
1 |
|
On the other hand, interface is the collection of methods. That describe the behavior of a type.
Go is using ducktype approach, as long as the type have all method defined in interface, it can be treat as the interface type:
1 |
|
Map Reduce
Map Reduce is a programming model that distribute the calculation and collect the results.
The most famous example is Google. When Google build the index of web pages. Because the calculation is too big. They need to distribute the task to different server, and run as parallel as possible. The result is that they use the map reduce structure, each server received some amount of web page. Parse and count the words of that page, and the result index will be collected and added to the main index.
Here is the structure of Map Reduce by Go, from Dan Bravender
1 |
|
I use the type “Object” to reference an empty interface. Because the Go don’t have generic type.
The first section send input to mapper, and receive worker_outputs as the collection of each worker’s result channel. Afterward, second section of the code send worker_output to reducer. Reducer return the result of calculation.
For example, we can try to calculate PI by some amounts of random points. Assume there’s a circle with radius 1. we get a random point on the square of 1. The PI will be the number of points in circle / total points. That the more points we collect, the more accurate the result will be.
For example, we can distribute the calculate of 2000 points to 2000 worker, and the PI will be equal to the result / total points
1 |
|
Unittest
Go provide a unittest package and a buildin command for testing:
go test
Will execute all *_test.go files as test cases.
Let’s write some test for the MapReduce mapper and reducer we have:
1 |
|
The go testing framework passing a pointer for test case.
There is no extra syntax like assert to learn. Just use simple ==, != to check the results is correct or not. If Error happened, call testcase.Error for report error. Else the case will be mark as passed.
Conclusion
Go is a very interesting language. On the first hand, it might be awkward because of it’s syntax and no object oriented support. But once you know the simplicity and concurrency of Go. It would be a handy tool for the need fast calculation. However, we’re still waiting for a more complete eco system for developer and successful usecase for Go language.
Comments