Monday, April 18, 2011

The Go language year and a half later

[This note was originally written and published as a guest post on my friends blog. I highly recommend a visit.]
On December 12 2009 Gynvael wrote a blog post about his first impressions of Google's Go language. Over an year later I mentioned to the author that a similar post about Digital Mars D and Mozilla Rust would also be an interesting topic. This whole conversation reminded me about the original post and influenced to refresh the material. After encountering the following sentance "And, I look forward to the Windows version, since it's currently available only on *nix platforms." I decided to learn about the current state of Go.



Couple of minutes spent on the official Go website quickly exposed the link discussing a port of the language for the MS Windows platform. 


The port page links directly to a site hosting compiled releases of the language from which we download the gowin32_2011-03-07.1_installer.exe and use it to install Go on our system.


During the reconnaissance Gynvael ported one of his raytracers to the language. My goal was to compile and run his code with a positive result on the MS Windows platform with eventual updates to the source in order to obtain a result compatible with his initial implementation.


t2.go starts with a series of preprocessor macro declarations. I checked if a similar feature was recently added to the language but a quick glimpse over their mailing list and bug tracker exposed that this functionality is still intentionally avoided.
This state forces us to process the file with the standard cpp preprocessor according to the authors recommendations and removing the macro definitions with the help of grep. For this purpose we will use a separate MinGW installation already present on my system.


Execute the following command in a MinGW shell:
$ cpp t2.go | grep -v '^#' > t2.out.go


In the following sections of this post I will assume that at this point the t2.out.go file was renamed to t2.go.


Now from a regular cmd.exe command line we try to execute the next step outlined in the program header:


C:\blog>6g t2.go
The name '6g' is not recognized as an internal or external command, operable program or batch file.


Nothing unusual.
We've downloaded the 32-bit version of Go (I didn't see other versions on the download page).
The language creators were following a specific naming scheme from the plan9 operating system, so 6 in 6g stands for amd64 (x86-64) and g indicates that it is a Go language compiler.
In this case we need to use the 8g compiler and the 8l linker (where 8 stands for the x86 architecture).


Let's retry the compilation with the proper programs:
C:\blog>8g t2.go
t2.go:57: syntax error: unexpected semicolon or newline before {
t2.go:60: non-declaration statement outside function body
t2.go:62: non-declaration statement outside function body
t2.go:65: syntax error: unexpected semicolon or newline before {
t2.go:65: non-declaration statement outside function body
t2.go:66: non-declaration statement outside function body
t2.go:67: non-declaration statement outside function body
t2.go:68: non-declaration statement outside function body
t2.go:70: syntax error: unexpected semicolon or newline before {
t2.go:70: too many errors



The compilation problems listed above are a result from the changes in the language itself during one year of it's development.
It's worth to note the error t2.go:70: too many errors doesn't mean that we made a tremendous amount of mistakes in line 70 :) The compiler is informing us, that the amount of errors is larger than displayed so the output of following problems is suspended until the currently presented ones are fixed.
Most people experienced in software development know that fixing the first reported error often eliminates a large part of the listed problems or reveals their real cause.
Spitting out each error on the output often leads to a mistake most beginner programmers make - fixing the last visible problem. The default behavior of the Go compiler will reduce the amount of people Chasing the Wind. Let's observe this on the first set of errors:


t2.go:57: syntax error: unexpected semicolon or newline before {
t2.go:60: non-declaration statement outside function body
t2.go:62: non-declaration statement outside function body



The contents of lines 56, 57 and 58 is presented below:
56: func main()
57: {
58:   fmt.Printf("Simple RT by gynvael.coldwind//vx (http://gynvael.coldwind.pl)\n");



The compiler complains about an unexpected semicolon or newline before the opening brace token. Referring to the documentation we read:


One caveat. You should never put the opening brace of a control structure (if, for, switch, or select) on the next line. If you do, a semicolon will be inserted before the brace, which could cause unwanted effects. Write them like this:
if i < f() {
   g()
}

not like this
if i < f() // wrong!
{          // wrong!
   g()
}


This sounds like a similar problem to the one we encountered but there is no mention of an automatically inserted semicolon after a function declaration. In order to precisely determine the root cause of our problem we look up the documented changes between the language releases.


In the note from 2009-12-22 we read:


Since the last release there has been one large syntactic change to
the language, already discussed extensively on this list: semicolons
are now implied between statement-ending tokens and newline characters.


A broader discussion on the introduced changes regarding the rules of insertion for the semicolon can be found on the mailing list.

Below an interesting excerpt from the discussion with an additional justification of the changes:

Believe it or not, this removes rules. Robert played with
this in the parser and the language spec and was surprised
how much simpler things got. I am doing the same conversion
in the 6g compiler right now, and the dead code I'm cutting
away is one of the ugliest parts of the compiler. I'd
forgotten writing it, but boy is it ugly. And soon it will
be gone. -- Russ Cox


It's interesting that there is no mention of influence in the case of function headers. One can even find examples on the mailing list thread, that use the exact same brace placement as Gynvael did, which could lead to assumptions that it's still a valid syntax.
In our case. The most interesting information is a mention of a feature built-into gofmt, which allows automatic adjustments of code using the old syntax rules.
gofmt -oldparser -w *.go
Let's try to manually fix the error on line 57 before we try any automatic transformations on our code.
56: func main() {
57: fmt.Printf("Simple RT by gynvael.coldwind//vx (http://gynvael.coldwind.pl)\n");


C:\blog>8g t2.go
t2.go:94: syntax error: unexpected semicolon or newline before {
t2.go:97: syntax error: unexpected {
t2.go:114: syntax error: unexpected }
t2.go:116: non-declaration statement outside function body
t2.go:116: empty top-level declaration
t2.go:117: non-declaration statement outside function body
t2.go:117: too many errors


The previous listing went up to the error on line 70, including an identical message on line 65 which doesn't contain any errors. After the change the list starts on line 94, saving us the effort of going through a lot of code in the search of nonexistent errors. The list would be a lot larger, if the Go compiler didn't stop reporting the errors.

Now we will rollback our change to the line 57 and see how many problems gofmt will fix:
C:\blog>gofmt -oldparser -w t2.go
flag provided but not defined: -oldparser
usage: gofmt [flags] [path ...]
-tabwidth=8: tab width
-trace=false: print parse trace
-r="": rewrite rule (e.g., '+-[+-:len(+-)] -> +-[+-:]')
-tabindent=true: indent with tabs independent of -spaces
-s=false: simplify code
-l=false: list files whose formatting differs from gofmt's
-w=false: write result to (source) file instead of stdout
-ast=false: print AST (before rewrites)
-spaces=true: align with spaces instead of tabs
-comments=true: print comments

The above error flag provided but not defined: -oldparser. Indicates no support for the flag -oldparser we found mentioned in the release notes.
In the note from 2010-01-13 we see an update announcing the removal of the -oldprinter flag, obviously this isn't the flag we are using.
In the whole document there is only one occurrence of the oldparser string. At this point we must assume that the removal of this flag was left undocumented. The only thing left we can do is to try calling the tool without providing the flag hoping that the behavior is on by default:

C:\blog>gofmt -w t2.go
t2.go:57:1: expected declaration, found '{'
..... a long list of recognized problems


It seems that we're out of luck.gofmt doesn't fix any of our errors by default.
We will not report the problem to the authors. Most probably we would be directed to an older release of the tool or were suggested to modify our code base with a custom script/manually (based on how this issue was handled in December 2009).
All that's left is just fixing the opening braces after function declarations and those reported along other instruction (for example the if statement).
I'm leaving this as an exercise for the reader (and a link to a patch for the lazy ones).

After fixing the last brace problem, the compiler will print out another interesting set of errors:
C:\blog>8g t2.go
t2.go:56: ".ggg....." not used
t2.go:56: ".g...rrr." not used
t2.go:56: ".g.g.r.r." not used
t2.go:56: ".ggg.rrr." not used
t2.go:56: "........." not used
t2.go:411: ".ggg....." not used
t2.go:411: ".g...rrr." not used
t2.go:411: ".g.g.r.r." not used
t2.go:411: ".ggg.rrr." not used
t2.go:411: "........." not used
t2.go:56: too many errors


The messages indicate that strings located on lines 56 and 411 are not used. The reported lines point to the beginning and end of the main function, so they are not helpful in locating the erroneous line. Lucky for us, there aren't many occurrences of the content of the reported strings, a quick manual search points us to line 81:

81: SpherePosMap =
82: "........."
83: ".ggg....."
84: ".g...rrr."
85: ".g.g.r.r."
86: ".ggg.rrr."
87: ".........";


In the light of the previous information we have gathered, we can assume that the cause of this problem is yet again the automatic insertion of the semicolon.
Here is the base for my reasoning:
If a semicolon is inserted in line 82 then the string containing only periods will be stored in the variable SpherePosMap.
On the next step, the compiler will insert semicolons at the end of each following lines containing string literals which aren't assigned to variable hence not stored anywhere - unused.
The compiler reports the first unused string as the one containing ".ggg.....", so the first line is completely omitted from the output which we assumed is stored in the SpherePosMap variable. This behavior is plausible enough that we can try a modification based on concatenating the strings togheter (using the + operator).

81: SpherePosMap =
82: "........." +
83: ".ggg....." +
84: ".g...rrr." +
85: ".g.g.r.r." +
86: ".ggg.rrr." +
87: ".........";


Recompilation confirms that the problem is resolved and we are presented with a new one:
C:\blog>8g t2.go
t2.go:348: img.Width undefined (type *image.RGBA has no field or method Width)
t2.go:349: img.Height undefined (type *image.RGBA has no field or method Height)

t2.go:357: img.Height undefined (type *image.RGBA has no field or method Height)

t2.go:360: img.Width undefined (type *image.RGBA has no field or method Width)
t2.go:400: img.Pixel undefined (type *image.RGBA has no field or method Pixel)


It looks like the interface of the image package changed.
A release note from 2010-08-11 confirms this with a following description of changes to the package:

An image.Image now has a Bounds rectangle, where previously it ranged
from (0, 0) to (Width, Height). Loops that previously looked like:


for y := 0; y < img.Height(); y++ {
   for x := 0; x < img.Width(); x++ {
       // Do something with img.At(x, y)
   }
}

should instead be:

b := img.Bounds()
for y := b.Min.Y; y < b.Max.Y; y++ {
   for x := b.Min.X; x < b.Max.X; x++ {
       // Do something with img.At(x, y)
   }
}

* image: change image representation from slice-of-slices to linear buffer,
introduce Decode and RegisterFormat,
introduce Transparent and Opaque,
replace Width and Height by Bounds, add the Point and Rect types.


The described changes enforce specific modifications of the Render function in the lines reported by the compiler. We start by adding the b variable to the list of declared variables in the function in order to avoid repetitive calls to img.Bounds.Max/Min.X/Y directy in the code:

var b = img.Bounds();

We also substitute occurrences of img.Width() and img.Height() to:

b.Max.X // instead of img.Width()
b.Max.Y // instead of img.Height()

Additionally in line 361 a purely cosmetic change from:

x = 0

to:

x = b.Min.X

After finishing the listed changes we are left with only one problem:

t2.go:400: img.Pixel undefined (type *image.RGBA has no field or method Pixel)

A mention of a change to Pixel appears in the first sentence of the second part of changes to the image package:


* image: change image representation from slice-of-slices to linear buffer

Unfortunately, in the case of this change the authors didn't provide examples of modifications that are required on the code base. We locate the image.go package on our hardware which was installed along our compiler and check the section responsible for handling the image.RGBA structure.



/* Location: C:\Go\src\pkg\image\image.go:28 */
// An RGBA is an in-memory image of RGBAColor values.
type RGBA struct {
   // Pix holds the image's pixels. The pixel at (x, y) is Pix[y*Stride+x].
   Pix    []RGBAColor
   Stride int
   // Rect is the image's bounds.
   Rect Rectangle
}


By reviewing the provided snippet we can deduce, that the name of the Pixel field was changed to Pix and according to the release note description it's representation is changed from a slice-of-slices (a multidimensional array?) to a linear buffer.
A comment placed inside the structure declaration contains an example usage of the new format:


Pix[y*Stride+x]

Armed with new information we go back to our code base and introduce the following change in line 401 from:


img.Pixel[y][x] = cl;

to:


img.Pix[y*img.Stride+x] = cl;

Once again we attempt a compilation of the code:


C:\blog>8g t2.go

This time without errors, so we move onward to the linking:


C:\blog>8l t2.8

The effect of the above commands is an executable 8.out.exe, which we gladly run :)


C:\blog>8.out.exe
Simple RT by gynvael.coldwind//vx (http://gynvael.coldwind.pl)
Creating scene...
Rendering...
[0] Thread start
[1] Thread start
[2] Thread start
[3] Thread start
[0] Thread finished
[1] Thread finished
[2] Thread finished
[3] Thread finished
Writing test.png image...
Done.


The output of the program in png format can be viewed here.
During the programs execution we can observe, that only one of the two (in my case) available CPU cores is utilized.

The code after every necessary change required to compile and run it can be found here.

At this point, we can think how to make the program correctly take advantage of the available cores. The official documentation (section Concurrency->Parallelization) proves to be helpful again.
The current implementation of gc (6g, etc.) will not parallelize this code by default.

Examples on the page show how to achieve parallelization and information that the feature is planned to be made automatic in the future.

Let's try to modify Gynvael's code again.
We will add an runtime package import to the beginning of the file and define a constant containing the number of CPU cores available in our machine:


import (
"os";
"fmt";
"math";
"image";
"runtime"; // Added for rendering parallelization
"image/png"
)
const NCPU = 2; // Number of cores


Next, at the beginning of the main() function we add a call to GOMAXPROCS passing the our constant as an input parameter:


func main() {
runtime.GOMAXPROCS(NCPU)
fmt.Printf("Simple RT by gynvael.coldwind//vx (http://gynvael.coldwind.pl)\n");


After recompiling the program and doing an additional run we can observe that both cores are properly used. We gain about 4 seconds by using the second CPU (~20s instead of ~24s to render the graphic).
It's interesting, that in the initial version, according to the trace printouts, the threads finished their works always in the same order - the new version shows much more diversity here.
The rendered file test.png is identical to the rendering from our previous pass (diff doesn't report any differences).

The modified code can be found here.

It's worth to note, that the same effect can be achieved by setting the environment variable GOMAXPROCS. However this isn't the most convenient way on the MS Windows platform hence we decided to modify the code.

At this point we could finish our fun with the code but let's reformat the code with gofmt as a final step:


C:\blog>gofmt t2.go > t2_p3.go

Output from diff -u shows changes in indention of several sections of the code, removal of all the semicolons (which became optional during the development of the language) and brace placement around local code blocks.
The code exapnded to 627 lines from 416 which we had after our last modification. In my opinion greatly gaining on readability.

This step should also help with future changes to the code base required due to potential changes to the languages syntax (if done often enough - before the migration phase is removed from gofmt similarly to the case of -oldparser).

A final version of the code after all of the modifications can be found under this address.

I also provided links to the executable files compiled on my system (MS Windows Vista 32-bit):


Regardless of the lack of recent updates to the code base and a language still in development, the amount and difficulty of required changes to the code were surprisingly small.
The quality of the error reports generated by the Go compiler is also satisfying - Mostly pointing to the precise source of the problem without flooding with irrelevant information.
The recently published post on the official blog of the language informs that the language is entering a more stabilized phase with official releases published once a month (up to now they were published once a week). The weekly releases will also remain their publication schedule which is a nice message to people wanting to experiment with the bleeding edge version of the language. The authors also announced the introduction of the gofix tool, which will become responsible for modifications of the code required by changes to the language, so most of our actions described in this post should become redundant in the future :)

PS.
While writing this note I stumbled upon on a different implementation of the Go language compiler for the MS Windows platform, namely erGo, which is especially interesting since it's implemented in Go itself and supports debugging in Visual Studio 2008. Unfortunately I didn't had the opportunity to test this implementation.

PS2.
Thanks neme for proof reading this post :)

PS3.
When this note was initially written gofix wasn't released yet. It's out now so go ahead and read about it on the official blog. If anyone tries the tool on the initial version of Gynvael's code then please share with us in the comments :)