Tuesday, 25 June 2013

Go: Unicode on the Windows command prompt (golang)

Go (golang.org) has pretty good Unicode support on the Windows command line. I've written about Unicode and cmd.exe before in the context of C++, C# and Java.

Versions: Windows 7 (64-bit); go1.1.1 windows/amd64; Java 1.7.0_21

Unicode in Go using cmd.exe

The command prompt still needs to be customized to use a TrueType font to overcome the limitations of raster fonts.

The following code prints the characters £я using their escape sequences in the source:

package main
import "fmt"
func main() {

Running the code:

>go run codepoints.go

When redirecting output to a file Go uses UTF-8 so this isn't lossy either.

Java Unicode shiv for cmd.exe implemented in Go

The following code uses Go to act as an intermediary between Java and cmd.exe. This avoids any mucking about with WriteConsoleW.

The Java code still needs to switch to a lossless encoding as System.out will use an old "ANSI" encoding by default. The implementation uses UTF-8 as the intermediary character encoding:

import java.io.*;
class CodePoints {
  public static void main(String[] args) throws IOException {
    // set STDOUT encoding to UTF-8
    System.setOut(new PrintStream(System.out, true, "UTF-8"));
    // print some data

The Go code transcodes the received bytes from UTF-8 to its rune type before emitting them:

package main

import (

type Transcoder struct {

func (w Transcoder) Write(b []byte) (n int, err error) {
        n = len(b)
        for len(b) > 0 {
                r, size := utf8.DecodeRune(b)
                fmt.Printf("%c", r)
                b = b[size:]
        return n, err

func main() {
        cmd := exec.Command("java", "CodePoints")
        cmd.Stdout = new(Transcoder)

Otherwise, this is a simple exec command, equivalent to typing java CodePoints.

Compiling and running the code:

>javac CodePoints.java

>go run java.go

Note that this simple demo code doesn't handle STDERR or STDIN.

No comments:

Post a Comment

All comments are moderated