Go (golang.org) has pretty good Unicode support on the Windows command line. I've written about Unicode and cmd.exe before in the context of C++, C# and Java.
Versions: Windows 7 (64-bit); go1.1.1 windows/amd64; Java 1.7.0_21
Unicode in Go using cmd.exe
The command prompt still needs to be customized to use a TrueType font to overcome the limitations of raster fonts.
The following code prints the characters £я using their escape sequences in the source:
//codepoints.go package main import "fmt" func main() { fmt.Printf("\u00A3\u044F\n") }
Running the code:
>go run codepoints.go £я
When redirecting output to a file Go uses UTF-8 so this isn't lossy either.
Java Unicode shiv for cmd.exe implemented in Go
The following code uses Go to act as an intermediary between Java and cmd.exe. This avoids any mucking about with WriteConsoleW.
The Java code still needs to switch to a lossless encoding as System.out
will use an old "ANSI" encoding by default.
The implementation uses UTF-8 as the intermediary character encoding:
import java.io.*; class CodePoints { public static void main(String[] args) throws IOException { // set STDOUT encoding to UTF-8 System.setOut(new PrintStream(System.out, true, "UTF-8")); // print some data System.out.println("\u00A3\u044F"); } }
The Go code transcodes the received bytes from UTF-8 to its rune type before emitting them:
//java.go package main import ( "fmt" "os/exec" "unicode/utf8" ) type Transcoder struct { } func (w Transcoder) Write(b []byte) (n int, err error) { n = len(b) for len(b) > 0 { r, size := utf8.DecodeRune(b) fmt.Printf("%c", r) b = b[size:] } return n, err } func main() { cmd := exec.Command("java", "CodePoints") cmd.Stdout = new(Transcoder) cmd.Start() cmd.Wait() }
Otherwise, this is a simple exec command, equivalent to typing java CodePoints
.
Compiling and running the code:
>javac CodePoints.java >go run java.go £я
Note that this simple demo code doesn't handle STDERR or STDIN.
No comments:
Post a Comment
All comments are moderated