Let's take a JavaScript string: "€100"
. This
is going to be sent from a browser input box and stored in a web
server's database. The database is using the UTF-8 encoding and the
constraint on the column is CHAR(4)
. Spot the problem?
Friday, 24 December 2010
JavaScript: validating UTF-8 string lengths in the browser
Sunday, 19 December 2010
JSP: what all the encoding declarations mean
When you see a JSP document, you might wonder why it specifies the UTF-8 encoding three or four times. This is a post about what those declarations mean.
Sunday, 21 November 2010
Comments policy
Comments are moderated and will not appear until I approve them.
- I don't live on the blog, so it may take me some time to see and respond to your comment.
- I won't publish comments with e-mail addresses in them.
- If you post a question and I don't respond, I just may not know the answer off the top of my head and may not feel like putting in the research to answer it. You'll have more luck on a dedicated Q&A site like stackoverflow.com.
- Comments that say little more than "Thanks!" are appreciated, but don't add much value for other readers. Don't expect them to show up.
- Spam gets deleted.
Corrections and constructive criticism are welcome.
Sunday, 19 September 2010
Java: System.console(), IDEs and testing
The method System.console()
can return null
if there is no console device present. This
comes as a surprise to people when they run
their code in an IDE. This post is about overcoming such problems.
Thursday, 16 September 2010
Java: "Content is not allowed in prolog" - causes of this XML processing error
Content is not allowed in prolog is an error generally
emitted by the Java XML parsers when data is encountered before the <?xml...
declaration. You may inspect the document in a text editor and think
nothing is wrong, but you need to go down to the byte level to understand
the problem. You probably have a character encoding bug.
Sunday, 1 August 2010
Java: a fluent I/O API (4/4)
This is the fourth post about my experiments with a fluent I/O API. This post covers conclusions and limitations of the implementation. You can find downloads and source repository details further down the page.
Java: a fluent I/O API (3/4)
This is the third post about my experiments with a fluent I/O API. This post covers how the API enhances exception handling.
Java: a fluent I/O API (2/4)
This is the second post about my experiments with a fluent I/O API. This post covers how to extend the API.
Java: a fluent I/O API (1/4)
I've been experimenting with fluent API design. You can find the sources in part 4.
I've often been frustrated with the verbosity of Java I/O. Handling close with decorators got better with the introduction of the Closeable interface, but there's still a bit of boilerplate. This post describes a new fluent API to wrapper around the existing I/O API.
Saturday, 17 April 2010
I18N: comparing character encoding in C, C#, Java, Python and Ruby
Don't assume that the character handling conventions you've learnt in one language/platform will automatically apply in others. I've selected a cross-section of popular languages to contrast the different ways character encoding is handled.
Tuesday, 12 January 2010
Scala: implementing a "did you mean..?" spelling corrector
I was looking at Scala again and decided to implement Peter Norvig's algorithm for suggesting spelling correction suggestions. I suggest you go read How to Write a Spelling Corrector for the clever stuff.
This implementation is limited to the English alphabet. You'll need the big.txt file or a similar set of training data.