Wednesday, 13 May 2015

Rust: UTF-8 byte array to String

Rust has a string type that mandates UTF-8 for strings but at some point you need to turn raw octets into structures you can treat as character data.

pub fn utf8_to_string(bytes: &[u8]) -> String {
  let vector: Vec<u8> = Vec::from(bytes);
  String::from_utf8(vector).unwrap()
}

#[cfg(test)]
mod test {
  use super::*;

  #[test]
  fn test_to_string() {
    let bytes: [u8; 7] = [0x55, 0x54, 0x46, 0x38, 0, 0, 0];
    let len: usize = 4;
    let actual = utf8_to_string(&bytes[0..len]);
    assert_eq!("UTF8", actual);
  }
}

The utf8_to_string function turns some octets into a String. Note the lack of semicolon on the last line - this indicates that the line returns a value. The ampersands indicate that the function is borrowing the variable. Rust has strong opinions on ownership - expect compile errors if you do not understand this concept.

Rust comes packages with the Cargo build system. I've in-lined my unit tests to exercise the functions I write. assert_eq! is a macro. Note that I've tried out a slice in my test to truncate the trailing zeroes.

This post is just a code snippet written by someone getting started. No promises are made about code quality. Version: rustc 1.0.0-beta.4 (850151a75 2015-04-30) (built 2015-04-30)

No comments:

Post a Comment

All comments are moderated