Skip to content

Back to Snippets

stringToUtf16()

Converts a JavaScript string to a UTF-16 TypedArray.

/*
Return the UTF-16 representation
of the string `str` as a TypedArray.
*/

function stringToUtf16(str) {
const arr = new Uint16Array(str.length);
for (let i = 0; i < str.length; i++) {
arr[i] = str.charCodeAt(i);
}
return arr;
}

How it works

CSV files are a common way to exchange data between the web and native apps. Some things, such as Adobe InDesign’s Data Merge feature, require that CSV file to be encoded using UTF-16.

JavaScript exposes strings as sequences of 16-bit code points, but making a Blob out of a string in order to download it encodes it as UTF-8.

Instead of strings, we must use binary data. TextEncoder only supports UTF-8 by design, with legacy encodings being deferred to userland. The stringToUtf16() method returns the binary data as an Uint16Array, which can be passed to the Blob constructor instead of the string.

As a more succinct, but probably slower, one-liner:

const sToUtf16 = s => new Uint16Array(s.split('').map(c => c.charCodeAt()));

You may sometimes want to include the Byte order mark (BOM) as the first character:

/*
Return the UTF-16 representation
of the string `str` as a TypedArray,
including the Byte Order Mark.
*/

function stringToUtf16(str) {
const arr = new Uint16Array(str.length + 1);
arr[0] = 0xFEFF; // BOM
for (let i = 0; i < str.length; i++) {
arr[i + 1] = str.charCodeAt(i);
}
return arr;
}