stringToUtf16()
Converts a JavaScript string to a UTF-16 TypedArray.
/*
Return the UTF-16 representation
of the string `str` as a TypedArray.
*/
function stringToUtf16(str) {
const arr = new Uint16Array(str.length);
for (let i = 0; i < str.length; i++) {
arr[i] = str.charCodeAt(i);
}
return arr;
}
How it works
CSV files are a common way to exchange data between the web and native apps. Some things, such as Adobe InDesign’s Data Merge feature, require that CSV file to be encoded using UTF-16.
JavaScript exposes strings as sequences of 16-bit code points, but making a Blob
out of a string in order to download it encodes it as UTF-8.
Instead of strings, we must use binary data. TextEncoder
only supports UTF-8 by design, with legacy encodings being deferred to userland. The stringToUtf16()
method returns the binary data as an Uint16Array
, which can be passed to the Blob
constructor instead of the string.
As a more succinct, but probably slower, one-liner:
const sToUtf16 = s => new Uint16Array(s.split('').map(c => c.charCodeAt()));
You may sometimes want to include the Byte order mark (BOM) as the first character:
/*
Return the UTF-16 representation
of the string `str` as a TypedArray,
including the Byte Order Mark.
*/
function stringToUtf16(str) {
const arr = new Uint16Array(str.length + 1);
arr[0] = 0xFEFF; // BOM
for (let i = 0; i < str.length; i++) {
arr[i + 1] = str.charCodeAt(i);
}
return arr;
}