Unicode, Charsets, Strings, and Binaries

#Code BEAM V 2020

TALK LEVEL: BEGINNER / INTERMEDIATE / ADVANCED

Writing global software means our programs need to speak global human languages, but writing programs that work correctly with non-Western European languages is at best a confusing affair. UTF8, latin1, Unicode?

What do these terms mean and how are they related to one another?

And what does Erlang do?

This talk demystifies the terminology around character encoding, explains how to retrofit your Erlang program for Unicode using Datometry HyperQ as a case study, and gives some best practices to help you break the one-byte/one-character assumption.