An overview of what goes into the computer processing of text:

I don’t believe there is a single place where it’s all properly written down. I have some explanation for that: while basic text layout is very important for UI, games, and other contexts, a lot of the “professional” needs around text layout are embedded in much more complicated systems such as Microsoft Word or a modern Web browser. […]

The hierarchy is: paragraph segmentation as the coarsest granularity, followed by rich text style and BiDi analysis, then itemization (coverage by font), then Unicode script, and shaping clusters as the finest.


Ultimately, Unicode’s Emojigeddon boils down to a few essential questions: Are emojis a language? And if not, what exactly are they? Why are their regulation and evolution overseen by a bunch of language nerds and engineers? Typographers, linguists, and text-encoding experts including Unicode’s president generally agree that the character set does not rise to the standards of an emerging language.

“People have strategies for stringing them together, of course, and deriving greater meaning — everyone knows eggplant is an erection and people sext with the vegetables, but that does not make it a substitute for language,” Everson said.

But for others, emojis’ ubiquity makes the character set a meaningful mode of expression that transcends traditional linguistic barriers — vegetable sexting included — and is quite the opposite of a dumbed-down “cartoon.” Language or not, they argue, when millions of people zealously adopt a new, authentic way to communicate, it becomes important whether Everson, Unicode, or any linguist, typographer, or academic agrees.

C’è una questione interna al consorzio Unicode — l’entità che si occupa di codificare i caratteri dei vari alfabeti del mondo. Compito di Unicode è di far sì che anche lingue sconosciute o morte siano rappresentabili in codice, assegnando a ciascun carattere un codice univoco. Unicode è anche l’entità che si occupa delle emoji, e la questione è questa: le emoji sono un linguaggio?

Secondo alcuni contributori storici le emoji stanno distogliendo l’attenzione del consorzio da alfabeti e caratteri meno conosciuti ma comunque importanti — per chi ha a che fare con manoscritti o testi antichi, o per chi usa un alfabeto poco conosciuto o popolare ma comunque parte della propria cultura. Unicode garantisce che anche queste persone possano scrivere e fare uso, su un computer, della propria lingua.