An overview of what goes into the computer processing of text:
I don’t believe there is a single place where it’s all properly written down. I have some explanation for that: while basic text layout is very important for UI, games, and other contexts, a lot of the “professional” needs around text layout are embedded in much more complicated systems such as Microsoft Word or a modern Web browser. […]
The hierarchy is: paragraph segmentation as the coarsest granularity, followed by rich text style and BiDi analysis, then itemization (coverage by font), then Unicode script, and shaping clusters as the finest.