In this video I continue the question "Why would anyone write Assembly Language?" by analysing UTF-8 in the multistream Japanese Wikipedia XML dump. (If that link stops working, search for jawiki on the backup index page)
Uncompressed, this file is around 12GB and is stored in UTF-8. In this video, I compare Assembly Language, C - clang, C - gcc, NodeJS, Ruby and Python 3.
Download working directory
An archive of everything I produced during this video: part2.tar.gz. Note: I included a download script so you can also grab the Japanese Wikipedia XML (not included in my part2.tar.gz).