TVS Part 2: Why? Tiptoe into Unicode
Jeff Marrison
In this video I continue the question "Why would anyone write Assembly Language?" by analysing UTF-8 in the multistream Japanese Wikipedia XML dump. (If that link stops working, search for jawiki on the backup index page)
Uncompressed, this file is around 12GB and is stored in UTF-8. In this video, I compare Assembly Language, C - clang, C - gcc, NodeJS, Ruby and Python 3.
Download working directory
An archive of everything I produced during this video: part2.tar.gz. Note: I included a download script so you can also grab the Japanese Wikipedia XML (not included in my part2.tar.gz).
Learn more about the basics of assembly language
Tomasz Grysztar, author of flat assembler has created an excellent series on Assembly Language for x86 on his Youtube Channel.