So you don't have to deal with it until user data includes _any non-ascii character_ (including emoji, weird spaces copied from other stuff, or loan words like café)
"Dealing with unicode" is really just about dealing with it at the input/output boundaries (and even then libraries handle it most of the time). But without the clear delineation that Python 3 provides, when you _do_ hit some issue you probably insert a "fix" in the wrong space. Leading to the classic Py2 "I just call decode 1000 times on the same string because I've lost track"
> So you don't have to deal with it until user data includes _any non-ascii character_ (including emoji, weird spaces copied from other stuff, or loan words like café)
Interesting text follows company set naming schemes, which means all english and ascii. The rest could be random bytes for all I have to care about.
Many formats like plain text or zip don't have a fixed encoding and I am not going to start guessing which one it is for every file i have to read, there is no way to do that correctly. Dealing with that mess is explicitly something I want to avoid.
"Dealing with unicode" is really just about dealing with it at the input/output boundaries (and even then libraries handle it most of the time). But without the clear delineation that Python 3 provides, when you _do_ hit some issue you probably insert a "fix" in the wrong space. Leading to the classic Py2 "I just call decode 1000 times on the same string because I've lost track"