I would like to suggest a protocol upgrade to convert Michelson string type from ASCII-only to Unicode/UTF8 representation.
Currently, Michelson string supports ASCII-only text. At a glance, it seems like a reasonable approach, but it creates problems when trying to integrate with off-chain tools and/or store some human-readable information on chain.
Strings Should be Human-Readable.
I have heard an argument that contract developers should put rich data off-chain and keep on chain only minimal information which helps to identify and retrieve related off-chain resources.
In general, it is a good approach, but what if a developer still decides to keep all the data on-chain? Strings are supposed to be human-readable (and human-readable does not mean English-only). What if a developer wants to put something like a token symbol or short name of the entity represented by the contract or a record into contract storage? She must limit herself to ASCII-only symbols and/or figure out how to “sanitize” those string before putting them on chain.
Interop With Off-chain Tools.
I think it is safe to say that almost all software these days runs on Unicode/UTF8. Supporting ASCII-only strings on chain makes it hard to integrate with off-chain tools. One of the examples would be storing a URL on chain to access off-chain resources. A URL cannot be stored as a Michelson string now since it may contain UTF8 characters.
Look at TZIP-16 standard. It encodes external URL as bytes because of the existing limitation of the Michelson string type.
All in all, although limiting string to ASCII-only may have some benefits for the blockchain implementation, it makes life of an application/contract developer more difficult.