IBM EBCDIC-based mainframe operating systems, such as z/OS, usually use UTF-16 for complete Unicode support.
While "Bush hid the facts" is the sentence most commonly presented on the Internet to induce the error, the bug can be triggered by many sentences with characters and spaces in a particular order so that the bytes match the UTF-16LE encoding of valid (if nonsensical) Chinese Unicode characters.
In 1980, IBM executives failed to heed Ben Riggins' strong suggestions that IBM should provide their own EBCDIC-based operating system and integrated-circuit microprocessor chip for use in the IBM Personal Computer as a CICS intelligent terminal (instead of the incompatible Intel chip, and immature ASCII-based Microsoft 1980 DOS).
It is internationalized; German, English, and other translations are available, and it supports sending and receiving acknowledged and non-acknowledged Unicode-encoded messages (it even understands UTF-8 messages for message types the ICQ protocol does not use them for).
The next 1,920 characters, U+0080 to U+07FF (encompassing the remainder of almost all Latin alphabets, and also Greek, Cyrillic, Coptic, Armenian, Hebrew, Arabic, Syriac, Tāna and N'Ko), requires 16 bits to encode in both UTF-8 and UTF-16, and 32 bits in UTF-32.
For instance, Internet Explorer 7 may be tricked to run JScript in circumvention of its policy by allowing the browser to guess that an HTML-file was encoded in UTF-7.
Open-source-software advocate and hacker Eric S. Raymond writes in his Jargon File that EBCDIC was almost universally loathed by early hackers and programmers because of its multitude of different versions, none of which resembled the other versions, and that IBM produced it in direct competition with the already-established ASCII.
Prior to the availability of 8BITMIME implementations, mail user agents employed several techniques to cope with the seven-bit limitation, such as binary-to-text encodings (including ones provided by MIME) and UTF-7.
•
SMTPUTF8
— Allow UTF-8 encoding in mailbox names and header fields, RFC 6531
VM2000 - EBCDIC-based hypervisor for S/390-compatible platform, capable of running multiple BS2000 and SINIX virtual machines
Other special characters and punctuation marks were added to the card code, involving as many as three punches per column (and in 1964 with the introduction of EBCDIC as many as six punches per column).
International email (IDN email or Intl email) is email that contains international, UTF-8 encoded, characters (characters which do not exist in the ASCII character set) in the email header.
JEF is a stateful EBCDIC charset used in Fujitsu mainframe systems called FACOM and some OASYS series personal word processors.
KEIS is a stateful EBCDIC charset used in Hitachi mainframe systems.
For example, instead of running "ssh legacy-machine", a user may have to run "LC ALL=fr FR luit ssh legacy-machine" to properly render French accented characters on a UTF-8 terminal.
•
The main purpose of luit is to allow "legacy" applications that use character sets other than UTF-8 to work with contemporary terminal emulators.
Each such string consists of a length byte followed by that many UTF-8 characters.
In October 2004 Netatalk 2.0 was released, which brought major improvements, including: support for Apple Filing Protocol version 3.1 (providing long UTF-8 filenames, file sizes > 2 gigabytes, full Mac OS X compatibility), CUPS integration, Kerberos V support allowing true "single sign-on", reliable and persistent storage of file and directory IDs and countless bug fixes compared to previous versions.
It also provides the functions expected of a modern scripting language, including support for regular expressions, XML, Unicode (UTF-8), TCP/IP and UDP networking, matrix and array processing, advanced math, statistics and Bayesian statistical analysis, financial mathematics, and distributed computing support.
OpenXDF requires the use of a XML 1.0 compliant parser that supports UTF-8 and UTF-16.
As they are used for input and output devices, they generally contain text, a sequence of characters in a predetermined encoding, such as Latin-1 or UTF-8.
UTF-16 is used by the Qualcomm BREW operating systems; the .NET environments; and the Qt cross-platform graphical widget toolkit.
•
IBM iSeries systems designate code page CCSID 13488 for UCS-2 character encoding, CCSID 1200 for UTF-16 encoding, and CCSID 1208 for UTF-8 encoding.
In August 1992, this proposal was circulated by an IBM X/Open representative to interested parties.
•
Dave Prosser of Unix System Laboratories submitted a proposal for one that had faster implementation characteristics and introduced the improvement that 7-bit ASCII characters would only represent themselves; all multibyte sequences would include only bytes where the high bit was set.