Pages

Powered by Blogger.

Like Our Facebook Page

ASCII and Unicode.............

Monday, 21 October 2013

ASCII - American Standard Code for Information Interchange
About the same time that IBM was developing EBCDIC, a group of engineers from the American Standards Association were developing another code for representing character data.  The result was ASCII - the American Standard Code for Information Interchange. 
ASCII used 7 bits to represent characters, giving 128 possible symbols.
Here is the original ASCII table from 1963.  Like EBCDIC, the grey areas represent non-printable control keys (Esc, Del, Backspace, etc.), as well as special characters used in data transmission.

To find the ASCII code for a particular character,
  1. Locate the character in the table (for example, "J")
  2. Write down the three "Low Order" bits on the left side of the table directly across from the character (for "J", they are 100).
  3. Write down the four "High Order" bits at the top of the table, directly above the character (for "J", they are 1010).
     
Therefore, the 7-bit ASCII code for an upper-case "J" is 1001010.
The arrangement of the characters in ASCII solved many of the data-processing problems that were caused by EBCDIC.  Curiously, IBM was in favor of adopting ASCII instead of EBCDIC for its computers.  However, they had already manufactured hundreds of computers and peripheral devices (like card punch machines) that were based on EBCDIC.  Deciding that it would be too time-consuming and costly to change, IBM settled on EBCDIC. 
With the huge success of the IBM mainframe computers in the 1960s and 1970s, ASCII virtually disappeared until the 1980s, when it was decided it should become the standard for the IBM PC.  ASCII was extended to 8-bits, and quickly became the universally accepted code for storing textual data in all PCs.

Unicode
By the end of the 1980s, engineers realized that much larger codes could now be possible due to the rapid increases in the speed and storage capacities of personal computers.  By the end of 1991, engineers from Xerox, Apple, Microsoft, Sun Microsystems and others published the first volume of Unicode, capable of representing all of the symbols used in any of the 6,800 languages of the people on earth!
Today, Unicode is the universally accepted standard for representing text in computer systems.
Unicode is a variable-length code, meaning that different character sets will use different number of bits.  For example,
  • Unicode uses 8 bits (one byte) to code the 256 extended ASCII characters, which allows the Unicode and ASCII representations of these characters to be identical (for example, the upper-case "J" is coded as  01001010 in both systems). 
  • Unicode uses 16 bits (2 bytes) to code the remaining characters, giving over 65,000 possible symbols.  This allows most "living" alphabets to be coded completely, including those in scripted languages like Arabic, and symbolic languages like Chinese.  For example,
    • the symbol for the Eastern Arabic number 3 is ٣ (Unicode 0663, or 00000110 01100011).
    • the symbol for the Chinese word for "snow" is (Unicode 96EA, or 10010110 11101010).

No comments:

Post a Comment

 

Most Reading

World Map


Site's Information