• Home
  • About
  • Privacy Policy
  • Sitemap
  • Disclaimer
  • Contact
  • Advertise

d-chips

All About SIM Cards

Home » Alphabets » GSM 7 bit Default Alphabet

GSM 7 bit Default Alphabet

Back to old school times, we studied the alphabet before learning words and other things. The same rule also applies in GSM world. The GSM has its own set of alphabet: some characters are similar to ASCII some others are not. You can get the specification (3GPP TS 03.38) freely here. So let's start learning!

The Primary Character Table

Open the 03.38 specification and go to section 6.2.1. Like the title said, the characters are defined only using 7-bit. If we need to put them on a byte, then we set the MSB (most significant bit) with zero plus the 7-bit value of the alphabet.

Let's take a look on the character table:

To get the encoding of a character:

  1. Find the corresponding character in the table
  2. Read the b7, b6, and b5 value on the top row of the cell
  3. Read the b4, b3, b2, and b1 value on the leftmost column of the cell
  4. Concatenate the value you get from step 2 and 3 above, the result is the 7-bit value
  5. You can also read the nibble value, which is located under the b7, b6, b5 and beside b4, b3, b2, b1 and concatenate them to get the byte value

Example:

  • Character "2" has b7=0, b6=1, and b5=1 and b4=0, b3=0, b2=1, b1=0. Hence, the 7-bit value is 0110010b ('32').
  • Character "a" has b7=1, b6=1, and b5=0 and b4=0, b3=0, b2=0, b1=1. Hence, the 7-bit value is 1100001b ('61').

Some specials encoding in the table are:

  1. LF ('0A') is Line Feed character
  2. CR ('0D') is the Carriage Return character
  3. 1) ('1B') is escape to extension table, which is discussed in the following section.
  4. SP means space character

The Extension Table

If you are going to use a characters that is defined in the following table, then you need to use the Escape Sequence '1B' (0011011b).

Example:

  • The left curly bracket  "{" value is '28' (0101000b). To use this character in 8-bit we shall use '1B 28'.
  • The hypen "|" value is '40' (1000000b). To use this character in 8-bit we shall use '1B 40'

The two examples explains the reason why the number of characters is decreased by two if we are using "^", "{", "}", "\", "[", "~", "]","|", or "€" in the SMS (Try to send SMS using the characters and notice the reduction!).

Some specials encoding in the table are:

  1. 3) ('0A') means Page Break character
  2. 1) ('1B') is escape to extension table, which is currently not used.

If the mobile station is not capable of showing symbol from the extension table, then it may show the character from primary table. In such case, Page Break become Line Feed, curly brackets become parentheses, backslash become slash, euro become small e, ... and so on.

Posted by David on Wednesday, June 27, 2012

Share to

Facebook Google+ Twitter
Newer Post
Older Post
Home

Labels

  • Alphabets (2)
  • Logical Model (2)
Copyright © 2012-2013 d-chips - All Rights Reserved
Powered by Blogger