Concatenated SMS Messages and Character Counts

Figuring out maximum character counts for standard SMS messages is really quite simple. However, the maximum character counts for concatenated SMS messagesis a bit more complicated. Throw character encodings into the mix, and everything can become very muddled.

Encodings

Languages which use a Latin-based alphabet (such as English, Spanish, French, etc.) usually use phones supporting the GSM character encoding . The GSM character encoding uses 7 bits to represent each character (similar to ASCII). This contrasts with non-Latin-based alphabet languages (such as Chinese, Arabic, Sinhala, Mongolian, etc.) which usually use phones supporting Unicode. The specific character encoding utilized by these phones is usually UTF-16 or UCS-2. Both UTF-16 and UCS-2 use 16 bits to represent each character. For the sake of simplicity, I will refer to the Latin-based alphabet and non-Latin-based alphabet languages in this post as “GSM” and “Unicode” languages respectively.

Standard SMS Messages

Standard SMS messages have a maximum payload of 140 bytes (1120 bits).

Since GSM phones use a 7-bit character encoding, this allows a maximum of 160 characters per standard SMS message:

1120 bits / (7 bits/character) = 160 characters

For Unicode phones, which use a 16-bit character encoding, this allows a maximum of 70 characters per standard SMS message:

1120 bits / (16 bits/character) = 70 characters

Concatenated SMS Messages

Things get a little bit more complex with concatenated SMS messages. Concatenated SMSmessages allow a phone to send messages longer than 160 GSM characters. The sender creates their message as normal, but without the 140 byte limit. Behind the scenes, the phone detects the message length. If the message is less than or equal to 140 bytes, the phone sends a standard SMS message. However, if the message is greater than 140 bytes characters, the phone automatically divides the longer message into multiple, shorter SMSmessages which are then transmitted to the recipient separately.

The recipient’s phone takes these multiple, shorter SMS messages and recombines them into the original message which was sent. Because the individual segments of the complete message need to be recombined in this way, this is referred to as ‘concatenated SMS’. In order to achieve this seamless delivery, additional information is added to each individual concatenated SMS message. This additional information, referred to as the user data header (UDH), provides identification and ordering information. For example, the UDH could relate the three individual concatenated SMS messages to each other, and indicate the order for recombination.

The UDH takes up 6 bytes (48 bits) of a normal SMS message payload. This reduces the space for actual message data in concatenated SMS messages:

1120 bits - 48 bits = 1072 bits

As a result, each individual concatenated SMS message can only contain 1072 bits of message data. This plays an important role in determining how many individual concatenatedSMS messages will be sent based on the actual message data length.

SMS payload diagram

Because GSM phones use a 7-bit character encoding, each individual concatenated SMSmessage can hold 153 characters:

1072 bits / (7 bits/character) = 153 characters

(Note: 153 characters * 7 bits/character = 1071 bits. However, the extra bit can’t be used to represent a full character, so it is added as added as padding so that the actual 7-bit encoding data begins on a septet boundary—the 50th bit.)

Unicode phones use a 16-bit character encoding, so each individual concatenated SMSmessage can hold 67 characters:

1072 bits / (16 bits/character) = 67 characters

Character Count Thresholds

The character limits for individual concatenated SMS messages results in various thresholds for which additional individual concatenated SMS messages will be required to support sending a larger overall message:

GSM encoding:

  • 1 standard SMS message = up to 160 characters
  • 2 concatenated SMS messages = up to 306 characters
  • 3 concatenated SMS messages = up to 459 characters
  • 4 concatenated SMS messages = up to 612 characters
  • 5 concatenated SMS messages = up to 765 characters
  • etc. (153 x number of individual concatenated SMS messages)

UTF-16 encoding:

  • 1 standard SMS message = up to 70 characters
  • 2 concatenated SMS messages = up to 134 characters
  • 3 concatenated SMS messages = up to 201 characters
  • 4 concatenated SMS messages = up to 268 characters
  • 5 concatenated SMS messages = up to 335 characters
  • etc. (67 x number of individual concatenated SMS messages)

Implications

These thresholds are an important consideration for a number of reasons including billing, and the programmatic interfacing with SMS gateways.

Generally, telephone companies count individual concatenated SMS messages separately even though they are being recombined at the phone into a single message. This means aGSM encoded message containing 180 characters could potentially invoke a charge for twoSMS messages, even if the sender/recipient only sees a single message.

When interfacing with a telephone company’s SMS gateway programmatically, there may be limits on the number of individual concatenated SMS messages which can sent as part of a single message. For example, Clickatell’s documentation states that messages sent through their API should not contain more than 3 concatenated SMS segments. This may require limiting the number of character input in a web application or service which sends SMSmessages via an API in such a manner.

While it may seem elementary, it is important to point out that SMS messages are always in one particular encoding; i.e. fully GSM or fully UTF-16. For example, a period character (”.”) takes up 7-bits in a GSM SMS message. The same character may exist in a Unicode SMSmessage, but takes up 16-bits, even it is representing the same character.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值