js charcodeat java_JS函数charCodeAt的Lua实现

JS函数charCodeAt的Lua实现

charCodeAt by Lua

@(Lua JavaScript charCodeAt)

I wanted to have a function charCodeAt in Lua ,and it should works exactly like javascript

but with Lua5.1 ,UTF8 and Unicode are not supported,

1: how charCodeAt works in javascript

to show Console press F12 in Chrome( MAC:CMD+alt+J)

[

'你'.charCodeAt(0),

'ñ'.charCodeAt(0),

'n'.charCodeAt(0)

]

it will output [20320, 241, 110] ,it means the numeric value of Unicode , '你'=20320 , 'ñ'=241, 'n'=110.

The charCodeAt() method returns the numeric Unicode value of the character at the given index (except for unicode codepoints > 0x10000).

according to alexander-yakushev we can know how many bytes one UTF8 word takes using function utf8.charbytes

[https://github.com/alexander-yakushev/awesompd/blob/master/utf8.lua]

function utf8.charbytes (s, i)

-- argument defaults

i = i or 1

local c = string.byte(s, i)

-- determine bytes needed for character, based on RFC 3629

if c > 0 and c <= 127 then

-- UTF8-1 byte

return 1

elseif c >= 194 and c <= 223 then

-- UTF8-2 byte

return 2

elseif c >= 224 and c <= 239 then

-- UTF8-3 byte

return 3

elseif c >= 240 and c <= 244 then

-- UTF8-4 byte

return 4

end

end

Unicode & UTF8 convert method

Unicode code range

UTF-8 code

example

hex code

binary code

char

0000 0000-0000 007F

0xxxxxxx

n(alphabet)

0000 0000-0000 007F

110xxxxx 10xxxxxx

ñ

0000 0080-0000 07FF

1110xxxx 10xxxxxx 10xxxxxx

你(most CJK)

0001 0000-0010 FFFF

11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

other chars

but we should pay attention to 4 bytes UTF8[emoji], it works not that simple

special Method

javascript engine using UTF16,characters in Basic Multilingual Plane were the same with unicode, but if the characters were in Supplementary Plane it should use the formula below,usually we encounter Supplementary Plane emoji like😝 (4 byte UTF8 character)

-- formula 1

H = Math.floor((c-0x10000) / 0x400)+0xD800

L = (c - 0x10000) % 0x400 + 0xDC00

code is here

Feedback & Bug Report

Twitter: [@lilien1010]

Thank you for reading this , if you got any better idea, share it.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值