最近在探究你们的code,发现你们在ffmpeg的id3v1.c文件中,添加了下面的函数将MP3文件的歌唱者的名字从GBK编码转到其他编码格式上。我的理解是最终要将GBK转到UTF8上,不然中文会乱码。但下面的装换很简单,并不是转换成UTF8.我想问的是这个函数将GBK转成什么格式?希望能得到你的帮助。
最近在用媒体中心播放音频时,发现ffmpeg获取metadata后,中文歌手名显示乱码。主要原因是ffmpeg取出的歌手名是GBK编码,直接通过android的newStringUTF给了java层显示,导致显示错误。下面的patch会将gbk转成UTF,再传给上层JAVA。
+static void convert_iso8859_to_string(const uint8_t *data, int size, char *s) {
+ int utf8len = 0;
+ int i;
+
+ for (i = 0; i < size; ++i) {
+ if (data[i] == '\0') {
+ size = i;
+ break;
+ } else if (data[i] < 0x80) {
+ ++utf8len;
+ } else {
+ utf8len += 2;
+ }
+ }
+
+ if (utf8len == size) {
+ // Only ASCII characters present.
+
+ memcpy(s, data, size);
+ s[size] = '\0';
+ return;
+ }
+
+ char *ptr = s;
+ for (i = 0; i < size; ++i) {
+ if (data[i] == '\0') {
+ break;
+ } else if (data[i] < 0x80) {
+ *ptr++ = data[i];
+ } else if (data[i] < 0xc0) {
+ *ptr++ = 0xc2;
+ *ptr++ = data[i];
+ } else {
+ *ptr++ = 0xc3;
+ *ptr++ = data[i] - 64;
+ }
+ }
+ *ptr = '\0';
+
+}
+
static void get_string(AVFormatContext *s, const char *key,
const uint8_t *buf, int buf_size)
{
int i, c;
char *q, str[512];
+ convert_iso8859_to_string(buf, buf_size, str);
+#if 0
q = str;
for(i = 0; i < buf_size; i++) {
c = buf[i];
@@ -191,6 +234,7 @@ static void get_string(AVFormatContext *s, const char *key,
*q++ = c;
}
*q = '\0';
+#endif
if (*str)
av_dict_set(&s->metadata, key, str, 0);
UTF8编码表
http://blog.csdn.net/qiaqia609/article/details/8069678
GBK编码表
http://blog.csdn.net/qiaqia609/article/details/8069655
utf8汉字编码对照表_信息与通信_工程
http://cache.baiducontent.com/c?m=9d78d513d98407fb4fece4741a16a671695797143ec0a11568a3e35cd424054e1d20a5f930236319ce802b3b58e85e5c9da06529614437b7ec99d515c0ffc97f6a957332211c864613d51bffcd17259621c45decaf1ce3bba66184aea589990b0d&p=9b3fc64ad4d015b708e29778065594&newp=8e6acc1487d512a05abd9b7d0b1da5231611d73f6590cf512496fe4b98&user=baidu&fm=sc&query=utf8+%B1%E0%C2%EB&qid=dcbcd21c000092cc&p1=5
UTF8编码表
http://blog.csdn.net/qiaqia609/article/details/8069678
全角字符unicode码对应表
http://blog.csdn.net/lvwx369/article/details/39294265
Unicode码对应表_IT/计算机_专业资料
http://cache.baiducontent.com/c?m=9f65cb4a8c8507ed4fece76310478a215915d7743ca080462482d45f93130a1c187ba7e070670d0fd4cf7b6c51ad4f0be0f53570345724bcccc98b41daea963f2fff7d722f42914066934eb8ca30619a77d54eacf259b1b5e743e2b9a5a2c854228d0f5e2bdda6dc4d00659b3ea745&p=8b2a975686cc40ad07f1cf351564&newp=8a769a47999611a059ef8a24565692695c16ed623e9885&user=baidu&fm=sc&query=%CC%EC+unicode+ccec&qid=b21a779500071717&p1=1
GBK、GB2312、iso-8859-1之间的区别
http://blog.csdn.net/jerry_bj/article/details/5714745