我从97年接触互联网的web开发,至今已经过去9年了,从最初的frontpage做html页面到学会ASP+access+IIS开始,就跟 web开发干上了,后来又依次使用了ASP+SQLServer+IIS、JSP+Oracle+Jrun(Resin/Tomcat)、 PHP+Syabse(MySQL)+Apache … 最后我定格到了 PHP+MySQL+Apache+Linux(BSD) 的架构上,也就是大家常说的LAMP架构,这说来有很多理由,网上也有很多人讨论各种架构和开发语言之间的优劣,我就不多说了,简单说一下我喜欢LAMP 的几个主要原因:
1、全开放的免费平台;
2、简单易上手、各种资源丰富;
3、PHP、MySQL、Apache与Linux(BSD)系统底层以及彼此间无缝结合,非常高效;
4、均使用最高效的语言C/C++开发,性能可靠;
5、PHP语言和C的风格基本一致,还吸取了Java和C++的诸多架构优点;
6、这是最关键的一点,那就是PHP可以非常方便的使用C/C++开发扩展模块,给了PHP无限的扩张性!
基于以上原因,我非常喜欢基于PHP语言的架构,其中最关键的一点就是最后一点,以前在Yahoo和mop均推广使用这个平台,在C扩展php方面也有一些经验,在此和大家分享一下,希望可以抛砖引玉。
用C语言编写PHP的扩展模块的方法有几种,根据最后的表现形式有两种,一种是直接编译进php,一种是编译为php的so扩展模块来被php调用,另 外根据编译的方式有两种,一种使用phpize工具(php编译后有的),一种使用ext_skel工具(php自带的),我们使用最多,也是最方便的方 式就是使用ext_skel工具来编写php的so扩展模块,这里也主要介绍这种方式。
我们在php的源码目录里面可以看到有个ext 目录(我这里说的php都是基于Linux平台的php来说的,不包括windows下的),在ext目录下有个工具 ext_skel ,这个工具可以让我们简单的开发出php的扩展模块,它提供了一个通用的php扩展模块开发步骤和模板。下面我们以开发一个在php里面进行 utf8/gbk/gb2312三种编码转换的扩展模块为例子进行说明。在这个模块中,我们要最终提供以下几个函数接口:
(1) string toplee_big52gbk(string s)
将输入字符串从BIG5码转换成GBK
(2) string toplee_gbk2big5(string s)
将输入字符串从GBK转换成BIG5码
(3) string toplee_normalize_name(string s)
将输入字符串作以下处理:全角转半角,strim,大写转小写
(4) string toplee_fan2jian(int code, string s)
将输入的GBK繁体字符串转换成简体
(5) string toplee_decode_utf(string s)
将utf编码的字符串转换成UNICODE
(6) string toplee_decode_utf_gb(string s)
将utf编码的字符串转换成GB
(7) string toplee_decode_utf_big5(string s)
将utf编码的字符串转换成BIG5
(8) string toplee_encode_utf_gb(string s)
将输入的GBKf编码的字符串转换成utf编码
首先,我们进入ext目录下,运行下面命令:
#./ext_skel –extname=toplee
这时,php会自动在ext目录下为我们生成一个目录toplee,里面包含下面几个文件
.cvsignore
CREDITS
EXPERIMENTAL
config.m4
php_toplee.h
tests
toplee.c
toplee.php
其中最有用的就是config.m4和toplee.c文件
接下来我们修改config.m4文件
#vi ./config.m4
找到里面有类似这样几行
1. dnl PHP_ARG_WITH(toplee, for toplee support,
2. dnl Make sure that the comment is aligned:
3. dnl [ --with-toplee Include toplee support])
4.
5. dnl Otherwise use enable:
6.
7. dnl PHP_ARG_ENABLE(toplee, whether to enable toplee support,
8. dnl Make sure that the comment is aligned:
9. dnl [ --enable-toplee Enable toplee support])
上面的几行意思是说告诉php编译的使用使用那种方式加载我们的扩展模块toplee,我们使用–with-toplee的方式,于是我们修改为下面的样子
1. PHP_ARG_WITH(toplee, for toplee support,
2. Make sure that the comment is aligned:
3. [ --with-toplee Include toplee support])
4.
5. dnl Otherwise use enable:
6.
7. dnl PHP_ARG_ENABLE(toplee, whether to enable toplee support,
8. dnl Make sure that the comment is aligned:
9. dnl [ --enable-toplee Enable toplee support])
然后我们要做的关键事情就是编写toplee.c,这个是我们编写模块的主要文件,如果您什么都不修改,其实也完成了一个php扩展模块的编写,里面有类似下面的几行代码
1. PHP_FUNCTION(confirm_toplee_compiled)
如 果我们在后面完成php的编译时把新的模块编译进去,那么我们就可以在php脚本中调用函数toplee(),它会输出一段字符串 “Congratulations! You have successfully modified ext/toplee/config.m4. Module toplee is now compiled into PHP.”
2. {
3. char *arg = NULL;
4. intarg_len, len;
5. charstring[256];
6.
7. if(zend_parse_parameters(ZEND_NUM_ARGS()TSRMLS_CC, "s", &arg, &arg_len) == FAILURE){
8. return;
9. }
10.
11. len = sprintf(string, "Congratulations! You have successfully modified ext/%.78s/config.m4. Module %.78s is now compiled into PHP.", "toplee", arg);
12. RETURN_STRINGL(string, len, 1);
13. }
下面是我们对toplee.c的修改,让其支持我们预先规划的功能和接口,下面是toplee.c的源代码
1. /*
2. +----------------------------------------------------------------------+
3. | PHP Version 4 |
4. +----------------------------------------------------------------------+
5. | Copyright (c) 1997-2002 The PHP Group |
6. +----------------------------------------------------------------------+
7. | This source file is subject to version 2.02 of the PHP license, |
8. | that is bundled with this package in the file LICENSE, and is |
9. | available at through the world-wide-web at |
10. | http://www.php.net/license/2_02.txt. |
11. | If you did not receive a copy of the PHP license and are unable to |
12. | obtain it through the world-wide-web, please send a note to |
13. | license@php.net so we can mail you a copy immediately. |
14. +----------------------------------------------------------------------+
15. | Author: |
16. +----------------------------------------------------------------------+
17.
18. $Id: header,v 1.10 2002/02/28 08:25:27 sebastian Exp $
19. */
20.
21. #ifdefHAVE_CONFIG_H
22. #include "config.h"
23. #endif
24.
25. #include "php.h"
26. #include "php_ini.h"
27. #include "ext/standard/info.h"
28. #include "php_gbk.h"
29. #include "toplee_util.h"
30.
31. /* If you declare any globals in php_gbk.h uncomment this:
32. ZEND_DECLARE_MODULE_GLOBALS(gbk)
33. */
34.
35. /* True global resources - no need for thread safety here */
36. staticintle_gbk;
37.
38. /* {{{ gbk_functions[]
39. *
40. * Every user visible function must have an entry in gbk_functions[].
41. */
42. function_entrygbk_functions[] = {
43. PHP_FE(toplee_decode_utf, NULL)
44. PHP_FE(toplee_decode_utf_gb, NULL)
45. PHP_FE(toplee_decode_utf_big5, NULL)
46. PHP_FE(toplee_encode_utf_gb, NULL)
47.
48. PHP_FE(toplee_big52gbk, NULL)
49. PHP_FE(toplee_gbk2big5, NULL)
50. PHP_FE(toplee_fan2jian, NULL)
51. PHP_FE(toplee_normalize_name, NULL)
52. {NULL, NULL, NULL} /* Must be the last line in gbk_functions[] */
53. };
54. /* }}} */
55.
56. /* {{{ gbk_module_entry
57. */
58. zend_module_entrygbk_module_entry = {
59. #ifZEND_MODULE_API_NO >= 20010901
60. STANDARD_MODULE_HEADER,
61. #endif
62. "gbk",
63. gbk_functions,
64. PHP_MINIT(gbk),
65. PHP_MSHUTDOWN(gbk),
66. PHP_RINIT(gbk), /* Replace with NULL if there's nothing to do at request start */
67. PHP_RSHUTDOWN(gbk), /* Replace with NULL if there's nothing to do at request end */
68. PHP_MINFO(gbk),
69. #ifZEND_MODULE_API_NO >= 20010901
70. "0.1", /* Replace with version number for your extension */
71. #endif
72. STANDARD_MODULE_PROPERTIES
73. };
74. /* }}} */
75.
76. #ifdefCOMPILE_DL_GBK
77. ZEND_GET_MODULE(gbk)
78. #endif
79.
80. /* {{{ PHP_INI
81. */
82. /* Remove comments and fill if you need to have entries in php.ini*/
83. PHP_INI_BEGIN()
84. PHP_INI_ENTRY("gbk2uni", "", PHP_INI_SYSTEM, NULL)
85. PHP_INI_ENTRY("uni2gbk", "", PHP_INI_SYSTEM, NULL)
86. PHP_INI_ENTRY("uni2big5", "", PHP_INI_SYSTEM, NULL)
87. PHP_INI_ENTRY("big52uni", "", PHP_INI_SYSTEM, NULL)
88. PHP_INI_ENTRY("big52gbk", "", PHP_INI_SYSTEM, NULL)
89. PHP_INI_ENTRY("gbk2big5", "", PHP_INI_SYSTEM, NULL)
90. // STD_PHP_INI_ENTRY("gbk.global_value", "42", PHP_INI_ALL, OnUpdateInt, global_value, zend_gbk_globals, gbk_globals)
91. // STD_PHP_INI_ENTRY("gbk.global_string", "foobar", PHP_INI_ALL, OnUpdateString, global_string, zend_gbk_globals, gbk_globals)
92. PHP_INI_END()
93.
94. /* }}} */
95.
96. /* {{{ php_gbk_init_globals
97. */
98. /* Uncomment this function if you have INI entries
99. static void php_gbk_init_globals(zend_gbk_globals *gbk_globals)
100. {
101. gbk_globals->global_value = 0;
102. gbk_globals->global_string = NULL;
103. }
104. */
105. /* }}} */
106.
107. chargbk2uni_file[256];
108. charuni2gbk_file[256];
109. charbig52uni_file[256];
110. charuni2big5_file[256];
111. chargbk2big5_file[256];
112. charbig52gbk_file[256];
113.
114. //utf file init flag
115. staticintinitutf=0;
116.
117. /* {{{ PHP_MINIT_FUNCTION
118. */
119. PHP_MINIT_FUNCTION(gbk)
120. {
121. /* If you have INI entries, uncomment these lines
122. ZEND_INIT_MODULE_GLOBALS(gbk, php_gbk_init_globals, NULL);*/
123. REGISTER_INI_ENTRIES();
124. memset(gbk2uni_file, 0, sizeof(gbk2uni_file));
125. memset(uni2gbk_file, 0, sizeof(uni2gbk_file));
126. memset(big52uni_file, 0, sizeof(big52uni_file));
127. memset(uni2big5_file, 0, sizeof(uni2big5_file));
128. memset(gbk2big5_file, 0, sizeof(gbk2big5_file));
129. memset(big52gbk_file, 0, sizeof(big52gbk_file));
130.
131. strncpy(gbk2uni_file, INI_STR("gbk2uni"), sizeof(gbk2uni_file)-1);
132. strncpy(uni2gbk_file, INI_STR("uni2gbk"), sizeof(uni2gbk_file)-1);
133. strncpy(big52uni_file, INI_STR("big52uni"), sizeof(big52uni_file)-1);
134. strncpy(uni2big5_file, INI_STR("uni2big5"), sizeof(uni2big5_file)-1);
135. strncpy(gbk2big5_file, INI_STR("gbk2big5"), sizeof(uni2big5_file)-1);
136. strncpy(big52gbk_file, INI_STR("big52gbk"), sizeof(uni2big5_file)-1);
137.
138. //InitMMResource();
139. InitResource();
140. if((uni2gbk_file[0] == '') || (uni2big5_file[0] == '')
141. || (gbk2big5_file[0] == '') || (big52gbk_file[0] == '')
142. || (gbk2uni_file[0] == '') || (big52uni_file[0] == ''))
143. {
144. returnFAILURE;
145. }
146.
147. if(gbk2uni_file[0] != '')
148. {
149. if(LoadOneCodeTable(CODE_GBK2UNI, gbk2uni_file) != NULL)
150. {
151. toplee_cleanup_mmap(NULL);
152. returnFAILURE;
153. }
154. }
155.
156. if(uni2gbk_file[0] != '')
157. {
158. if(LoadOneCodeTable(CODE_UNI2GBK, uni2gbk_file) != NULL)
159. {
160. toplee_cleanup_mmap(NULL);
161. returnFAILURE;
162. }
163. }
164.
165. if(big52uni_file[0] != '')
166. {
167. if(LoadOneCodeTable(CODE_BIG52UNI, big52uni_file) != NULL)
168. {
169. toplee_cleanup_mmap(NULL);
170. returnFAILURE;
171. }
172. }
173.
174. if(uni2big5_file[0] != '')
175. {
176. if(LoadOneCodeTable(CODE_UNI2BIG5, uni2big5_file) != NULL)
177. {
178. toplee_cleanup_mmap(NULL);
179. returnFAILURE;
180. }
181. }
182.
183. if(gbk2big5_file[0] != '')
184. {
185. if(LoadOneCodeTable(CODE_GBK2BIG5, gbk2big5_file) != NULL)
186. {
187. toplee_cleanup_mmap(NULL);
188. returnFAILURE;
189. }
190. }
191.
192. if(big52gbk_file[0] != '')
193. {
194. if(LoadOneCodeTable(CODE_BIG52GBK, big52gbk_file) != NULL)
195. {
196. toplee_cleanup_mmap(NULL);
197. returnFAILURE;
198. }
199. }
200.
201. initutf = 1;
202. returnSUCCESS;
203. }
204. /* }}} */
205.
206. /* {{{ PHP_MSHUTDOWN_FUNCTION
207. */
208. PHP_MSHUTDOWN_FUNCTION(gbk)
209. {
210. /* uncomment this line if you have INI entries*/
211. UNREGISTER_INI_ENTRIES();
212.
213. toplee_cleanup_mmap(NULL);
214. returnSUCCESS;
215. }
216. /* }}} */
217.
218. /* Remove if there's nothing to do at request start */
219. /* {{{ PHP_RINIT_FUNCTION
220. */
221. PHP_RINIT_FUNCTION(gbk)
222. {
223. returnSUCCESS;
224. }
225. /* }}} */
226.
227. /* Remove if there's nothing to do at request end */
228. /* {{{ PHP_RSHUTDOWN_FUNCTION
229. */
230. PHP_RSHUTDOWN_FUNCTION(gbk)
231. {
232. returnSUCCESS;
233. }
234. /* }}} */
235.
236. /* {{{ PHP_MINFO_FUNCTION
237. */
238. PHP_MINFO_FUNCTION(gbk)
239. {
240. php_info_print_table_start();
241. php_info_print_table_header(2, "gbk support", "enabled");
242. php_info_print_table_end();
243.
244. /* Remove comments if you have entries in php.ini*/
245. DISPLAY_INI_ENTRIES();
246.
247. }
248. /* }}} */
249.
250.
251. /* Remove the following function when you have succesfully modified config.m4
252. so that your module can be compiled into PHP, it exists only for testing
253. purposes. */
254.
255. /* {{{ proto toplee_decode_utf(string s)
256. */
257. PHP_FUNCTION(toplee_decode_utf)
258. {
259. char *s = NULL, *t=NULL;
260. intargc = ZEND_NUM_ARGS();
261. ints_len;
262.
263. if(zend_parse_parameters(argcTSRMLS_CC, "s", &s, &s_len) == FAILURE)
264. return;
265.
266. if(!initutf)
267. RETURN_FALSE
268. t = strdup(s);
269. if(t==NULL)
270. RETURN_FALSE
271.
272.
273. DecodePureUTF(t, KEEP_UNICODE);
274. RETVAL_STRING(t,1);
275. free(t);
276. return;
277. }
278. /* }}} */
279.
280. /* {{{ proto toplee_decode_utf_gb(string s)
281. */
282. PHP_FUNCTION(toplee_decode_utf_gb)
283. {
284. char *s = NULL, *t=NULL;
285. intargc = ZEND_NUM_ARGS();
286. ints_len;
287.
288. if(zend_parse_parameters(argcTSRMLS_CC, "s", &s, &s_len) == FAILURE)
289. return;
290.
291. if(!initutf)
292. RETURN_FALSE
293. t = strdup(s);
294. if(t==NULL)
295. RETURN_FALSE
296.
297. DecodePureUTF(t, DECODE_UNICODE);
298. RETVAL_STRING(t,1);
299. free(t);
300. return;
301.
302. }
303. /* }}} */
304.
305. /* {{{ proto toplee_decode_utf_big5(string s)
306. */
307. PHP_FUNCTION(toplee_decode_utf_big5)
308. {
309. char *s = NULL, *t=NULL;
310. intargc = ZEND_NUM_ARGS();
311. ints_len;
312.
313. if(zend_parse_parameters(argcTSRMLS_CC, "s", &s, &s_len) == FAILURE)
314. return;
315.
316. if(!initutf)
317. RETURN_FALSE
318. t = strdup(s);
319. if(t==NULL)
320. RETURN_FALSE
321.
322.
323. DecodePureUTF(t, DECODE_UNICODE | DECODE_BIG5);
324. RETVAL_STRING(t,1);
325. free(t);
326. return;
327. }
328. /* }}} */
329. intEncodePureUTF(unsignedchar* strSrc,
330. unsignedchar* strDst, intnDstLen, intnFlag)
331. {
332. intnRet;
333. intpos;
334. unsignedshortc;
335. unsignedshort* uBuf;
336. intnSize;
337. intnLen;
338. intnReturn;
339.
340. nLen=strlen((constchar*)strSrc);
341. if(nDstLen < nLen*2+1)
342. return0;
343.
344. nSize=nLen+1;
345. uBuf=(unsignedshort*)emalloc(sizeof(unsignedshort)*nSize);
346.
347. nRet=MultiByteToWideChar(936, 0, (constchar*)strSrc, strlen((constchar*)strSrc),
348. uBuf, nSize);
349.
350. nReturn=0;
351. pos=nRet;
352. while(pos>0)
353. {
354. c = *uBuf;
355. if(c < 0x80){
356. strDst[nReturn++] = (char)c;
357. }elseif(c < 0x800){
358. strDst[nReturn++] = (0xc0 | (c >> 6));
359. strDst[nReturn++] = (0x80 | (c & 0x3f));
360. }elseif(c < 0x10000){
361. strDst[nReturn++] = (0xe0 | (c >> 12));
362. strDst[nReturn++] = (0x80 | ((c >> 6) & 0x3f));
363. strDst[nReturn++] = (0x80 | (c & 0x3f));
364. }elseif(c < 0x200000){
365. strDst[nReturn++] = (0xf0 | (c >> 18));
366. strDst[nReturn++] = (0x80 | ((c >> 12) & 0x3f));
367. strDst[nReturn++] = (0x80 | ((c >> 6) & 0x3f));
368. strDst[nReturn++] = (0x80 | (c & 0x3f));
369. }
370. pos--;
371. uBuf++;
372. }
373. strDst[nReturn]='';
374.
375. returnnReturn;
376. }
377.
378. /* {{{ proto toplee_encode_utf_gb(string s)
379. */
380. PHP_FUNCTION(toplee_encode_utf_gb)
381. {
382. char *s = NULL;
383. intargc = ZEND_NUM_ARGS();
384. ints_len;
385. char* sRet;
386.
387. if(zend_parse_parameters(argcTSRMLS_CC, "s", &s, &s_len) == FAILURE)
388. return;
389.
390. if(!initutf)
391. RETURN_FALSE
392. sRet=emalloc(strlen(s)*2+1);
393.
394. EncodePureUTF(s, sRet, strlen(s)*2+1, 0);
395. RETVAL_STRING(sRet,1);
396. return;
397. }
398. /* }}} */
399.
400.
401. /* {{{ proto toplee_big52gbk(string s)
402. */
403. PHP_FUNCTION(toplee_big52gbk)
404. {
405. char *s = NULL;
406. intargc = ZEND_NUM_ARGS();
407. ints_len;
408. char* sRet = NULL;
409.
410. if(zend_parse_parameters(argcTSRMLS_CC, "s", &s, &s_len) == FAILURE)
411. return;
412.
413. if(!initutf)
414. RETURN_FALSE
415.
416. sRet=estrdup(s);
417. if(NULL == sRet)
418. RETURN_FALSE
419.
420. BIG52GBK(sRet, strlen(sRet));
421. RETVAL_STRING(sRet,1);
422. return;
423. }
424. /* }}} */
425.
426. /* {{{ proto toplee_gbk2big5(string s)
427. */
428. PHP_FUNCTION(toplee_gbk2big5)
429. {
430. char *s = NULL;
431. intargc = ZEND_NUM_ARGS();
432. ints_len;
433. char* sRet = NULL;
434.
435. if(zend_parse_parameters(argcTSRMLS_CC, "s", &s, &s_len) == FAILURE)
436. return;
437.
438. if(!initutf)
439. RETURN_FALSE
440.
441. sRet=estrdup(s);
442. if(NULL == sRet)
443. RETURN_FALSE
444.
445. GBK2BIG5(sRet, strlen(sRet));
446. RETVAL_STRING(sRet,1);
447. return;
448. }
449. /* }}} */
450.
451. /* {{{ proto toplee_normalize_name(string s)
452. */
453. PHP_FUNCTION(toplee_normalize_name)
454. {
455. char *s = NULL;
456. intargc = ZEND_NUM_ARGS();
457. ints_len;
458. char* sRet = NULL;
459.
460. if(zend_parse_parameters(argcTSRMLS_CC, "s", &s, &s_len) == FAILURE)
461. return;
462.
463. if(!initutf)
464. RETURN_FALSE
465.
466. NormalizeName(s);
467.
468. RETURN_STRING(s, 1);
469.
470.
471. return;
472. }
473. /* }}} */
474.
475. /* {{{ proto toplee_fan2jian(int code, string s)
476. */
477. PHP_FUNCTION(toplee_fan2jian)
478. {
479. char *s = NULL;
480. intargc = ZEND_NUM_ARGS();
481. ints_len, code;
482. char* sRet = NULL;
483. char *pSource;
484. char *pDest1=NULL, *pDest2=NULL;
485. intnSourceLen, nDestLen;
486.
487. if(zend_parse_parameters(argcTSRMLS_CC, "ls", &code, &s, &s_len) == FAILURE)
488. return;
489.
490. if(!initutf)
491. RETURN_FALSE
492.
493. pSource = s;
494. nSourceLen = s_len;
495. pDest1 = malloc(nSourceLen * 2);
496. pDest2 = malloc(nSourceLen+1);
497. if(NULL == pDest1 || NULL == pDest2)
498. goto_f2j_err;
499.
500. memset(pDest1, 0, nSourceLen * 2);
501. memset(pDest2, 0, nSourceLen + 1);
502. nDestLen = MultiByteToWideChar(code, 0, pSource, nSourceLen, (short *)pDest1, nSourceLen * 2);
503.
504. if(0 >= nDestLen)
505. goto_f2j_err;
506.
507. nDestLen = WideCharToMultiByte(code, 0, (short *)pDest1, nDestLen, pDest2, nSourceLen, NULL, NULL);
508. if(0 >= nDestLen)
509. goto_f2j_err;
510.
511. RETVAL_STRING(pDest2, 1);
512. if(pDest1 != NULL)
513. free(pDest1);
514. if(pDest2 != NULL)
515. free(pDest2);
516. return;
517.
518. _f2j_err:
519. if(pDest1 != NULL)
520. free(pDest1);
521. if(pDest2 != NULL)
522. free(pDest2);
523. RETURN_FALSE;
524. }
525. /* }}} */
526.
527. /*
528. * Local variables:
529. * tab-width: 4
530. * c-basic-offset: 4
531. * End:
532. * vim600: noet sw=4 ts=4 fdm=marker
533. * vim<600: noet sw=4 ts=4
534. */
事实上我们在这个文件里面定义了所有我们要实现的接口,剩下的部分就是我们再编写几个具体实现的C语言代码,有关C具体实现的技术细节就不在此讨论,有 个关键的大家注意就是,您可以在ext/toplee目录下加入您所有用于实现您在toplee.c里面定义的接口的C源文件和头文件,让 toplee.c在编译的时候可以调用到,这些都是标准的C语言语法。Michael就不另说,下我把我们实现的几个代码都贴出来: