Skip to content

Commit 78ac8f5

Browse files
author
yanruitao
committed
添加使用说明文件
1 parent 3d05711 commit 78ac8f5

File tree

5 files changed

+183
-5
lines changed

5 files changed

+183
-5
lines changed

README.md

+175
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,178 @@ php 汉字转pinyin 扩展形式的
33

44
###
55
fcgi模式常驻内存,速度很快
6+
7+
### 配置
8+
这里面需要在/path/to/php.ini中配置`pinyin.dir=/path/to/pinyindir`,配置的路径为数据文件存放的目录。
9+
目录下面的文件分为两种:
10+
11+
一种是姓氏文件,只有一个,名称为`surnames`
12+
13+
另一种为普通句子和汉字文件,名称为`words_0``words_1`, ..... , `words_9` ,最多10个,越靠前的是越常用的词组
14+
15+
### 使用
16+
这里用了最简洁的方式,用了一个函数`chinese_to_pinyin(char *str, int flags)`,根据不同的参数,转换为不用的形式
17+
> PINYIN_NONE 拼音不加音调
18+
> PINYIN_UNICODE 拼音加音调
19+
> PINYIN_ISNAME 要转化的内容为名字
20+
> PINYIN_TRIM 简洁模式,去掉所有的标点符号
21+
> PINYIN_FORMAT_EN 将标点符号全转化为英文格式
22+
> PINYIN_FORMAT_CH 将标点符号全转化中文格式
23+
24+
#### PINYIN_NONE 不带音调
25+
```php
26+
print_r(chinese_to_pinyin("你因为穷用盗版的时候至少要知道自己是不对的,这说明你还有救。", PINYIN_NONE));
27+
```
28+
输出结果:
29+
30+
```
31+
Array
32+
(
33+
[0] => ni
34+
[1] => yin
35+
[2] => wei
36+
[3] => qiong
37+
[4] => yong
38+
[5] => dao
39+
[6] => ban
40+
[7] => de
41+
[8] => shi
42+
[9] => hou
43+
[10] => zhi
44+
[11] => shao
45+
[12] => yao
46+
[13] => zhi
47+
[14] => dao
48+
[15] => zi
49+
[16] => ji
50+
[17] => shi
51+
[18] => bu
52+
[19] => dui
53+
[20] => de,
54+
[21] => zhe
55+
[22] => shuo
56+
[23] => ming
57+
[24] => ni
58+
[25] => hai
59+
[26] => you
60+
[27] => jiu。
61+
)
62+
```
63+
64+
#### 不带音调和格式化标点符号
65+
```php
66+
print_r(chinese_to_pinyin("你因为穷用盗版的时候至少要知道自己是不对的,这说明你还有救。", PINYIN_NONE|PINYIN_TRIM));
67+
```
68+
69+
结果如下,可以看出标点符号全过滤掉了
70+
```
71+
Array
72+
(
73+
[0] => ni
74+
[1] => yin
75+
[2] => wei
76+
[3] => qiong
77+
[4] => yong
78+
[5] => dao
79+
[6] => ban
80+
[7] => de
81+
[8] => shi
82+
[9] => hou
83+
[10] => zhi
84+
[11] => shao
85+
[12] => yao
86+
[13] => zhi
87+
[14] => dao
88+
[15] => zi
89+
[16] => ji
90+
[17] => shi
91+
[18] => bu
92+
[19] => dui
93+
[20] => de
94+
[21] => zhe
95+
[22] => shuo
96+
[23] => ming
97+
[24] => ni
98+
[25] => hai
99+
[26] => you
100+
[27] => jiu
101+
)
102+
```
103+
104+
#### 带音调和格式化标点
105+
```php
106+
print_r(chinese_to_pinyin("你因为穷用盗版的时候至少要知道自己是不对的,这说明你还有救。", PINYIN_UNICODE|PINYIN_FORMAT_CH));
107+
```
108+
109+
输出结果如下,标点符号也输出了
110+
```
111+
Array
112+
(
113+
[0] => nǐ
114+
[1] => yīn
115+
[2] => wèi
116+
[3] => qióng
117+
[4] => yòng
118+
[5] => dào
119+
[6] => bǎn
120+
[7] => de
121+
[8] => shí
122+
[9] => hòu
123+
[10] => zhì
124+
[11] => shǎo
125+
[12] => yào
126+
[13] => zhī
127+
[14] => dào
128+
[15] => zì
129+
[16] => jǐ
130+
[17] => shì
131+
[18] => bú
132+
[19] => duì
133+
[20] => de
134+
[21] => ,
135+
[22] => zhè
136+
[23] => shuō
137+
[24] => míng
138+
[25] => nǐ
139+
[26] => hái
140+
[27] => yǒu
141+
[28] => jiù
142+
[29] => 。
143+
)
144+
```
145+
146+
#### 姓名
147+
这里使用了几个朋友的名称(比较有特色的名字)
148+
```php
149+
print_r(chinese_to_pinyin("冼佩君", PINYIN_ISNAME));
150+
print_r(chinese_to_pinyin("袁旭东", PINYIN_ISNAME));
151+
print_r(chinese_to_pinyin("燕睿涛", PINYIN_ISNAME));
152+
print_r(chinese_to_pinyin("单净净", PINYIN_ISNAME));
153+
```
154+
155+
```
156+
Array
157+
(
158+
[0] => xiǎn
159+
[1] => pèi
160+
[2] => jūn
161+
)
162+
Array
163+
(
164+
[0] => yuán
165+
[1] => xù
166+
[2] => dōng
167+
)
168+
Array
169+
(
170+
[0] => yān
171+
[1] => ruì
172+
[2] => tāo
173+
)
174+
Array
175+
(
176+
[0] => shàn
177+
[1] => jìng
178+
[2] => jìng
179+
)
180+
```

modules/pinyin.so

0 Bytes
Binary file not shown.

php_pinyin.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ void str_replace(const char *from, const char *to, char *str, char *ret, zend_bo
9696
#define PINYIN_ISNAME (1<<2)
9797
#define PINYIN_TRIM (1<<3) //省略标点符号
9898
#define PINYIN_FORMAT_EN (1<<4) //将标点符号转为英文的
99-
#define PINYIN_FORMAT (1<<5) //将表单符号分割为一个
99+
#define PINYIN_FORMAT_CH (1<<5) //将表单符号分割为一个
100100

101101
/* In every utility function you add that needs to use variables
102102
in php_pinyin_globals, call TSRMLS_FETCH(); after declaring other

pinyin.c

+3-3
Original file line numberDiff line numberDiff line change
@@ -248,12 +248,12 @@ PHP_FUNCTION(chinese_to_pinyin)
248248
MyList *p_surname = pinyin_globals.mySurnameList->next;
249249

250250
//去掉标点符号
251-
if(l & (PINYIN_TRIM|PINYIN_FORMAT_EN|PINYIN_FORMAT))
251+
if(l & (PINYIN_TRIM|PINYIN_FORMAT_EN|PINYIN_FORMAT_CH))
252252
{
253253
int j = 0;
254254
for(; j<MY_TRIM_NUM; j++)
255255
{
256-
if(l & PINYIN_FORMAT) //仅仅格式化
256+
if(l & PINYIN_FORMAT_CH) //仅仅格式化
257257
{
258258
memset(char_str, '\0', MAX_PUNCTUATION_SIZE);
259259
strcat(char_str, "\t");
@@ -372,7 +372,7 @@ PHP_MINIT_FUNCTION(pinyin)
372372
REGISTER_LONG_CONSTANT("PINYIN_ISNAME", PINYIN_ISNAME, CONST_PERSISTENT | CONST_CS);
373373
REGISTER_LONG_CONSTANT("PINYIN_TRIM", PINYIN_TRIM, CONST_PERSISTENT | CONST_CS);
374374
REGISTER_LONG_CONSTANT("PINYIN_FORMAT_EN", PINYIN_FORMAT_EN, CONST_PERSISTENT | CONST_CS);
375-
REGISTER_LONG_CONSTANT("PINYIN_FORMAT", PINYIN_FORMAT, CONST_PERSISTENT | CONST_CS);
375+
REGISTER_LONG_CONSTANT("PINYIN_FORMAT_CH", PINYIN_FORMAT_CH, CONST_PERSISTENT | CONST_CS);
376376

377377
/* If you have INI entries, uncomment these lines
378378
REGISTER_INI_ENTRIES();

tests/test.php

+4-1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,5 @@
11
<?php
2-
print_r(chinese_to_pinyin("你好..;我是ter", PINYIN_FORMAT));
2+
print_r(chinese_to_pinyin("冼佩君", PINYIN_ISNAME));
3+
print_r(chinese_to_pinyin("袁旭东", PINYIN_ISNAME));
4+
print_r(chinese_to_pinyin("燕睿涛", PINYIN_ISNAME));
5+
print_r(chinese_to_pinyin("单净净", PINYIN_ISNAME));

0 commit comments

Comments
 (0)