Skip to content

Commit d35cb9a

Browse files
authored
v2.0.2
* Add support for utf-8 characters when parsing words.
1 parent 21fffb3 commit d35cb9a

File tree

4 files changed

+19
-1
lines changed

4 files changed

+19
-1
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ phpunit.xml
1717
.buildpath
1818
*.iml
1919
.idea/
20+
.phpunit.result.cache
2021
.project
2122
.settings
2223

CHANGELOG-2.x.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,10 @@
22
This changelog references the relevant changes done in 2.x versions.
33

44

5+
## v2.0.2
6+
* Add support for utf-8 characters when parsing words.
7+
8+
59
## v2.0.1
610
* Do not truncate input in `Tokenizer::scan`. Removed `substr($input, 0, 256)` rule as we're unsure where/why it's there and seems safe to remove.
711

src/Tokenizer.php

4 Bytes
Binary file not shown.

tests/Fixtures/test-queries.php

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@
99
use Gdbots\QueryParser\Node\Field;
1010
use Gdbots\QueryParser\Node\Hashtag;
1111
use Gdbots\QueryParser\Node\Mention;
12-
use Gdbots\QueryParser\Node\Numbr;
1312
use Gdbots\QueryParser\Node\NumberRange;
13+
use Gdbots\QueryParser\Node\Numbr;
1414
use Gdbots\QueryParser\Node\Phrase;
1515
use Gdbots\QueryParser\Node\Subquery;
1616
use Gdbots\QueryParser\Node\Url;
@@ -1328,6 +1328,19 @@
13281328
new Word('$p0rty-spicé'),
13291329
],
13301330
],
1331+
1332+
[
1333+
'name' => 'utf chars',
1334+
'input' => '测试 測試',
1335+
'expected_tokens' => [
1336+
[T::T_WORD, '测试'],
1337+
[T::T_WORD, '測試'],
1338+
],
1339+
'expected_nodes' => [
1340+
new Word('测试'),
1341+
new Word('測試'),
1342+
],
1343+
],
13311344
/*
13321345
* END: ACCENTED CHARS
13331346
*/

0 commit comments

Comments
 (0)