CJKSimpleCharacterRegExTokenizerTest   A
last analyzed

Complexity

Total Complexity 4

Size/Duplication

Total Lines 72
Duplicated Lines 0 %

Coupling/Cohesion

Components 0
Dependencies 1

Importance

Changes 0
Metric Value
wmc 4
lcom 0
cbo 1
dl 0
loc 72
rs 10
c 0
b 0
f 0

4 Methods

Rating   Name   Duplication   Size   Complexity  
A testUnknownOption() 0 7 1
A testTokenize() 0 13 1
B testTokenizeWithEnabledExemptionList() 0 28 1
A stringProvider() 0 14 1
1
<?php
2
3
namespace Onoi\Tesa\Tests;
4
5
use Onoi\Tesa\Tokenizer\CJKSimpleCharacterRegExTokenizer;
6
7
/**
8
 * @covers \Onoi\Tesa\Tokenizer\CJKSimpleCharacterRegExTokenizer
9
 * @group onoi-tesa
10
 *
11
 * @license GNU GPL v2+
12
 * @since 0.1
13
 *
14
 * @author mwjames
15
 */
16
class CJKSimpleCharacterRegExTokenizerTest extends \PHPUnit_Framework_TestCase {
17
18
	public function testUnknownOption() {
19
20
		$this->assertInstanceOf(
21
			'\Onoi\Tesa\Tokenizer\CJKSimpleCharacterRegExTokenizer',
22
			new CJKSimpleCharacterRegExTokenizer()
23
		);
24
	}
25
26
	/**
27
	 * @dataProvider stringProvider
28
	 */
29
	public function testTokenize( $string, $expected ) {
30
31
		$instance = new CJKSimpleCharacterRegExTokenizer();
32
33
		$this->assertEquals(
34
			$expected,
35
			$instance->tokenize( $string )
36
		);
37
38
		$this->assertFalse(
39
			$instance->isWordTokenizer()
40
		);
41
	}
42
43
	public function testTokenizeWithEnabledExemptionList() {
44
45
		$string = '《红色中华》报改名为《新中华报》的同时,在';
46
47
		$tokenizer = $this->getMockBuilder( '\Onoi\Tesa\Tokenizer\Tokenizer' )
48
			->disableOriginalConstructor()
49
			->getMockForAbstractClass();
50
51
		$tokenizer->expects( $this->once() )
52
			->method( 'setOption' );
53
54
		$tokenizer->expects( $this->once() )
55
			->method( 'tokenize' )
56
			->with( $this->equalTo( $string ) )
57
			->will( $this->returnValue( array( $string ) ) );
58
59
		$instance = new CJKSimpleCharacterRegExTokenizer( $tokenizer );
60
61
		$instance->setOption(
62
			CJKSimpleCharacterRegExTokenizer::REGEX_EXEMPTION,
63
			array( '《', '》', ',')
64
		);
65
66
		$this->assertEquals(
67
			array( '《红色中华》报改名', '《新中华报》', '同', ',' ),
68
			$instance->tokenize( $string )
69
		);
70
	}
71
72
	public function stringProvider() {
73
74
		$provider[] = array(
0 ignored issues
show
Coding Style Comprehensibility introduced by
$provider was never initialized. Although not strictly required by PHP, it is generally a good practice to add $provider = array(); before regardless.

Adding an explicit array definition is generally preferable to implicit array definition as it guarantees a stable state of the code.

Let’s take a look at an example:

foreach ($collection as $item) {
    $myArray['foo'] = $item->getFoo();

    if ($item->hasBar()) {
        $myArray['bar'] = $item->getBar();
    }

    // do something with $myArray
}

As you can see in this example, the array $myArray is initialized the first time when the foreach loop is entered. You can also see that the value of the bar key is only written conditionally; thus, its value might result from a previous iteration.

This might or might not be intended. To make your intention clear, your code more readible and to avoid accidental bugs, we recommend to add an explicit initialization $myArray = array() either outside or inside the foreach loop.

Loading history...
75
			'《红色中华》报改名为《新中华报》的同时,在延安更名为新华通讯社。但是当时,新华社和《新中华报》还是同一个机构。',
76
			array( '红色中华', '报改名', '新中华报', '同', '延安更名', '新华通讯社', '新华社', '新中华报', '同一', '机构' )
77
		);
78
79
		$provider[] = array(
80
			'江泽民在北京人民大会堂会见参加全国法院工作会议和全国法院系统打击经济犯罪先进集体表彰大会代表时要求大家要充分认识打击经济犯罪的艰巨性和长期性',
81
			array( '江泽民', '北京人民大会堂会见参加全国法院工作会议', '全国法院系统打击经济犯罪先进集体表彰大会代表', '求大家', '充分认识打击经济犯罪', '艰巨性', '长期性' )
82
		);
83
84
		return $provider;
85
	}
86
87
}
88