Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.
Common duplication problems, and corresponding solutions are:
1 | <?php |
||
11 | class KernelPCA extends PCA |
||
12 | { |
||
13 | const KERNEL_RBF = 1; |
||
14 | const KERNEL_SIGMOID = 2; |
||
15 | const KERNEL_LAPLACIAN = 3; |
||
16 | const KERNEL_LINEAR = 4; |
||
17 | |||
18 | /** |
||
19 | * Selected kernel function |
||
20 | * |
||
21 | * @var int |
||
22 | */ |
||
23 | protected $kernel; |
||
24 | |||
25 | /** |
||
26 | * Gamma value used by the kernel |
||
27 | * |
||
28 | * @var float |
||
29 | */ |
||
30 | protected $gamma; |
||
31 | |||
32 | /** |
||
33 | * Original dataset used to fit KernelPCA |
||
34 | * |
||
35 | * @var array |
||
36 | */ |
||
37 | protected $data; |
||
38 | |||
39 | /** |
||
40 | * Kernel principal component analysis (KernelPCA) is an extension of PCA using |
||
41 | * techniques of kernel methods. It is more suitable for data that involves |
||
42 | * vectors that are not linearly separable<br><br> |
||
43 | * Example: <b>$kpca = new KernelPCA(KernelPCA::KERNEL_RBF, null, 2, 15.0);</b> |
||
44 | * will initialize the algorithm with an RBF kernel having the gamma parameter as 15,0. <br> |
||
45 | * This transformation will return the same number of rows with only <i>2</i> columns. |
||
46 | * |
||
47 | * @param int $kernel |
||
48 | * @param float $totalVariance Total variance to be preserved if numFeatures is not given |
||
49 | * @param int $numFeatures Number of columns to be returned |
||
50 | * @param float $gamma Gamma parameter is used with RBF and Sigmoid kernels |
||
51 | * |
||
52 | * @throws \Exception |
||
53 | */ |
||
54 | public function __construct(int $kernel = self::KERNEL_RBF, $totalVariance = null, $numFeatures = null, $gamma = null) |
||
66 | |||
67 | /** |
||
68 | * Takes a data and returns a lower dimensional version |
||
69 | * of this data while preserving $totalVariance or $numFeatures. <br> |
||
70 | * $data is an n-by-m matrix and returned array is |
||
71 | * n-by-k matrix where k <= m |
||
72 | * |
||
73 | * @param array $data |
||
74 | * |
||
75 | * @return array |
||
76 | */ |
||
77 | public function fit(array $data) |
||
95 | |||
96 | /** |
||
97 | * Calculates similarity matrix by use of selected kernel function<br> |
||
98 | * An n-by-m matrix is given and an n-by-n matrix is returned |
||
99 | * |
||
100 | * @param array $data |
||
101 | * @param int $numRows |
||
102 | * |
||
103 | * @return array |
||
104 | */ |
||
105 | protected function calculateKernelMatrix(array $data, int $numRows) |
||
122 | |||
123 | /** |
||
124 | * Kernel matrix is centered in its original space by using the following |
||
125 | * conversion: |
||
126 | * |
||
127 | * K′ = K − N.K − K.N + N.K.N where N is n-by-n matrix filled with 1/n |
||
128 | * |
||
129 | * @param array $matrix |
||
130 | * @param int $n |
||
131 | */ |
||
132 | protected function centerMatrix(array $matrix, int $n) |
||
150 | |||
151 | /** |
||
152 | * Returns the callable kernel function |
||
153 | * |
||
154 | * @return \Closure |
||
155 | */ |
||
156 | protected function getKernel() |
||
186 | |||
187 | /** |
||
188 | * @param array $sample |
||
189 | * |
||
190 | * @return array |
||
191 | */ |
||
192 | protected function getDistancePairs(array $sample) |
||
203 | |||
204 | /** |
||
205 | * @param array $pairs |
||
206 | * |
||
207 | * @return array |
||
208 | */ |
||
209 | protected function projectSample(array $pairs) |
||
223 | |||
224 | /** |
||
225 | * Transforms the given sample to a lower dimensional vector by using |
||
226 | * the variables obtained during the last run of <code>fit</code>. |
||
227 | * |
||
228 | * @param array $sample |
||
229 | * |
||
230 | * @return array |
||
231 | */ |
||
232 | public function transform(array $sample) |
||
246 | } |
||
247 |
Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.
You can also find more detailed suggestions in the “Code” section of your repository.