| Conditions | 11 |
| Paths | 4 |
| Total Lines | 59 |
| Code Lines | 28 |
| Lines | 0 |
| Ratio | 0 % |
| Changes | 2 | ||
| Bugs | 0 | Features | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
| 1 | <?php |
||
| 163 | protected function initializeClusters($nbClusters, $seed) |
||
| 164 | { |
||
| 165 | if ($nbClusters <= 0) { |
||
| 166 | throw new InvalidArgumentException('invalid clusters number'); |
||
| 167 | } |
||
| 168 | |||
| 169 | switch ($seed) { |
||
| 170 | // the default seeding method chooses completely random centroid |
||
| 171 | case KMeans::INIT_RANDOM: |
||
| 172 | // get the space boundaries to avoid placing clusters centroid too far from points |
||
| 173 | list($min, $max) = $this->getBoundaries(); |
||
| 174 | |||
| 175 | // initialize N clusters with a random point within space boundaries |
||
| 176 | for ($n = 0; $n < $nbClusters; ++$n) { |
||
| 177 | $clusters[] = new Cluster($this, $this->getRandomPoint($min, $max)->getCoordinates()); |
||
|
|
|||
| 178 | } |
||
| 179 | |||
| 180 | break; |
||
| 181 | |||
| 182 | // the DASV seeding method consists of finding good initial centroids for the clusters |
||
| 183 | case KMeans::INIT_KMEANS_PLUS_PLUS: |
||
| 184 | // find a random point |
||
| 185 | $position = rand(1, count($this)); |
||
| 186 | for ($i = 1, $this->rewind(); $i < $position && $this->valid(); $i++, $this->next()); |
||
| 187 | $clusters[] = new Cluster($this, $this->current()->getCoordinates()); |
||
| 188 | |||
| 189 | // retains the distances between points and their closest clusters |
||
| 190 | $distances = new SplObjectStorage(); |
||
| 191 | |||
| 192 | // create k clusters |
||
| 193 | for ($i = 1; $i < $nbClusters; ++$i) { |
||
| 194 | $sum = 0; |
||
| 195 | |||
| 196 | // for each points, get the distance with the closest centroid already choosen |
||
| 197 | foreach ($this as $point) { |
||
| 198 | $distance = $point->getDistanceWith($point->getClosest($clusters)); |
||
| 199 | $sum += $distances[$point] = $distance; |
||
| 200 | } |
||
| 201 | |||
| 202 | // choose a new random point using a weighted probability distribution |
||
| 203 | $sum = rand(0, (int) $sum); |
||
| 204 | foreach ($this as $point) { |
||
| 205 | if (($sum -= $distances[$point]) > 0) { |
||
| 206 | continue; |
||
| 207 | } |
||
| 208 | |||
| 209 | $clusters[] = new Cluster($this, $point->getCoordinates()); |
||
| 210 | break; |
||
| 211 | } |
||
| 212 | } |
||
| 213 | |||
| 214 | break; |
||
| 215 | } |
||
| 216 | |||
| 217 | // assing all points to the first cluster |
||
| 218 | $clusters[0]->attachAll($this); |
||
| 219 | |||
| 220 | return $clusters; |
||
| 221 | } |
||
| 222 | |||
| 272 |
Adding an explicit array definition is generally preferable to implicit array definition as it guarantees a stable state of the code.
Let’s take a look at an example:
As you can see in this example, the array
$myArrayis initialized the first time when the foreach loop is entered. You can also see that the value of thebarkey is only written conditionally; thus, its value might result from a previous iteration.This might or might not be intended. To make your intention clear, your code more readible and to avoid accidental bugs, we recommend to add an explicit initialization $myArray = array() either outside or inside the foreach loop.