Passed
Push — master ( 31fead...1101eb )
by Ruben
02:09
created

ML::kmeans()   B

Complexity

Conditions 10
Paths 26

Size

Total Lines 45
Code Lines 30

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 30
dl 0
loc 45
rs 7.6666
c 0
b 0
f 0
cc 10
nc 26
nop 2

How to fix   Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
/**
3
 *
4
 * (c) Ruben Dorado <[email protected]>
5
 *
6
 * For the full copyright and license information, please view the LICENSE
7
 * file that was distributed with this source code.
8
 */
9
namespace SiteAnalyzer;
10
11
use Exception;
12
13
/**
14
 * class ML
15
 *
16
 * @package   SiteAnalyzer
17
 * @author    Ruben Dorado <[email protected]>
18
 * @copyright 2018 Ruben Dorado
19
 * @license   http://www.opensource.org/licenses/MIT The MIT License
20
 */
21
class ML
22
{
23
    
24
    /*
25
     * @param
26
     */
27
    function kmeans($data, $nclusters)
0 ignored issues
show
Best Practice introduced by
It is generally recommended to explicitly declare the visibility for methods.

Adding explicit visibility (private, protected, or public) is generally recommend to communicate to other developers how, and from where this method is intended to be used.

Loading history...
28
    {
29
        $resp = [];
30
        $finished = false;
31
        $npoints = count($data);
32
        if (count($npoints) <= 0) throw new Exception("Not enough data. ");    
0 ignored issues
show
Bug introduced by
$npoints of type integer is incompatible with the type Countable|array expected by parameter $var of count(). ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

32
        if (count(/** @scrutinizer ignore-type */ $npoints) <= 0) throw new Exception("Not enough data. ");    
Loading history...
33
        $ndimensions = count($npoints[0]);
34
        $centroids = initCentroids($nclusters, $ndimensions, function(){return rand(0,100)/100;});   
0 ignored issues
show
Bug introduced by
The function initCentroids was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

34
        $centroids = /** @scrutinizer ignore-call */ initCentroids($nclusters, $ndimensions, function(){return rand(0,100)/100;});   
Loading history...
Unused Code introduced by
The assignment to $centroids is dead and can be removed.
Loading history...
35
        while (!$finished) {
36
    
37
            // Assign each one of the points to one centroid        
38
            $nresp = [];
39
            for ($j = 0; $j < $npoints; $j++) {        
40
                $best = -1;
41
                $bdist = INF;
42
                for ($i = 0; $i < $nclusters; $i++) {
43
                    $ndist = eclideanDistance($npoints[$j], $nclusters[$i]);
0 ignored issues
show
Bug introduced by
The function eclideanDistance was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

43
                    $ndist = /** @scrutinizer ignore-call */ eclideanDistance($npoints[$j], $nclusters[$i]);
Loading history...
44
                    if($bdist > $ndist) {
45
                        $bdist = $ndist;
46
                        $best = $i;
47
                    }            
48
                }
49
                $nresp[] = $best;
50
            }
51
        
52
            // Check change 
53
            if(count($resp)!=0) {
54
                $finished = true;
55
                for ($j = 0; $j < $npoints; $j++) {        
56
                    if($resp[$j]!==$nresp[$j]){
57
                        $finished = false;
58
                        break;
59
                    }
60
                }
61
                $resp = $nresp;
62
            }
63
                
64
            /** Recalculate the centroids*/
65
            $centroids = initCentroids($nclusters, $ndimensions, function(){return 0;});
66
            $counts = array_fill(0, $nclusters, 0);
67
            for ($j = 0; $j < $npoints; $j++) {              
68
                sumCentroid($centroids[$resp[$j]], $data[$j]);
0 ignored issues
show
Bug introduced by
The function sumCentroid was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

68
                /** @scrutinizer ignore-call */ 
69
                sumCentroid($centroids[$resp[$j]], $data[$j]);
Loading history...
69
                $counts[$resp[$j]]++;            
70
            }
71
            normalizeCentroids($centroids, $counts);
0 ignored issues
show
Bug introduced by
The function normalizeCentroids was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

71
            /** @scrutinizer ignore-call */ 
72
            normalizeCentroids($centroids, $counts);
Loading history...
72
        }
73
    }
74
75
    /*
76
     * @param
77
     */
78
    function initCentroids($nclusters, $ndimensions, $fvalue) 
0 ignored issues
show
Best Practice introduced by
It is generally recommended to explicitly declare the visibility for methods.

Adding explicit visibility (private, protected, or public) is generally recommend to communicate to other developers how, and from where this method is intended to be used.

Loading history...
79
    {
80
        $resp = [];
81
        for ($i = 0; $i < $nclusters; $i++) {
82
            $centroid = [];
83
            for ($d = 0; $d < $ndimensions; $d++) {
84
                $centroid[] = $fvalue();
85
            }
86
            $resp[] = $centroid;
87
        }
88
        return $resp;
89
    }
90
91
    /*
92
     * @param
93
     */
94
    function eclideanDistance($p1, $p2) {
0 ignored issues
show
Best Practice introduced by
It is generally recommended to explicitly declare the visibility for methods.

Adding explicit visibility (private, protected, or public) is generally recommend to communicate to other developers how, and from where this method is intended to be used.

Loading history...
95
       $len = count($p1);
96
       $acum = 0;
97
       for($i=0; $i<$len; $i++) {
98
           $acum += ($p1[$i] - $p2[$i])**2;
99
       }
100
       return sqrt($acum);
101
    }
102
    
103
}
104