Passed
Push — develop ( 54efe8...813855 )
by Adrien
28:44
created

Csv::loadIntoExisting()   B

Complexity

Conditions 11
Paths 41

Size

Total Lines 58
Code Lines 29

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 28
CRAP Score 11.0359

Importance

Changes 0
Metric Value
cc 11
eloc 29
nc 41
nop 2
dl 0
loc 58
ccs 28
cts 30
cp 0.9333
crap 11.0359
rs 7.3166
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
namespace PhpOffice\PhpSpreadsheet\Reader;
4
5
use PhpOffice\PhpSpreadsheet\Cell\Coordinate;
6
use PhpOffice\PhpSpreadsheet\Shared\StringHelper;
7
use PhpOffice\PhpSpreadsheet\Spreadsheet;
8
9
class Csv extends BaseReader
10
{
11
    /**
12
     * Input encoding.
13
     *
14
     * @var string
15
     */
16
    private $inputEncoding = 'UTF-8';
17
18
    /**
19
     * Delimiter.
20
     *
21
     * @var string
22
     */
23
    private $delimiter;
24
25
    /**
26
     * Enclosure.
27
     *
28
     * @var string
29
     */
30
    private $enclosure = '"';
31
32
    /**
33
     * Sheet index to read.
34
     *
35
     * @var int
36
     */
37
    private $sheetIndex = 0;
38
39
    /**
40
     * Load rows contiguously.
41
     *
42
     * @var bool
43
     */
44
    private $contiguous = false;
45
46
    /**
47
     * Row counter for loading rows contiguously.
48
     *
49
     * @var int
50
     */
51
    private $contiguousRow = -1;
52
53
    /**
54
     * The character that can escape the enclosure.
55
     *
56
     * @var string
57
     */
58
    private $escapeCharacter = '\\';
59
60
    /**
61
     * Create a new CSV Reader instance.
62
     */
63 23
    public function __construct()
64
    {
65 23
        $this->readFilter = new DefaultReadFilter();
66 23
    }
67
68
    /**
69
     * Set input encoding.
70
     *
71
     * @param string $pValue Input encoding, eg: 'UTF-8'
72
     *
73
     * @return Csv
74
     */
75
    public function setInputEncoding($pValue)
76
    {
77
        $this->inputEncoding = $pValue;
78
79
        return $this;
80
    }
81
82
    /**
83
     * Get input encoding.
84
     *
85
     * @return string
86
     */
87
    public function getInputEncoding()
88
    {
89
        return $this->inputEncoding;
90
    }
91
92
    /**
93
     * Move filepointer past any BOM marker.
94
     */
95 12
    protected function skipBOM()
96
    {
97 12
        rewind($this->fileHandle);
98
99 12
        switch ($this->inputEncoding) {
100 12
            case 'UTF-8':
101 12
                fgets($this->fileHandle, 4) == "\xEF\xBB\xBF" ?
102 12
                    fseek($this->fileHandle, 3) : fseek($this->fileHandle, 0);
103
104 12
                break;
105
            case 'UTF-16LE':
106
                fgets($this->fileHandle, 3) == "\xFF\xFE" ?
107
                    fseek($this->fileHandle, 2) : fseek($this->fileHandle, 0);
108
109
                break;
110
            case 'UTF-16BE':
111
                fgets($this->fileHandle, 3) == "\xFE\xFF" ?
112
                    fseek($this->fileHandle, 2) : fseek($this->fileHandle, 0);
113
114
                break;
115
            case 'UTF-32LE':
116
                fgets($this->fileHandle, 5) == "\xFF\xFE\x00\x00" ?
117
                    fseek($this->fileHandle, 4) : fseek($this->fileHandle, 0);
118
119
                break;
120
            case 'UTF-32BE':
121
                fgets($this->fileHandle, 5) == "\x00\x00\xFE\xFF" ?
122
                    fseek($this->fileHandle, 4) : fseek($this->fileHandle, 0);
123
124
                break;
125
            default:
126
                break;
127
        }
128 12
    }
129
130
    /**
131
     * Identify any separator that is explicitly set in the file.
132
     */
133 12
    protected function checkSeparator()
134
    {
135 12
        $line = fgets($this->fileHandle);
136 12
        if ($line === false) {
137
            return;
138
        }
139
140 12
        if ((strlen(trim($line, "\r\n")) == 5) && (stripos($line, 'sep=') === 0)) {
141
            $this->delimiter = substr($line, 4, 1);
142
143
            return;
144
        }
145
146 12
        return $this->skipBOM();
0 ignored issues
show
Bug introduced by
Are you sure the usage of $this->skipBOM() targeting PhpOffice\PhpSpreadsheet\Reader\Csv::skipBOM() seems to always return null.

This check looks for function or method calls that always return null and whose return value is used.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
if ($a->getObject()) {

The method getObject() can return nothing but null, so it makes no sense to use the return value.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
147
    }
148
149
    /**
150
     * Infer the separator if it isn't explicitly set in the file or specified by the user.
151
     */
152 12
    protected function inferSeparator()
153
    {
154 12
        if ($this->delimiter !== null) {
155 4
            return;
156
        }
157
158 10
        $potentialDelimiters = [',', ';', "\t", '|', ':', ' '];
159 10
        $counts = [];
160 10
        foreach ($potentialDelimiters as $delimiter) {
161 10
            $counts[$delimiter] = [];
162
        }
163
164
        // Count how many times each of the potential delimiters appears in each line
165 10
        $numberLines = 0;
166 10
        while (($line = $this->getNextLine()) !== false && (++$numberLines < 1000)) {
167 10
            $countLine = [];
168 10
            for ($i = strlen($line) - 1; $i >= 0; --$i) {
169 10
                $char = $line[$i];
170 10
                if (isset($counts[$char])) {
171 9
                    if (!isset($countLine[$char])) {
172 9
                        $countLine[$char] = 0;
173
                    }
174 9
                    ++$countLine[$char];
175
                }
176
            }
177 10
            foreach ($potentialDelimiters as $delimiter) {
178 10
                $counts[$delimiter][] = isset($countLine[$delimiter])
179 9
                    ? $countLine[$delimiter]
180 10
                    : 0;
181
            }
182
        }
183
184
        // Calculate the mean square deviations for each delimiter (ignoring delimiters that haven't been found consistently)
185 10
        $meanSquareDeviations = [];
186 10
        $middleIdx = floor(($numberLines - 1) / 2);
187
188 10
        foreach ($potentialDelimiters as $delimiter) {
189 10
            $series = $counts[$delimiter];
190 10
            sort($series);
191
192 10
            $median = ($numberLines % 2)
193 6
                ? $series[$middleIdx]
194 10
                : ($series[$middleIdx] + $series[$middleIdx + 1]) / 2;
195
196 10
            if ($median === 0) {
197 10
                continue;
198
            }
199
200 9
            $meanSquareDeviations[$delimiter] = array_reduce(
201 9
                $series,
202 9
                function ($sum, $value) use ($median) {
203 9
                    return $sum + pow($value - $median, 2);
204 9
                }
205 9
            ) / count($series);
206
        }
207
208
        // ... and pick the delimiter with the smallest mean square deviation (in case of ties, the order in potentialDelimiters is respected)
209 10
        $min = INF;
210 10
        foreach ($potentialDelimiters as $delimiter) {
211 10
            if (!isset($meanSquareDeviations[$delimiter])) {
212 10
                continue;
213
            }
214
215 9
            if ($meanSquareDeviations[$delimiter] < $min) {
216 9
                $min = $meanSquareDeviations[$delimiter];
217 9
                $this->delimiter = $delimiter;
218
            }
219
        }
220
221
        // If no delimiter could be detected, fall back to the default
222 10
        if ($this->delimiter === null) {
223 1
            $this->delimiter = reset($potentialDelimiters);
224
        }
225
226 10
        return $this->skipBOM();
0 ignored issues
show
Bug introduced by
Are you sure the usage of $this->skipBOM() targeting PhpOffice\PhpSpreadsheet\Reader\Csv::skipBOM() seems to always return null.

This check looks for function or method calls that always return null and whose return value is used.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
if ($a->getObject()) {

The method getObject() can return nothing but null, so it makes no sense to use the return value.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
227
    }
228
229
    /**
230
     * Get the next full line from the file.
231
     *
232
     * @param string $line
233
     *
234
     * @return bool|string
235
     */
236 10
    private function getNextLine($line = '')
237
    {
238
        // Get the next line in the file
239 10
        $newLine = fgets($this->fileHandle);
240
241
        // Return false if there is no next line
242 10
        if ($newLine === false) {
243 10
            return false;
244
        }
245
246
        // Add the new line to the line passed in
247 10
        $line = $line . $newLine;
248
249
        // Drop everything that is enclosed to avoid counting false positives in enclosures
250 10
        $enclosure = preg_quote($this->enclosure, '/');
251 10
        $line = preg_replace('/(' . $enclosure . '.*' . $enclosure . ')/U', '', $line);
252
253
        // See if we have any enclosures left in the line
254 10
        $matches = [];
255 10
        preg_match('/(' . $enclosure . ')/', $line, $matches);
256
257
        // if we still have an enclosure then we need to read the next line aswell
258 10
        if (count($matches) > 0) {
259 1
            $line = $this->getNextLine($line);
260
        }
261
262 10
        return $line;
263
    }
264
265
    /**
266
     * Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns).
267
     *
268
     * @param string $pFilename
269
     *
270
     * @throws Exception
271
     *
272
     * @return array
273
     */
274
    public function listWorksheetInfo($pFilename)
275
    {
276
        // Open file
277
        if (!$this->canRead($pFilename)) {
278
            throw new Exception($pFilename . ' is an Invalid Spreadsheet file.');
279
        }
280
        $this->openFile($pFilename);
281
        $fileHandle = $this->fileHandle;
282
283
        // Skip BOM, if any
284
        $this->skipBOM();
285
        $this->checkSeparator();
286
        $this->inferSeparator();
287
288
        $worksheetInfo = [];
289
        $worksheetInfo[0]['worksheetName'] = 'Worksheet';
290
        $worksheetInfo[0]['lastColumnLetter'] = 'A';
291
        $worksheetInfo[0]['lastColumnIndex'] = 0;
292
        $worksheetInfo[0]['totalRows'] = 0;
293
        $worksheetInfo[0]['totalColumns'] = 0;
294
295
        // Loop through each line of the file in turn
296
        while (($rowData = fgetcsv($fileHandle, 0, $this->delimiter, $this->enclosure, $this->escapeCharacter)) !== false) {
1 ignored issue
show
Bug introduced by
It seems like $fileHandle can also be of type false; however, parameter $handle of fgetcsv() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

296
        while (($rowData = fgetcsv(/** @scrutinizer ignore-type */ $fileHandle, 0, $this->delimiter, $this->enclosure, $this->escapeCharacter)) !== false) {
Loading history...
297
            ++$worksheetInfo[0]['totalRows'];
298
            $worksheetInfo[0]['lastColumnIndex'] = max($worksheetInfo[0]['lastColumnIndex'], count($rowData) - 1);
299
        }
300
301
        $worksheetInfo[0]['lastColumnLetter'] = Coordinate::stringFromColumnIndex($worksheetInfo[0]['lastColumnIndex'] + 1);
302
        $worksheetInfo[0]['totalColumns'] = $worksheetInfo[0]['lastColumnIndex'] + 1;
303
304
        // Close file
305
        fclose($fileHandle);
1 ignored issue
show
Bug introduced by
It seems like $fileHandle can also be of type false; however, parameter $handle of fclose() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

305
        fclose(/** @scrutinizer ignore-type */ $fileHandle);
Loading history...
306
307
        return $worksheetInfo;
308
    }
309
310
    /**
311
     * Loads Spreadsheet from file.
312
     *
313
     * @param string $pFilename
314
     *
315
     * @throws Exception
316
     *
317
     * @return Spreadsheet
318
     */
319 11
    public function load($pFilename)
320
    {
321
        // Create new Spreadsheet
322 11
        $spreadsheet = new Spreadsheet();
323
324
        // Load into this instance
325 11
        return $this->loadIntoExisting($pFilename, $spreadsheet);
326
    }
327
328
    /**
329
     * Loads PhpSpreadsheet from file into PhpSpreadsheet instance.
330
     *
331
     * @param string $pFilename
332
     * @param Spreadsheet $spreadsheet
333
     *
334
     * @throws Exception
335
     *
336
     * @return Spreadsheet
337
     */
338 12
    public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
339
    {
340 12
        $lineEnding = ini_get('auto_detect_line_endings');
341 12
        ini_set('auto_detect_line_endings', true);
0 ignored issues
show
Bug introduced by
true of type true is incompatible with the type string expected by parameter $newvalue of ini_set(). ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

341
        ini_set('auto_detect_line_endings', /** @scrutinizer ignore-type */ true);
Loading history...
342
343
        // Open file
344 12
        if (!$this->canRead($pFilename)) {
345
            throw new Exception($pFilename . ' is an Invalid Spreadsheet file.');
346
        }
347 12
        $this->openFile($pFilename);
348 12
        $fileHandle = $this->fileHandle;
349
350
        // Skip BOM, if any
351 12
        $this->skipBOM();
352 12
        $this->checkSeparator();
353 12
        $this->inferSeparator();
354
355
        // Create new PhpSpreadsheet object
356 12
        while ($spreadsheet->getSheetCount() <= $this->sheetIndex) {
357 2
            $spreadsheet->createSheet();
358
        }
359 12
        $sheet = $spreadsheet->setActiveSheetIndex($this->sheetIndex);
360
361
        // Set our starting row based on whether we're in contiguous mode or not
362 12
        $currentRow = 1;
363 12
        if ($this->contiguous) {
364 1
            $currentRow = ($this->contiguousRow == -1) ? $sheet->getHighestRow() : $this->contiguousRow;
365
        }
366
367
        // Loop through each line of the file in turn
368 12
        while (($rowData = fgetcsv($fileHandle, 0, $this->delimiter, $this->enclosure, $this->escapeCharacter)) !== false) {
1 ignored issue
show
Bug introduced by
It seems like $fileHandle can also be of type false; however, parameter $handle of fgetcsv() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

368
        while (($rowData = fgetcsv(/** @scrutinizer ignore-type */ $fileHandle, 0, $this->delimiter, $this->enclosure, $this->escapeCharacter)) !== false) {
Loading history...
369 12
            $columnLetter = 'A';
370 12
            foreach ($rowData as $rowDatum) {
371 12
                if ($rowDatum != '' && $this->readFilter->readCell($columnLetter, $currentRow)) {
372
                    // Convert encoding if necessary
373 12
                    if ($this->inputEncoding !== 'UTF-8') {
374
                        $rowDatum = StringHelper::convertEncoding($rowDatum, 'UTF-8', $this->inputEncoding);
375
                    }
376
377
                    // Set cell value
378 12
                    $sheet->getCell($columnLetter . $currentRow)->setValue($rowDatum);
379
                }
380 12
                ++$columnLetter;
381
            }
382 12
            ++$currentRow;
383
        }
384
385
        // Close file
386 12
        fclose($fileHandle);
1 ignored issue
show
Bug introduced by
It seems like $fileHandle can also be of type false; however, parameter $handle of fclose() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

386
        fclose(/** @scrutinizer ignore-type */ $fileHandle);
Loading history...
387
388 12
        if ($this->contiguous) {
389 1
            $this->contiguousRow = $currentRow;
390
        }
391
392 12
        ini_set('auto_detect_line_endings', $lineEnding);
393
394
        // Return
395 12
        return $spreadsheet;
396
    }
397
398
    /**
399
     * Get delimiter.
400
     *
401
     * @return string
402
     */
403 6
    public function getDelimiter()
404
    {
405 6
        return $this->delimiter;
406
    }
407
408
    /**
409
     * Set delimiter.
410
     *
411
     * @param string $delimiter Delimiter, eg: ','
412
     *
413
     * @return CSV
414
     */
415 2
    public function setDelimiter($delimiter)
416
    {
417 2
        $this->delimiter = $delimiter;
418
419 2
        return $this;
420
    }
421
422
    /**
423
     * Get enclosure.
424
     *
425
     * @return string
426
     */
427
    public function getEnclosure()
428
    {
429
        return $this->enclosure;
430
    }
431
432
    /**
433
     * Set enclosure.
434
     *
435
     * @param string $enclosure Enclosure, defaults to "
436
     *
437
     * @return CSV
438
     */
439 1
    public function setEnclosure($enclosure)
440
    {
441 1
        if ($enclosure == '') {
442
            $enclosure = '"';
443
        }
444 1
        $this->enclosure = $enclosure;
445
446 1
        return $this;
447
    }
448
449
    /**
450
     * Get sheet index.
451
     *
452
     * @return int
453
     */
454
    public function getSheetIndex()
455
    {
456
        return $this->sheetIndex;
457
    }
458
459
    /**
460
     * Set sheet index.
461
     *
462
     * @param int $pValue Sheet index
463
     *
464
     * @return CSV
465
     */
466 3
    public function setSheetIndex($pValue)
467
    {
468 3
        $this->sheetIndex = $pValue;
469
470 3
        return $this;
471
    }
472
473
    /**
474
     * Set Contiguous.
475
     *
476
     * @param bool $contiguous
477
     *
478
     * @return Csv
479
     */
480 1
    public function setContiguous($contiguous)
481
    {
482 1
        $this->contiguous = (bool) $contiguous;
483 1
        if (!$contiguous) {
484
            $this->contiguousRow = -1;
485
        }
486
487 1
        return $this;
488
    }
489
490
    /**
491
     * Get Contiguous.
492
     *
493
     * @return bool
494
     */
495
    public function getContiguous()
496
    {
497
        return $this->contiguous;
498
    }
499
500
    /**
501
     * Set escape backslashes.
502
     *
503
     * @param string $escapeCharacter
504
     *
505
     * @return $this
506
     */
507 1
    public function setEscapeCharacter($escapeCharacter)
508
    {
509 1
        $this->escapeCharacter = $escapeCharacter;
510
511 1
        return $this;
512
    }
513
514
    /**
515
     * Get escape backslashes.
516
     *
517
     * @return string
518
     */
519 1
    public function getEscapeCharacter()
520
    {
521 1
        return $this->escapeCharacter;
522
    }
523
524
    /**
525
     * Can the current IReader read the file?
526
     *
527
     * @param string $pFilename
528
     *
529
     * @return bool
530
     */
531 22
    public function canRead($pFilename)
532
    {
533
        // Check if file exists
534
        try {
535 22
            $this->openFile($pFilename);
536
        } catch (Exception $e) {
537
            return false;
538
        }
539
540 22
        fclose($this->fileHandle);
1 ignored issue
show
Bug introduced by
It seems like $this->fileHandle can also be of type false; however, parameter $handle of fclose() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

540
        fclose(/** @scrutinizer ignore-type */ $this->fileHandle);
Loading history...
541
542
        // Trust file extension if any
543 22
        if (strtolower(pathinfo($pFilename, PATHINFO_EXTENSION)) === 'csv') {
544 17
            return true;
545
        }
546
547
        // Attempt to guess mimetype
548 5
        $type = mime_content_type($pFilename);
549
        $supportedTypes = [
550 5
            'text/csv',
551
            'text/plain',
552
            'inode/x-empty',
553
        ];
554
555 5
        return in_array($type, $supportedTypes, true);
556
    }
557
}
558