Completed
Push — develop ( ac1c7a...ca6114 )
by Adrien
30:28 queued 32s
created

Csv::setDelimiter()   A

Complexity

Conditions 1
Paths 1

Size

Total Lines 5
Code Lines 2

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 3
CRAP Score 1

Importance

Changes 0
Metric Value
cc 1
eloc 2
nc 1
nop 1
dl 0
loc 5
ccs 3
cts 3
cp 1
crap 1
rs 9.4285
c 0
b 0
f 0
1
<?php
2
3
namespace PhpOffice\PhpSpreadsheet\Reader;
4
5
use PhpOffice\PhpSpreadsheet\Cell\Coordinate;
6
use PhpOffice\PhpSpreadsheet\Shared\StringHelper;
7
use PhpOffice\PhpSpreadsheet\Spreadsheet;
8
9
class Csv extends BaseReader
10
{
11
    /**
12
     * Input encoding.
13
     *
14
     * @var string
15
     */
16
    private $inputEncoding = 'UTF-8';
17
18
    /**
19
     * Delimiter.
20
     *
21
     * @var string
22
     */
23
    private $delimiter;
24
25
    /**
26
     * Enclosure.
27
     *
28
     * @var string
29
     */
30
    private $enclosure = '"';
31
32
    /**
33
     * Sheet index to read.
34
     *
35
     * @var int
36
     */
37
    private $sheetIndex = 0;
38
39
    /**
40
     * Load rows contiguously.
41
     *
42
     * @var bool
43
     */
44
    private $contiguous = false;
45
46
    /**
47
     * Row counter for loading rows contiguously.
48
     *
49
     * @var int
50
     */
51
    private $contiguousRow = -1;
52
53
    /**
54
     * Create a new CSV Reader instance.
55
     */
56 10
    public function __construct()
57
    {
58 10
        $this->readFilter = new DefaultReadFilter();
59 10
    }
60
61
    /**
62
     * Set input encoding.
63
     *
64
     * @param string $pValue Input encoding, eg: 'UTF-8'
65
     */
66
    public function setInputEncoding($pValue)
67
    {
68
        $this->inputEncoding = $pValue;
69
70
        return $this;
71
    }
72
73
    /**
74
     * Get input encoding.
75
     *
76
     * @return string
77
     */
78
    public function getInputEncoding()
79
    {
80
        return $this->inputEncoding;
81
    }
82
83
    /**
84
     * Move filepointer past any BOM marker.
85
     */
86 9
    protected function skipBOM()
87
    {
88 9
        rewind($this->fileHandle);
89
90 9
        switch ($this->inputEncoding) {
91 9
            case 'UTF-8':
92 9
                fgets($this->fileHandle, 4) == "\xEF\xBB\xBF" ?
93 9
                    fseek($this->fileHandle, 3) : fseek($this->fileHandle, 0);
94
95 9
                break;
96 View Code Duplication
            case 'UTF-16LE':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
97
                fgets($this->fileHandle, 3) == "\xFF\xFE" ?
98
                    fseek($this->fileHandle, 2) : fseek($this->fileHandle, 0);
99
100
                break;
101 View Code Duplication
            case 'UTF-16BE':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
102
                fgets($this->fileHandle, 3) == "\xFE\xFF" ?
103
                    fseek($this->fileHandle, 2) : fseek($this->fileHandle, 0);
104
105
                break;
106 View Code Duplication
            case 'UTF-32LE':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
107
                fgets($this->fileHandle, 5) == "\xFF\xFE\x00\x00" ?
108
                    fseek($this->fileHandle, 4) : fseek($this->fileHandle, 0);
109
110
                break;
111 View Code Duplication
            case 'UTF-32BE':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
112
                fgets($this->fileHandle, 5) == "\x00\x00\xFE\xFF" ?
113
                    fseek($this->fileHandle, 4) : fseek($this->fileHandle, 0);
114
115
                break;
116
            default:
117
                break;
118
        }
119 9
    }
120
121
    /**
122
     * Identify any separator that is explicitly set in the file.
123
     */
124 9
    protected function checkSeparator()
125
    {
126 9
        $line = fgets($this->fileHandle);
127 9
        if ($line === false) {
128
            return;
129
        }
130
131 9
        if ((strlen(trim($line, "\r\n")) == 5) && (stripos($line, 'sep=') === 0)) {
132
            $this->delimiter = substr($line, 4, 1);
133
134
            return;
135
        }
136
137 9
        return $this->skipBOM();
0 ignored issues
show
Bug introduced by
Are you sure the usage of $this->skipBOM() targeting PhpOffice\PhpSpreadsheet\Reader\Csv::skipBOM() seems to always return null.

This check looks for function or method calls that always return null and whose return value is used.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
if ($a->getObject()) {

The method getObject() can return nothing but null, so it makes no sense to use the return value.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
138
    }
139
140
    /**
141
     * Infer the separator if it isn't explicitly set in the file or specified by the user.
142
     */
143 9
    protected function inferSeparator()
144
    {
145 9
        if ($this->delimiter !== null) {
146 4
            return;
147
        }
148
149 7
        $potentialDelimiters = [',', ';', "\t", '|', ':', ' '];
150 7
        $counts = [];
151 7
        foreach ($potentialDelimiters as $delimiter) {
152 7
            $counts[$delimiter] = [];
153
        }
154
155
        // Count how many times each of the potential delimiters appears in each line
156 7
        $numberLines = 0;
157 7
        while (($line = fgets($this->fileHandle)) !== false && (++$numberLines < 1000)) {
158
            // Drop everything that is enclosed to avoid counting false positives in enclosures
159 7
            $enclosure = preg_quote($this->enclosure, '/');
160 7
            $line = preg_replace('/(' . $enclosure . '.*' . $enclosure . ')/U', '', $line);
161
162 7
            $countLine = [];
163 7
            for ($i = strlen($line) - 1; $i >= 0; --$i) {
164 7
                $char = $line[$i];
165 7
                if (isset($counts[$char])) {
166 7
                    if (!isset($countLine[$char])) {
167 7
                        $countLine[$char] = 0;
168
                    }
169 7
                    ++$countLine[$char];
170
                }
171
            }
172 7
            foreach ($potentialDelimiters as $delimiter) {
173 7
                $counts[$delimiter][] = isset($countLine[$delimiter])
174 7
                    ? $countLine[$delimiter]
175 7
                    : 0;
176
            }
177
        }
178
179
        // Calculate the mean square deviations for each delimiter (ignoring delimiters that haven't been found consistently)
180 7
        $meanSquareDeviations = [];
181 7
        $middleIdx = floor(($numberLines - 1) / 2);
182
183 7
        foreach ($potentialDelimiters as $delimiter) {
184 7
            $series = $counts[$delimiter];
185 7
            sort($series);
186
187 7
            $median = ($numberLines % 2)
188 4
                ? $series[$middleIdx]
189 7
                : ($series[$middleIdx] + $series[$middleIdx + 1]) / 2;
190
191 7
            if ($median === 0) {
192 7
                continue;
193
            }
194
195 7
            $meanSquareDeviations[$delimiter] = array_reduce(
196 7
                $series,
197 7
                function ($sum, $value) use ($median) {
198 7
                    return $sum + pow($value - $median, 2);
199 7
                }
200 7
            ) / count($series);
201
        }
202
203
        // ... and pick the delimiter with the smallest mean square deviation (in case of ties, the order in potentialDelimiters is respected)
204 7
        $min = INF;
205 7
        foreach ($potentialDelimiters as $delimiter) {
206 7
            if (!isset($meanSquareDeviations[$delimiter])) {
207 7
                continue;
208
            }
209
210 7
            if ($meanSquareDeviations[$delimiter] < $min) {
211 7
                $min = $meanSquareDeviations[$delimiter];
212 7
                $this->delimiter = $delimiter;
213
            }
214
        }
215
216
        // If no delimiter could be detected, fall back to the default
217 7
        if ($this->delimiter === null) {
218
            $this->delimiter = reset($potentialDelimiters);
219
        }
220
221 7
        return $this->skipBOM();
222
    }
223
224
    /**
225
     * Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns).
226
     *
227
     * @param string $pFilename
228
     *
229
     * @throws Exception
230
     *
231
     * @return array
232
     */
233
    public function listWorksheetInfo($pFilename)
234
    {
235
        // Open file
236
        if (!$this->canRead($pFilename)) {
237
            throw new Exception($pFilename . ' is an Invalid Spreadsheet file.');
238
        }
239
        $this->openFile($pFilename);
240
        $fileHandle = $this->fileHandle;
241
242
        // Skip BOM, if any
243
        $this->skipBOM();
244
        $this->checkSeparator();
245
        $this->inferSeparator();
246
247
        $worksheetInfo = [];
248
        $worksheetInfo[0]['worksheetName'] = 'Worksheet';
249
        $worksheetInfo[0]['lastColumnLetter'] = 'A';
250
        $worksheetInfo[0]['lastColumnIndex'] = 0;
251
        $worksheetInfo[0]['totalRows'] = 0;
252
        $worksheetInfo[0]['totalColumns'] = 0;
253
254
        // Loop through each line of the file in turn
255
        while (($rowData = fgetcsv($fileHandle, 0, $this->delimiter, $this->enclosure)) !== false) {
1 ignored issue
show
Bug introduced by
It seems like $fileHandle can also be of type false; however, parameter $handle of fgetcsv() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

255
        while (($rowData = fgetcsv(/** @scrutinizer ignore-type */ $fileHandle, 0, $this->delimiter, $this->enclosure)) !== false) {
Loading history...
256
            ++$worksheetInfo[0]['totalRows'];
257
            $worksheetInfo[0]['lastColumnIndex'] = max($worksheetInfo[0]['lastColumnIndex'], count($rowData) - 1);
258
        }
259
260
        $worksheetInfo[0]['lastColumnLetter'] = Coordinate::stringFromColumnIndex($worksheetInfo[0]['lastColumnIndex'] + 1);
261
        $worksheetInfo[0]['totalColumns'] = $worksheetInfo[0]['lastColumnIndex'] + 1;
262
263
        // Close file
264
        fclose($fileHandle);
1 ignored issue
show
Bug introduced by
It seems like $fileHandle can also be of type false; however, parameter $handle of fclose() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

264
        fclose(/** @scrutinizer ignore-type */ $fileHandle);
Loading history...
265
266
        return $worksheetInfo;
267
    }
268
269
    /**
270
     * Loads Spreadsheet from file.
271
     *
272
     * @param string $pFilename
273
     *
274
     * @throws Exception
275
     *
276
     * @return Spreadsheet
277
     */
278 8
    public function load($pFilename)
279
    {
280
        // Create new Spreadsheet
281 8
        $spreadsheet = new Spreadsheet();
282
283
        // Load into this instance
284 8
        return $this->loadIntoExisting($pFilename, $spreadsheet);
285
    }
286
287
    /**
288
     * Loads PhpSpreadsheet from file into PhpSpreadsheet instance.
289
     *
290
     * @param string $pFilename
291
     * @param Spreadsheet $spreadsheet
292
     *
293
     * @throws Exception
294
     *
295
     * @return Spreadsheet
296
     */
297 9
    public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
298
    {
299 9
        $lineEnding = ini_get('auto_detect_line_endings');
300 9
        ini_set('auto_detect_line_endings', true);
0 ignored issues
show
Bug introduced by
true of type true is incompatible with the type string expected by parameter $newvalue of ini_set(). ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

300
        ini_set('auto_detect_line_endings', /** @scrutinizer ignore-type */ true);
Loading history...
301
302
        // Open file
303 9
        if (!$this->canRead($pFilename)) {
304
            throw new Exception($pFilename . ' is an Invalid Spreadsheet file.');
305
        }
306 9
        $this->openFile($pFilename);
307 9
        $fileHandle = $this->fileHandle;
308
309
        // Skip BOM, if any
310 9
        $this->skipBOM();
311 9
        $this->checkSeparator();
312 9
        $this->inferSeparator();
313
314
        // Create new PhpSpreadsheet object
315 9
        while ($spreadsheet->getSheetCount() <= $this->sheetIndex) {
316 2
            $spreadsheet->createSheet();
317
        }
318 9
        $sheet = $spreadsheet->setActiveSheetIndex($this->sheetIndex);
319
320
        // Set our starting row based on whether we're in contiguous mode or not
321 9
        $currentRow = 1;
322 9
        if ($this->contiguous) {
323 1
            $currentRow = ($this->contiguousRow == -1) ? $sheet->getHighestRow() : $this->contiguousRow;
324
        }
325
326
        // Loop through each line of the file in turn
327 9
        while (($rowData = fgetcsv($fileHandle, 0, $this->delimiter, $this->enclosure)) !== false) {
1 ignored issue
show
Bug introduced by
It seems like $fileHandle can also be of type false; however, parameter $handle of fgetcsv() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

327
        while (($rowData = fgetcsv(/** @scrutinizer ignore-type */ $fileHandle, 0, $this->delimiter, $this->enclosure)) !== false) {
Loading history...
328 9
            $columnLetter = 'A';
329 9
            foreach ($rowData as $rowDatum) {
330 9
                if ($rowDatum != '' && $this->readFilter->readCell($columnLetter, $currentRow)) {
331
                    // Convert encoding if necessary
332 9
                    if ($this->inputEncoding !== 'UTF-8') {
333
                        $rowDatum = StringHelper::convertEncoding($rowDatum, 'UTF-8', $this->inputEncoding);
334
                    }
335
336
                    // Set cell value
337 9
                    $sheet->getCell($columnLetter . $currentRow)->setValue($rowDatum);
338
                }
339 9
                ++$columnLetter;
340
            }
341 9
            ++$currentRow;
342
        }
343
344
        // Close file
345 9
        fclose($fileHandle);
1 ignored issue
show
Bug introduced by
It seems like $fileHandle can also be of type false; however, parameter $handle of fclose() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

345
        fclose(/** @scrutinizer ignore-type */ $fileHandle);
Loading history...
346
347 9
        if ($this->contiguous) {
348 1
            $this->contiguousRow = $currentRow;
349
        }
350
351 9
        ini_set('auto_detect_line_endings', $lineEnding);
352
353
        // Return
354 9
        return $spreadsheet;
355
    }
356
357
    /**
358
     * Get delimiter.
359
     *
360
     * @return string
361
     */
362 5
    public function getDelimiter()
363
    {
364 5
        return $this->delimiter;
365
    }
366
367
    /**
368
     * Set delimiter.
369
     *
370
     * @param string $delimiter Delimiter, eg: ','
371
     *
372
     * @return CSV
373
     */
374 2
    public function setDelimiter($delimiter)
375
    {
376 2
        $this->delimiter = $delimiter;
377
378 2
        return $this;
379
    }
380
381
    /**
382
     * Get enclosure.
383
     *
384
     * @return string
385
     */
386
    public function getEnclosure()
387
    {
388
        return $this->enclosure;
389
    }
390
391
    /**
392
     * Set enclosure.
393
     *
394
     * @param string $enclosure Enclosure, defaults to "
395
     *
396
     * @return CSV
397
     */
398 1
    public function setEnclosure($enclosure)
399
    {
400 1
        if ($enclosure == '') {
401
            $enclosure = '"';
402
        }
403 1
        $this->enclosure = $enclosure;
404
405 1
        return $this;
406
    }
407
408
    /**
409
     * Get sheet index.
410
     *
411
     * @return int
412
     */
413
    public function getSheetIndex()
414
    {
415
        return $this->sheetIndex;
416
    }
417
418
    /**
419
     * Set sheet index.
420
     *
421
     * @param int $pValue Sheet index
422
     *
423
     * @return CSV
424
     */
425 3
    public function setSheetIndex($pValue)
426
    {
427 3
        $this->sheetIndex = $pValue;
428
429 3
        return $this;
430
    }
431
432
    /**
433
     * Set Contiguous.
434
     *
435
     * @param bool $contiguous
436
     */
437 1
    public function setContiguous($contiguous)
438
    {
439 1
        $this->contiguous = (bool) $contiguous;
440 1
        if (!$contiguous) {
441
            $this->contiguousRow = -1;
442
        }
443
444 1
        return $this;
445
    }
446
447
    /**
448
     * Get Contiguous.
449
     *
450
     * @return bool
451
     */
452
    public function getContiguous()
453
    {
454
        return $this->contiguous;
455
    }
456
457
    /**
458
     * Can the current IReader read the file?
459
     *
460
     * @param string $pFilename
461
     *
462
     * @throws Exception
463
     *
464
     * @return bool
465
     */
466 9
    public function canRead($pFilename)
467
    {
468
        // Check if file exists
469
        try {
470 9
            $this->openFile($pFilename);
471
        } catch (Exception $e) {
472
            return false;
473
        }
474
475 9
        fclose($this->fileHandle);
1 ignored issue
show
Bug introduced by
It seems like $this->fileHandle can also be of type false; however, parameter $handle of fclose() does only seem to accept resource, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

475
        fclose(/** @scrutinizer ignore-type */ $this->fileHandle);
Loading history...
476
477 9
        return true;
478
    }
479
}
480