Completed
Pull Request — master (#316)
by Adrien
02:40
created

StyleHelper   B

Complexity

Total Complexity 40

Size/Duplication

Total Lines 318
Duplicated Lines 0 %

Coupling/Cohesion

Components 1
Dependencies 1

Test Coverage

Coverage 98.1%

Importance

Changes 0
Metric Value
wmc 40
lcom 1
cbo 1
dl 0
loc 318
ccs 103
cts 105
cp 0.981
rs 8.2608
c 0
b 0
f 0

14 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 5 1
A shouldFormatNumericValueAsDate() 0 15 3
B extractRelevantInfo() 0 20 5
A extractNumberFormats() 0 13 4
B extractStyleAttributes() 0 20 6
A getStylesAttributes() 0 8 2
A doesStyleIndicateDate() 0 16 3
A doesNumFmtIdIndicateDate() 0 13 3
A getFormatCodeForNumFmtId() 0 7 2
A isNumFmtIdBuiltInDateFormat() 0 4 1
A getCustomNumberFormats() 0 8 2
A isFormatCodeCustomDateFormat() 0 9 3
B isFormatCodeMatchingDateFormatPattern() 0 24 3
A getNumberFormatCode() 0 15 2

How to fix   Complexity   

Complex Class

Complex classes like StyleHelper often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use StyleHelper, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
namespace Box\Spout\Reader\XLSX\Helper;
4
5
use Box\Spout\Reader\Wrapper\XMLReader;
6
7
/**
8
 * Class StyleHelper
9
 * This class provides helper functions related to XLSX styles
10
 *
11
 * @package Box\Spout\Reader\XLSX\Helper
12
 */
13
class StyleHelper
14
{
15
    /** Paths of XML files relative to the XLSX file root */
16
    const STYLES_XML_FILE_PATH = 'xl/styles.xml';
17
18
    /** Nodes used to find relevant information in the styles XML file */
19
    const XML_NODE_NUM_FMTS = 'numFmts';
20
    const XML_NODE_NUM_FMT = 'numFmt';
21
    const XML_NODE_CELL_XFS = 'cellXfs';
22
    const XML_NODE_XF = 'xf';
23
24
    /** Attributes used to find relevant information in the styles XML file */
25
    const XML_ATTRIBUTE_NUM_FMT_ID = 'numFmtId';
26
    const XML_ATTRIBUTE_FORMAT_CODE = 'formatCode';
27
    const XML_ATTRIBUTE_APPLY_NUMBER_FORMAT = 'applyNumberFormat';
28
29
    /** By convention, default style ID is 0 */
30
    const DEFAULT_STYLE_ID = 0;
31
32
    const NUMBER_FORMAT_GENERAL = 'General';
33
34
    /**
35
     * @see https://msdn.microsoft.com/en-us/library/ff529597(v=office.12).aspx
36
     * @var array Mapping between built-in numFmtId and the associated format - for dates only
37
     */
38
    protected static $builtinNumFmtIdToNumFormatMapping = [
39
        14 => 'm/d/yyyy', // @NOTE: ECMA spec is 'mm-dd-yy'
40
        15 => 'd-mmm-yy',
41
        16 => 'd-mmm',
42
        17 => 'mmm-yy',
43
        18 => 'h:mm AM/PM',
44
        19 => 'h:mm:ss AM/PM',
45
        20 => 'h:mm',
46
        21 => 'h:mm:ss',
47
        22 => 'm/d/yyyy h:mm', // @NOTE: ECMA spec is 'm/d/yy h:mm',
48
        45 => 'mm:ss',
49
        46 => '[h]:mm:ss',
50
        47 => 'mm:ss.0',  // @NOTE: ECMA spec is 'mmss.0',
51
    ];
52
53
    /** @var string Path of the XLSX file being read */
54
    protected $filePath;
55
56
    /** @var array Array containing the IDs of built-in number formats indicating a date */
57
    protected $builtinNumFmtIdIndicatingDates;
58
59
    /** @var array Array containing a mapping NUM_FMT_ID => FORMAT_CODE */
60
    protected $customNumberFormats;
61
62
    /** @var array Array containing a mapping STYLE_ID => [STYLE_ATTRIBUTES] */
63
    protected $stylesAttributes;
64
65
    /** @var array Cache containing a mapping NUM_FMT_ID => IS_DATE_FORMAT. Used to avoid lots of recalculations */
66
    protected $numFmtIdToIsDateFormatCache = [];
67
68
    /**
69
     * @param string $filePath Path of the XLSX file being read
70
     */
71 192
    public function __construct($filePath)
72
    {
73 192
        $this->filePath = $filePath;
74 192
        $this->builtinNumFmtIdIndicatingDates = array_keys(self::$builtinNumFmtIdToNumFormatMapping);
75 192
    }
76
77
    /**
78
     * Returns whether the style with the given ID should consider
79
     * numeric values as timestamps and format the cell as a date.
80
     *
81
     * @param int $styleId Zero-based style ID
82
     * @return bool Whether the cell with the given cell should display a date instead of a numeric value
83
     */
84 129
    public function shouldFormatNumericValueAsDate($styleId)
85
    {
86 129
        $stylesAttributes = $this->getStylesAttributes();
87
88
        // Default style (0) does not format numeric values as timestamps. Only custom styles do.
89
        // Also if the style ID does not exist in the styles.xml file, format as numeric value.
90
        // Using isset here because it is way faster than array_key_exists...
91 129
        if ($styleId === self::DEFAULT_STYLE_ID || !isset($stylesAttributes[$styleId])) {
92 18
            return false;
93
        }
94
95 111
        $styleAttributes = $stylesAttributes[$styleId];
96
97 111
        return $this->doesStyleIndicateDate($styleAttributes);
98
    }
99
100
    /**
101
     * Reads the styles.xml file and extract the relevant information from the file.
102
     *
103
     * @return void
104
     */
105 30
    protected function extractRelevantInfo()
106
    {
107 30
        $this->customNumberFormats = [];
108 30
        $this->stylesAttributes = [];
109
110 30
        $xmlReader = new XMLReader();
111
112 30
        if ($xmlReader->openFileInZip($this->filePath, self::STYLES_XML_FILE_PATH)) {
113 30
            while ($xmlReader->read()) {
114 30
                if ($xmlReader->isPositionedOnStartingNode(self::XML_NODE_NUM_FMTS)) {
115 21
                    $this->extractNumberFormats($xmlReader);
116
117 30
                } else if ($xmlReader->isPositionedOnStartingNode(self::XML_NODE_CELL_XFS)) {
118 30
                    $this->extractStyleAttributes($xmlReader);
119 30
                }
120 30
            }
121
122 30
            $xmlReader->close();
123 30
        }
124 30
    }
125
126
    /**
127
     * Extracts number formats from the "numFmt" nodes.
128
     * For simplicity, the styles attributes are kept in memory. This is possible thanks
129
     * to the reuse of formats. So 1 million cells should not use 1 million formats.
130
     *
131
     * @param \Box\Spout\Reader\Wrapper\XMLReader $xmlReader XML Reader positioned on the "numFmts" node
132
     * @return void
133
     */
134 21
    protected function extractNumberFormats($xmlReader)
135
    {
136 21
        while ($xmlReader->read()) {
137 21
            if ($xmlReader->isPositionedOnStartingNode(self::XML_NODE_NUM_FMT)) {
138 21
                $numFmtId = intval($xmlReader->getAttribute(self::XML_ATTRIBUTE_NUM_FMT_ID));
139 21
                $formatCode = $xmlReader->getAttribute(self::XML_ATTRIBUTE_FORMAT_CODE);
140 21
                $this->customNumberFormats[$numFmtId] = $formatCode;
141 21
            } else if ($xmlReader->isPositionedOnEndingNode(self::XML_NODE_NUM_FMTS)) {
142
                // Once done reading "numFmts" node's children
143 21
                break;
144
            }
145 21
        }
146 21
    }
147
148
    /**
149
     * Extracts style attributes from the "xf" nodes, inside the "cellXfs" section.
150
     * For simplicity, the styles attributes are kept in memory. This is possible thanks
151
     * to the reuse of styles. So 1 million cells should not use 1 million styles.
152
     *
153
     * @param \Box\Spout\Reader\Wrapper\XMLReader $xmlReader XML Reader positioned on the "cellXfs" node
154
     * @return void
155
     */
156 30
    protected function extractStyleAttributes($xmlReader)
157
    {
158 30
        while ($xmlReader->read()) {
159 30
            if ($xmlReader->isPositionedOnStartingNode(self::XML_NODE_XF)) {
160 30
                $numFmtId = $xmlReader->getAttribute(self::XML_ATTRIBUTE_NUM_FMT_ID);
161 30
                $normalizedNumFmtId = ($numFmtId !== null) ? intval($numFmtId) : null;
162
163 30
                $applyNumberFormat = $xmlReader->getAttribute(self::XML_ATTRIBUTE_APPLY_NUMBER_FORMAT);
164 30
                $normalizedApplyNumberFormat = ($applyNumberFormat !== null) ? !!$applyNumberFormat : null;
165
166 30
                $this->stylesAttributes[] = [
167 30
                    self::XML_ATTRIBUTE_NUM_FMT_ID => $normalizedNumFmtId,
168 30
                    self::XML_ATTRIBUTE_APPLY_NUMBER_FORMAT => $normalizedApplyNumberFormat,
169
                ];
170 30
            } else if ($xmlReader->isPositionedOnEndingNode(self::XML_NODE_CELL_XFS)) {
171
                // Once done reading "cellXfs" node's children
172 30
                break;
173
            }
174 30
        }
175 30
    }
176
177
    /**
178
     * @return array The custom number formats
179
     */
180 15
    protected function getCustomNumberFormats()
181
    {
182 15
        if (!isset($this->customNumberFormats)) {
183
            $this->extractRelevantInfo();
184
        }
185
186 15
        return $this->customNumberFormats;
187
    }
188
189
    /**
190
     * @return array The styles attributes
191
     */
192 30
    protected function getStylesAttributes()
193
    {
194 30
        if (!isset($this->stylesAttributes)) {
195 30
            $this->extractRelevantInfo();
196 30
        }
197
198 30
        return $this->stylesAttributes;
199
    }
200
201
    /**
202
     * @param array $styleAttributes Array containing the style attributes (2 keys: "applyNumberFormat" and "numFmtId")
203
     * @return bool Whether the style with the given attributes indicates that the number is a date
204
     */
205 111
    protected function doesStyleIndicateDate($styleAttributes)
206
    {
207 111
        $applyNumberFormat = $styleAttributes[self::XML_ATTRIBUTE_APPLY_NUMBER_FORMAT];
208 111
        $numFmtId = $styleAttributes[self::XML_ATTRIBUTE_NUM_FMT_ID];
209
210
        // A style may apply a date format if it has:
211
        //  - "applyNumberFormat" attribute not set to "false"
212
        //  - "numFmtId" attribute set
213
        // This is a preliminary check, as having "numFmtId" set just means the style should apply a specific number format,
214
        // but this is not necessarily a date.
215 111
        if ($applyNumberFormat === false || $numFmtId === null) {
216 6
            return false;
217
        }
218
219 105
        return $this->doesNumFmtIdIndicateDate($numFmtId);
220
    }
221
222
    /**
223
     * Returns whether the number format ID indicates that the number is a date.
224
     * The result is cached to avoid recomputing the same thing over and over, as
225
     * "numFmtId" attributes can be shared between multiple styles.
226
     *
227
     * @param int $numFmtId
228
     * @return bool Whether the number format ID indicates that the number is a date
229
     */
230 105
    protected function doesNumFmtIdIndicateDate($numFmtId)
231
    {
232 105
        if (!isset($this->numFmtIdToIsDateFormatCache[$numFmtId])) {
233 105
            $formatCode = $this->getFormatCodeForNumFmtId($numFmtId);
234
235 105
            $this->numFmtIdToIsDateFormatCache[$numFmtId] = (
236 105
                $this->isNumFmtIdBuiltInDateFormat($numFmtId) ||
237 99
                $this->isFormatCodeCustomDateFormat($formatCode)
238 99
            );
239 105
        }
240
241 105
        return $this->numFmtIdToIsDateFormatCache[$numFmtId];
242
    }
243
244
    /**
245
     * @param int $numFmtId
246
     * @return string|null The custom number format or NULL if none defined for the given numFmtId
247
     */
248 105
    protected function getFormatCodeForNumFmtId($numFmtId)
249
    {
250 105
        $customNumberFormats = $this->getCustomNumberFormats();
251
252
        // Using isset here because it is way faster than array_key_exists...
253 105
        return (isset($customNumberFormats[$numFmtId])) ? $customNumberFormats[$numFmtId] : null;
254
    }
255
256
    /**
257
     * @param int $numFmtId
258
     * @return bool Whether the number format ID indicates that the number is a date
259
     */
260 105
    protected function isNumFmtIdBuiltInDateFormat($numFmtId)
261
    {
262 105
        return in_array($numFmtId, $this->builtinNumFmtIdIndicatingDates);
263
    }
264
265
    /**
266
     * @param string|null $formatCode
267
     * @return bool Whether the given format code indicates that the number is a date
268
     */
269 99
    protected function isFormatCodeCustomDateFormat($formatCode)
270
    {
271
        // if no associated format code or if using the default "General" format
272 99
        if ($formatCode === null || strcasecmp($formatCode, self::NUMBER_FORMAT_GENERAL) === 0) {
273 21
            return false;
274
        }
275
276 81
        return $this->isFormatCodeMatchingDateFormatPattern($formatCode);
277
    }
278
279
    /**
280
     * @param string $formatCode
281
     * @return bool Whether the given format code matches a date format pattern
282
     */
283 81
    protected function isFormatCodeMatchingDateFormatPattern($formatCode)
284
    {
285
        // Remove extra formatting (what's between [ ], the brackets should not be preceded by a "\")
286 81
        $pattern = '((?<!\\\)\[.+?(?<!\\\)\])';
287 81
        $formatCode = preg_replace($pattern, '', $formatCode);
288
289
        // custom date formats contain specific characters to represent the date:
290
        // e - yy - m - d - h - s
291
        // and all of their variants (yyyy - mm - dd...)
292 81
        $dateFormatCharacters = ['e', 'yy', 'm', 'd', 'h', 's'];
293
294 81
        $hasFoundDateFormatCharacter = false;
295 81
        foreach ($dateFormatCharacters as $dateFormatCharacter) {
296
            // character not preceded by "\" (case insensitive)
297 81
            $pattern = '/(?<!\\\)' . $dateFormatCharacter . '/i';
298
299 81
            if (preg_match($pattern, $formatCode)) {
300 75
                $hasFoundDateFormatCharacter = true;
301 75
                break;
302
            }
303 81
        }
304
305 81
        return $hasFoundDateFormatCharacter;
306
    }
307
308
    /**
309
     * Returns the format as defined in "styles.xml" of the given style.
310
     * NOTE: It is assumed that the style DOES have a number format associated to it.
311
     *
312
     * @param int $styleId Zero-based style ID
313
     * @return string The number format code associated with the given style
314
     */
315 6
    public function getNumberFormatCode($styleId)
316
    {
317 6
        $stylesAttributes = $this->getStylesAttributes();
318 6
        $styleAttributes = $stylesAttributes[$styleId];
319 6
        $numFmtId = $styleAttributes[self::XML_ATTRIBUTE_NUM_FMT_ID];
320
321 6
        if ($this->isNumFmtIdBuiltInDateFormat($numFmtId)) {
322 3
            $numberFormatCode = self::$builtinNumFmtIdToNumFormatMapping[$numFmtId];
323 3
        } else {
324 6
            $customNumberFormats = $this->getCustomNumberFormats();
325 6
            $numberFormatCode = $customNumberFormats[$numFmtId];
326
        }
327
328 6
        return $numberFormatCode;
329
    }
330
}
331