AbstractParserType   A
last analyzed

Complexity

Total Complexity 42

Size/Duplication

Total Lines 448
Duplicated Lines 43.08 %

Coupling/Cohesion

Components 1
Dependencies 26

Test Coverage

Coverage 0%

Importance

Changes 0
Metric Value
wmc 42
lcom 1
cbo 26
dl 193
loc 448
ccs 0
cts 227
cp 0
rs 9.0399
c 0
b 0
f 0

24 Methods

Rating   Name   Duplication   Size   Complexity  
getMapping() 0 1 ?
getAssociationMapping() 0 1 ?
getFieldMapping() 0 1 ?
getEntityFields() 0 1 ?
getForeignMapping() 0 1 ?
A __construct() 0 5 1
A setOptions() 24 24 1
A build() 0 35 2
A getOriginalIdCallback() 16 16 3
A getOriginalUrlCallback() 16 16 3
A getModificationDateCallback() 18 18 4
A addCustomModifiers() 0 3 1
A addEntityModifiers() 21 21 4
A addAssociationModifiers() 15 15 2
A addFieldModifiers() 4 16 2
C addFieldTypeModifiers() 48 64 11
A addFinalModifiers() 8 8 1
A getMappedFields() 23 23 1
A getUnmappedFields() 0 4 1
A getExtraMappedFields() 0 4 1
A getOriginalIdSelector() 0 3 1
A getOriginalUrlSelector() 0 3 1
A getModificationDateSelector() 0 3 1
A getEntityManager() 0 4 1

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like AbstractParserType often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use AbstractParserType, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
namespace TreeHouse\IoBundle\Scrape\Parser\Type;
4
5
use Doctrine\Common\Persistence\ManagerRegistry;
6
use Doctrine\ORM\EntityManagerInterface;
7
use Doctrine\ORM\Mapping\ClassMetadataInfo;
8
use Symfony\Component\DomCrawler\Crawler;
9
use Symfony\Component\HttpFoundation\ParameterBag;
10
use Symfony\Component\OptionsResolver\OptionsResolver;
11
use TreeHouse\Feeder\Modifier\Data\Transformer\DateTimeToIso8601Transformer;
12
use TreeHouse\Feeder\Modifier\Data\Transformer\EmptyValueToNullTransformer;
13
use TreeHouse\Feeder\Modifier\Data\Transformer\StringToBooleanTransformer;
14
use TreeHouse\Feeder\Modifier\Data\Transformer\TraversingTransformer;
15
use TreeHouse\Feeder\Modifier\Item\Transformer\ObsoleteFieldsTransformer;
16
use TreeHouse\Feeder\Modifier\Item\Transformer\TrimTransformer;
17
use TreeHouse\Feeder\Modifier\Item\Transformer\UnderscoreKeysTransformer;
18
use TreeHouse\IoBundle\Entity\Scraper;
19
use TreeHouse\IoBundle\Item\Modifier\Data\Transformer\DutchStringToDateTimeTransformer;
20
use TreeHouse\IoBundle\Item\Modifier\Data\Transformer\EntityToIdTransformer;
21
use TreeHouse\IoBundle\Item\Modifier\Data\Transformer\ForeignMappingTransformer;
22
use TreeHouse\IoBundle\Item\Modifier\Data\Transformer\LocalizedStringToNumberTransformer;
23
use TreeHouse\IoBundle\Item\Modifier\Data\Transformer\MultiLineToSingleLineTransformer;
24
use TreeHouse\IoBundle\Item\Modifier\Data\Transformer\MultiSpaceToSingleSpaceTransformer;
25
use TreeHouse\IoBundle\Item\Modifier\Data\Transformer\NormalizedStringTransformer;
26
use TreeHouse\IoBundle\Item\Modifier\Data\Transformer\StringToDateTimeTransformer;
27
use TreeHouse\IoBundle\Item\Modifier\Item\Filter\BlockedSourceFilter;
28
use TreeHouse\IoBundle\Item\Modifier\Item\Filter\ModifiedItemFilter;
29
use TreeHouse\IoBundle\Item\Modifier\Item\Transformer\DefaultValuesTransformer;
30
use TreeHouse\IoBundle\Item\Modifier\Item\Validator\OriginIdValidator;
31
use TreeHouse\IoBundle\Scrape\Modifier\Item\Mapper\NodeMapper;
32
use TreeHouse\IoBundle\Scrape\Modifier\Item\Mapper\ScrapedItemBagMapper;
33
use TreeHouse\IoBundle\Scrape\Parser\ParserBuilderInterface;
34
use TreeHouse\IoBundle\Source\Manager\CachedSourceManager;
35
36
abstract class AbstractParserType implements ParserTypeInterface
37
{
38
    /**
39
     * @var ManagerRegistry
40
     */
41
    protected $doctrine;
42
43
    /**
44
     * @var CachedSourceManager
45
     */
46
    protected $sourceManager;
47
48
    /**
49
     * @var array
50
     */
51
    protected $options = [];
52
53
    /**
54
     * @param ManagerRegistry     $doctrine
55
     * @param CachedSourceManager $sourceManager
56
     */
57
    public function __construct(ManagerRegistry $doctrine, CachedSourceManager $sourceManager)
58
    {
59
        $this->doctrine = $doctrine;
60
        $this->sourceManager = $sourceManager;
61
    }
62
63
    /**
64
     * {@inheritDoc}
65
     */
66 View Code Duplication
    public function setOptions(OptionsResolver $resolver)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
67
    {
68
        $resolver->setRequired([
69
            'forced',
70
            'scraper',
71
            'date_locale',
72
            'number_locale',
73
            'default_values',
74
        ]);
75
76
        $resolver->setDefaults([
77
            'forced' => false,
78
            'date_locale' => 'en',
79
            'number_locale' => 'en',
80
            'default_values' => [],
81
        ]);
82
83
        $resolver->setAllowedTypes('forced', 'bool');
84
        $resolver->setAllowedTypes('scraper', Scraper::class);
85
        $resolver->setAllowedTypes('default_values', 'array');
86
87
        $resolver->setAllowedValues('date_locale', ['en', 'nl']);
88
        $resolver->setAllowedValues('number_locale', ['en', 'nl']);
89
    }
90
91
    /**
92
     * @inheritdoc
93
     */
94
    public function build(ParserBuilderInterface $parser, array $options)
95
    {
96
        $this->options = $options;
97
98
        // set original id/url on the item
99
        $parser->addModifierBetween(new ScrapedItemBagMapper($this->getOriginalIdCallback(), $this->getOriginalUrlCallback(), $this->getModificationDateCallback()), 100, 200);
100
101
        // range 300-400: perform validation and checks for skipping early on
102
        $parser->addModifierBetween(new OriginIdValidator(), 300, 400);
103
        $parser->addModifierBetween(new BlockedSourceFilter($this->sourceManager), 300, 400);
104
105
        // check for modification dates, but only when not forced
106
        if ($options['forced'] === false) {
107
            $parser->addModifierBetween(new ModifiedItemFilter($this->sourceManager), 300, 400);
108
        }
109
110
        // 2000-3000: map paths
111
        $parser->addModifier(new NodeMapper($this->getMapping()), 2000);
112
        $parser->addModifierBetween(new TrimTransformer(), 2000, 3000);
113
114
        // 3000-3500: transform specific fields
115
116
        // 3500-4000: feed-type specific: reserved for field transformers before the regular transformers
117
118
        // 4000-5000: reserved for transformers added automatically based on entity field mapping
119
        $this->addEntityModifiers($parser, 4000, 5000);
120
121
        // 5000-6000: feed-type specific: reserved for field transformers after the regular transformers
122
123
        // 6000-7000: reserved for modifiers after all other modifiers are done
124
        $this->addFinalModifiers($parser, 6000, 7000);
125
126
        // give extending feed type a method for custom modifiers
127
        $this->addCustomModifiers($parser, $options);
128
    }
129
130
    /**
131
     * @return array
132
     */
133
    abstract protected function getMapping();
134
135
    /**
136
     * Returns mapping for an association, or null if it does not exist.
137
     *
138
     * @param string $association
139
     *
140
     * @return array|null
141
     */
142
    abstract protected function getAssociationMapping($association);
143
144
    /**
145
     * Returns mapping for a field, or null if it does not exist.
146
     *
147
     * @param string $field
148
     *
149
     * @return array|null
150
     */
151
    abstract protected function getFieldMapping($field);
152
153
    /**
154
     * Returns an array with all the field/association names for
155
     * the entity that is imported.
156
     *
157
     * @return array
158
     */
159
    abstract protected function getEntityFields();
160
161
    /**
162
     * Specify a mapping here from foreign configuration to our configuration.
163
     *
164
     * @return array
165
     */
166
    abstract protected function getForeignMapping();
167
168
    /**
169
     * @return \Closure
170
     */
171 View Code Duplication
    protected function getOriginalIdCallback()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
172
    {
173
        return function (Crawler $crawler) {
174
            if ($selector = $this->getOriginalIdSelector()) {
175
                if (preg_match('/^\/\//', $selector)) {
176
                    $node = $crawler->filterXPath($selector);
177
                } else {
178
                    $node = $crawler->filter($selector);
179
                }
180
181
                return $node->html();
182
            }
183
184
            return null;
185
        };
186
    }
187
188
    /**
189
     * @return \Closure
190
     */
191 View Code Duplication
    protected function getOriginalUrlCallback()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
192
    {
193
        return function (Crawler $crawler) {
194
            if ($selector = $this->getOriginalUrlSelector()) {
195
                if (preg_match('/^\/\//', $selector)) {
196
                    $node = $crawler->filterXPath($selector);
197
                } else {
198
                    $node = $crawler->filter($selector);
199
                }
200
201
                return $node->html();
202
            }
203
204
            return null;
205
        };
206
    }
207
208
    /**
209
     * @return \Closure
210
     */
211 View Code Duplication
    protected function getModificationDateCallback()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
212
    {
213
        return function (Crawler $crawler) {
214
            if ($selector = $this->getModificationDateSelector()) {
215
                if (preg_match('/^\/\//', $selector)) {
216
                    $node = $crawler->filterXPath($selector);
217
                } else {
218
                    $node = $crawler->filter($selector);
219
                }
220
221
                if ($date = $node->html()) {
222
                    return new \DateTime($date);
223
                }
224
            }
225
226
            return null;
227
        };
228
    }
229
230
    /**
231
     * Override this method to add custom modifiers to the feed.
232
     *
233
     * @param ParserBuilderInterface $parser
234
     * @param array                  $options
235
     */
236
    protected function addCustomModifiers(ParserBuilderInterface $parser, array $options)
0 ignored issues
show
Unused Code introduced by
The parameter $parser is not used and could be removed.

This check looks from parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
Unused Code introduced by
The parameter $options is not used and could be removed.

This check looks from parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
237
    {
238
    }
239
240
    /**
241
     * Automatically adds modifiers based on entity field/association mapping.
242
     *
243
     * @param ParserBuilderInterface $parser
244
     * @param int                    $startIndex
245
     * @param int                    $endIndex
246
     */
247 View Code Duplication
    protected function addEntityModifiers(ParserBuilderInterface $parser, $startIndex, $endIndex)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
248
    {
249
        // skip id
250
        $mappedFields = array_filter($this->getMappedFields(), function ($value) {
251
            return $value !== 'id';
252
        });
253
254
        foreach ($mappedFields as $key) {
255
            // see if association is in meta
256
            if (null !== $mapping = $this->getAssociationMapping($key)) {
257
                $this->addAssociationModifiers($parser, $key, $mapping, $startIndex, $endIndex);
258
                continue;
259
            }
260
261
            // see if field is in meta
262
            if (null !== $mapping = $this->getFieldMapping($key)) {
263
                $this->addFieldModifiers($parser, $key, $mapping, $startIndex, $endIndex);
264
                continue;
265
            }
266
        }
267
    }
268
269
    /**
270
     * @param ParserBuilderInterface $parser
271
     * @param string                 $association The association name
272
     * @param array                  $mapping     The association mapping
273
     * @param int                    $startIndex
274
     * @param int                    $endIndex
275
     *
276
     * @return int The updated index
277
     */
278 View Code Duplication
    protected function addAssociationModifiers(
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
279
        ParserBuilderInterface $parser,
280
        $association,
281
        array $mapping,
282
        $startIndex,
283
        $endIndex
284
    ) {
285
        $transformer = new EntityToIdTransformer($this->getEntityManager());
286
287
        if ($mapping['type'] & ClassMetadataInfo::TO_MANY) {
288
            $transformer = new TraversingTransformer($transformer);
289
        }
290
291
        $parser->addTransformerBetween($transformer, $association, $startIndex, $endIndex);
292
    }
293
294
    /**
295
     * @param ParserBuilderInterface $parser
296
     * @param string                 $field      The field name
297
     * @param array                  $mapping    The field mapping
298
     * @param int                    $startIndex
299
     * @param int                    $endIndex
300
     */
301
    protected function addFieldModifiers(
302
        ParserBuilderInterface $parser,
303
        $field,
304
        array $mapping,
305
        $startIndex,
306
        $endIndex
307
    ) {
308
        // see if we need to translate it using foreign mapping
309
        $foreignMapping = $this->getForeignMapping();
310 View Code Duplication
        if (array_key_exists($field, $foreignMapping)) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
311
            $transformer = new ForeignMappingTransformer($field, $foreignMapping[$field]);
312
            $parser->addTransformerBetween($transformer, $field, $startIndex, $endIndex);
313
        }
314
315
        $this->addFieldTypeModifiers($parser, $field, $mapping, $startIndex, $endIndex);
316
    }
317
318
    /**
319
     * @param ParserBuilderInterface $parser
320
     * @param string                 $field      The field name
321
     * @param array                  $mapping    The field mapping
322
     * @param int                    $startIndex
323
     * @param int                    $endIndex
324
     */
325
    protected function addFieldTypeModifiers(ParserBuilderInterface $parser, $field, array $mapping, $startIndex, $endIndex)
326
    {
327
        // try to cast types
328
        switch ($mapping['type']) {
329 View Code Duplication
            case 'string':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
330
                $parser->addTransformerBetween(new MultiLineToSingleLineTransformer(), $field, $startIndex, $endIndex);
331
                $parser->addTransformerBetween(new MultiSpaceToSingleSpaceTransformer(), $field, $startIndex, $endIndex);
332
                $parser->addTransformerBetween(new NormalizedStringTransformer(), $field, $startIndex, $endIndex);
333
                break;
334 View Code Duplication
            case 'text':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
335
                $parser->addTransformerBetween(new MultiSpaceToSingleSpaceTransformer(), $field, $startIndex, $endIndex);
336
                $parser->addTransformerBetween(new NormalizedStringTransformer(), $field, $startIndex, $endIndex);
337
                break;
338
339
            case 'integer':
340 View Code Duplication
            case 'smallint':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
341
                $parser->addTransformerBetween(
342
                    new LocalizedStringToNumberTransformer($this->options['number_locale'], 0, true, null),
343
                    $field,
344
                    $startIndex,
345
                    $endIndex
346
                );
347
                break;
348
349 View Code Duplication
            case 'decimal':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
350
                $parser->addTransformerBetween(
351
                    new LocalizedStringToNumberTransformer($this->options['number_locale'], $mapping['scale'], true, null),
352
                    $field,
353
                    $startIndex,
354
                    $endIndex
355
                );
356
                break;
357
358 View Code Duplication
            case 'boolean':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
359
                $parser->addTransformerBetween(
360
                    new StringToBooleanTransformer(['ja', 'j', 'y'], ['nee', 'n']),
361
                    $field,
362
                    $startIndex,
363
                    $endIndex
364
                );
365
                break;
366
367
            case 'date':
368 View Code Duplication
            case 'datetime':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
369
                switch ($this->options['date_locale']) {
370
                    case 'nl':
371
                        $transformer = new DutchStringToDateTimeTransformer();
372
                        break;
373
374
                    default:
375
                        $transformer = new StringToDateTimeTransformer();
376
377
                        break;
378
                }
379
380
                $parser->addTransformerBetween($transformer, $field, $startIndex, $endIndex);
381
                $parser->addTransformerBetween(new DateTimeToIso8601Transformer(), $field, $startIndex, $endIndex);
382
                break;
383
        }
384
385
        if ($mapping['nullable']) {
386
            $parser->addTransformerBetween(new EmptyValueToNullTransformer(), $field, $startIndex, $endIndex);
387
        }
388
    }
389
390
    /**
391
     * @param ParserBuilderInterface $parser
392
     * @param int                    $startStartIndex
393
     * @param int                    $endIndex
394
     */
395 View Code Duplication
    protected function addFinalModifiers(ParserBuilderInterface $parser, $startStartIndex, $endIndex)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
396
    {
397
        // set default values
398
        $parser->addModifierBetween(new DefaultValuesTransformer($this->options['default_values']), $startStartIndex, $endIndex);
399
400
        // scrub obsolete fields
401
        $parser->addModifierBetween(new ObsoleteFieldsTransformer($this->getMappedFields()), $startStartIndex, $endIndex);
402
    }
403
404
    /**
405
     * Returns the names of all mapped and extra mapped fields. These are the
406
     * fields that are allowed in the resulting item. The fields are
407
     * normalized to be lowercased and underscored (instead of dashes).
408
     *
409
     * @return array
410
     */
411 View Code Duplication
    protected function getMappedFields()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
412
    {
413
        $fields = array_diff(
414
            array_unique(
415
                array_merge(
416
                    array_values($this->getEntityFields()),
417
                    array_values($this->getExtraMappedFields()),
418
                    array_keys($this->getMapping())
419
                )
420
            ),
421
            $this->getUnmappedFields()
422
        );
423
424
        $fields = new ParameterBag(array_flip($fields));
425
426
        // make sure id is not in it
427
        $fields->remove('id');
428
429
        $transformer = new UnderscoreKeysTransformer();
430
        $transformer->transform($fields);
431
432
        return $fields->keys();
433
    }
434
435
    /**
436
     * Specify fields here that are explicitly not mapped.
437
     *
438
     * @return array
439
     */
440
    protected function getUnmappedFields()
441
    {
442
        return [];
443
    }
444
445
    /**
446
     * Specify fields here that are not mapped directly, but need to stay in the resulting item.
447
     *
448
     * @return array
449
     */
450
    protected function getExtraMappedFields()
451
    {
452
        return [];
453
    }
454
455
    /**
456
     * @return string
457
     */
458
    protected function getOriginalIdSelector()
459
    {
460
    }
461
462
    /**
463
     * @return string
464
     */
465
    protected function getOriginalUrlSelector()
466
    {
467
    }
468
469
    /**
470
     * @return string
471
     */
472
    protected function getModificationDateSelector()
473
    {
474
    }
475
476
    /**
477
     * @return EntityManagerInterface
478
     */
479
    protected function getEntityManager()
480
    {
481
        return $this->doctrine->getManager();
482
    }
483
}
484