Completed
Pull Request — master (#36)
by Tim
03:00
created

SubjectPlugin::processSubject()   B

Complexity

Conditions 6
Paths 7

Size

Total Lines 68
Code Lines 25

Duplication

Lines 3
Ratio 4.41 %

Code Coverage

Tests 0
CRAP Score 42

Importance

Changes 0
Metric Value
dl 3
loc 68
ccs 0
cts 33
cp 0
rs 8.5748
c 0
b 0
f 0
cc 6
eloc 25
nc 7
nop 1
crap 42

How to fix   Long Method   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
/**
4
 * TechDivision\Import\Plugins\SubjectPlugin
5
 *
6
 * NOTICE OF LICENSE
7
 *
8
 * This source file is subject to the Open Software License (OSL 3.0)
9
 * that is available through the world-wide-web at this URL:
10
 * http://opensource.org/licenses/osl-3.0.php
11
 *
12
 * PHP version 5
13
 *
14
 * @author    Tim Wagner <[email protected]>
15
 * @copyright 2016 TechDivision GmbH <[email protected]>
16
 * @license   http://opensource.org/licenses/osl-3.0.php Open Software License (OSL 3.0)
17
 * @link      https://github.com/techdivision/import
18
 * @link      http://www.techdivision.com
19
 */
20
21
namespace TechDivision\Import\Plugins;
22
23
use TechDivision\Import\Utils\BunchKeys;
24
use TechDivision\Import\Utils\RegistryKeys;
25
use TechDivision\Import\ApplicationInterface;
26
use TechDivision\Import\Subjects\ExportableSubjectInterface;
27
use TechDivision\Import\Cli\Exceptions\LineNotFoundException;
28
use TechDivision\Import\Cli\Exceptions\FileNotFoundException;
29
use TechDivision\Import\Configuration\SubjectConfigurationInterface;
30
31
/**
32
 * Plugin that processes the subjects.
33
 *
34
 * @author    Tim Wagner <[email protected]>
35
 * @copyright 2016 TechDivision GmbH <[email protected]>
36
 * @license   http://opensource.org/licenses/osl-3.0.php Open Software License (OSL 3.0)
37
 * @link      https://github.com/techdivision/import
38
 * @link      http://www.techdivision.com
39
 */
40
class SubjectPlugin extends AbstractPlugin
41
{
42
43
    /**
44
     * The matches for the last processed CSV filename.
45
     *
46
     * @var array
47
     */
48
    protected $matches = array();
49
50
    /**
51
     * The number of imported bunches.
52
     *
53
     * @var integer
54
     */
55
    protected $bunches = 0;
56
57
    /**
58
     * Process the plugin functionality.
59
     *
60
     * @return void
61
     * @throws \Exception Is thrown, if the plugin can not be processed
62
     */
63
    public function process()
64
    {
65
        try {
66
            // immediately add the PID to lock this import process
67
            $this->lock();
68
69
            // load system logger and registry
70
            $importProcessor = $this->getImportProcessor();
71
72
            // start the transaction
73
            $importProcessor->getConnection()->beginTransaction();
74
75
            // load the plugin's subjects
76
            $subjects = $this->getPluginConfiguration()->getSubjects();
77
78
            // initialize the status information for the subjects
79
            /** @var \TechDivision\Import\Configuration\SubjectConfigurationInterface $subject */
80
            foreach ($subjects as $subject) {
81
                $status[$subject->getPrefix()] = array();
0 ignored issues
show
Coding Style Comprehensibility introduced by
$status was never initialized. Although not strictly required by PHP, it is generally a good practice to add $status = array(); before regardless.

Adding an explicit array definition is generally preferable to implicit array definition as it guarantees a stable state of the code.

Let’s take a look at an example:

foreach ($collection as $item) {
    $myArray['foo'] = $item->getFoo();

    if ($item->hasBar()) {
        $myArray['bar'] = $item->getBar();
    }

    // do something with $myArray
}

As you can see in this example, the array $myArray is initialized the first time when the foreach loop is entered. You can also see that the value of the bar key is only written conditionally; thus, its value might result from a previous iteration.

This might or might not be intended. To make your intention clear, your code more readible and to avoid accidental bugs, we recommend to add an explicit initialization $myArray = array() either outside or inside the foreach loop.

Loading history...
82
            }
83
84
            // and update it in the registry
85
            $this->getRegistryProcessor()->mergeAttributesRecursive($this->getSerial(), $status);
0 ignored issues
show
Bug introduced by
The variable $status does not seem to be defined for all execution paths leading up to this point.

If you define a variable conditionally, it can happen that it is not defined for all execution paths.

Let’s take a look at an example:

function myFunction($a) {
    switch ($a) {
        case 'foo':
            $x = 1;
            break;

        case 'bar':
            $x = 2;
            break;
    }

    // $x is potentially undefined here.
    echo $x;
}

In the above example, the variable $x is defined if you pass “foo” or “bar” as argument for $a. However, since the switch statement has no default case statement, if you pass any other value, the variable $x would be undefined.

Available Fixes

  1. Check for existence of the variable explicitly:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        if (isset($x)) { // Make sure it's always set.
            echo $x;
        }
    }
    
  2. Define a default value for the variable:

    function myFunction($a) {
        $x = ''; // Set a default which gets overridden for certain paths.
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        echo $x;
    }
    
  3. Add a value for the missing path:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
    
            // We add support for the missing case.
            default:
                $x = '';
                break;
        }
    
        echo $x;
    }
    
Loading history...
86
87
            // process all the subjects found in the system configuration
88
            /** @var \TechDivision\Import\Configuration\SubjectConfigurationInterface $subject */
89
            foreach ($subjects as $subject) {
90
                $this->processSubject($subject);
91
            }
92
93
            // update the number of imported bunches
94
            $this->getRegistryProcessor()->mergeAttributesRecursive(
95
                $this->getSerial(), array(RegistryKeys::BUNCHES => $this->bunches)
96
            );
97
98
            // finally, if a PID has been set (because CSV files has been found),
99
            // remove it from the PID file to unlock the importer
100
            $this->unlock();
101
102
            // commit the transaction
103
            $importProcessor->getConnection()->commit();
104
0 ignored issues
show
Coding Style introduced by
Blank line found at end of control structure
Loading history...
105
        } catch (\Exception $e) {
106
            // finally, if a PID has been set (because CSV files has been found),
107
            // remove it from the PID file to unlock the importer
108
            $this->unlock();
109
110
            // rollback the transaction
111
            $importProcessor->getConnection()->rollBack();
0 ignored issues
show
Bug introduced by
The variable $importProcessor does not seem to be defined for all execution paths leading up to this point.

If you define a variable conditionally, it can happen that it is not defined for all execution paths.

Let’s take a look at an example:

function myFunction($a) {
    switch ($a) {
        case 'foo':
            $x = 1;
            break;

        case 'bar':
            $x = 2;
            break;
    }

    // $x is potentially undefined here.
    echo $x;
}

In the above example, the variable $x is defined if you pass “foo” or “bar” as argument for $a. However, since the switch statement has no default case statement, if you pass any other value, the variable $x would be undefined.

Available Fixes

  1. Check for existence of the variable explicitly:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        if (isset($x)) { // Make sure it's always set.
            echo $x;
        }
    }
    
  2. Define a default value for the variable:

    function myFunction($a) {
        $x = ''; // Set a default which gets overridden for certain paths.
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        echo $x;
    }
    
  3. Add a value for the missing path:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
    
            // We add support for the missing case.
            default:
                $x = '';
                break;
        }
    
        echo $x;
    }
    
Loading history...
112
113
            // re-throw the exception
114
            throw $e;
115
        }
116
    }
117
118
    /**
119
     * Process the subject with the passed name/identifier.
120
     *
121
     * We create a new, fresh and separate subject for EVERY file here, because this would be
122
     * the starting point to parallelize the import process in a multithreaded/multiprocessed
123
     * environment.
124
     *
125
     * @param \TechDivision\Import\Configuration\SubjectConfigurationInterface $subject The subject configuration
126
     *
127
     * @return void
128
     * @throws \Exception Is thrown, if the subject can't be processed
129
     */
130
    protected function processSubject(SubjectConfigurationInterface $subject)
131
    {
132
133
        // clear the filecache
134
        clearstatcache();
135
136
        // load the actual status
137
        $status = $this->getRegistryProcessor()->getAttribute($serial = $this->getSerial());
138
139
        // query whether or not the configured source directory is available
140 View Code Duplication
        if (!is_dir($sourceDir = $status[RegistryKeys::SOURCE_DIRECTORY])) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
141
            throw new \Exception(sprintf('Source directory %s for subject %s is not available!', $sourceDir, $subject->getClassName()));
142
        }
143
144
        // initialize the array with the CSV files found in the source directory
145
        $files = glob(sprintf('%s/*.csv', $sourceDir));
146
147
        // sorting the files for the apropriate order
148
        usort($files, function ($a, $b) {
149
            return strcmp($a, $b);
150
        });
151
152
        // log a debug message
153
        $this->getSystemLogger()->debug(sprintf('Now checking directory %s for files to be imported', $sourceDir));
154
155
        // initialize the bunch number
156
        $bunches = 0;
157
158
        // iterate through all CSV files and process the subjects
159
        foreach ($files as $pathname) {
160
            // query whether or not that the file is part of the actual bunch
161
            if ($this->isPartOfBunch($subject->getPrefix(), $pathname)) {
162
                // initialize the subject and import the bunch
163
                $subjectInstance = $this->subjectFactory($subject);
164
165
                // query whether or not the subject needs an OK file,
166
                // if yes remove the filename from the file
167
                if ($subjectInstance->isOkFileNeeded()) {
168
                    $this->removeFromOkFile($pathname);
169
                }
170
171
                // finally import the CSV file
172
                $subjectInstance->import($serial, $pathname);
173
174
                // query whether or not, we've to export artefacts
175
                if ($subjectInstance instanceof ExportableSubjectInterface) {
176
                    $subjectInstance->export(
177
                        $this->matches[BunchKeys::FILENAME],
178
                        $this->matches[BunchKeys::COUNTER]
179
                    );
180
                }
181
182
                // raise the number of the imported bunches
183
                $bunches++;
184
            }
185
        }
186
187
        // raise the bunch number by the imported bunches
188
        $this->bunches = $this->bunches + $bunches;
189
190
        // reset the matches, because the exported artefacts
191
        $this->matches = array();
192
193
        // and and log a message that the subject has been processed
194
        $this->getSystemLogger()->debug(
195
            sprintf('Successfully processed subject %s with %d bunch(es)!', $subject->getClassName(), $bunches)
196
        );
197
    }
198
199
    /**
200
     * Factory method to create new handler instances.
201
     *
202
     * @param \TechDivision\Import\Configuration\SubjectConfigurationInterface $subjectConfiguration The subject configuration
203
     *
204
     * @return object The handler instance
205
     */
206
    protected function subjectFactory(SubjectConfigurationInterface $subjectConfiguration)
207
    {
208
209
        // load the subject class name
210
        $className = $subjectConfiguration->getClassName();
211
212
        // initialize the instances
213
        $productProcessor = null;
214
        $systemLogger = $this->getSystemLogger();
215
        $registryProcessor = $this->getRegistryProcessor();
216
217
        // instanciate and set the product processor, if specified
218
        if ($processorFactory = $subjectConfiguration->getProcessorFactory()) {
219
            $productProcessor = $processorFactory::factory(
220
                $this->getImportProcessor()->getConnection(),
221
                $subjectConfiguration
222
            );
223
        }
224
225
        // initialize a new handler with the passed class name
226
        return new $className(
227
            $systemLogger,
228
            $subjectConfiguration,
229
            $registryProcessor,
230
            $productProcessor
231
        );
232
    }
233
234
    /**
235
     * Queries whether or not, the passed filename is part of a bunch or not.
236
     *
237
     * @param string $prefix   The prefix to query for
238
     * @param string $filename The filename to query for
239
     *
240
     * @return boolean TRUE if the filename is part, else FALSE
241
     */
242 2
    protected function isPartOfBunch($prefix, $filename)
243
    {
244
245
        // initialize the pattern
246 2
        $pattern = '';
0 ignored issues
show
Unused Code introduced by
$pattern is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
247
248
        // query whether or not, this is the first file to be processed
249 2
        if (sizeof($this->matches) === 0) {
250
            // initialize the pattern to query whether the FIRST file has to be processed or not
251 2
            $pattern = sprintf(
252 2
                '/^.*\/(?<%s>%s)_(?<%s>.*)_(?<%s>\d+)\\.csv$/',
253 2
                BunchKeys::PREFIX,
254 2
                $prefix,
255 2
                BunchKeys::FILENAME,
256
                BunchKeys::COUNTER
257 2
            );
258
0 ignored issues
show
Coding Style introduced by
Blank line found at end of control structure
Loading history...
259 2
        } else {
260
            // initialize the pattern to query whether the NEXT file is part of a bunch or not
261 2
            $pattern = sprintf(
262 2
                '/^.*\/(?<%s>%s)_(?<%s>%s)_(?<%s>\d+)\\.csv$/',
263 2
                BunchKeys::PREFIX,
264 2
                $this->matches[BunchKeys::PREFIX],
265 2
                BunchKeys::FILENAME,
266 2
                $this->matches[BunchKeys::FILENAME],
267
                BunchKeys::COUNTER
268 2
            );
269
        }
270
271
        // initialize the array for the matches
272 2
        $matches = array();
273
274
        // update the matches, if the pattern matches
275 2
        if ($result = preg_match($pattern, $filename, $matches)) {
276 2
            $this->matches = $matches;
277 2
        }
278
279
        // stop processing, if the filename doesn't match
280 2
        return (boolean) $result;
281
    }
282
283
    /**
284
     * Return's an array with the names of the expected OK files for the actual subject.
285
     *
286
     * @return array The array with the expected OK filenames
287
     */
288
    protected function getOkFilenames()
289
    {
290
291
        // load the array with the available bunch keys
292
        $bunchKeys = BunchKeys::getAllKeys();
293
294
        // initialize the array for the available okFilenames
295
        $okFilenames = array();
296
297
        // prepare the OK filenames based on the found CSV file information
298
        for ($i = 1; $i <= sizeof($bunchKeys); $i++) {
0 ignored issues
show
Performance Best Practice introduced by
It seems like you are calling the size function sizeof() as part of the test condition. You might want to compute the size beforehand, and not on each iteration.

If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration:

for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
Loading history...
299
            // intialize the array for the parts of the names (prefix, filename + counter)
300
            $parts = array();
301
            // load the parts from the matches
302
            for ($z = 0; $z < $i; $z++) {
303
                $parts[] = $this->matches[$bunchKeys[$z]];
304
            }
305
306
            // query whether or not, the OK file exists, if yes append it
307
            if (file_exists($okFilename = sprintf('%s/%s.ok', $this->getSourceDir(), implode('_', $parts)))) {
308
                $okFilenames[] = $okFilename;
309
            }
310
        }
311
312
        // prepare and return the pattern for the OK file
313
        return $okFilenames;
314
    }
315
316
    /**
317
     * Query whether or not, the passed CSV filename is in the OK file. If the filename was found,
318
     * it'll be returned and the method return TRUE.
319
     *
320
     * If the filename is NOT in the OK file, the method return's FALSE and the CSV should NOT be
321
     * imported/moved.
322
     *
323
     * @param string $filename The CSV filename to query for
324
     *
325
     * @return void
326
     * @throws \Exception Is thrown, if the passed filename is NOT in the OK file or it can NOT be removed from it
327
     */
328
    protected function removeFromOkFile($filename)
329
    {
330
331
        try {
332
            // load the expected OK filenames
333
            $okFilenames = $this->getOkFilenames();
334
335
            // iterate over the found OK filenames (should usually be only one, but could be more)
336
            foreach ($okFilenames as $okFilename) {
337
                // if the OK filename matches the CSV filename AND the OK file is empty
338
                if (basename($filename, '.csv') === basename($okFilename, '.ok') && filesize($okFilename) === 0) {
339
                    unlink($okFilename);
340
                    return;
341
                }
342
343
                // else, remove the CSV filename from the OK file
344
                $this->removeLineFromFile(basename($filename), $okFilename);
345
                return;
346
            }
347
348
            // throw an exception if either no OK file has been found,
349
            // or the CSV file is not in one of the OK files
350
            throw new \Exception(
351
                sprintf(
352
                    'Can\'t found filename %s in one of the expected OK files: %s',
353
                    $filename,
354
                    implode(', ', $okFilenames)
355
                )
356
            );
357
0 ignored issues
show
Coding Style introduced by
Blank line found at end of control structure
Loading history...
358
        } catch (\Exception $e) {
359
            // wrap and re-throw the exception
360
            throw new \Exception(
361
                sprintf(
362
                    'Can\'t remove filename %s from OK file: %s',
363
                    $filename,
364
                    $okFilename
365
                ),
366
                null,
367
                $e
368
            );
369
        }
370
    }
371
}
372