Completed
Branch master (3adcdb)
by Raffael
08:09 queued 04:17
created

Job::getFileByHash()   B

Complexity

Conditions 4
Paths 4

Size

Total Lines 29
Code Lines 16

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
dl 0
loc 29
rs 8.5806
c 0
b 0
f 0
cc 4
eloc 16
nc 4
nop 1
1
<?php
2
3
declare(strict_types=1);
4
5
/**
6
 * balloon
7
 *
8
 * @copyright   Copryright (c) 2012-2018 gyselroth GmbH (https://gyselroth.com)
9
 * @license     GPL-3.0 https://opensource.org/licenses/GPL-3.0
10
 */
11
12
namespace Balloon\App\Elasticsearch;
13
14
use Balloon\Filesystem;
15
use Balloon\Filesystem\Node\Collection;
16
use Balloon\Filesystem\Node\File;
17
use Balloon\Filesystem\Node\NodeInterface;
18
use Balloon\Filesystem\Storage;
19
use Balloon\Server;
20
use InvalidArgumentException;
21
use MongoDB\BSON\ObjectId;
22
use Psr\Log\LoggerInterface;
23
use TaskScheduler\AbstractJob;
24
25
class Job extends AbstractJob
26
{
27
    /**
28
     * Document actions.
29
     */
30
    const ACTION_CREATE = 0;
31
    const ACTION_UPDATE = 1;
32
    const ACTION_DELETE_COLLECTION = 2;
33
    const ACTION_DELETE_FILE = 3;
34
    const ACTION_ADD_SHARE = 4;
35
    const ACTION_DELETE_SHARE = 5;
36
37
    /**
38
     * Filesystem.
39
     *
40
     * @var Filesystem
41
     */
42
    protected $fs;
43
44
    /**
45
     * Elasticsearch.
46
     *
47
     * @var Elasticsearch
48
     */
49
    protected $es;
50
51
    /**
52
     * Logger.
53
     *
54
     * @var LoggerInterface
55
     */
56
    protected $logger;
57
58
    /**
59
     * Node attribute decorator.
60
     *
61
     * @var NodeAttributeDecorator
62
     */
63
    protected $decorator;
64
65
    /**
66
     * File size limit.
67
     *
68
     * @var int
69
     */
70
    protected $size_limit = 52428800;
71
72
    /**
73
     * Constructor.
74
     *
75
     * @param Elasticsarch           $es
76
     * @param Storage                $storage
0 ignored issues
show
Bug introduced by
There is no parameter named $storage. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
77
     * @param Server                 $server
78
     * @param NodeAttributeDecorator $decorator
79
     * @param LoggerInterface        $logger
80
     * @param iterable               $config
81
     */
82
    public function __construct(Elasticsearch $es, Server $server, NodeAttributeDecorator $decorator, LoggerInterface $logger, Iterable $config = null)
83
    {
84
        $this->es = $es;
85
        $this->fs = $server->getFilesystem();
86
        $this->decorator = $decorator;
87
        $this->logger = $logger;
88
        $this->setOptions($config);
89
        $this->setMemoryLimit();
90
    }
91
92
    /**
93
     * Set options.
94
     *
95
     * @param iterable $config
96
     *
97
     * @return Job
98
     */
99
    public function setOptions(?Iterable $config = null): self
100
    {
101
        if (null === $config) {
102
            return $this;
103
        }
104
105
        foreach ($config as $option => $value) {
106
            switch ($option) {
107
                case 'size_limit':
108
                    $this->size_limit = (int) $value;
109
110
                break;
111
                default:
112
                    throw new InvalidArgumentException('invalid option '.$option.' given');
113
            }
114
        }
115
116
        return $this;
117
    }
118
119
    /**
120
     * {@inheritdoc}
121
     */
122
    public function start(): bool
123
    {
124
        $this->logger->debug('elasticsearch document action ['.$this->data['action'].'] for node ['.$this->data['id'].']', [
125
            'category' => get_class($this),
126
        ]);
127
128
        switch ($this->data['action']) {
129
            case self::ACTION_CREATE:
130
                $this->createDocument($this->fs->findNodeById($this->data['id']));
131
132
            break;
133
            case self::ACTION_DELETE_FILE:
134
                $this->deleteFileDocument($this->data['id'], $this->data['hash']);
135
136
            break;
137
            case self::ACTION_DELETE_COLLECTION:
138
                $this->deleteCollectionDocument($this->data['id']);
139
140
            break;
141
            case self::ACTION_UPDATE:
142
                $this->updateDocument($this->fs->findNodeById($this->data['id']), $this->data['hash']);
143
144
            break;
145
            case self::ACTION_ADD_SHARE:
146
                $this->addShare($this->fs->findNodeById($this->data['id']));
0 ignored issues
show
Compatibility introduced by
$this->fs->findNodeById($this->data['id']) of type object<Balloon\Filesystem\Node\NodeInterface> is not a sub-type of object<Balloon\Filesystem\Node\Collection>. It seems like you assume a concrete implementation of the interface Balloon\Filesystem\Node\NodeInterface to be always present.

This check looks for parameters that are defined as one type in their type hint or doc comment but seem to be used as a narrower type, i.e an implementation of an interface or a subclass.

Consider changing the type of the parameter or doing an instanceof check before assuming your parameter is of the expected type.

Loading history...
147
148
            break;
149
            case self::ACTION_DELETE_SHARE:
150
                $this->deleteShare($this->fs->findNodeById($this->data['id']));
0 ignored issues
show
Compatibility introduced by
$this->fs->findNodeById($this->data['id']) of type object<Balloon\Filesystem\Node\NodeInterface> is not a sub-type of object<Balloon\Filesystem\Node\Collection>. It seems like you assume a concrete implementation of the interface Balloon\Filesystem\Node\NodeInterface to be always present.

This check looks for parameters that are defined as one type in their type hint or doc comment but seem to be used as a narrower type, i.e an implementation of an interface or a subclass.

Consider changing the type of the parameter or doing an instanceof check before assuming your parameter is of the expected type.

Loading history...
151
152
            break;
153
            default:
154
                throw new InvalidArgumentException('invalid document action given');
155
        }
156
157
        return true;
158
    }
159
160
    /**
161
     * Create document.
162
     *
163
     * @param NodeInterface $node
164
     *
165
     * @return bool
166
     */
167
    public function createDocument(NodeInterface $node): bool
168
    {
169
        $this->logger->info('create elasticsearch document for node ['.$node->getId().']', [
170
            'category' => get_class($this),
171
        ]);
172
173
        $params = $this->getParams($node);
174
        $params['body'] = $this->decorator->decorate($node);
175
176
        $this->es->getEsClient()->index($params);
177
178
        if ($node instanceof File) {
179
            $this->storeBlob($node);
180
        }
181
182
        return true;
183
    }
184
185
    /**
186
     * Update document.
187
     *
188
     * @param NodeInterface $node
189
     *
190
     * @return bool
191
     */
192
    public function updateDocument(NodeInterface $node, ?string $hash): bool
193
    {
194
        $this->logger->info('update elasticsearch document for node ['.$node->getId().']', [
195
            'category' => get_class($this),
196
        ]);
197
198
        $params = $this->getParams($node);
199
        $params['body'] = $this->decorator->decorate($node);
200
        $this->es->getEsClient()->index($params);
201
202
        if ($node instanceof File && $hash !== $node->getHash()) {
203
            if ($hash !== null) {
204
                $this->deleteBlobReference((string) $node->getId(), $hash);
205
            }
206
207
            $this->storeBlob($node);
208
        }
209
210
        return true;
211
    }
212
213
    /**
214
     * Delete collection document.
215
     *
216
     * @param ObjectId $node
217
     *
218
     * @return bool
219
     */
220
    public function deleteCollectionDocument(ObjectId $node): bool
221
    {
222
        $this->logger->info('delete elasticsearch document for collection ['.$node.']', [
223
            'category' => get_class($this),
224
        ]);
225
226
        $params = [
227
            'id' => (string) $node,
228
            'type' => 'storage',
229
            'index' => $this->es->getIndex(),
230
        ];
231
232
        $this->es->getEsClient()->delete($params);
233
234
        return true;
235
    }
236
237
    /**
238
     * Create document.
239
     *
240
     * @param ObjectId $node
241
     * @param string   $hash
242
     *
243
     * @return bool
244
     */
245
    public function deleteFileDocument(ObjectId $node, ?string $hash): bool
246
    {
247
        $this->logger->info('delete elasticsearch document for file ['.$node.']', [
248
            'category' => get_class($this),
249
        ]);
250
251
        $params = [
252
            'id' => (string) $node,
253
            'type' => 'storage',
254
            'index' => $this->es->getIndex(),
255
        ];
256
257
        $this->es->getEsClient()->delete($params);
258
259
        if ($hash !== null) {
260
            return $this->deleteBlobReference((string) $node, $hash);
261
        }
262
    }
263
264
    /**
265
     * Add share.
266
     *
267
     * @param Collection $collection
268
     *
269
     * @return bool
270
     */
271
    protected function addShare(Collection $collection): bool
272
    {
273
        $that = $this;
274
        $collection->doRecursiveAction(function ($node) use ($that) {
275
            if ($node instanceof Collection) {
276
                $that->addShare($node);
277
            } else {
278
                $that->storeBlob($node);
279
            }
280
        }, NodeInterface::DELETED_INCLUDE);
281
282
        return true;
283
    }
284
285
    /**
286
     * Delete share.
287
     *
288
     * @param Collection $collection
289
     *
290
     * @return bool
291
     */
292
    protected function deleteShare(Collection $collection): bool
293
    {
294
        $that = $this;
295
        $collection->doRecursiveAction(function ($node) use ($that) {
296
            if ($node instanceof Collection) {
297
                $that->deleteShare($node);
298
            } else {
299
                $that->storeBlob($node);
300
                //$result = $that->getFileByHash($node->getHash());
0 ignored issues
show
Unused Code Comprehensibility introduced by
67% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
301
                //if ($result !== null) {
0 ignored issues
show
Unused Code Comprehensibility introduced by
59% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
302
                //     $this->updateBlob($result['_id'], ['share_ref' => []]);
0 ignored issues
show
Unused Code Comprehensibility introduced by
77% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
303
                //}
304
            }
305
        }, NodeInterface::DELETED_INCLUDE);
306
307
        return true;
308
    }
309
310
    /**
311
     * Delete blob reference.
312
     *
313
     * @param string $id
314
     * @param string $hash
315
     *
316
     * @return bool
317
     */
318
    protected function deleteBlobReference(string $id, string $hash): bool
319
    {
320
        $result = $this->getFileByHash($hash);
321
        if ($result === null) {
322
            return true;
323
        }
324
325
        $ref = $result['_source']['metadata']['ref'];
326
        $key = array_search($id, array_column($ref, 'id'));
327
328
        if ($key !== false) {
329
            unset($ref[$key]);
330
        }
331
332
        if (count($ref) >= 1) {
333
            $this->logger->debug('elasticsarch blob document ['.$result['_id'].'] still has references left, just remove the reference ['.$id.']', [
334
                'category' => get_class($this),
335
            ]);
336
337
            $meta = $result['_source']['metadata'];
338
            $meta['ref'] = array_values($ref);
339
340
            return $this->updateBlob($result['_id'], $meta);
341
        }
342
343
        $this->logger->debug('elasticsarch blob document ['.$result['_id'].'] has no references left, remove completely', [
344
         'category' => get_class($this),
345
        ]);
346
347
        return $this->deleteBlob($result['_id']);
348
    }
349
350
    /**
351
     * Set memory limit.
352
     *
353
     * @return Job
354
     */
355
    protected function setMemoryLimit(): self
356
    {
357
        $limit = (int) ini_get('memory_limit') * 1024 * 1024;
358
        $required = $this->size_limit * 2;
359
        if ($limit !== -1 && $limit < $limit + $required) {
360
            ini_set('memory_limit', (string) (($limit + $required) * 1024 * 1024));
361
        }
362
363
        return $this;
364
    }
365
366
    /**
367
     * Get params.
368
     *
369
     * @param NodeInterface $node
370
     *
371
     * @return array
372
     */
373
    protected function getParams(NodeInterface $node): array
374
    {
375
        return [
376
            'index' => $this->es->getIndex(),
377
            'id' => (string) $node->getId(),
378
            'type' => 'storage',
379
        ];
380
    }
381
382
    /**
383
     * Delete blob.
384
     *
385
     * @param string $id
386
     *
387
     * @return bool
388
     */
389
    protected function deleteBlob(string $id): bool
390
    {
391
        $params = [
392
            'index' => $this->es->getIndex(),
393
            'id' => $id,
394
            'type' => 'fs',
395
        ];
396
397
        $this->es->getEsClient()->delete($params);
398
399
        return true;
400
    }
401
402
    /**
403
     * Get stored file.
404
     *
405
     * @param string $hash
406
     *
407
     * @return array
408
     */
409
    protected function getFileByHash(?string $hash): ?array
410
    {
411
        if ($hash === null) {
412
            return null;
413
        }
414
415
        $params = [
416
            'index' => $this->es->getIndex(),
417
            'type' => 'fs',
418
            'body' => [
419
                'query' => [
420
                    'match' => [
421
                        'md5' => $hash,
422
                    ],
423
                ],
424
            ],
425
        ];
426
427
        $result = $this->es->getEsClient()->search($params);
428
429
        if (count($result['hits']['hits']) === 0) {
430
            return null;
431
        }
432
        if (count($result['hits']['hits']) > 1) {
433
            throw new Exception\MultipleDocumentsFound('multiple elasticsearch documents found by the same hash');
434
        }
435
436
        return $result['hits']['hits'][0];
437
    }
438
439
    /**
440
     * Add or update blob.
441
     *
442
     * @param File $node
0 ignored issues
show
Bug introduced by
There is no parameter named $node. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
443
     *
444
     * @return bool
445
     */
446
    protected function storeBlob(File $file): bool
447
    {
448
        $this->logger->debug('store file blob for node ['.$file->getId().'] to elasticsearch', [
449
            'category' => get_class($this),
450
            'size' => $file->getSize(),
451
        ]);
452
453
        if ($file->getSize() > $this->size_limit) {
454
            $this->logger->debug('skip file blob ['.$file->getId().'] because size ['.$file->getSize().'] is bigger than the maximum configured ['.$this->size_limit.']', [
455
                'category' => get_class($this),
456
            ]);
457
458
            return false;
459
        }
460
        if ($file->getSize() === 0) {
461
            $this->logger->debug('skip empty file blob ['.$file->getId().']', [
462
                'category' => get_class($this),
463
            ]);
464
465
            return false;
466
        }
467
468
        $result = $this->getFileByHash($file->getHash());
469
470
        if ($result === null) {
471
            return $this->createNewBlob($file);
472
        }
473
474
        $update = $this->prepareUpdate($file, $result['_source']['metadata']);
475
476
        return $this->updateBlob($result['_id'], $update);
477
    }
478
479
    /**
480
     * Prepare references update.
481
     *
482
     * @param array $references
483
     * @param array $new_references
0 ignored issues
show
Documentation introduced by
There is no parameter named $new_references. Did you maybe mean $references?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function. It has, however, found a similar but not annotated parameter which might be a good fit.

Consider the following example. The parameter $ireland is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $ireland
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was changed, but the annotation was not.

Loading history...
484
     * @param array $new_share_references
0 ignored issues
show
Documentation introduced by
There is no parameter named $new_share_references. Did you maybe mean $references?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function. It has, however, found a similar but not annotated parameter which might be a good fit.

Consider the following example. The parameter $ireland is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $ireland
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was changed, but the annotation was not.

Loading history...
485
     *
486
     * @return array
487
     */
488
    protected function prepareUpdate(File $file, array $references): array
489
    {
490
        $refs = array_column($references['ref'], 'id');
491
        if (!in_array((string) $file->getId(), $refs)) {
492
            $references['ref'][] = [
493
                'id' => (string) $file->getId(),
494
                'owner' => (string) $file->getOwner(),
495
            ];
496
        }
497
498
        $share_refs = array_column($references['share_ref'], 'id');
499
        $key = array_search((string) $file->getId(), $share_refs);
500
        if (!$file->isShareMember() && $key !== false) {
501
            unset($references['share_ref'][$key]);
502
            $references['share_ref'] = array_values($references['share_ref']);
503
        } elseif ($file->isShareMember() && $key === false) {
504
            $references['share_ref'][] = [
505
                'id' => (string) $file->getId(),
506
                'share' => (string) $file->getShareId(),
507
            ];
508
        }
509
510
        return $references;
511
    }
512
513
    /**
514
     * Add new blob.
515
     *
516
     * @param File $file
517
     *
518
     * @return bool
519
     */
520
    protected function createNewBlob(File $file): bool
521
    {
522
        $meta = [
523
            'ref' => [[
524
                'id' => (string) $file->getId(),
525
                'owner' => (string) $file->getOwner(),
526
            ]],
527
        ];
528
529
        if ($file->isShareMember()) {
530
            $meta['share_ref'] = [[
531
                'id' => (string) $file->getId(),
532
                'share' => (string) $file->getShareId(),
533
            ]];
534
        }
535
536
        $content = base64_encode(stream_get_contents($file->get()));
537
538
        $params = [
539
            'index' => $this->es->getIndex(),
540
            'id' => (string) new ObjectId(),
541
            'type' => 'fs',
542
            'body' => [
543
                'md5' => $file->getHash(),
544
                'metadata' => $meta,
545
                'content' => $content,
546
            ],
547
        ];
548
549
        $this->es->getEsClient()->index($params);
550
551
        return true;
552
    }
553
554
    /**
555
     * Update blob.
556
     *
557
     * @param File  $file
0 ignored issues
show
Bug introduced by
There is no parameter named $file. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
558
     * @param array $meta
559
     *
560
     * @return bool
561
     */
562
    protected function updateBlob(string $id, array $meta): bool
563
    {
564
        $params = [
565
            'index' => $this->es->getIndex(),
566
            'id' => $id,
567
            'type' => 'fs',
568
            'body' => [
569
                'doc' => [
570
                    'metadata' => $meta,
571
                ],
572
            ],
573
        ];
574
575
        $this->es->getEsClient()->update($params);
576
577
        return true;
578
    }
579
}
580