Completed
Push — stable8.2 ( 35e240...680ee0 )
by Joas
29s
created

Scanner   F

Complexity

Total Complexity 88

Size/Duplication

Total Lines 403
Duplicated Lines 7.69 %

Coupling/Cohesion

Components 1
Dependencies 13

Test Coverage

Coverage 91.48%

Importance

Changes 4
Bugs 1 Features 3
Metric Value
c 4
b 1
f 3
dl 31
loc 403
ccs 204
cts 223
cp 0.9148
rs 3.4814
wmc 88
lcom 1
cbo 13

14 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 7 1
A setUseTransactions() 0 3 1
A getData() 0 7 2
F scanFile() 0 70 24
A removeFromCache() 0 7 2
A addToCache() 10 14 3
A updateCache() 7 11 3
A getExistingChildren() 0 8 2
B getNewChildren() 0 13 5
F scanChildren() 3 76 27
A isPartialFile() 0 10 3
B backgroundScan() 0 21 7
A setCacheActive() 0 3 1
B scan() 11 19 7

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like Scanner often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Scanner, and based on these observations, apply Extract Interface, too.

1
<?php
2
/**
3
 * @author Arthur Schiwon <[email protected]>
4
 * @author Jörn Friedrich Dreyer <[email protected]>
5
 * @author Lukas Reschke <[email protected]>
6
 * @author Martin Mattel <[email protected]>
7
 * @author Michael Gapczynski <[email protected]>
8
 * @author Morris Jobke <[email protected]>
9
 * @author Olivier Paroz <[email protected]>
10
 * @author Owen Winkler <[email protected]>
11
 * @author Robin Appelman <[email protected]>
12
 * @author Robin McCorkell <[email protected]>
13
 * @author Thomas Müller <[email protected]>
14
 * @author Vincent Petry <[email protected]>
15
 *
16
 * @copyright Copyright (c) 2015, ownCloud, Inc.
17
 * @license AGPL-3.0
18
 *
19
 * This code is free software: you can redistribute it and/or modify
20
 * it under the terms of the GNU Affero General Public License, version 3,
21
 * as published by the Free Software Foundation.
22
 *
23
 * This program is distributed in the hope that it will be useful,
24
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
26
 * GNU Affero General Public License for more details.
27
 *
28
 * You should have received a copy of the GNU Affero General Public License, version 3,
29
 * along with this program.  If not, see <http://www.gnu.org/licenses/>
30
 *
31
 */
32
33
namespace OC\Files\Cache;
34
35
use OC\Files\Filesystem;
36
use OC\Hooks\BasicEmitter;
37
use OCP\Config;
38
use OCP\Lock\ILockingProvider;
39
40
/**
41
 * Class Scanner
42
 *
43
 * Hooks available in scope \OC\Files\Cache\Scanner:
44
 *  - scanFile(string $path, string $storageId)
45
 *  - scanFolder(string $path, string $storageId)
46
 *  - postScanFile(string $path, string $storageId)
47
 *  - postScanFolder(string $path, string $storageId)
48
 *
49
 * @package OC\Files\Cache
50
 */
51
class Scanner extends BasicEmitter {
52
	/**
53
	 * @var \OC\Files\Storage\Storage $storage
54
	 */
55
	protected $storage;
56
57
	/**
58
	 * @var string $storageId
59
	 */
60
	protected $storageId;
61
62
	/**
63
	 * @var \OC\Files\Cache\Cache $cache
64
	 */
65
	protected $cache;
66
67
	/**
68
	 * @var boolean $cacheActive If true, perform cache operations, if false, do not affect cache
69
	 */
70
	protected $cacheActive;
71
72
	/**
73
	 * @var bool $useTransactions whether to use transactions
74
	 */
75
	protected $useTransactions = true;
76
77
	/**
78
	 * @var \OCP\Lock\ILockingProvider
79
	 */
80
	protected $lockingProvider;
81
82
	const SCAN_RECURSIVE = true;
83
	const SCAN_SHALLOW = false;
84
85
	const REUSE_ETAG = 1;
86
	const REUSE_SIZE = 2;
87
88 890
	public function __construct(\OC\Files\Storage\Storage $storage) {
89 890
		$this->storage = $storage;
90 890
		$this->storageId = $this->storage->getId();
91 890
		$this->cache = $storage->getCache();
92 890
		$this->cacheActive = !Config::getSystemValue('filesystem_cache_readonly', false);
93 890
		$this->lockingProvider = \OC::$server->getLockingProvider();
94 890
	}
95
96
	/**
97
	 * Whether to wrap the scanning of a folder in a database transaction
98
	 * On default transactions are used
99
	 *
100
	 * @param bool $useTransactions
101
	 */
102 4
	public function setUseTransactions($useTransactions) {
103 4
		$this->useTransactions = $useTransactions;
104 4
	}
105
106
	/**
107
	 * get all the metadata of a file or folder
108
	 * *
109
	 *
110
	 * @param string $path
111
	 * @return array an array of metadata of the file
112
	 */
113 875
	public function getData($path) {
114 875
		$data = $this->storage->getMetaData($path);
115 875
		if (is_null($data)) {
116 9
			\OCP\Util::writeLog('OC\Files\Cache\Scanner', "!!! Path '$path' is not accessible or present !!!", \OCP\Util::DEBUG);
117 9
		}
118 875
		return $data;
119
	}
120
121
	/**
122
	 * scan a single file and store it in the cache
123
	 *
124
	 * @param string $file
125
	 * @param int $reuseExisting
126
	 * @param int $parentId
127
	 * @param array | null $cacheData existing data in the cache for the file to be scanned
128
	 * @param bool $lock set to false to disable getting an additional read lock during scanning
129
	 * @return array an array of metadata of the scanned file
130
	 * @throws \OC\ServerNotAvailableException
131
	 * @throws \OCP\Lock\LockedException
132
	 */
133 875
	public function scanFile($file, $reuseExisting = 0, $parentId = -1, $cacheData = null, $lock = true) {
134 875
		if (!self::isPartialFile($file)
135 875
			and !Filesystem::isFileBlacklisted($file)
136 875
		) {
137 875
			if ($lock) {
138 607
				$this->storage->acquireLock($file, ILockingProvider::LOCK_SHARED, $this->lockingProvider);
139 607
			}
140 875
			$this->emit('\OC\Files\Cache\Scanner', 'scanFile', array($file, $this->storageId));
141 875
			\OC_Hook::emit('\OC\Files\Cache\Scanner', 'scan_file', array('path' => $file, 'storage' => $this->storageId));
142 875
			$data = $this->getData($file);
143 875
			if ($data) {
144 874
				$parent = dirname($file);
145 874
				if ($parent === '.' or $parent === '/') {
146 780
					$parent = '';
147 780
				}
148 874
				if ($parentId === -1) {
149 874
					$parentId = $this->cache->getId($parent);
150 874
				}
151
152
				// scan the parent if it's not in the cache (id -1) and the current file is not the root folder
153 874
				if ($file and $parentId === -1) {
154 168
					$parentData = $this->scanFile($parent);
155 168
					$parentId = $parentData['fileid'];
156 168
				}
157 874
				if ($parent) {
158 822
					$data['parent'] = $parentId;
159 822
				}
160 874
				if (is_null($cacheData)) {
161 874
					$cacheData = $this->cache->get($file);
162 874
				}
163 874
				if ($cacheData and $reuseExisting and isset($cacheData['fileid'])) {
164
					// prevent empty etag
165 664
					if (empty($cacheData['etag'])) {
166 3
						$etag = $data['etag'];
167 3
					} else {
168 664
						$etag = $cacheData['etag'];
169
					}
170 664
					$fileId = $cacheData['fileid'];
171 664
					$data['fileid'] = $fileId;
172
					// only reuse data if the file hasn't explicitly changed
173 664
					if (isset($data['storage_mtime']) && isset($cacheData['storage_mtime']) && $data['storage_mtime'] === $cacheData['storage_mtime']) {
174 647
						$data['mtime'] = $cacheData['mtime'];
175 647
						if (($reuseExisting & self::REUSE_SIZE) && ($data['size'] === -1)) {
176 43
							$data['size'] = $cacheData['size'];
177 43
						}
178 647
						if ($reuseExisting & self::REUSE_ETAG) {
179 647
							$data['etag'] = $etag;
180 647
						}
181 647
					}
182
					// Only update metadata that has changed
183 664
					$newData = array_diff_assoc($data, $cacheData);
184 664
				} else {
185 862
					$newData = $data;
186 862
					$fileId = -1;
187
				}
188 874
				if (!empty($newData)) {
189 873
					$data['fileid'] = $this->addToCache($file, $newData, $fileId);
190 873
				}
191 874
				$this->emit('\OC\Files\Cache\Scanner', 'postScanFile', array($file, $this->storageId));
192 874
				\OC_Hook::emit('\OC\Files\Cache\Scanner', 'post_scan_file', array('path' => $file, 'storage' => $this->storageId));
193 874
			} else {
194 9
				$this->removeFromCache($file);
195
			}
196 875
			if ($lock) {
197 607
				$this->storage->releaseLock($file, ILockingProvider::LOCK_SHARED, $this->lockingProvider);
198 607
			}
199 875
			return $data;
200
		}
201 1
		return null;
202
	}
203
204 24
	protected function removeFromCache($path) {
205 24
		\OC_Hook::emit('Scanner', 'removeFromCache', array('file' => $path));
206 24
		$this->emit('\OC\Files\Cache\Scanner', 'removeFromCache', array($path));
207 24
		if ($this->cacheActive) {
208 24
			$this->cache->remove($path);
209 24
		}
210 24
	}
211
212
	/**
213
	 * @param string $path
214
	 * @param array $data
215
	 * @param int $fileId
216
	 * @return int the id of the added file
217
	 */
218 873
	protected function addToCache($path, $data, $fileId = -1) {
219 873
		\OC_Hook::emit('Scanner', 'addToCache', array('file' => $path, 'data' => $data));
220 873
		$this->emit('\OC\Files\Cache\Scanner', 'addToCache', array($path, $this->storageId, $data));
221 873 View Code Duplication
		if ($this->cacheActive) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
222 873
			if ($fileId !== -1) {
223 583
				$this->cache->update($fileId, $data);
224 583
				return $fileId;
225
			} else {
226 862
				return $this->cache->put($path, $data);
227
			}
228
		} else {
229
			return -1;
230
		}
231
	}
232
233
	/**
234
	 * @param string $path
235
	 * @param array $data
236
	 * @param int $fileId
237
	 */
238 818
	protected function updateCache($path, $data, $fileId = -1) {
239 818
		\OC_Hook::emit('Scanner', 'addToCache', array('file' => $path, 'data' => $data));
240 818
		$this->emit('\OC\Files\Cache\Scanner', 'updateCache', array($path, $this->storageId, $data));
241 818 View Code Duplication
		if ($this->cacheActive) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
242 818
			if ($fileId !== -1) {
243 818
				$this->cache->update($fileId, $data);
244 818
			} else {
245 2
				$this->cache->put($path, $data);
246
			}
247 818
		}
248 818
	}
249
250
	/**
251
	 * scan a folder and all it's children
252
	 *
253
	 * @param string $path
254
	 * @param bool $recursive
255
	 * @param int $reuse
256
	 * @param bool $lock set to false to disable getting an additional read lock during scanning
257
	 * @return array an array of the meta data of the scanned file or folder
258
	 */
259 869
	public function scan($path, $recursive = self::SCAN_RECURSIVE, $reuse = -1, $lock = true) {
260 869 View Code Duplication
		if ($reuse === -1) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
261 865
			$reuse = ($recursive === self::SCAN_SHALLOW) ? self::REUSE_ETAG | self::REUSE_SIZE : self::REUSE_ETAG;
262 865
		}
263 869 View Code Duplication
		if ($lock) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
264 555
			$this->lockingProvider->acquireLock('scanner::' . $this->storageId . '::' . $path, ILockingProvider::LOCK_EXCLUSIVE);
265 555
			$this->storage->acquireLock($path, ILockingProvider::LOCK_SHARED, $this->lockingProvider);
266 869
		}
267 869
		$data = $this->scanFile($path, $reuse, -1, null, $lock);
268 846
		if ($data and $data['mimetype'] === 'httpd/unix-directory') {
269 846
			$size = $this->scanChildren($path, $recursive, $reuse, $data, $lock);
270 846
			$data['size'] = $size;
271 869
		}
272 555 View Code Duplication
		if ($lock) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
273 555
			$this->storage->releaseLock($path, ILockingProvider::LOCK_SHARED, $this->lockingProvider);
274 869
			$this->lockingProvider->releaseLock('scanner::' . $this->storageId . '::' . $path, ILockingProvider::LOCK_EXCLUSIVE);
275
		}
276
		return $data;
277
	}
278
279
	/**
280
	 * Get the children currently in the cache
281
	 *
282
	 * @param int $folderId
283 846
	 * @return array[]
284 846
	 */
285 846
	protected function getExistingChildren($folderId) {
286 846
		$existingChildren = array();
287 542
		$children = $this->cache->getFolderContentsById($folderId);
288 846
		foreach ($children as $child) {
289 846
			$existingChildren[$child['name']] = $child;
290
		}
291
		return $existingChildren;
292
	}
293
294
	/**
295
	 * Get the children from the storage
296
	 *
297
	 * @param string $folder
298 846
	 * @return string[]
299 846
	 */
300 846
	protected function getNewChildren($folder) {
301 846
		$children = array();
302 846
		if ($dh = $this->storage->opendir($folder)) {
303 846
			if (is_resource($dh)) {
304 578
				while (($file = readdir($dh)) !== false) {
305 578
					if (!Filesystem::isIgnoredDir($file)) {
306 846
						$children[] = $file;
307 846
					}
308 846
				}
309 846
			}
310
		}
311
		return $children;
312
	}
313
314
	/**
315
	 * scan all the files and folders in a folder
316
	 *
317
	 * @param string $path
318
	 * @param bool $recursive
319
	 * @param int $reuse
320
	 * @param array $folderData existing cache data for the folder to be scanned
321
	 * @param bool $lock set to false to disable getting an additional read lock during scanning
322 846
	 * @return int the size of the scanned folder or -1 if the size is unknown at this stage
323 846
	 */
324
	protected function scanChildren($path, $recursive = self::SCAN_RECURSIVE, $reuse = -1, $folderData = null, $lock = true) {
325 View Code Duplication
		if ($reuse === -1) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
326 846
			$reuse = ($recursive === self::SCAN_SHALLOW) ? self::REUSE_ETAG | self::REUSE_SIZE : self::REUSE_ETAG;
327 846
		}
328 846
		$this->emit('\OC\Files\Cache\Scanner', 'scanFolder', array($path, $this->storageId));
329 846
		$size = 0;
330 846
		$childQueue = array();
331 846
		if (is_array($folderData) and isset($folderData['fileid'])) {
332
			$folderId = $folderData['fileid'];
333
		} else {
334 846
			$folderId = $this->cache->getId($path);
335 846
		}
336
		$existingChildren = $this->getExistingChildren($folderId);
337 846
		$newChildren = $this->getNewChildren($path);
338 842
339 842
		if ($this->useTransactions) {
340 846
			\OC_DB::beginTransaction();
341 846
		}
342 578
		$exceptionOccurred = false;
343
		foreach ($newChildren as $file) {
344 578
			$child = ($path) ? $path . '/' . $file : $file;
345 578
			try {
346 578
				$existingData = isset($existingChildren[$file]) ? $existingChildren[$file] : null;
347 578
				$data = $this->scanFile($child, $reuse, $folderId, $existingData, $lock);
348 540
				if ($data) {
349 578
					if ($data['mimetype'] === 'httpd/unix-directory' and $recursive === self::SCAN_RECURSIVE) {
350 21
						$childQueue[$child] = $data;
351 568
					} else if ($data['size'] === -1) {
352 568
						$size = -1;
353 568
					} else if ($size !== -1) {
354 578
						$size += $data['size'];
355 578
					}
356
				}
357
			} catch (\Doctrine\DBAL\DBALException $ex) {
0 ignored issues
show
Bug introduced by
The class Doctrine\DBAL\DBALException does not exist. Did you forget a USE statement, or did you not list all dependencies?

Scrutinizer analyzes your composer.json/composer.lock file if available to determine the classes, and functions that are defined by your dependencies.

It seems like the listed class was neither found in your dependencies, nor was it found in the analyzed files in your repository. If you are using some other form of dependency management, you might want to disable this analysis.

Loading history...
358
				// might happen if inserting duplicate while a scanning
359
				// process is running in parallel
360
				// log and ignore
361
				\OCP\Util::writeLog('core', 'Exception while scanning file "' . $child . '": ' . $ex->getMessage(), \OCP\Util::DEBUG);
362
				$exceptionOccurred = true;
363
			} catch (\OCP\Lock\LockedException $e) {
364
				if ($this->useTransactions) {
365
					\OC_DB::rollback();
366
				}
367 846
				throw $e;
368 846
			}
369 846
		}
370 15
		$removedChildren = \array_diff(array_keys($existingChildren), $newChildren);
371 15
		foreach ($removedChildren as $childName) {
372 846
			$child = ($path) ? $path . '/' . $childName : $childName;
373 846
			$this->removeFromCache($child);
374 842
		}
375 842
		if ($this->useTransactions) {
376 846
			\OC_DB::commit();
377
		}
378
		if ($exceptionOccurred) {
379
			// It might happen that the parallel scan process has already
380
			// inserted mimetypes but those weren't available yet inside the transaction
381
			// To make sure to have the updated mime types in such cases,
382
			// we reload them here
383
			\OC::$server->getMimeTypeLoader()->reset();
384 846
		}
385 540
386 540
		foreach ($childQueue as $child => $childData) {
387
			$childSize = $this->scanChildren($child, self::SCAN_RECURSIVE, $reuse, $childData, $lock);
388 540
			if ($childSize === -1) {
389 540
				$size = -1;
390 540
			} else if ($size !== -1) {
391 846
				$size += $childSize;
392 846
			}
393 818
		}
394 818
		if (!is_array($folderData) or !isset($folderData['size']) or $folderData['size'] !== $size) {
395 846
			$this->updateCache($path, array('size' => $size), $folderId);
396 846
		}
397
		$this->emit('\OC\Files\Cache\Scanner', 'postScanFolder', array($path, $this->storageId));
398
		return $size;
399
	}
400
401
	/**
402
	 * check if the file should be ignored when scanning
403
	 * NOTE: files with a '.part' extension are ignored as well!
404
	 *       prevents unfinished put requests to be scanned
405
	 *
406
	 * @param string $file
407 992
	 * @return boolean
408 992
	 */
409 35
	public static function isPartialFile($file) {
410
		if (pathinfo($file, PATHINFO_EXTENSION) === 'part') {
411 990
			return true;
412 6
		}
413
		if (strpos($file, '.part/') !== false) {
414
			return true;
415 989
		}
416
417
		return false;
418
	}
419
420
	/**
421 2
	 * walk over any folders that are not fully scanned yet and scan them
422 2
	 */
423 2
	public function backgroundScan() {
424
		$lastPath = null;
425 1
		while (($path = $this->cache->getIncomplete()) !== false && $path !== $lastPath) {
426 1
			try {
427 1
				$this->scan($path, self::SCAN_RECURSIVE, self::REUSE_ETAG);
0 ignored issues
show
Bug introduced by
It seems like $path defined by $this->cache->getIncomplete() on line 425 can also be of type boolean; however, OC\Files\Cache\Scanner::scan() does only seem to accept string, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
428 1
				\OC_Hook::emit('Scanner', 'correctFolderSize', array('path' => $path));
429 1
				if ($this->cacheActive) {
430 1
					$this->cache->correctFolderSize($path);
431
				}
432
			} catch (\OCP\Files\StorageInvalidException $e) {
433
				// skip unavailable storages
434
			} catch (\OCP\Files\StorageNotAvailableException $e) {
435
				// skip unavailable storages
436
			} catch (\OCP\Lock\LockedException $e) {
437
				// skip unavailable storages
438
			}
439 1
			// FIXME: this won't proceed with the next item, needs revamping of getIncomplete()
440 1
			// to make this possible
441 2
			$lastPath = $path;
442
		}
443
	}
444
445
	/**
446
	 * Set whether the cache is affected by scan operations
447
	 *
448
	 * @param boolean $active The active state of the cache
449
	 */
450
	public function setCacheActive($active) {
451
		$this->cacheActive = $active;
452
	}
453
}
454