Completed
Push — spip-2-stable ( d79e3d )
by cam
30:47 queued 16:59
created

distant.php ➔ recuperer_infos_distantes()   F

Complexity

Conditions 32
Paths 12920

Size

Total Lines 121
Code Lines 67

Duplication

Lines 16
Ratio 13.22 %

Importance

Changes 2
Bugs 0 Features 0
Metric Value
cc 32
eloc 67
nc 12920
nop 3
dl 16
loc 121
rs 2
c 2
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
/***************************************************************************\
4
 *  SPIP, Systeme de publication pour l'internet                           *
5
 *                                                                         *
6
 *  Copyright (c) 2001-2016                                                *
7
 *  Arnaud Martin, Antoine Pitrou, Philippe Riviere, Emmanuel Saint-James  *
8
 *                                                                         *
9
 *  Ce programme est un logiciel libre distribue sous licence GNU/GPL.     *
10
 *  Pour plus de details voir le fichier COPYING.txt ou l'aide en ligne.   *
11
\***************************************************************************/
12
13
if (!defined('_ECRIRE_INC_VERSION')) return;
14
15
if (!defined('_INC_DISTANT_VERSION_HTTP')) define('_INC_DISTANT_VERSION_HTTP', "HTTP/1.0");
16
if (!defined('_INC_DISTANT_CONTENT_ENCODING')) define('_INC_DISTANT_CONTENT_ENCODING', "gzip");
17
if (!defined('_INC_DISTANT_USER_AGENT')) define('_INC_DISTANT_USER_AGENT', 'SPIP-' . $GLOBALS['spip_version_affichee'] . " (" . $GLOBALS['home_server'] . ")");
18
19
if (!defined('_REGEXP_COPIE_LOCALE')) define('_REGEXP_COPIE_LOCALE', ',' . 
20
       preg_replace('@^https?:@', 'https?:', $GLOBALS['meta']['adresse_site'])
21
       . "/?spip.php[?]action=acceder_document.*file=(.*)$,");
22
23
//@define('_COPIE_LOCALE_MAX_SIZE',2097152); // poids (inc/utils l'a fait)
24
25
/**
26
 * Cree au besoin la copie locale d'un fichier distant
27
 *
28
 *
29
 * Prend en argument un chemin relatif au rep racine, ou une URL
30
 * Renvoie un chemin relatif au rep racine, ou false
31
 *
32
 * http://doc.spip.org/@copie_locale
33
 *
34
 * @param $source
35
 * @param string $mode
36
 *   'test' - ne faire que tester
37
 *   'auto' - charger au besoin
38
 *   'modif' - Si deja present, ne charger que si If-Modified-Since
39
 *   'force' - charger toujours (mettre a jour)
40
 * @param string $local
0 ignored issues
show
Bug introduced by
There is no parameter named $local. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
41
 *   permet de specifier le nom du fichier local (stockage d'un cache par exemple, et non document IMG)
42
 * @return bool|string
43
 */
44
function copie_locale($source, $mode='auto') {
45
46
	// si c'est la protection de soi-meme, retourner le path
47
	if ($mode !== 'force' AND preg_match(_REGEXP_COPIE_LOCALE, $source, $local)) {
48
        $source = urldecode($local[1]);
49
		$local = substr(_DIR_IMG,strlen(_DIR_RACINE))  . $source;
50
		return (@file_exists($local) OR @file_exists(_DIR_IMG . $source )) ? $local : false;
51
	}
52
	$local = fichier_copie_locale($source);
53
	$localrac = _DIR_RACINE.$local;
54
	$t = ($mode=='force') ? false  : @file_exists($localrac);
55
56
	// test d'existence du fichier
57
	if ($mode=='test') return $t ? $local : '';
58
59
	// si $local = '' c'est un fichier refuse par fichier_copie_locale(),
60
	// par exemple un fichier qui ne figure pas dans nos documents ;
61
	// dans ce cas on n'essaie pas de le telecharger pour ensuite echouer
62
63
	if (!$local) return false;
64
65
	// sinon voir si on doit/peut le telecharger
66
	if ($local==$source OR !preg_match(',^\w+://,', $source))
67
		return $local;
68
69
	if ($mode=='modif' OR !$t){
70
		// passer par un fichier temporaire unique pour gerer les echecs en cours de recuperation
71
		// et des eventuelles recuperations concurantes
72
		include_spip("inc/acces");
73
		$localractmp = "$localrac.".creer_uniqid().".tmp";
74
		$res = recuperer_page($source, $localractmp, false, _COPIE_LOCALE_MAX_SIZE, '', '', false, $t ? filemtime($localrac) : '');
75
		if ($res) {
76
			// si OK on supprime l'ancien fichier et on renomme
77
			spip_log("copie_locale : recuperation $source sur $localractmp taille $res OK, renommage en $localrac");
78
			spip_unlink($localrac);
79
			@rename($localractmp, $localrac);
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
80
		} else {
81
			// sinon on supprime le fichier temporaire qui a echoue et qui est sans doute corrompu...
82
			spip_log("copie_locale : Echec recuperation $source sur $localractmp, fichier supprime",_LOG_INFO_IMPORTANTE);
83
			spip_unlink($localractmp);
84
		}
85
		if (!$res) return $t ? $local : false;
86
87
		// pour une eventuelle indexation
88
		pipeline('post_edition',
89
			array(
90
				'args' => array(
91
					'operation' => 'copie_locale',
92
					'source' => $source,
93
					'fichier' => $local
94
				),
95
				'data' => null
96
			)
97
		);
98
	}
99
100
	return $local;
101
}
102
103
// http://doc.spip.org/@prepare_donnees_post
104
function prepare_donnees_post($donnees, $boundary = '') {
105
106
	// permettre a la fonction qui a demande le post de formater elle meme ses donnees
107
	// pour un appel soap par exemple
108
	// l'entete est separe des donnees par un double retour a la ligne
109
	// on s'occupe ici de passer tous les retours lignes (\r\n, \r ou \n) en \r\n
110
	if (is_string($donnees) && strlen($donnees)){
111
		$entete = "";
112
		// on repasse tous les \r\n et \r en simples \n
113
		$donnees = str_replace("\r\n","\n",$donnees);
114
		$donnees = str_replace("\r","\n",$donnees);
115
		// un double retour a la ligne signifie la fin de l'entete et le debut des donnees
116
		$p = strpos($donnees, "\n\n");
117
		if ($p!==FALSE){
118
			$entete = str_replace("\n", "\r\n", substr($donnees, 0, $p+1));
119
			$donnees = substr($donnees, $p+2);
120
		}
121
		$chaine = str_replace("\n", "\r\n", $donnees);
122
	}
123
	else {
124
		/* boundary automatique */
125
		// Si on a plus de 500 octects de donnees, on "boundarise"
126
		if ($boundary===''){
127
			$taille = 0;
128
			foreach ($donnees as $cle => $valeur){
129
				if (is_array($valeur)){
130
					foreach ($valeur as $val2){
131
						$taille += strlen($val2);
132
					}
133
				} else {
134
					// faut-il utiliser spip_strlen() dans inc/charsets ?
135
					$taille += strlen($valeur);
136
				}
137
			}
138
			if ($taille>500){
139
				$boundary = substr(md5(rand() . 'spip'), 0, 8);
140
			}
141
		}
142
143
		if (is_string($boundary) and strlen($boundary)){
144
			// fabrique une chaine HTTP pour un POST avec boundary
145
			$entete = "Content-Type: multipart/form-data; boundary=$boundary\r\n";
146
			$chaine = '';
147
			if (is_array($donnees)) {
148
				foreach ($donnees as $cle => $valeur) {
149
					$chaine .= "\r\n--$boundary\r\n";
150
					$chaine .= "Content-Disposition: form-data; name=\"$cle\"\r\n";
151
					$chaine .= "\r\n";
152
					$chaine .= $valeur;
153
				}
154
				$chaine .= "\r\n--$boundary\r\n";
155
			}
156
		} else {
157
			// fabrique une chaine HTTP simple pour un POST
158
			$entete = 'Content-Type: application/x-www-form-urlencoded'."\r\n";
159
			$chaine = array();
160
			if (is_array($donnees)) {
161
				foreach ($donnees as $cle => $valeur) {
162
					if (is_array($valeur)) {
163
						foreach ($valeur as $val2) {
164
							$chaine[] = rawurlencode($cle).'='.rawurlencode($val2);
165
						}
166
					} else {
167
						$chaine[] = rawurlencode($cle).'='.rawurlencode($valeur);
168
					}
169
				}
170
				$chaine = implode('&', $chaine);
171
			} else {
172
				$chaine = $donnees;
173
			}
174
		}
175
	}
176
	return array($entete, $chaine);
177
}
178
179
//
180
// Recupere une page sur le net
181
// et au besoin l'encode dans le charset local
182
//
183
// options : get_headers si on veut recuperer les entetes
184
// taille_max : arreter le contenu au-dela (0 = seulement les entetes ==>HEAD)
185
// Par defaut taille_max = 1Mo.
186
// datas, une chaine ou un tableau pour faire un POST de donnees
187
// boundary, pour forcer l'envoi par cette methode
188
// et refuser_gz pour forcer le refus de la compression (cas des serveurs orthographiques)
189
// date_verif, un timestamp unix pour arreter la recuperation si la page distante n'a pas ete modifiee depuis une date donnee
190
// uri_referer, preciser un referer different
191
// Le second argument ($trans) :
192
// * si c'est une chaine longue, alors c'est un nom de fichier
193
//   dans lequel on ecrit directement la page
194
// * si c'est true/null ca correspond a une demande d'encodage/charset
195
// http://doc.spip.org/@recuperer_page
196
function recuperer_page($url, $trans = false, $get_headers = false,
197
                        $taille_max = null, $datas = '', $boundary = '', $refuser_gz = false,
198
                        $date_verif = '', $uri_referer = ''){
199
	$gz = false;
0 ignored issues
show
Unused Code introduced by
$gz is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
200
201
	// $copy = copier le fichier ?
202
	$copy = (is_string($trans) AND strlen($trans)>5); // eviter "false" :-)
203
204
	if (is_null($taille_max))
205
		$taille_max = $copy ? _COPIE_LOCALE_MAX_SIZE : 1048576;
206
207
	// Accepter les URLs au format feed:// ou qui ont oublie le http://
208
	$url = preg_replace(',^feed://,i', 'http://', $url);
209
	if (!preg_match(',^[a-z]+://,i', $url)) $url = 'http://' . $url;
210
211
	if ($taille_max==0)
212
		$get = 'HEAD';
213
	else
214
		$get = 'GET';
215
216
	if (!empty($datas)) {
217
		$get = 'POST';
218
		list($type, $postdata) = prepare_donnees_post($datas, $boundary);
219
		$datas = $type . 'Content-Length: ' . strlen($postdata) . "\r\n\r\n" . $postdata;
220
	}
221
222
	// dix tentatives maximum en cas d'entetes 301...
223
	for ($i = 0; $i<10; $i++){
224
		$url = recuperer_lapage($url, $trans, $get, $taille_max, $datas, $refuser_gz, $date_verif, $uri_referer);
0 ignored issues
show
Bug introduced by
It seems like $trans can also be of type string; however, recuperer_lapage() does only seem to accept boolean, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
225
		if (!$url) return false;
226
		if (is_array($url)){
227
			list($headers, $result) = $url;
228
			return ($get_headers ? $headers . "\n" : '') . $result;
229
		} else spip_log("recuperer page recommence sur $url");
230
	}
231
}
232
233
// args comme ci-dessus (presque)
234
// retourne l'URL en cas de 301, un tableau (entete, corps) si ok, false sinon
235
// si $trans est null -> on ne veut que les headers
236
// si $trans est une chaine, c'est un nom de fichier pour ecrire directement dedans
237
// http://doc.spip.org/@recuperer_lapage
238
function recuperer_lapage($url, $trans = false, $get = 'GET', $taille_max = 1048576, $datas = '', $refuser_gz = false, $date_verif = '', $uri_referer = ''){
239
	// $copy = copier le fichier ?
240
	$copy = (is_string($trans) AND strlen($trans)>5); // eviter "false" :-)
241
242
	// si on ecrit directement dans un fichier, pour ne pas manipuler
243
	// en memoire refuser gz
244
	if ($copy)
245
		$refuser_gz = true;
246
247
	// ouvrir la connexion et envoyer la requete et ses en-tetes
248
	list($f, $fopen) = init_http($get, $url, $refuser_gz, $uri_referer, $datas, _INC_DISTANT_VERSION_HTTP, $date_verif);
249
	if (!$f){
250
		spip_log("ECHEC init_http $url");
251
		return false;
252
	}
253
254
	// Sauf en fopen, envoyer le flux d'entree
255
	// et recuperer les en-tetes de reponses
256
	if ($fopen)
257
		$headers = '';
258
	else {
259
		$headers = recuperer_entetes($f, $date_verif);
260
		if (is_numeric($headers)){
261
			fclose($f);
262
			// Chinoisierie inexplicable pour contrer 
263
			// les actions liberticides de l'empire du milieu
264
			if ($headers){
265
				spip_log("HTTP status $headers pour $url");
266
				return false;
267
			}
268
			elseif ($result = @file_get_contents($url))
269
				return array('', $result);
270
			else
271
				return false;
272
		}
273
		if (!is_array($headers)){ // cas Location
274
			fclose($f);
275
			include_spip('inc/filtres');
276
			return suivre_lien($url, $headers);
277
		}
278
		$headers = join('', $headers);
279
	}
280
281
	if ($trans===NULL) return array($headers, '');
282
283
	// s'il faut deballer, le faire via un fichier temporaire
284
	// sinon la memoire explose pour les gros flux
285
286
	$gz = preg_match(",\bContent-Encoding: .*gzip,is", $headers) ?
287
		(_DIR_TMP . md5(uniqid(mt_rand())) . '.tmp.gz') : '';
288
289
#	spip_log("entete ($trans $copy $gz)\n$headers"); 
290
	$result = recuperer_body($f, $taille_max, $gz ? $gz : ($copy ? $trans : ''));
291
	fclose($f);
292
	if (!$result)
293
		return array($headers, $result);
294
295
	// Decompresser au besoin
296
	if ($gz){
297
		$result = join('', gzfile($gz));
298
		supprimer_fichier($gz);
299
	}
300
	// Faut-il l'importer dans notre charset local ?
301
	if ($trans===true){
302
		include_spip('inc/charsets');
303
		$result = transcoder_page($result, $headers);
304
	}
305
306
	return array($headers, $result);
307
}
308
309
// http://doc.spip.org/@recuperer_body
310
function recuperer_body($f, $taille_max = 1048576, $fichier = ''){
311
	$taille = 0;
312
	$result = '';
313
	$fp = false;
314
	if ($fichier){
315
		$fp = spip_fopen_lock($fichier, 'w', LOCK_EX);
316
		if (!$fp)
317
			return false;
318
		$result = 0; // on renvoie la taille du fichier
319
	}
320
	while (!feof($f) AND $taille<$taille_max){
321
		$res = fread($f, 16384);
322
		$taille += strlen($res);
323
		if ($fp){
324
			fwrite($fp, $res);
325
			$result = $taille;
326
		}
327
		else
328
			$result .= $res;
329
	}
330
	if ($fp)
331
		spip_fclose_unlock($fp);
332
	return $result;
333
}
334
335
// Lit les entetes de reponse HTTP sur la socket $f et retourne:
336
// la valeur (chaine) de l'en-tete Location si on l'a trouvee
337
// la valeur (numerique) du statut si different de 200, notamment Not-Modified
338
// le tableau des entetes dans tous les autres cas
339
340
// http://doc.spip.org/@recuperer_entetes
341
function recuperer_entetes($f, $date_verif = ''){
342
	$s = @trim(fgets($f, 16384));
343
344
	if (!preg_match(',^HTTP/[0-9]+\.[0-9]+ ([0-9]+),', $s, $r)){
345
		return 0;
346
	}
347
	$status = intval($r[1]);
348
	$headers = array();
349
	$not_modif = $location = false;
350
	while ($s = trim(fgets($f, 16384))){
351
		$headers[] = $s . "\n";
352
		preg_match(',^([^:]*): *(.*)$,i', $s, $r);
353
		list(, $d, $v) = $r;
354
		if (strtolower(trim($d))=='location' AND $status>=300 AND $status<400){
355
			$location = $v;
356
		}
357
		elseif ($date_verif AND ($d=='Last-Modified')) {
358
			if ($date_verif>=strtotime($v)){
359
				//Cas ou la page distante n'a pas bouge depuis
360
				//la derniere visite
361
				$not_modif = true;
362
			}
363
		}
364
	}
365
366
	if ($location)
367
		return $location;
368
	if ($status!=200 or $not_modif)
369
		return $status;
370
	return $headers;
371
}
372
373
// Si on doit conserver une copie locale des fichiers distants, autant que ca
374
// soit a un endroit canonique -- si ca peut etre bijectif c'est encore mieux,
375
// mais la tout de suite je ne trouve pas l'idee, etant donne les limitations
376
// des filesystems
377
// http://doc.spip.org/@nom_fichier_copie_locale
378
function nom_fichier_copie_locale($source, $extension){
379
	if (version_compare($spip_version_branche,"3.0.0") < 0)
0 ignored issues
show
Bug introduced by
The variable $spip_version_branche does not exist. Did you forget to declare it?

This check marks access to variables or properties that have not been declared yet. While PHP has no explicit notion of declaring a variable, accessing it before a value is assigned to it is most likely a bug.

Loading history...
380
		include_spip('inc/getdocument');
381
	else
382
		include_spip('inc/documents');
383
	$d = creer_repertoire_documents('distant'); # IMG/distant/
384
	$d = sous_repertoire($d, $extension); # IMG/distant/pdf/
385
386
	// on se place tout le temps comme si on etait a la racine
387
	if (_DIR_RACINE)
388
		$d = preg_replace(',^' . preg_quote(_DIR_RACINE) . ',', '', $d);
389
390
	$m = md5($source);
391
392
	return $d
393
		. substr(preg_replace(',[^\w-],', '', basename($source)) . '-' . $m, 0, 12)
394
		. substr($m, 0, 4)
395
		. ".$extension";
396
}
397
398
//
399
// Donne le nom de la copie locale de la source
400
//
401
// http://doc.spip.org/@fichier_copie_locale
402
function fichier_copie_locale($source){
403
	// Si c'est deja local pas de souci
404
	if (!preg_match(',^\w+://,', $source)){
405
		if (_DIR_RACINE)
406
			$source = preg_replace(',^' . preg_quote(_DIR_RACINE) . ',', '', $source);
407
		return $source;
408
	}
409
410
	// optimisation : on regarde si on peut deviner l'extension dans l'url et si le fichier
411
	// a deja ete copie en local avec cette extension
412
	// dans ce cas elle est fiable, pas la peine de requeter en base
413
	$path_parts = pathinfo($source);
414
	$ext = $path_parts ? $path_parts['extension'] : '';
415
	if ($ext
416
	AND preg_match(',^\w+$,', $ext) // pas de php?truc=1&...
417
	AND $f = nom_fichier_copie_locale($source, $ext)
418
	AND file_exists(_DIR_RACINE . $f)
419
	)
420
		return $f;
421
422
423
	// Si c'est deja dans la table des documents,
424
	// ramener le nom de sa copie potentielle
425
426
	$ext = sql_getfetsel("extension", "spip_documents", "fichier=" . sql_quote($source) . " AND distant='oui' AND extension <> ''");
427
428
429
	if ($ext) return nom_fichier_copie_locale($source, $ext);
430
431
	// voir si l'extension indiquee dans le nom du fichier est ok
432
	// et si il n'aurait pas deja ete rapatrie
433
434
	$ext = $path_parts ? $path_parts['extension'] : '';
435
436
	if ($ext AND sql_getfetsel("extension", "spip_types_documents", "extension=" . sql_quote($ext))){
437
		$f = nom_fichier_copie_locale($source, $ext);
438
		if (file_exists(_DIR_RACINE . $f))
439
			return $f;
440
	}
441
442
	// Ping  pour voir si son extension est connue et autorisee
443
	// avec mise en cache du resultat du ping
444
445
	$cache = sous_repertoire(_DIR_CACHE, 'rid') . md5($source);
446
	if (!@file_exists($cache)
447
		OR !$path_parts = @unserialize(spip_file_get_contents($cache))
448
		OR _request('var_mode')=='recalcul'
449
	){
450
		$path_parts = recuperer_infos_distantes($source, 0, false);
451
		ecrire_fichier($cache, serialize($path_parts));
452
	}
453
	$ext = $path_parts ? $path_parts['extension'] : '';
454
	if ($ext AND sql_getfetsel("extension", "spip_types_documents", "extension=" . sql_quote($ext))){
455
		return nom_fichier_copie_locale($source, $ext);
456
	}
457
	spip_log("pas de copie locale pour $source");
458
}
459
460
461
// Recuperer les infos d'un document distant, sans trop le telecharger
462
#$a['body'] = chaine
463
#$a['type_image'] = booleen
464
#$a['titre'] = chaine
465
#$a['largeur'] = intval
466
#$a['hauteur'] = intval
467
#$a['taille'] = intval
468
#$a['extension'] = chaine
469
#$a['fichier'] = chaine
470
471
// http://doc.spip.org/@recuperer_infos_distantes
472
function recuperer_infos_distantes($source, $max = 0, $charger_si_petite_image = true){
473
474
	# charger les alias des types mime
475
	include_spip('base/typedoc');
476
	global $mime_alias;
477
478
	$a = array();
479
	$mime_type = '';
480
	// On va directement charger le debut des images et des fichiers html,
481
	// de maniere a attrapper le maximum d'infos (titre, taille, etc). Si
482
	// ca echoue l'utilisateur devra les entrer...
483
	if ($headers = recuperer_page($source, false, true, $max, '', '', true)){
484
		list($headers, $a['body']) = preg_split(',\n\n,', $headers, 2);
485
486 View Code Duplication
		if (preg_match(",\nContent-Type: *([^[:space:];]*),i", "\n$headers", $regs))
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
487
			$mime_type = (trim($regs[1]));
488
		else
489
			$mime_type = ''; // inconnu
490
491
		// Appliquer les alias
492
		while (isset($mime_alias[$mime_type]))
493
			$mime_type = $mime_alias[$mime_type];
494
495
		// Si on a un mime-type insignifiant
496
		// text/plain,application/octet-stream ou vide
497
		// c'est peut-etre que le serveur ne sait pas
498
		// ce qu'il sert ; on va tenter de detecter via l'extension de l'url
499
		// ou le Content-Disposition: attachment; filename=...
500
		$t = null;
501
		if (in_array($mime_type, array('text/plain', '', 'application/octet-stream'))){
502
			if (!$t
503
				AND preg_match(',\.([a-z0-9]+)(\?.*)?$,i', $source, $rext)
504
			){
505
				$t = sql_fetsel("extension", "spip_types_documents", "extension=" . sql_quote($rext[1],'','text'));
506
			}
507 View Code Duplication
			if (!$t
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
508
				AND preg_match(",^Content-Disposition:\s*attachment;\s*filename=(.*)$,Uims", $headers, $m)
509
				AND preg_match(',\.([a-z0-9]+)(\?.*)?$,i', $m[1], $rext)
510
			){
511
				$t = sql_fetsel("extension", "spip_types_documents", "extension=" . sql_quote($rext[1],'','text'));
512
			}
513
		}
514
515
		// Autre mime/type (ou text/plain avec fichier d'extension inconnue)
516
		if (!$t)
517
			$t = sql_fetsel("extension", "spip_types_documents", "mime_type=" . sql_quote($mime_type));
518
519
		// Toujours rien ? (ex: audio/x-ogg au lieu de application/ogg)
520
		// On essaie de nouveau avec l'extension
521 View Code Duplication
		if (!$t
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
522
			AND $mime_type!='text/plain'
523
			AND preg_match(',\.([a-z0-9]+)(\?.*)?$,i', $source, $rext)
524
		){
525
			$t = sql_fetsel("extension", "spip_types_documents", "extension=" . sql_quote($rext[1],'','text')); # eviter xxx.3 => 3gp (> SPIP 3)
526
		}
527
528
529
		if ($t){
530
			spip_log("mime-type $mime_type ok, extension " . $t['extension']);
531
			$a['extension'] = $t['extension'];
532
		} else {
533
			# par defaut on retombe sur '.bin' si c'est autorise
534
			spip_log("mime-type $mime_type inconnu");
535
			$t = sql_fetsel("extension", "spip_types_documents", "extension='bin'");
536
			if (!$t) return false;
537
			$a['extension'] = $t['extension'];
538
		}
539
540
		if (preg_match(",\nContent-Length: *([^[:space:]]*),i",
541
			"\n$headers", $regs)
542
		)
543
			$a['taille'] = intval($regs[1]);
544
	}
545
546
	// Echec avec HEAD, on tente avec GET
547
	if (!$a AND !$max){
548
		spip_log("tenter GET $source");
549
		$a = recuperer_infos_distantes($source, _COPIE_LOCALE_MAX_SIZE);
550
	}
551
552
	// S'il s'agit d'une image pas trop grosse ou d'un fichier html, on va aller
553
	// recharger le document en GET et recuperer des donnees supplementaires...
554
	if (preg_match(',^image/(jpeg|gif|png|swf),', $mime_type)){
555
		if ($max==0
556
			AND $a['taille']<1024*1024
557
				AND (strpos($GLOBALS['meta']['formats_graphiques'], $a['extension'])!==false)
558
					AND $charger_si_petite_image
559
		){
560
			$a = recuperer_infos_distantes($source, 1024*1024);
561
		}
562
		else if ($a['body']
563
		AND $a['taille'] < 1024*1024
564
		) {
565
			$a['fichier'] = _DIR_RACINE . nom_fichier_copie_locale($source, $a['extension']);
566
			ecrire_fichier($a['fichier'], $a['body']);
567
			$size_image = @getimagesize($a['fichier']);
568
			$a['largeur'] = intval($size_image[0]);
569
			$a['hauteur'] = intval($size_image[1]);
570
			$a['type_image'] = true;
571
		}
572
	}
573
574
	// Fichier swf, si on n'a pas la taille, on va mettre 425x350 par defaut
575
	// ce sera mieux que 0x0
576
	if ($a AND $a['extension']=='swf'
577
		AND !$a['largeur']
578
	){
579
		$a['largeur'] = 425;
580
		$a['hauteur'] = 350;
581
	}
582
583
	if ($mime_type=='text/html'){
584
		include_spip('inc/filtres');
585
		$page = recuperer_page($source, true, false, 1024*1024);
586
		if (preg_match(',<title>(.*?)</title>,ims', $page, $regs))
587
			$a['titre'] = corriger_caracteres(trim($regs[1]));
588
		if (!$a['taille']) $a['taille'] = strlen($page); # a peu pres
589
	}
590
591
	return $a;
592
}
593
594
595
/**
596
 * Tester si un host peut etre recuperer directement ou doit passer par un proxy
597
 * on peut passer en parametre le proxy et la liste des host exclus,
598
 * pour les besoins des tests, lors de la configuration
599
 *
600
 * @param string $host
601
 * @param string $http_proxy
0 ignored issues
show
Documentation introduced by
Should the type for parameter $http_proxy not be string|null?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
602
 * @param string $http_noproxy
0 ignored issues
show
Documentation introduced by
Should the type for parameter $http_noproxy not be string|null?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
603
 * @return string
604
 */
605
function need_proxy($host, $http_proxy = null, $http_noproxy = null){
606
	if (is_null($http_proxy))
607
		$http_proxy = @$GLOBALS['meta']["http_proxy"];
608
	if (is_null($http_noproxy))
609
		$http_noproxy = @$GLOBALS['meta']["http_noproxy"];
610
611
	$domain = substr($host, strpos($host, '.'));
612
613
	return ($http_proxy
614
		AND (strpos(" $http_noproxy ", " $host ")===false
615
		AND (strpos(" $http_noproxy ", " $domain ")===false)))
616
		? $http_proxy : '';
617
}
618
619
//
620
// Lance une requete HTTP avec entetes
621
// retourne le descripteur sur lequel lire la reponse
622
//
623
// http://doc.spip.org/@init_http
624
function init_http($method, $url, $refuse_gz = false, $referer = '', $datas = "", $vers = "HTTP/1.0", $date = ''){
625
	$user = $via_proxy = $proxy_user = '';
0 ignored issues
show
Unused Code introduced by
$proxy_user is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
Unused Code introduced by
$via_proxy is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
626
	$fopen = false;
627
628
	$t = @parse_url($url);
629
	$host = $t['host'];
630
	if ($t['scheme']=='http'){
631
		$scheme = 'http';
632
		$noproxy = '';
633
	} elseif ($t['scheme']=='https') {
634
		$scheme = 'ssl';
635
		$noproxy = 'ssl://';
636 View Code Duplication
		if (!isset($t['port']) || !($port = $t['port'])) $t['port'] = 443;
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
637
	}
638
	else {
639
		$scheme = $t['scheme'];
640
		$noproxy = $scheme . '://';
641
	}
642
	if (isset($t['user']))
643
		$user = array($t['user'], $t['pass']);
644
645 View Code Duplication
	if (!isset($t['port']) || !($port = $t['port'])) $port = 80;
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
646 View Code Duplication
	if (!isset($t['path']) || !($path = $t['path'])) $path = "/";
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
647
	if (@$t['query']) $path .= "?" . $t['query'];
648
649
	$f = lance_requete($method, $scheme, $user, $host, $path, $port, $noproxy, $refuse_gz, $referer, $datas, $vers, $date);
650
	if (!$f){
651
		// fallback : fopen
652
		if (!_request('tester_proxy')){
653
			$f = @fopen($url, "rb");
654
			spip_log("connexion vers $url par simple fopen");
655
			$fopen = true;
656
		}
657
		else
658
			$f = false;
659
		// echec total
660
	}
661
662
	return array($f, $fopen);
663
}
664
665
// http://doc.spip.org/@lance_requete
666
function lance_requete($method, $scheme, $user, $host, $path, $port, $noproxy, $refuse_gz = false, $referer = '', $datas = "", $vers = "HTTP/1.0", $date = ''){
667
668
	$proxy_user = '';
669
	$http_proxy = need_proxy($host);
670
671
	if ($http_proxy){
672
		$path = (($scheme=='ssl') ? 'https://' : "$scheme://")
673
			. (!$user ? '' : (urlencode($user[0]) . ":" . urlencode($user[1]) . "@"))
674
			. "$host" . (($port!=80) ? ":$port" : "") . $path;
675
		$t2 = @parse_url($http_proxy);
676
		$first_host = $t2['host'];
677
		if (!($port = $t2['port'])) $port = 80;
678
		if ($t2['user'])
679
			$proxy_user = base64_encode($t2['user'] . ":" . $t2['pass']);
680
	}
681
	else
682
		$first_host = $noproxy . $host;
683
684
	$f = @fsockopen($first_host, $port);
685
	spip_log("Recuperer $path sur $first_host:$port par $f");
686
	if (!$f) return false;
687
688
	$site = $GLOBALS['meta']["adresse_site"];
689
690
	if ($user) $user = base64_encode($user[0] . ":" . $user[1]);
691
692
	$req = "$method $path $vers\r\n"
693
		. "Host: $host\r\n"
694
		. "User-Agent: " . _INC_DISTANT_USER_AGENT . "\r\n"
695
		. ($refuse_gz ? '' : ("Accept-Encoding: " . _INC_DISTANT_CONTENT_ENCODING . "\r\n"))
696
		. (!$site ? '' : "Referer: $site/$referer\r\n")
697
		. (!$date ? '' : "If-Modified-Since: " . (gmdate("D, d M Y H:i:s", $date) . " GMT\r\n"))
698
		. (!$user ? '' : "Authorization: Basic $user\r\n")
699
		. (!$proxy_user ? '' : "Proxy-Authorization: Basic $proxy_user\r\n")
700
		. (!strpos($vers, '1.1') ? '' : "Keep-Alive: 300\r\nConnection: keep-alive\r\n");
701
702
#	spip_log("Requete\n$req");
703
	fputs($f, $req);
704
	fputs($f, $datas ? $datas : "\r\n");
705
	return $f;
706
}
707
708
?>
0 ignored issues
show
Best Practice introduced by
It is not recommended to use PHP's closing tag ?> in files other than templates.

Using a closing tag in PHP files that only contain PHP code is not recommended as you might accidentally add whitespace after the closing tag which would then be output by PHP. This can cause severe problems, for example headers cannot be sent anymore.

A simple precaution is to leave off the closing tag as it is not required, and it also has no negative effects whatsoever.

Loading history...
709