Completed
Push — master ( 7884e3...ea4e2c )
by cam
05:07
created

distant.php ➔ nom_fichier_copie_locale()   A

Complexity

Conditions 2
Paths 2

Size

Total Lines 18

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 2
nc 2
nop 2
dl 0
loc 18
rs 9.6666
c 0
b 0
f 0
1
<?php
2
3
/***************************************************************************\
4
 *  SPIP, Système de publication pour l'internet                           *
5
 *                                                                         *
6
 *  Copyright © avec tendresse depuis 2001                                 *
7
 *  Arnaud Martin, Antoine Pitrou, Philippe Rivière, Emmanuel Saint-James  *
8
 *                                                                         *
9
 *  Ce programme est un logiciel libre distribué sous licence GNU/GPL.     *
10
 *  Pour plus de détails voir le fichier COPYING.txt ou l'aide en ligne.   *
11
\***************************************************************************/
12
13
/**
14
 * Ce fichier gère l'obtention de données distantes
15
 *
16
 * @package SPIP\Core\Distant
17
 **/
18
if (!defined('_ECRIRE_INC_VERSION')) {
19
	return;
20
}
21
22
if (!defined('_INC_DISTANT_VERSION_HTTP')) {
23
	define('_INC_DISTANT_VERSION_HTTP', 'HTTP/1.0');
24
}
25
if (!defined('_INC_DISTANT_CONTENT_ENCODING')) {
26
	define('_INC_DISTANT_CONTENT_ENCODING', 'gzip');
27
}
28
if (!defined('_INC_DISTANT_USER_AGENT')) {
29
	define('_INC_DISTANT_USER_AGENT', 'SPIP-' . $GLOBALS['spip_version_affichee'] . ' (' . $GLOBALS['home_server'] . ')');
30
}
31
if (!defined('_INC_DISTANT_MAX_SIZE')) {
32
	define('_INC_DISTANT_MAX_SIZE', 2097152);
33
}
34
if (!defined('_INC_DISTANT_CONNECT_TIMEOUT')) {
35
	define('_INC_DISTANT_CONNECT_TIMEOUT', 10);
36
}
37
38
define('_REGEXP_COPIE_LOCALE', ',' 	.
39
	preg_replace(
40
		'@^https?:@',
41
		'https?:',
42
		(isset($GLOBALS['meta']['adresse_site']) ? $GLOBALS['meta']['adresse_site'] : '')
43
	)
44
	. '/?spip.php[?]action=acceder_document.*file=(.*)$,');
45
46
//@define('_COPIE_LOCALE_MAX_SIZE',2097152); // poids (inc/utils l'a fait)
47
48
/**
49
 * Crée au besoin la copie locale d'un fichier distant
50
 *
51
 * Prend en argument un chemin relatif au rep racine, ou une URL
52
 * Renvoie un chemin relatif au rep racine, ou false
53
 *
54
 * @link https://www.spip.net/4155
55
 * @pipeline_appel post_edition
56
 *
57
 * @param string $source
58
 * @param string $mode
59
 *   - 'test' - ne faire que tester
60
 *   - 'auto' - charger au besoin
61
 *   - 'modif' - Si deja present, ne charger que si If-Modified-Since
62
 *   - 'force' - charger toujours (mettre a jour)
63
 * @param string $local
0 ignored issues
show
Documentation introduced by
Should the type for parameter $local not be string|null?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
64
 *   permet de specifier le nom du fichier local (stockage d'un cache par exemple, et non document IMG)
65
 * @param int $taille_max
0 ignored issues
show
Documentation introduced by
Should the type for parameter $taille_max not be integer|null?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
66
 *   taille maxi de la copie local, par defaut _COPIE_LOCALE_MAX_SIZE
67
 * @return bool|string
0 ignored issues
show
Documentation introduced by
Consider making the return type a bit more specific; maybe use string|false.

This check looks for the generic type array as a return type and suggests a more specific type. This type is inferred from the actual code.

Loading history...
68
 */
69
function copie_locale($source, $mode = 'auto', $local = null, $taille_max = null) {
70
71
	// si c'est la protection de soi-meme, retourner le path
72
	if ($mode !== 'force' and preg_match(_REGEXP_COPIE_LOCALE, $source, $match)) {
73
		$source = substr(_DIR_IMG, strlen(_DIR_RACINE)) . urldecode($match[1]);
74
75
		return @file_exists($source) ? $source : false;
76
	}
77
78
	if (is_null($local)) {
79
		$local = fichier_copie_locale($source);
80 View Code Duplication
	} else {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
81
		if (_DIR_RACINE and strncmp(_DIR_RACINE, $local, strlen(_DIR_RACINE)) == 0) {
82
			$local = substr($local, strlen(_DIR_RACINE));
83
		}
84
	}
85
86
	// si $local = '' c'est un fichier refuse par fichier_copie_locale(),
87
	// par exemple un fichier qui ne figure pas dans nos documents ;
88
	// dans ce cas on n'essaie pas de le telecharger pour ensuite echouer
89
	if (!$local) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $local of type string|null is loosely compared to false; this is ambiguous if the string can be empty. You might want to explicitly use === null instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For string values, the empty string '' is a special case, in particular the following results might be unexpected:

''   == false // true
''   == null  // true
'ab' == false // false
'ab' == null  // false

// It is often better to use strict comparison
'' === false // false
'' === null  // false
Loading history...
90
		return false;
91
	}
92
93
	$localrac = _DIR_RACINE . $local;
94
	$t = ($mode == 'force') ? false : @file_exists($localrac);
95
96
	// test d'existence du fichier
97
	if ($mode == 'test') {
98
		return $t ? $local : '';
99
	}
100
101
	// sinon voir si on doit/peut le telecharger
102
	if ($local == $source or !tester_url_absolue($source)) {
103
		return $local;
104
	}
105
106
	if ($mode == 'modif' or !$t) {
107
		// passer par un fichier temporaire unique pour gerer les echecs en cours de recuperation
108
		// et des eventuelles recuperations concurantes
109
		include_spip('inc/acces');
110
		if (!$taille_max) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $taille_max of type integer|null is loosely compared to false; this is ambiguous if the integer can be zero. You might want to explicitly use === null instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For integer values, zero is a special case, in particular the following results might be unexpected:

0   == false // true
0   == null  // true
123 == false // false
123 == null  // false

// It is often better to use strict comparison
0 === false // false
0 === null  // false
Loading history...
111
			$taille_max = _COPIE_LOCALE_MAX_SIZE;
112
		}
113
		$res = recuperer_url(
114
			$source,
115
			array('file' => $localrac, 'taille_max' => $taille_max, 'if_modified_since' => $t ? filemtime($localrac) : '')
116
		);
117
		if (!$res or (!$res['length'] and $res['status'] != 304)) {
118
			spip_log("copie_locale : Echec recuperation $source sur $localrac status : " . $res['status'], 'distant' . _LOG_INFO_IMPORTANTE);
119
		}
120
		if (!$res['length']) {
121
			// si $t c'est sans doute juste un not-modified-since
122
			return $t ? $local : false;
123
		}
124
		spip_log("copie_locale : recuperation $source sur $localrac taille " . $res['length'] . ' OK', 'distant');
125
126
		// pour une eventuelle indexation
127
		pipeline(
128
			'post_edition',
129
			array(
130
				'args' => array(
131
					'operation' => 'copie_locale',
132
					'source' => $source,
133
					'fichier' => $local,
134
					'http_res' => $res['length'],
135
				),
136
				'data' => null
137
			)
138
		);
139
	}
140
141
	return $local;
142
}
143
144
/**
145
 * Valider qu'une URL d'un document distant est bien distante
146
 * et pas une url localhost qui permet d'avoir des infos sur le serveur
147
 * inspiree de https://core.trac.wordpress.org/browser/trunk/src/wp-includes/http.php?rev=36435#L500
148
 * 
149
 * @param string $url
150
 * @param array $known_hosts
151
 *   url/hosts externes connus et acceptes
152
 * @return false|string 
153
 *   url ou false en cas d'echec
154
 */
155
function valider_url_distante($url, $known_hosts = array()) {
156
	if (!function_exists('protocole_verifier')){
157
		include_spip('inc/filtres_mini');
158
	}
159
160
	if (!protocole_verifier($url, array('http', 'https'))) {
161
		return false;
162
	}
163
	
164
	$parsed_url = parse_url($url);
165
	if (!$parsed_url or empty($parsed_url['host']) ) {
166
		return false;
167
	}
168
169
	if (isset($parsed_url['user']) or isset($parsed_url['pass'])) {
170
		return false;
171
	}
172
173
	if (false !== strpbrk($parsed_url['host'], ':#?[]')) {
174
		return false;
175
	}
176
177
	if (!is_array($known_hosts)) {
178
		$known_hosts = array($known_hosts);
179
	}
180
	$known_hosts[] = $GLOBALS['meta']['adresse_site'];
181
	$known_hosts[] = url_de_base();
182
	$known_hosts = pipeline('declarer_hosts_distants', $known_hosts);
183
184
	$is_known_host = false;
185
	foreach ($known_hosts as $known_host) {
186
		$parse_known = parse_url($known_host);
187
		if ($parse_known
188
		  and strtolower($parse_known['host']) === strtolower($parsed_url['host'])) {
189
			$is_known_host = true;
190
			break;
191
		}
192
	}
193
194
	if (!$is_known_host) {
195
		$host = trim($parsed_url['host'], '.');
196
		if (preg_match('#^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$#', $host)) {
197
			$ip = $host;
198
		} else {
199
			$ip = gethostbyname($host);
200
			if ($ip === $host) {
201
				// Error condition for gethostbyname()
202
				$ip = false;
203
			}
204
		}
205
		if ($ip) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $ip of type false|string is loosely compared to true; this is ambiguous if the string can be empty. You might want to explicitly use !== false instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For string values, the empty string '' is a special case, in particular the following results might be unexpected:

''   == false // true
''   == null  // true
'ab' == false // false
'ab' == null  // false

// It is often better to use strict comparison
'' === false // false
'' === null  // false
Loading history...
206
			$parts = array_map('intval', explode( '.', $ip ));
207
			if (127 === $parts[0] or 10 === $parts[0] or 0 === $parts[0]
208
			  or ( 172 === $parts[0] and 16 <= $parts[1] and 31 >= $parts[1] )
209
			  or ( 192 === $parts[0] && 168 === $parts[1] )
210
			) {
211
				return false;
212
			}
213
		}
214
	}
215
216
	if (empty($parsed_url['port'])) {
217
		return $url;
218
	}
219
220
	$port = $parsed_url['port'];
221
	if ($port === 80  or $port === 443  or $port === 8080) {
0 ignored issues
show
Unused Code Bug introduced by
The strict comparison === seems to always evaluate to false as the types of $port (string) and 80 (integer) can never be identical. Maybe you want to use a loose comparison == instead?
Loading history...
222
		return $url;
223
	}
224
225
	if ($is_known_host) {
226
		foreach ($known_hosts as $known_host) {
227
			$parse_known = parse_url($known_host);
228
			if ($parse_known
229
				and !empty($parse_known['port'])
230
			  and strtolower($parse_known['host']) === strtolower($parsed_url['host'])
231
			  and $parse_known['port'] == $port) {
232
				return $url;
233
			}
234
		}
235
	}
236
237
	return false;
238
}
239
240
/**
241
 * Preparer les donnes pour un POST
242
 * si $donnees est une chaine
243
 *  - charge a l'envoyeur de la boundariser, de gerer le Content-Type,
244
 *    de séparer les entetes des données par une ligne vide etc...
245
 *  - on traite les retour ligne pour les mettre au bon format
246
 *  - on decoupe en entete/corps (separes par ligne vide)
247
 * si $donnees est un tableau
248
 *  - structuration en chaine avec boundary si necessaire ou fournie et bon Content-Type
249
 *
250
 * @param string|array $donnees
251
 * @param string $boundary
252
 * @return array
253
 *   entete,corps
254
 */
255
function prepare_donnees_post($donnees, $boundary = '') {
256
257
	// permettre a la fonction qui a demande le post de formater elle meme ses donnees
258
	// pour un appel soap par exemple
259
	// l'entete est separe des donnees par un double retour a la ligne
260
	// on s'occupe ici de passer tous les retours lignes (\r\n, \r ou \n) en \r\n
261
	if (is_string($donnees) && strlen($donnees)) {
262
		$entete = '';
263
		// on repasse tous les \r\n et \r en simples \n
264
		$donnees = str_replace("\r\n", "\n", $donnees);
265
		$donnees = str_replace("\r", "\n", $donnees);
266
		// un double retour a la ligne signifie la fin de l'entete et le debut des donnees
267
		$p = strpos($donnees, "\n\n");
268
		if ($p !== false) {
269
			$entete = str_replace("\n", "\r\n", substr($donnees, 0, $p + 1));
270
			$donnees = substr($donnees, $p + 2);
271
		}
272
		$chaine = str_replace("\n", "\r\n", $donnees);
273
	} else {
274
		/* boundary automatique */
275
		// Si on a plus de 500 octects de donnees, on "boundarise"
276
		if ($boundary === '') {
277
			$taille = 0;
278
			foreach ($donnees as $cle => $valeur) {
0 ignored issues
show
Bug introduced by
The expression $donnees of type array|string is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
279
				if (is_array($valeur)) {
280
					foreach ($valeur as $val2) {
281
						$taille += strlen($val2);
282
					}
283
				} else {
284
					// faut-il utiliser spip_strlen() dans inc/charsets ?
285
					$taille += strlen($valeur);
286
				}
287
			}
288
			if ($taille > 500) {
289
				$boundary = substr(md5(rand() . 'spip'), 0, 8);
290
			}
291
		}
292
293
		if (is_string($boundary) and strlen($boundary)) {
294
			// fabrique une chaine HTTP pour un POST avec boundary
295
			$entete = "Content-Type: multipart/form-data; boundary=$boundary\r\n";
296
			$chaine = '';
297
			if (is_array($donnees)) {
298
				foreach ($donnees as $cle => $valeur) {
299
					if (is_array($valeur)) {
300
						foreach ($valeur as $val2) {
301
							$chaine .= "\r\n--$boundary\r\n";
302
							$chaine .= "Content-Disposition: form-data; name=\"{$cle}[]\"\r\n";
303
							$chaine .= "\r\n";
304
							$chaine .= $val2;
305
						}
306
					} else {
307
						$chaine .= "\r\n--$boundary\r\n";
308
						$chaine .= "Content-Disposition: form-data; name=\"$cle\"\r\n";
309
						$chaine .= "\r\n";
310
						$chaine .= $valeur;
311
					}
312
				}
313
				$chaine .= "\r\n--$boundary\r\n";
314
			}
315
		} else {
316
			// fabrique une chaine HTTP simple pour un POST
317
			$entete = 'Content-Type: application/x-www-form-urlencoded' . "\r\n";
318
			$chaine = array();
319
			if (is_array($donnees)) {
320
				foreach ($donnees as $cle => $valeur) {
321
					if (is_array($valeur)) {
322
						foreach ($valeur as $val2) {
323
							$chaine[] = rawurlencode($cle) . '[]=' . rawurlencode($val2);
324
						}
325
					} else {
326
						$chaine[] = rawurlencode($cle) . '=' . rawurlencode($valeur);
327
					}
328
				}
329
				$chaine = implode('&', $chaine);
330
			} else {
331
				$chaine = $donnees;
332
			}
333
		}
334
	}
335
336
	return array($entete, $chaine);
337
}
338
339
/**
340
 * Convertir une URL dont le host est en utf8 en ascii
341
 * Utilise la librairie https://github.com/phlylabs/idna-convert/tree/v0.9.1
342
 * dans sa derniere version compatible toutes version PHP 5
343
 * La fonction PHP idn_to_ascii depend d'un package php5-intl et est rarement disponible
344
 *
345
 * @param string $url_idn
346
 * @return array|string
347
 */
348
function url_to_ascii($url_idn) {
349
350
	if ($parts = parse_url($url_idn)) {
351
		$host = $parts['host'];
352
		if (!preg_match(',^[a-z0-9_\.\-]+$,i', $host)) {
353
			include_spip('inc/idna_convert.class');
354
			$IDN = new idna_convert();
355
			$host_ascii = $IDN->encode($host);
356
			$url_idn = explode($host, $url_idn, 2);
357
			$url_idn = implode($host_ascii, $url_idn);
358
		}
359
		// et on urlencode les char utf si besoin dans le path
360
		$url_idn = preg_replace_callback('/[^\x20-\x7f]/', function($match) { return urlencode($match[0]); }, $url_idn);
361
	}
362
363
	return $url_idn;
364
}
365
366
/**
367
 * Récupère le contenu d'une URL
368
 * au besoin encode son contenu dans le charset local
369
 *
370
 * @uses init_http()
371
 * @uses recuperer_entetes_complets()
372
 * @uses recuperer_body()
373
 * @uses transcoder_page()
374
 * @uses prepare_donnees_post()
375
 *
376
 * @param string $url
377
 * @param array $options
378
 *   bool transcoder : true si on veut transcoder la page dans le charset du site
379
 *   string methode : Type de requête HTTP à faire (HEAD, GET, POST, PUT, DELETE)
380
 *   int taille_max : Arrêter le contenu au-delà (0 = seulement les entetes ==> requête HEAD). Par defaut taille_max = 1Mo ou 16Mo si copie dans un fichier
381
 *   array headers : tableau associatif d'entetes https a envoyer
382
 *   string|array datas : Pour envoyer des donnees (array) et/ou entetes au complet, avec saut de ligne entre headers et donnees ( string @see prepare_donnees_post()) (force la methode POST si donnees non vide)
383
 *   string boundary : boundary pour formater les datas au format array
384
 *   bool refuser_gz : Pour forcer le refus de la compression (cas des serveurs orthographiques)
385
 *   int if_modified_since : Un timestamp unix pour arrêter la récuperation si la page distante n'a pas été modifiée depuis une date donnée
386
 *   string uri_referer : Pour préciser un référer différent
387
 *   string file : nom du fichier dans lequel copier le contenu
388
 *   int follow_location : nombre de redirections a suivre (0 pour ne rien suivre)
389
 *   string version_http : version du protocole HTTP a utiliser (par defaut defini par la constante _INC_DISTANT_VERSION_HTTP)
390
 * @return array|bool
391
 *   false si echec
392
 *   array sinon :
393
 *     int status : le status de la page
394
 *     string headers : les entetes de la page
395
 *     string page : le contenu de la page (vide si copie dans un fichier)
396
 *     int last_modified : timestamp de derniere modification
397
 *     string location : url de redirection envoyee par la page
398
 *     string url : url reelle de la page recuperee
399
 *     int length : taille du contenu ou du fichier
400
 *
401
 *     string file : nom du fichier si enregistre dans un fichier
402
 */
403
function recuperer_url($url, $options = array()) {
404
	// Conserve la mémoire de la méthode fournit éventuellement
405
	$methode_demandee = $options['methode'] ?? '';
406
	$default = array(
407
		'transcoder' => false,
408
		'methode' => 'GET',
409
		'taille_max' => null,
410
		'headers' => [],
411
		'datas' => '',
412
		'boundary' => '',
413
		'refuser_gz' => false,
414
		'if_modified_since' => '',
415
		'uri_referer' => '',
416
		'file' => '',
417
		'follow_location' => 10,
418
		'version_http' => _INC_DISTANT_VERSION_HTTP,
419
	);
420
	$options = array_merge($default, $options);
421
	// copier directement dans un fichier ?
422
	$copy = $options['file'];
423
424
	if ($options['methode'] == 'HEAD') {
425
		$options['taille_max'] = 0;
426
	}
427
	if (is_null($options['taille_max'])) {
428
		$options['taille_max'] = $copy ? _COPIE_LOCALE_MAX_SIZE : _INC_DISTANT_MAX_SIZE;
429
	}
430
431
432
	// Ajout des en-têtes spécifiques si besoin
433
	$head_add = '';
434
	if (!empty($options['headers'])) {
435
		foreach ($options['headers'] as $champ => $valeur) {
436
			$head_add .= $champ . ': ' . $valeur . "\r\n";
437
		}
438
		// ne pas le repasser a recuperer_url si on follow un location, car ils seront dans datas
439
		unset($options['entetes']);
440
	}
441
442
	if (!empty($options['datas'])) {
443
		list($head, $postdata) = prepare_donnees_post($options['datas'], $options['boundary']);
444
		$head .= $head_add;
445
		if (stripos($head, 'Content-Length:') === false) {
446
			$head .= 'Content-Length: ' . strlen($postdata) . "\r\n";
447
		}
448
		$options['datas'] = $head . "\r\n" . $postdata;
449
		if (
450
			strlen($postdata)
451
			and !$methode_demandee
452
		) {
453
			$options['methode'] = 'POST';
454
		}
455
	} elseif ($head_add) {
456
		$options['datas'] = $head_add . "\r\n";
457
	}
458
459
	// Accepter les URLs au format feed:// ou qui ont oublie le http:// ou les urls relatives au protocole
460
	$url = preg_replace(',^feed://,i', 'http://', $url);
461
	if (!tester_url_absolue($url)) {
462
		$url = 'http://' . $url;
463
	} elseif (strncmp($url, '//', 2) == 0) {
464
		$url = 'http:' . $url;
465
	}
466
467
	$url = url_to_ascii($url);
468
469
	$result = array(
470
		'status' => 0,
471
		'headers' => '',
472
		'page' => '',
473
		'length' => 0,
474
		'last_modified' => '',
475
		'location' => '',
476
		'url' => $url
477
	);
478
479
	// si on ecrit directement dans un fichier, pour ne pas manipuler en memoire refuser gz
480
	$refuser_gz = (($options['refuser_gz'] or $copy) ? true : false);
481
482
	// ouvrir la connexion et envoyer la requete et ses en-tetes
483
	list($handle, $fopen) = init_http(
484
		$options['methode'],
485
		$url,
0 ignored issues
show
Bug introduced by
It seems like $url can also be of type array; however, init_http() does only seem to accept string, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
486
		$refuser_gz,
487
		$options['uri_referer'],
488
		$options['datas'],
489
		$options['version_http'],
490
		$options['if_modified_since']
491
	);
492
	if (!$handle) {
493
		spip_log("ECHEC init_http $url", 'distant' . _LOG_ERREUR);
494
495
		return false;
496
	}
497
498
	// Sauf en fopen, envoyer le flux d'entree
499
	// et recuperer les en-tetes de reponses
500
	if (!$fopen) {
501
		$res = recuperer_entetes_complets($handle, $options['if_modified_since']);
502
		if (!$res) {
503
			fclose($handle);
504
			$t = @parse_url($url);
505
			$host = $t['host'];
506
			// Chinoisierie inexplicable pour contrer
507
			// les actions liberticides de l'empire du milieu
508
			if (!need_proxy($host)
509
				and $res = @file_get_contents($url)
510
			) {
511
				$result['length'] = strlen($res);
512
				if ($copy) {
513
					ecrire_fichier($copy, $res);
514
					$result['file'] = $copy;
515
				} else {
516
					$result['page'] = $res;
517
				}
518
				$res = array(
519
					'status' => 200,
520
				);
521
			} else {
522
				spip_log("ECHEC chinoiserie $url", 'distant' . _LOG_ERREUR);
523
				return false;
524
			}
525
		} elseif ($res['location'] and $options['follow_location']) {
526
			$options['follow_location']--;
527
			fclose($handle);
528
			include_spip('inc/filtres');
529
			$url = suivre_lien($url, $res['location']);
0 ignored issues
show
Bug introduced by
It seems like $url can also be of type array; however, suivre_lien() does only seem to accept string, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
530
			spip_log("recuperer_url recommence sur $url", 'distant');
531
532
			return recuperer_url($url, $options);
533
		} elseif ($res['status'] !== 200) {
534
			spip_log('HTTP status ' . $res['status'] . " pour $url", 'distant');
535
		}
536
		$result['status'] = $res['status'];
537
		if (isset($res['headers'])) {
538
			$result['headers'] = $res['headers'];
539
		}
540
		if (isset($res['last_modified'])) {
541
			$result['last_modified'] = $res['last_modified'];
542
		}
543
		if (isset($res['location'])) {
544
			$result['location'] = $res['location'];
545
		}
546
	}
547
548
	// on ne veut que les entetes
549
	if (!$options['taille_max'] or $options['methode'] == 'HEAD' or $result['status'] == '304') {
550
		return $result;
551
	}
552
553
554
	// s'il faut deballer, le faire via un fichier temporaire
555
	// sinon la memoire explose pour les gros flux
556
557
	$gz = false;
558
	if (preg_match(",\bContent-Encoding: .*gzip,is", $result['headers'])) {
559
		$gz = (_DIR_TMP . md5(uniqid(mt_rand())) . '.tmp.gz');
560
	}
561
562
	// si on a pas deja recuperer le contenu par une methode detournee
563
	if (!$result['length']) {
564
		$res = recuperer_body($handle, $options['taille_max'], $gz ? $gz : $copy);
565
		fclose($handle);
566
		if ($copy) {
567
			$result['length'] = $res;
568
			$result['file'] = $copy;
569
		} elseif ($res) {
570
			$result['page'] = &$res;
571
			$result['length'] = strlen($result['page']);
572
		}
573
		if (!$result['status']) {
574
			$result['status'] = 200; // on a reussi, donc !
575
		}
576
	}
577
	if (!$result['page']) {
578
		return $result;
579
	}
580
581
	// Decompresser au besoin
582
	if ($gz) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $gz of type string|false is loosely compared to true; this is ambiguous if the string can be empty. You might want to explicitly use !== false instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For string values, the empty string '' is a special case, in particular the following results might be unexpected:

''   == false // true
''   == null  // true
'ab' == false // false
'ab' == null  // false

// It is often better to use strict comparison
'' === false // false
'' === null  // false
Loading history...
583
		$result['page'] = implode('', gzfile($gz));
584
		supprimer_fichier($gz);
585
	}
586
587
	// Faut-il l'importer dans notre charset local ?
588
	if ($options['transcoder']) {
589
		include_spip('inc/charsets');
590
		$result['page'] = transcoder_page($result['page'], $result['headers']);
591
	}
592
593
	return $result;
594
}
595
596
/**
597
 * Récuperer une URL si on l'a pas déjà dans un cache fichier
598
 *
599
 * Le délai de cache est fourni par l'option `delai_cache`
600
 * Les autres options et le format de retour sont identiques à la fonction `recuperer_url`
601
 * @uses recuperer_url()
602
 *
603
 * @param string $url
604
 * @param array $options
605
 *   int delai_cache : anciennete acceptable pour le contenu (en seconde)
606
 * @return array|bool|mixed
607
 */
608
function recuperer_url_cache($url, $options = array()) {
609
	if (!defined('_DELAI_RECUPERER_URL_CACHE')) {
610
		define('_DELAI_RECUPERER_URL_CACHE', 3600);
611
	}
612
	$default = array(
613
		'transcoder' => false,
614
		'methode' => 'GET',
615
		'taille_max' => null,
616
		'datas' => '',
617
		'boundary' => '',
618
		'refuser_gz' => false,
619
		'if_modified_since' => '',
620
		'uri_referer' => '',
621
		'file' => '',
622
		'follow_location' => 10,
623
		'version_http' => _INC_DISTANT_VERSION_HTTP,
624
		'delai_cache' => in_array(_VAR_MODE, ['preview', 'recalcul']) ? 0 : _DELAI_RECUPERER_URL_CACHE,
625
	);
626
	$options = array_merge($default, $options);
627
628
	// cas ou il n'est pas possible de cacher
629
	if (!empty($options['data']) or $options['methode'] == 'POST') {
630
		return recuperer_url($url, $options);
631
	}
632
633
	// ne pas tenter plusieurs fois la meme url en erreur (non cachee donc)
634
	static $errors = array();
635
	if (isset($errors[$url])) {
636
		return $errors[$url];
637
	}
638
639
	$sig = $options;
640
	unset($sig['if_modified_since']);
641
	unset($sig['delai_cache']);
642
	$sig['url'] = $url;
643
644
	$dir = sous_repertoire(_DIR_CACHE, 'curl');
645
	$cache = md5(serialize($sig)) . '-' . substr(preg_replace(',\W+,', '_', $url), 0, 80);
646
	$sub = sous_repertoire($dir, substr($cache, 0, 2));
647
	$cache = "$sub$cache";
648
649
	$res = false;
650
	$is_cached = file_exists($cache);
651
	if ($is_cached
652
		and (filemtime($cache) > $_SERVER['REQUEST_TIME'] - $options['delai_cache'])
653
	) {
654
		lire_fichier($cache, $res);
655
		if ($res = unserialize($res)) {
656
			// mettre le last_modified et le status=304 ?
657
		}
658
	}
659
	if (!$res) {
660
		$res = recuperer_url($url, $options);
661
		// ne pas recharger cette url non cachee dans le meme hit puisque non disponible
662
		if (!$res) {
663
			if ($is_cached) {
664
				// on a pas reussi a recuperer mais on avait un cache : l'utiliser
665
				lire_fichier($cache, $res);
666
				$res = unserialize($res);
667
			}
668
669
			return $errors[$url] = $res;
670
		}
671
		ecrire_fichier($cache, serialize($res));
672
	}
673
674
	return $res;
675
}
676
677
/**
678
 * Recuperer le contenu sur lequel pointe la resource passee en argument
679
 * $taille_max permet de tronquer
680
 * de l'url dont on a deja recupere les en-tetes
681
 *
682
 * @param resource $handle
683
 * @param int $taille_max
684
 * @param string $fichier
685
 *   fichier dans lequel copier le contenu de la resource
686
 * @return bool|int|string
0 ignored issues
show
Documentation introduced by
Consider making the return type a bit more specific; maybe use integer|false|string.

This check looks for the generic type array as a return type and suggests a more specific type. This type is inferred from the actual code.

Loading history...
687
 *   bool false si echec
688
 *   int taille du fichier si argument fichier fourni
689
 *   string contenu de la resource
690
 */
691
function recuperer_body($handle, $taille_max = _INC_DISTANT_MAX_SIZE, $fichier = '') {
692
	$taille = 0;
693
	$result = '';
694
	$fp = false;
695
	if ($fichier) {
696
		include_spip('inc/acces');
697
		$tmpfile = "$fichier." . creer_uniqid() . '.tmp';
698
		$fp = spip_fopen_lock($tmpfile, 'w', LOCK_EX);
699
		if (!$fp and file_exists($fichier)) {
700
			return filesize($fichier);
701
		}
702
		if (!$fp) {
703
			return false;
704
		}
705
		$result = 0; // on renvoie la taille du fichier
706
	}
707
	while (!feof($handle) and $taille < $taille_max) {
708
		$res = fread($handle, 16384);
709
		$taille += strlen($res);
710
		if ($fp) {
711
			fwrite($fp, $res);
712
			$result = $taille;
713
		} else {
714
			$result .= $res;
715
		}
716
	}
717
	if ($fp) {
718
		spip_fclose_unlock($fp);
719
		spip_unlink($fichier);
720
		@rename($tmpfile, $fichier);
0 ignored issues
show
Bug introduced by
The variable $tmpfile does not seem to be defined for all execution paths leading up to this point.

If you define a variable conditionally, it can happen that it is not defined for all execution paths.

Let’s take a look at an example:

function myFunction($a) {
    switch ($a) {
        case 'foo':
            $x = 1;
            break;

        case 'bar':
            $x = 2;
            break;
    }

    // $x is potentially undefined here.
    echo $x;
}

In the above example, the variable $x is defined if you pass “foo” or “bar” as argument for $a. However, since the switch statement has no default case statement, if you pass any other value, the variable $x would be undefined.

Available Fixes

  1. Check for existence of the variable explicitly:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        if (isset($x)) { // Make sure it's always set.
            echo $x;
        }
    }
    
  2. Define a default value for the variable:

    function myFunction($a) {
        $x = ''; // Set a default which gets overridden for certain paths.
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        echo $x;
    }
    
  3. Add a value for the missing path:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
    
            // We add support for the missing case.
            default:
                $x = '';
                break;
        }
    
        echo $x;
    }
    
Loading history...
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
721
		if (!file_exists($fichier)) {
722
			return false;
723
		}
724
	}
725
726
	return $result;
727
}
728
729
/**
730
 * Lit les entetes de reponse HTTP sur la socket $handle
731
 * et retourne
732
 * false en cas d'echec,
733
 * un tableau associatif en cas de succes, contenant :
734
 * - le status
735
 * - le tableau complet des headers
736
 * - la date de derniere modif si connue
737
 * - l'url de redirection si specifiee
738
 *
739
 * @param resource $handle
740
 * @param int|bool $if_modified_since
741
 * @return bool|array
0 ignored issues
show
Documentation introduced by
Consider making the return type a bit more specific; maybe use false|array.

This check looks for the generic type array as a return type and suggests a more specific type. This type is inferred from the actual code.

Loading history...
742
 *   int status
743
 *   string headers
744
 *   int last_modified
745
 *   string location
746
 */
747
function recuperer_entetes_complets($handle, $if_modified_since = false) {
748
	$result = array('status' => 0, 'headers' => array(), 'last_modified' => 0, 'location' => '');
749
750
	$s = @trim(fgets($handle, 16384));
751
	if (!preg_match(',^HTTP/[0-9]+\.[0-9]+ ([0-9]+),', $s, $r)) {
752
		return false;
753
	}
754
	$result['status'] = intval($r[1]);
755
	while ($s = trim(fgets($handle, 16384))) {
756
		$result['headers'][] = $s . "\n";
757
		preg_match(',^([^:]*): *(.*)$,i', $s, $r);
758
		list(, $d, $v) = $r;
759
		if (strtolower(trim($d)) == 'location' and $result['status'] >= 300 and $result['status'] < 400) {
760
			$result['location'] = $v;
761
		} elseif ($d == 'Last-Modified') {
762
			$result['last_modified'] = strtotime($v);
763
		}
764
	}
765
	if ($if_modified_since
766
		and $result['last_modified']
767
		and $if_modified_since > $result['last_modified']
768
		and $result['status'] == 200
769
	) {
770
		$result['status'] = 304;
771
	}
772
773
	$result['headers'] = implode('', $result['headers']);
774
775
	return $result;
776
}
777
778
/**
779
 * Calcule le nom canonique d'une copie local d'un fichier distant
780
 *
781
 * Si on doit conserver une copie locale des fichiers distants, autant que ca
782
 * soit à un endroit canonique
783
 *
784
 * @note
785
 *   Si ca peut être bijectif c'est encore mieux,
786
 *   mais là tout de suite je ne trouve pas l'idee, étant donné les limitations
787
 *   des filesystems
788
 *
789
 * @param string $source
790
 *     URL de la source
791
 * @param string $extension
792
 *     Extension du fichier
793
 * @return string
794
 *     Nom du fichier pour copie locale
795
 **/
796
function nom_fichier_copie_locale($source, $extension) {
797
	include_spip('inc/documents');
798
799
	$d = creer_repertoire_documents('distant'); # IMG/distant/
800
	$d = sous_repertoire($d, $extension); # IMG/distant/pdf/
801
802
	// on se place tout le temps comme si on etait a la racine
803
	if (_DIR_RACINE) {
804
		$d = preg_replace(',^' . preg_quote(_DIR_RACINE) . ',', '', $d);
805
	}
806
807
	$m = md5($source);
808
809
	return $d
810
	. substr(preg_replace(',[^\w-],', '', basename($source)) . '-' . $m, 0, 12)
811
	. substr($m, 0, 4)
812
	. ".$extension";
813
}
814
815
/**
816
 * Donne le nom de la copie locale de la source
817
 *
818
 * Soit obtient l'extension du fichier directement de l'URL de la source,
819
 * soit tente de le calculer.
820
 *
821
 * @uses nom_fichier_copie_locale()
822
 * @uses recuperer_infos_distantes()
823
 *
824
 * @param string $source
825
 *      URL de la source distante
826
 * @return string
0 ignored issues
show
Documentation introduced by
Should the return type not be string|null?

This check compares the return type specified in the @return annotation of a function or method doc comment with the types returned by the function and raises an issue if they mismatch.

Loading history...
827
 *      Nom du fichier calculé
828
 **/
829
function fichier_copie_locale($source) {
830
	// Si c'est deja local pas de souci
831
	if (!tester_url_absolue($source)) {
832
		if (_DIR_RACINE) {
833
			$source = preg_replace(',^' . preg_quote(_DIR_RACINE) . ',', '', $source);
834
		}
835
836
		return $source;
837
	}
838
839
	// optimisation : on regarde si on peut deviner l'extension dans l'url et si le fichier
840
	// a deja ete copie en local avec cette extension
841
	// dans ce cas elle est fiable, pas la peine de requeter en base
842
	$path_parts = pathinfo($source);
843
	if (!isset($path_parts['extension'])) {
844
		$path_parts['extension'] = '';
845
	}
846
	$ext = $path_parts ? $path_parts['extension'] : '';
847
	if ($ext
848
		and preg_match(',^\w+$,', $ext) // pas de php?truc=1&...
849
		and $f = nom_fichier_copie_locale($source, $ext)
850
		and file_exists(_DIR_RACINE . $f)
851
	) {
852
		return $f;
853
	}
854
855
856
	// Si c'est deja dans la table des documents,
857
	// ramener le nom de sa copie potentielle
858
	$ext = sql_getfetsel('extension', 'spip_documents', 'fichier=' . sql_quote($source) . " AND distant='oui' AND extension <> ''");
859
860
	if ($ext) {
861
		return nom_fichier_copie_locale($source, $ext);
862
	}
863
864
	// voir si l'extension indiquee dans le nom du fichier est ok
865
	// et si il n'aurait pas deja ete rapatrie
866
867
	$ext = $path_parts ? $path_parts['extension'] : '';
868
869
	if ($ext and sql_getfetsel('extension', 'spip_types_documents', 'extension=' . sql_quote($ext))) {
870
		$f = nom_fichier_copie_locale($source, $ext);
871
		if (file_exists(_DIR_RACINE . $f)) {
872
			return $f;
873
		}
874
	}
875
876
	// Ping  pour voir si son extension est connue et autorisee
877
	// avec mise en cache du resultat du ping
878
879
	$cache = sous_repertoire(_DIR_CACHE, 'rid') . md5($source);
880
	if (!@file_exists($cache)
881
		or !$path_parts = @unserialize(spip_file_get_contents($cache))
882
		or _request('var_mode') == 'recalcul'
883
	) {
884
		$path_parts = recuperer_infos_distantes($source, 0, false);
885
		ecrire_fichier($cache, serialize($path_parts));
886
	}
887
	$ext = !empty($path_parts['extension']) ? $path_parts['extension'] : '';
888
	if ($ext and sql_getfetsel('extension', 'spip_types_documents', 'extension=' . sql_quote($ext))) {
889
		return nom_fichier_copie_locale($source, $ext);
890
	}
891
	spip_log("pas de copie locale pour $source", 'distant' . _LOG_ERREUR);
892
}
893
894
895
/**
896
 * Récupérer les infos d'un document distant, sans trop le télécharger
897
 *
898
 * @param string $source
899
 *     URL de la source
900
 * @param int $max
901
 *     Taille maximum du fichier à télécharger
902
 * @param bool $charger_si_petite_image
903
 *     Pour télécharger le document s'il est petit
904
 * @return array
0 ignored issues
show
Documentation introduced by
Should the return type not be false|array?

This check compares the return type specified in the @return annotation of a function or method doc comment with the types returned by the function and raises an issue if they mismatch.

Loading history...
905
 *     Couples des informations obtenues parmis :
906
 *
907
 *     - 'body' = chaine
908
 *     - 'type_image' = booleen
909
 *     - 'titre' = chaine
910
 *     - 'largeur' = intval
911
 *     - 'hauteur' = intval
912
 *     - 'taille' = intval
913
 *     - 'extension' = chaine
914
 *     - 'fichier' = chaine
915
 *     - 'mime_type' = chaine
916
 **/
917
function recuperer_infos_distantes($source, $max = 0, $charger_si_petite_image = true) {
918
919
	// pas la peine de perdre son temps
920
	if (!tester_url_absolue($source)) {
921
		return false;
922
	}
923
924
	# charger les alias des types mime
925
	include_spip('base/typedoc');
926
927
	$a = array();
928
	$mime_type = '';
929
	// On va directement charger le debut des images et des fichiers html,
930
	// de maniere a attrapper le maximum d'infos (titre, taille, etc). Si
931
	// ca echoue l'utilisateur devra les entrer...
932
	$reponse = recuperer_url($source, ['taille_max' => $max, 'refuser_gz' => true]);
933
	$headers = $reponse['headers'] ?? '';
934
	$a['body'] = $reponse['page'] ?? '';
935
	if ($headers) {
936
		if (preg_match(",\nContent-Type: *([^[:space:];]*),i", "\n$headers", $regs)) {
937
			$mime_type = (trim($regs[1]));
938
		} else {
939
			$mime_type = '';
940
		} // inconnu
941
942
		// Appliquer les alias
943
		while (isset($GLOBALS['mime_alias'][$mime_type])) {
944
			$mime_type = $GLOBALS['mime_alias'][$mime_type];
945
		}
946
947
		// Si on a un mime-type insignifiant
948
		// text/plain,application/octet-stream ou vide
949
		// c'est peut-etre que le serveur ne sait pas
950
		// ce qu'il sert ; on va tenter de detecter via l'extension de l'url
951
		// ou le Content-Disposition: attachment; filename=...
952
		$t = null;
953
		if (in_array($mime_type, array('text/plain', '', 'application/octet-stream'))) {
954
			if (!$t
955
				and preg_match(',\.([a-z0-9]+)(\?.*)?$,i', $source, $rext)
956
			) {
957
				$t = sql_fetsel('extension', 'spip_types_documents', 'extension=' . sql_quote($rext[1], '', 'text'));
958
			}
959 View Code Duplication
			if (!$t
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
960
				and preg_match(',^Content-Disposition:\s*attachment;\s*filename=(.*)$,Uims', $headers, $m)
961
				and preg_match(',\.([a-z0-9]+)(\?.*)?$,i', $m[1], $rext)
962
			) {
963
				$t = sql_fetsel('extension', 'spip_types_documents', 'extension=' . sql_quote($rext[1], '', 'text'));
964
			}
965
		}
966
967
		// Autre mime/type (ou text/plain avec fichier d'extension inconnue)
968
		if (!$t) {
969
			$t = sql_fetsel('extension', 'spip_types_documents', 'mime_type=' . sql_quote($mime_type));
970
		}
971
972
		// Toujours rien ? (ex: audio/x-ogg au lieu de application/ogg)
973
		// On essaie de nouveau avec l'extension
974 View Code Duplication
		if (!$t
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
975
			and $mime_type != 'text/plain'
976
			and preg_match(',\.([a-z0-9]+)(\?.*)?$,i', $source, $rext)
977
		) {
978
			# eviter xxx.3 => 3gp (> SPIP 3)
979
			$t = sql_fetsel('extension', 'spip_types_documents', 'extension=' . sql_quote($rext[1], '', 'text'));
980
		}
981
982
		if ($t) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $t of type array is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
983
			spip_log("mime-type $mime_type ok, extension " . $t['extension'], 'distant');
984
			$a['extension'] = $t['extension'];
985
		} else {
986
			# par defaut on retombe sur '.bin' si c'est autorise
987
			spip_log("mime-type $mime_type inconnu", 'distant');
988
			$t = sql_fetsel('extension', 'spip_types_documents', "extension='bin'");
989
			if (!$t) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $t of type array is implicitly converted to a boolean; are you sure this is intended? If so, consider using empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
990
				return false;
991
			}
992
			$a['extension'] = $t['extension'];
993
		}
994
995
		if (preg_match(",\nContent-Length: *([^[:space:]]*),i", "\n$headers", $regs)) {
996
			$a['taille'] = intval($regs[1]);
997
		}
998
	}
999
1000
	// Echec avec HEAD, on tente avec GET
1001
	if (!$a and !$max) {
1002
		spip_log("tenter GET $source", 'distant');
1003
		$a = recuperer_infos_distantes($source, _INC_DISTANT_MAX_SIZE);
1004
	}
1005
1006
	// si on a rien trouve pas la peine d'insister
1007
	if (!$a) {
1008
		return false;
1009
	}
1010
1011
	// S'il s'agit d'une image pas trop grosse ou d'un fichier html, on va aller
1012
	// recharger le document en GET et recuperer des donnees supplementaires...
1013
	include_spip('inc/filtres_images_lib_mini');
1014
	if (strpos($mime_type, "image/") === 0
1015
	  and $extension = _image_trouver_extension_depuis_mime($mime_type)) {
1016
		if ($max == 0
1017
			and (empty($a['taille']) or $a['taille'] < _INC_DISTANT_MAX_SIZE)
1018
			and in_array($extension, formats_image_acceptables())
1019
			and $charger_si_petite_image
1020
		) {
1021
			$a = recuperer_infos_distantes($source, _INC_DISTANT_MAX_SIZE);
1022
		} else {
1023
			if ($a['body']) {
1024
				$a['extension'] = $extension;
1025
				$a['fichier'] = _DIR_RACINE . nom_fichier_copie_locale($source, $extension);
1026
				ecrire_fichier($a['fichier'], $a['body']);
1027
				$size_image = @spip_getimagesize($a['fichier']);
1028
				$a['largeur'] = intval($size_image[0]);
1029
				$a['hauteur'] = intval($size_image[1]);
1030
				$a['type_image'] = true;
1031
			}
1032
		}
1033
	}
1034
1035
	// Fichier swf, si on n'a pas la taille, on va mettre 425x350 par defaut
1036
	// ce sera mieux que 0x0
1037
	// Flash is dead!
1038
	if ($a and isset($a['extension']) and $a['extension'] == 'swf'
1039
		and empty($a['largeur'])
1040
	) {
1041
		$a['largeur'] = 425;
1042
		$a['hauteur'] = 350;
1043
	}
1044
1045
	if ($mime_type == 'text/html') {
1046
		include_spip('inc/filtres');
1047
		$page = recuperer_url($source, ['transcoder' => true, 'taille_max' => _INC_DISTANT_MAX_SIZE]);
1048
		$page = $page['page'] ?? '';
1049
		if (preg_match(',<title>(.*?)</title>,ims', $page, $regs)) {
1050
			$a['titre'] = corriger_caracteres(trim($regs[1]));
1051
		}
1052
		if (!isset($a['taille']) or !$a['taille']) {
1053
			$a['taille'] = strlen($page); # a peu pres
1054
		}
1055
	}
1056
	$a['mime_type'] = $mime_type;
1057
1058
	return $a;
1059
}
1060
1061
1062
/**
1063
 * Tester si un host peut etre recuperer directement ou doit passer par un proxy
1064
 *
1065
 * On peut passer en parametre le proxy et la liste des host exclus,
1066
 * pour les besoins des tests, lors de la configuration
1067
 *
1068
 * @param string $host
1069
 * @param string $http_proxy
0 ignored issues
show
Documentation introduced by
Should the type for parameter $http_proxy not be string|null?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
1070
 * @param string $http_noproxy
0 ignored issues
show
Documentation introduced by
Should the type for parameter $http_noproxy not be string|null?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
1071
 * @return string
1072
 */
1073
function need_proxy($host, $http_proxy = null, $http_noproxy = null) {
1074
	if (is_null($http_proxy)) {
1075
		$http_proxy = isset($GLOBALS['meta']['http_proxy']) ? $GLOBALS['meta']['http_proxy'] : null;
1076
	}
1077
	// rien a faire si pas de proxy :)
1078
	if (is_null($http_proxy) or !$http_proxy = trim($http_proxy)) {
1079
		return '';
1080
	}
1081
1082
	if (is_null($http_noproxy)) {
1083
		$http_noproxy = isset($GLOBALS['meta']['http_noproxy']) ? $GLOBALS['meta']['http_noproxy'] : null;
1084
	}
1085
	// si pas d'exception, on retourne le proxy
1086
	if (is_null($http_noproxy) or !$http_noproxy = trim($http_noproxy)) {
1087
		return $http_proxy;
1088
	}
1089
1090
	// si le host ou l'un des domaines parents est dans $http_noproxy on fait exception
1091
	// $http_noproxy peut contenir plusieurs domaines separes par des espaces ou retour ligne
1092
	$http_noproxy = str_replace("\n", " ", $http_noproxy);
1093
	$http_noproxy = str_replace("\r", " ", $http_noproxy);
1094
	$http_noproxy = " $http_noproxy ";
1095
	$domain = $host;
1096
	// si le domaine exact www.example.org est dans les exceptions
1097
	if (strpos($http_noproxy, " $domain ") !== false)
1098
		return '';
1099
1100
	while (strpos($domain, '.') !== false) {
1101
		$domain = explode('.', $domain);
1102
		array_shift($domain);
1103
		$domain = implode('.', $domain);
1104
1105
		// ou si un domaine parent commencant par un . est dans les exceptions (indiquant qu'il couvre tous les sous-domaines)
1106
		if (strpos($http_noproxy, " .$domain ") !== false) {
1107
			return '';
1108
		}
1109
	}
1110
1111
	// ok c'est pas une exception
1112
	return $http_proxy;
1113
}
1114
1115
1116
/**
1117
 * Initialise une requete HTTP avec entetes
1118
 *
1119
 * Décompose l'url en son schema+host+path+port et lance la requete.
1120
 * Retourne le descripteur sur lequel lire la réponse.
1121
 *
1122
 * @uses lance_requete()
1123
 *
1124
 * @param string $method
1125
 *   HEAD, GET, POST
1126
 * @param string $url
1127
 * @param bool $refuse_gz
1128
 * @param string $referer
1129
 * @param string $datas
1130
 * @param string $vers
1131
 * @param string $date
1132
 * @return array
1133
 */
1134
function init_http($method, $url, $refuse_gz = false, $referer = '', $datas = '', $vers = 'HTTP/1.0', $date = '') {
1135
	$user = $via_proxy = $proxy_user = '';
0 ignored issues
show
Unused Code introduced by
$via_proxy is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
Unused Code introduced by
$proxy_user is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
1136
	$fopen = false;
1137
1138
	$t = @parse_url($url);
1139
	$host = $t['host'];
1140
	if ($t['scheme'] == 'http') {
1141
		$scheme = 'http';
1142
		$noproxy = '';
1143
	} elseif ($t['scheme'] == 'https') {
1144
		$scheme = 'ssl';
1145
		$noproxy = 'ssl://';
1146 View Code Duplication
		if (!isset($t['port']) || !($port = $t['port'])) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1147
			$t['port'] = 443;
1148
		}
1149
	} else {
1150
		$scheme = $t['scheme'];
1151
		$noproxy = $scheme . '://';
1152
	}
1153
	if (isset($t['user'])) {
1154
		$user = array($t['user'], $t['pass']);
1155
	}
1156
1157 View Code Duplication
	if (!isset($t['port']) || !($port = $t['port'])) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1158
		$port = 80;
1159
	}
1160 View Code Duplication
	if (!isset($t['path']) || !($path = $t['path'])) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1161
		$path = '/';
1162
	}
1163
1164
	if (!empty($t['query'])) {
1165
		$path .= '?' . $t['query'];
1166
	}
1167
1168
	$f = lance_requete($method, $scheme, $user, $host, $path, $port, $noproxy, $refuse_gz, $referer, $datas, $vers, $date);
0 ignored issues
show
Bug introduced by
It seems like $user defined by $via_proxy = $proxy_user = '' on line 1135 can also be of type string; however, lance_requete() does only seem to accept array, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
1169
	if (!$f or !is_resource($f)) {
1170
		// fallback : fopen si on a pas fait timeout dans lance_requete
1171
		// ce qui correspond a $f===110
1172
		if ($f !== 110
1173
			and !need_proxy($host)
1174
			and !_request('tester_proxy')
1175
			and (!isset($GLOBALS['inc_distant_allow_fopen']) or $GLOBALS['inc_distant_allow_fopen'])
1176
		) {
1177
			$f = @fopen($url, 'rb');
1178
			spip_log("connexion vers $url par simple fopen", 'distant');
1179
			$fopen = true;
1180
		} else {
1181
			// echec total
1182
			$f = false;
1183
		}
1184
	}
1185
1186
	return array($f, $fopen);
1187
}
1188
1189
/**
1190
 * Lancer la requete proprement dite
1191
 *
1192
 * @param string $method
1193
 *   type de la requete (GET, HEAD, POST...)
1194
 * @param string $scheme
1195
 *   protocole (http, tls, ftp...)
1196
 * @param array $user
1197
 *   couple (utilisateur, mot de passe) en cas d'authentification http
1198
 * @param string $host
1199
 *   nom de domaine
1200
 * @param string $path
1201
 *   chemin de la page cherchee
1202
 * @param string $port
1203
 *   port utilise pour la connexion
1204
 * @param bool $noproxy
1205
 *   protocole utilise si requete sans proxy
1206
 * @param bool $refuse_gz
1207
 *   refuser la compression GZ
1208
 * @param string $referer
1209
 *   referer
1210
 * @param string $datas
1211
 *   donnees postees
1212
 * @param string $vers
1213
 *   version HTTP
1214
 * @param int|string $date
1215
 *   timestamp pour entente If-Modified-Since
1216
 * @return bool|resource
1217
 *   false|int si echec
1218
 *   resource socket vers l'url demandee
1219
 */
1220
function lance_requete(
1221
	$method,
1222
	$scheme,
1223
	$user,
1224
	$host,
1225
	$path,
1226
	$port,
1227
	$noproxy,
1228
	$refuse_gz = false,
1229
	$referer = '',
1230
	$datas = '',
1231
	$vers = 'HTTP/1.0',
1232
	$date = ''
1233
) {
1234
1235
	$proxy_user = '';
1236
	$http_proxy = need_proxy($host);
1237
	if ($user) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $user of type array is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
1238
		$user = urlencode($user[0]) . ':' . urlencode($user[1]);
1239
	}
1240
1241
	$connect = '';
1242
	if ($http_proxy) {
1243
		if (!defined('_PROXY_HTTPS_NOT_VIA_CONNECT') and in_array($scheme , array('tls','ssl'))) {
1244
			$path_host = (!$user ? '' : "$user@") . $host . (($port != 80) ? ":$port" : '');
1245
			$connect = 'CONNECT ' . $path_host . " $vers\r\n"
1246
				. "Host: $path_host\r\n"
1247
				. "Proxy-Connection: Keep-Alive\r\n";
1248
		} else {
1249
			$path = (in_array($scheme , array('tls','ssl')) ? 'https://' : "$scheme://")
1250
				. (!$user ? '' : "$user@")
1251
				. "$host" . (($port != 80) ? ":$port" : '') . $path;
1252
		}
1253
		$t2 = @parse_url($http_proxy);
1254
		$first_host = $t2['host'];
1255
		if (!($port = $t2['port'])) {
1256
			$port = 80;
1257
		}
1258
		if ($t2['user']) {
1259
			$proxy_user = base64_encode($t2['user'] . ':' . $t2['pass']);
1260
		}
1261
	} else {
1262
		$first_host = $noproxy . $host;
1263
	}
1264
1265
	if ($connect) {
1266
		$streamContext = stream_context_create(array(
1267
			'ssl' => array(
1268
				'verify_peer' => false,
1269
				'allow_self_signed' => true,
1270
				'SNI_enabled' => true,
1271
				'peer_name' => $host,
1272
			)
1273
		));
1274
		$f = @stream_socket_client(
1275
			"tcp://$first_host:$port",
1276
			$errno,
1277
			$errstr,
1278
			_INC_DISTANT_CONNECT_TIMEOUT,
1279
			STREAM_CLIENT_CONNECT,
1280
			$streamContext
1281
		);
1282
		spip_log("Recuperer $path sur $first_host:$port par $f (via CONNECT)", 'connect');
1283
		if (!$f) {
1284
			spip_log("Erreur connexion $errno $errstr", 'distant' . _LOG_ERREUR);
1285
			return $errno;
1286
		}
1287
		stream_set_timeout($f, _INC_DISTANT_CONNECT_TIMEOUT);
1288
1289
		fputs($f, $connect);
1290
		fputs($f, "\r\n");
1291
		$res = fread($f, 1024);
1292
		if (!$res
1293
			or !count($res = explode(' ', $res))
1294
			or $res[1] !== '200'
1295
		) {
1296
			spip_log("Echec CONNECT sur $first_host:$port", 'connect' . _LOG_INFO_IMPORTANTE);
1297
			fclose($f);
1298
1299
			return false;
1300
		}
1301
		// important, car sinon on lit trop vite et les donnees ne sont pas encore dispo
1302
		stream_set_blocking($f, true);
1303
		// envoyer le handshake
1304
		stream_socket_enable_crypto($f, true, STREAM_CRYPTO_METHOD_SSLv23_CLIENT);
1305
		spip_log("OK CONNECT sur $first_host:$port", 'connect');
1306
	} else {
1307
		$ntry = 3;
1308
		do {
1309
			$f = @fsockopen($first_host, $port, $errno, $errstr, _INC_DISTANT_CONNECT_TIMEOUT);
1310
		} while (!$f and $ntry-- and $errno !== 110 and sleep(1));
1311
		spip_log("Recuperer $path sur $first_host:$port par $f");
1312
		if (!$f) {
1313
			spip_log("Erreur connexion $errno $errstr", 'distant' . _LOG_ERREUR);
1314
1315
			return $errno;
1316
		}
1317
		stream_set_timeout($f, _INC_DISTANT_CONNECT_TIMEOUT);
1318
	}
1319
1320
	$site = isset($GLOBALS['meta']['adresse_site']) ? $GLOBALS['meta']['adresse_site'] : '';
1321
1322
	$host_port = $host;
1323
	if ($port != (in_array($scheme , array('tls','ssl')) ? 443 : 80)) {
1324
		$host_port .= ":$port";
1325
	}
1326
	$req = "$method $path $vers\r\n"
1327
		. "Host: $host_port\r\n"
1328
		. 'User-Agent: ' . _INC_DISTANT_USER_AGENT . "\r\n"
1329
		. ($refuse_gz ? '' : ('Accept-Encoding: ' . _INC_DISTANT_CONTENT_ENCODING . "\r\n"))
1330
		. (!$site ? '' : "Referer: $site/$referer\r\n")
1331
		. (!$date ? '' : 'If-Modified-Since: ' . (gmdate('D, d M Y H:i:s', $date) . " GMT\r\n"))
1332
		. (!$user ? '' : ('Authorization: Basic ' . base64_encode($user) . "\r\n"))
1333
		. (!$proxy_user ? '' : "Proxy-Authorization: Basic $proxy_user\r\n")
1334
		. (!strpos($vers, '1.1') ? '' : "Keep-Alive: 300\r\nConnection: keep-alive\r\n");
1335
1336
#	spip_log("Requete\n$req", 'distant');
1337
	fputs($f, $req);
1338
	fputs($f, $datas ? $datas : "\r\n");
1339
1340
	return $f;
1341
}
1342