Completed
Push — master ( 8e3f70...b785b5 )
by Andreas
19:42
created

midcom_services_indexer_document   A

Complexity

Total Complexity 40

Size/Duplication

Total Lines 555
Duplicated Lines 0 %

Test Coverage

Coverage 22.79%

Importance

Changes 0
Metric Value
eloc 125
c 0
b 0
f 0
dl 0
loc 555
ccs 31
cts 136
cp 0.2279
rs 9.2
wmc 40

23 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 3 1
A get_field() 0 7 2
A get_field_record() 0 7 2
A remove_field() 0 3 1
A list_fields() 0 3 1
A add_person() 0 3 1
A _add_field() 0 6 2
A fields_to_members() 0 23 3
A read_metadata_from_object() 0 19 6
A add_keyword() 0 3 1
A add_date() 0 4 1
A add_text() 0 3 1
A add_date_pair() 0 4 1
A _set_type() 0 6 2
A process_topic() 0 14 3
A is_a() 0 3 1
A read_authorname() 0 10 2
A add_result() 0 3 1
A add_unstored() 0 3 1
A html2text() 0 12 1
A read_person() 0 6 2
A members_to_fields() 0 28 3
A add_unindexed() 0 3 1

How to fix   Complexity   

Complex Class

Complex classes like midcom_services_indexer_document often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use midcom_services_indexer_document, and based on these observations, apply Extract Interface, too.

1
<?php
2
/**
3
 * @package midcom.services
4
 * @author The Midgard Project, http://www.midgard-project.org
5
 * @copyright The Midgard Project, http://www.midgard-project.org
6
 * @license http://www.gnu.org/licenses/lgpl.html GNU Lesser General Public License
7
 */
8
9
/**
10
 * This class encapsulates a single indexer document. It is used for both indexing
11
 * and retrieval.
12
 *
13
 * A document consists of a number of fields, each field has different properties
14
 * when handled by the indexer (exact behavior depends, as always, on the indexer
15
 * backend in use). On retrieval, this field information is lost, all fields being
16
 * of the same type (naturally). The core indexer backend supports these field
17
 * types:
18
 *
19
 * - <i>date</i> is a date-wrapped field suitable for use with the Date Filter.
20
 * - <i>keyword</i> is store and indexed, but not tokenized.
21
 * - <i>unindexed</i> is stored but neither indexed nor tokenized.
22
 * - <i>unstored</i> is not stored, but indexed and tokenized.
23
 * - <i>text</i> is stored, indexed and tokenized.
24
 *
25
 * This class should not be instantiated directly, a new instance of this class
26
 * can be obtained using the midcom_services_indexer class.
27
 *
28
 * A number of predefined fields are available using member fields. These fields
29
 * are all meta-fields. See their individual documentation for details. All fields
30
 * are mandatory unless mentioned otherwise explicitly and, as always, assumed to
31
 * be in the local charset.
32
 *
33
 * Remember, that both date and unstored fields are not available on retrieval.
34
 * For the core fields, all timestamps are stored twice therefore, once as searchable
35
 * field, and once as readable timestamp.
36
 *
37
 * The class will automatically pass all data to the i18n charset conversion functions,
38
 * thus you work using your site's charset like usual. UTF-8 conversion is done
39
 * implicitly.
40
 *
41
 * @package midcom.services
42
 * @see midcom_services_indexer
43
 * @todo The Type field is not yet handled properly.
44
 */
45
class midcom_services_indexer_document
46
{
47
    /**
48
     * An associative array containing all fields of the current document.
49
     *
50
     * Each field is indexed by its name (a string). The value is another
51
     * array containing the fields "name", type" and "content".
52
     *
53
     * @var Array
54
     */
55
    private $_fields = [];
56
57
    /**
58
     * The i18n service, used for charset conversion.
59
     *
60
     * @var midcom_services_i18n
61
     */
62
    protected $_i18n;
63
64
    /**
65
     * This is the score of this document. Only populated on resultset documents,
66
     * of course.
67
     *
68
     * @var double
69
     */
70
    public $score = 0.0;
71
72
    /* ------ START OF DOCUMENT FIELDS --------- */
73
74
    /**
75
     * The Resource Identifier of this document.
76
     *
77
     * Must be UTF-8 on assignment already.
78
     *
79
     * This field is mandatory.
80
     *
81
     * @var string
82
     */
83
    public $RI = '';
84
85
    /**
86
     * Two letter language code of the document content
87
     *
88
     * This field is optional.
89
     *
90
     * @var string
91
     */
92
    public $lang = '';
93
94
    /**
95
     * The GUID of the topic the document is assigned to.
96
     *
97
     * May be empty for non-midgard resources.
98
     *
99
     * This field is mandatory.
100
     *
101
     * @var string GUID
102
     */
103
    public $topic_guid = '';
104
105
    /**
106
     * The name of the component responsible for the document.
107
     *
108
     * May be empty for non-midgard resources.
109
     *
110
     * This field is mandatory.
111
     *
112
     * @var string
113
     */
114
    public $component = '';
115
116
    /**
117
     * The fully qualified URL to the document, this should be a PermaLink.
118
     *
119
     * This field is mandatory.
120
     *
121
     * @var string
122
     */
123
    public $document_url = '';
124
125
    /**
126
     * The time of document creation, this is a UNIX timestamp.
127
     *
128
     * This field is mandatory.
129
     *
130
     * @var int
131
     */
132
    public $created = 0;
133
134
    /**
135
     * The time of the last document modification, this is a UNIX timestamp.
136
     *
137
     * This field is mandatory.
138
     *
139
     * @var int
140
     */
141
    public $edited = 0;
142
143
    /**
144
     * The timestamp of indexing.
145
     *
146
     * This field is added automatically and to be considered read-only.
147
     *
148
     * @var int
149
     */
150
    public $indexed = 0;
151
152
    /**
153
     * The MidgardPerson who created the object.
154
     *
155
     * This is optional.
156
     *
157
     * @var midcom_db_person
158
     */
159
    public $creator;
160
161
    /**
162
     * The MidgardPerson who modified the object the last time.
163
     *
164
     * This is optional.
165
     *
166
     * @var midcom_db_person
167
     */
168
    public $editor;
169
170
    /**
171
     * The title of the document
172
     *
173
     * This is mandatory.
174
     *
175
     * @var string
176
     */
177
    public $title = '';
178
179
    /**
180
     * The content of the document
181
     *
182
     * This is mandatory.
183
     *
184
     * This field is empty on documents retrieved from the index.
185
     *
186
     * @var string
187
     */
188
    public $content = '';
189
190
    /**
191
     * The abstract of the document
192
     *
193
     * This is optional.
194
     *
195
     * @var string
196
     */
197
    public $abstract = '';
198
199
    /**
200
     * The author of the document
201
     *
202
     * This is optional.
203
     *
204
     * @var string
205
     */
206
    public $author = '';
207
208
    /**
209
     * An additional tag indicating the source of the document for use by the
210
     * component doing the indexing.
211
     *
212
     * This value is not indexed and should not be used by anybody except the
213
     * component doing the indexing.
214
     *
215
     * This is optional.
216
     *
217
     * @var string
218
     */
219
    public $source = '';
220
221
    /**
222
     * The full path to the topic that houses the document.
223
     *
224
     * For external resources, this should be either a MidCOM topic, to which this
225
     * resource is associated or some "directory" after which you could filter.
226
     * You may also leave it empty prohibiting it to appear on any topic-specific search.
227
     *
228
     * The value should be fully qualified, as returned by MIDCOM_NAV_FULLURL, including
229
     * a trailing slash, f.x. https://host/path/to/topic/
230
     *
231
     * This is optional.
232
     *
233
     * @var string
234
     */
235
    public $topic_url = '';
236
237
    /**
238
     * The type of the document, set by subclasses and added to the index
239
     * automatically.
240
     *
241
     * The type *must* reflect the original type hierarchy. It is to be set
242
     * using the $this->_set_type call <i>after</i> initializing the base class.
243
     *
244
     * @see is_a()
245
     * @see _set_type()
246
     * @var string
247
     */
248
    public $type = '';
249
250
    /**
251
     * This is have support for #651 without rewriting all components' index methods
252
     *
253
     * If set to false the indexer backend will silently skip this document.
254
     *
255
     * @see http://trac.midgard-project.org/ticket/651
256
     * @var boolean
257
     */
258
    public $actually_index = true;
259
260
    /* ------ END OF DOCUMENT FIELDS --------- */
261
262
    /**
263
     * Initialize the object, nothing fancy here.
264
     */
265 14
    public function __construct()
266
    {
267 14
        $this->_i18n = midcom::get()->i18n;
268 14
    }
269
270
    /**
271
     * Returns the contents of the field name or false on failure.
272
     *
273
     * @param string $name The name of the field.
274
     * @return mixed The content of the field or false on failure.
275
     */
276
    public function get_field($name)
277
    {
278
        if (!array_key_exists($name, $this->_fields)) {
279
            debug_add("Field {$name} not found in the document.", MIDCOM_LOG_INFO);
280
            return false;
281
        }
282
        return $this->_i18n->convert_from_utf8($this->_fields[$name]['content']);
283
    }
284
285
    /**
286
     * Returns the complete internal field record, including type and UTF-8 encoded
287
     * content.
288
     *
289
     * This should normally not be used from the outside, it is geared towards the
290
     * indexer backends, which need the full field information on indexing.
291
     *
292
     * @param string $name The name of the field.
293
     * @return Array The full content record.
294
     */
295
    public function get_field_record($name)
296
    {
297
        if (!array_key_exists($name, $this->_fields)) {
298
            debug_add("Field {$name} not found in the document.", MIDCOM_LOG_INFO);
299
            return false;
300
        }
301
        return $this->_fields[$name];
302
    }
303
304
    /**
305
     * Returns a list of all defined fields.
306
     */
307
    public function list_fields() : array
308
    {
309
        return array_keys($this->_fields);
310
    }
311
312
    /**
313
     * Remove a field from the list. Nonexistent fields are ignored silently.
314
     *
315
     * @param string $name The name of the field.
316
     */
317
    public function remove_field($name)
318
    {
319
        unset($this->_fields[$name]);
320
    }
321
322
    /**
323
     * Add a date field. A timestamp is expected, which is automatically
324
     * converted to a suitable ISO timestamp before storage.
325
     *
326
     * Direct specification of the ISO timestamp is not yet possible due
327
     * to lacking validation outside the timestamp range.
328
     *
329
     * If a field of the same name is already present, it is overwritten
330
     * silently.
331
     */
332
    public function add_date(string $name, int $timestamp)
333
    {
334
        // This is always UTF-8 conformant.
335
        $this->_add_field($name, 'date', gmstrftime('%Y-%m-%dT%H:%M:%SZ', $timestamp), true);
336
    }
337
338
    /**
339
     * Create a normal date field and an unindexed _TS-postfixed timestamp field at the same time.
340
     *
341
     * This is useful because the date fields are not in a readable format,
342
     * it can't even be determined that they were a date in the first place.
343
     * so the _TS field is quite useful if you need the original value for the
344
     * timestamp.
345
     *
346
     * @param string $name The field's name, "_TS" is appended for the plain-timestamp field.
347
     * @param int $timestamp The timestamp to store.
348
     */
349
    public function add_date_pair(string $name, int $timestamp)
350
    {
351
        $this->add_date($name, $timestamp);
352
        $this->add_unindexed("{$name}_TS", $timestamp);
353
    }
354
355
    public function add_keyword(string $name, string $content)
356
    {
357
        $this->_add_field($name, 'keyword', $content);
358
    }
359
360
    public function add_unindexed(string $name, string $content)
361
    {
362
        $this->_add_field($name, 'unindexed', $content);
363
    }
364
365
    public function add_unstored(string $name, string $content)
366
    {
367
        $this->_add_field($name, 'unstored', $this->html2text($content));
368
    }
369
370
    public function add_text(string $name, string $content)
371
    {
372
        $this->_add_field($name, 'text', $this->html2text($content));
373
    }
374
375
    /**
376
     * Add a search result field, this should normally not be done
377
     * manually, the indexer will call this function when creating a
378
     * document out of a search result.
379
     *
380
     * @param string $name The field's name.
381
     * @param string $content The field's content, which is <b>assumed to be UTF-8 already</b>
382
     */
383
    public function add_result(string $name, $content)
384
    {
385
        $this->_add_field($name, 'result', $content, true);
386
    }
387
388
    /**
389
     * Add a person field.
390
     */
391
    private function add_person(string $name, ?midcom_db_person $person)
392
    {
393
        $this->add_text($name, $person->guid ?? '');
394
    }
395
396
    /**
397
     * This will translate all member variables into appropriate
398
     * field records, the backend should call this immediately before
399
     * indexing.
400
     *
401
     * This call will automatically populate indexed with time()
402
     * and author with the name of the creator (if set).
403
     */
404
    public function members_to_fields()
405
    {
406
        // Complete fields
407
        $this->indexed = time();
408
        if (   $this->author == ''
409
            && isset($this->creator->name)) {
410
            $this->author = $this->creator->name;
411
        }
412
413
        // __RI does not need to be populated, this is done by backends.
414
        $this->add_unindexed('__LANG', $this->lang);
415
        $this->add_text('__TOPIC_GUID', $this->topic_guid);
416
        $this->add_text('__COMPONENT', $this->component);
417
        $this->add_unindexed('__DOCUMENT_URL', $this->document_url);
418
        $this->add_text('__TOPIC_URL', $this->topic_url);
419
        $this->add_date_pair('__CREATED', $this->created);
420
        $this->add_date_pair('__EDITED', $this->edited);
421
        $this->add_date_pair('__INDEXED', $this->indexed);
422
        $this->add_text('title', $this->title);
423
        $this->add_unstored('content', $this->content);
424
425
        $this->add_unindexed('__SOURCE', $this->source);
426
        $this->add_person('__CREATOR', $this->creator);
427
        $this->add_person('__EDITOR', $this->editor);
428
429
        $this->add_text('author', $this->author);
430
        $this->add_text('abstract', $this->abstract);
431
        $this->add_text('__TYPE', $this->type);
432
    }
433
434
    /**
435
     * Populate all relevant members with the respective values after
436
     * retrieving a document from the index
437
     */
438
    public function fields_to_members()
439
    {
440
        $this->RI = $this->get_field('__RI');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__RI') can also be of type false. However, the property $RI is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
441
        $this->lang = $this->get_field('__LANG');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__LANG') can also be of type false. However, the property $lang is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
442
        $this->topic_guid = $this->get_field('__TOPIC_GUID');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__TOPIC_GUID') can also be of type false. However, the property $topic_guid is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
443
        $this->component = $this->get_field('__COMPONENT');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__COMPONENT') can also be of type false. However, the property $component is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
444
        $this->document_url = $this->get_field('__DOCUMENT_URL');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__DOCUMENT_URL') can also be of type false. However, the property $document_url is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
445
        $this->topic_url = $this->get_field('__TOPIC_URL');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__TOPIC_URL') can also be of type false. However, the property $topic_url is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
446
        $this->created = $this->get_field('__CREATED_TS');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__CREATED_TS') of type false or string is incompatible with the declared type integer of property $created.

Our type inference engine has found an assignment to a property that is incompatible with the declared type of that property.

Either this assignment is in error or the assigned type should be added to the documentation/type hint for that property..

Loading history...
447
        $this->edited = $this->get_field('__EDITED_TS');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__EDITED_TS') of type false or string is incompatible with the declared type integer of property $edited.

Our type inference engine has found an assignment to a property that is incompatible with the declared type of that property.

Either this assignment is in error or the assigned type should be added to the documentation/type hint for that property..

Loading history...
448
        $this->indexed = $this->get_field('__INDEXED_TS');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__INDEXED_TS') of type false or string is incompatible with the declared type integer of property $indexed.

Our type inference engine has found an assignment to a property that is incompatible with the declared type of that property.

Either this assignment is in error or the assigned type should be added to the documentation/type hint for that property..

Loading history...
449
        $this->title = $this->get_field('title');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('title') can also be of type false. However, the property $title is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
450
451
        $this->source = $this->get_field('__SOURCE');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__SOURCE') can also be of type false. However, the property $source is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
452
        if ($creator = $this->get_field('__CREATOR')) {
453
            $this->creator = $this->read_person($creator);
454
        }
455
        if ($editor = $this->get_field('__EDITOR')) {
456
            $this->editor = $this->read_person($editor);
457
        }
458
        $this->author = $this->get_field('author');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('author') can also be of type false. However, the property $author is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
459
        $this->abstract = $this->get_field('abstract');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('abstract') can also be of type false. However, the property $abstract is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
460
        $this->type = $this->get_field('__TYPE');
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->get_field('__TYPE') can also be of type false. However, the property $type is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
461
    }
462
463
    /**
464
     * Internal helper which actually stores a field.
465
     */
466
    protected function _add_field(string $name, string $type, $content, bool $is_utf8 = false)
467
    {
468
        $this->_fields[$name] = [
469
            'name' => $name,
470
            'type' => $type,
471
            'content' => ($is_utf8 ? $content : $this->_i18n->convert_to_utf8($content))
472
        ];
473
    }
474
475
    /**
476
     * Convert HTML to plain text (relatively simple):
477
     *
478
     * Basically, JavaScript blocks and
479
     * HTML Tags are stripped, and all HTML Entities
480
     * are converted to their native equivalents.
481
     *
482
     * Don't replace with an empty string but with a space, so that constructs like
483
     * <li>torben</li><li>nehmer</li> are recognized correctly.
484
     */
485 14
    public function html2text(string $text) : string
486
    {
487
        $search = [
488 14
            "'\s*<script[^>]*?>.*?</script>\s*'si", // Strip out javascript
489
            "'\s*<[\/\!]*?[^<>]*?>\s*'si", // Strip out html tags
490
        ];
491
        $replace = [
492 14
            ' ',
493
            ' ',
494
        ];
495 14
        $result = $this->_i18n->html_entity_decode(preg_replace($search, $replace, $text));
496 14
        return trim(preg_replace('/\s+/s', ' ', $result));
497
    }
498
499
    /**
500
     * Checks whether the given document is an instance of given document type.
501
     *
502
     * This is equivalent to the is_a object hierarchy check, except that it
503
     * works with MidCOM documents.
504
     *
505
     * @see $type
506
     * @see _set_type()
507
     */
508
    public function is_a(string $document_type) : bool
509
    {
510
        return str_starts_with($this->type, $document_type);
511
    }
512
513
    /**
514
     * Sets the type of the object, reflecting the inheritance hierarchy.
515
     *
516
     * @see $type
517
     * @see is_a()
518
     */
519 6
    protected function _set_type(string $type)
520
    {
521 6
        if (empty($this->type)) {
522 6
            $this->type = $type;
523
        } else {
524
            $this->type .= "_{$type}";
525
        }
526 6
    }
527
528
    /**
529
     * Tries to determine the topic GUID and component using NAPs reverse-lookup capabilities.
530
     *
531
     * If this fails, you have to set the members $topic_guid, $topic_url and
532
     * $component manually.
533
     */
534
    protected function process_topic()
535
    {
536
        $nav = new midcom_helper_nav();
537
        $object = $nav->resolve_guid($this->source, true);
538
        if (!$object) {
539
            debug_add("Failed to resolve the topic, skipping autodetection.");
540
            return;
541
        }
542
        if ($object[MIDCOM_NAV_TYPE] == 'leaf') {
543
            $object = $nav->get_node($object[MIDCOM_NAV_NODEID]);
544
        }
545
        $this->topic_guid = $object[MIDCOM_NAV_GUID];
546
        $this->topic_url = $object[MIDCOM_NAV_FULLURL];
547
        $this->component = $object[MIDCOM_NAV_COMPONENT];
548
    }
549
550
    /**
551
     * Tries to resolve created, revised, author, editor and creator for the document from Midgard object
552
     */
553 6
    public function read_metadata_from_object(midcom_core_dbaobject $object)
554
    {
555
        // if published is set to non-empty value, use it as creation data
556 6
        $this->created = $object->metadata->published ?: $object->metadata->created;
557
        // Revised
558 6
        $this->edited = $object->metadata->revised;
559
        // Heuristics to determine author
560 6
        if (!empty($object->metadata->authors)) {
561 4
            $this->author = $this->read_authorname($object->metadata->authors);
562 2
        } elseif (!empty($object->metadata->creator)) {
563 2
            $this->author = $this->read_authorname($object->metadata->creator);
564
        }
565
        // Creator
566 6
        if (isset($object->metadata->creator)) {
567 6
            $this->creator = $this->read_person($object->metadata->creator);
568
        }
569
        // Editor
570 6
        if (isset($object->metadata->revisor)) {
571 6
            $this->editor = $this->read_person($object->metadata->revisor);
572
        }
573 6
    }
574
575
    /**
576
     * Get person by given ID, caches results.
577
     */
578 6
    private function read_person(string $guid) : ?midcom_db_person
579
    {
580
        try {
581 6
            return midcom_db_person::get_cached($guid);
582
        } catch (midcom_error $e) {
583
            return null;
584
        }
585
    }
586
587
    /**
588
     * Gets person name for given ID (in case it's imploded_wrapped of multiple GUIDs it will use the first)
589
     */
590 6
    private function read_authorname(string $input) : string
591
    {
592
        // Check for imploded_wrapped datamanager storage.
593 6
        if (str_contains($input, '|')) {
594
            // Find first non-empty value in the array and use that
595 4
            $id_arr = array_filter(explode('|', $input));
596 4
            $input = $id_arr[0] ?? null;
597
        }
598
599 6
        return midcom::get()->auth->get_user($input)->name ?? '';
600
    }
601
}
602