Completed
Push — master ( 1d9aa7...03d576 )
by Henri
10:00
created

Model::searchConceptsAndInfo()   C

Complexity

Conditions 10
Paths 68

Size

Total Lines 47
Code Lines 29

Duplication

Lines 0
Ratio 0 %
Metric Value
dl 0
loc 47
rs 5.1579
cc 10
eloc 29
nc 68
nop 1

How to fix   Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
0 ignored issues
show
Coding Style Compatibility introduced by
For compatibility and reusability of your code, PSR1 recommends that a file should introduce either new symbols (like classes, functions, etc.) or have side-effects (like outputting something, or including other files), but not both at the same time. The first symbol is defined on line 22 and the first side effect is on line 11.

The PSR-1: Basic Coding Standard recommends that a file should either introduce new symbols, that is classes, functions, constants or similar, or have side effects. Side effects are anything that executes logic, like for example printing output, changing ini settings or writing to a file.

The idea behind this recommendation is that merely auto-loading a class should not change the state of an application. It also promotes a cleaner style of programming and makes your code less prone to errors, because the logic is not spread out all over the place.

To learn more about the PSR-1, please see the PHP-FIG site on the PSR-1.

Loading history...
2
/**
3
 * Copyright (c) 2012 Aalto University and University of Helsinki
4
 * MIT License
5
 * see LICENSE.txt for more information
6
 */
7
8
/**
9
 * Setting some often needed namespace prefixes
10
 */
11
EasyRdf_Namespace::set('skosmos', 'http://purl.org/net/skosmos#');
12
EasyRdf_Namespace::set('void', 'http://rdfs.org/ns/void#');
13
EasyRdf_Namespace::set('skosext', 'http://purl.org/finnonto/schema/skosext#');
14
EasyRdf_Namespace::set('isothes', 'http://purl.org/iso25964/skos-thes#');
15
EasyRdf_Namespace::set('mads', 'http://www.loc.gov/mads/rdf/v1#');
16
17
/**
18
 * Model provides access to the data.
19
 * @property EasyRdf_Graph $graph
20
 * @property GlobalConfig $global_config
21
 */
22
class Model
0 ignored issues
show
Coding Style Compatibility introduced by
PSR1 recommends that each class must be in a namespace of at least one level to avoid collisions.

You can fix this by adding a namespace to your class:

namespace YourVendor;

class YourClass { }

When choosing a vendor namespace, try to pick something that is not too generic to avoid conflicts with other libraries.

Loading history...
23
{
24
    /** EasyRdf_Graph graph instance */
25
    private $graph;
26
    /** cache for Vocabulary objects */
27
    private $all_vocabularies = null;
28
    /** cache for Vocabulary objects */
29
    private $vocabs_by_graph = null;
30
    /** cache for Vocabulary objects */
31
    private $vocabs_by_urispace = null;
32
    /** how long to store retrieved URI information in APC cache */
33
    private $URI_FETCH_TTL = 86400; // 1 day
34
    private $global_config;
35
36
    /**
37
     * Initializes the Model object
38
     */
39
    public function __construct($config)
40
    {
41
        $this->global_config = $config;
42
        $this->initializeVocabularies();
43
    }
44
45
    /**
46
     * Returns the GlobalConfig object given to the Model as a constructor parameter.
47
     * @return GlobalConfig
48
     */
49
    public function getConfig() {
50
      return $this->global_config;
51
    }
52
53
    /**
54
     * Initializes the configuration from the vocabularies.ttl file
55
     */
56
    private function initializeVocabularies()
57
    {
58
        if (!file_exists($this->getConfig()->getVocabularyConfigFile())) {
59
            throw new Exception($this->getConfig()->getVocabularyConfigFile() . ' is missing, please provide one.');
60
        }
61
62
        try {
63
            // use APC user cache to store parsed vocabularies.ttl configuration
64
            if (function_exists('apc_store') && function_exists('apc_fetch')) {
65
                $key = realpath($this->getConfig()->getVocabularyConfigFile()) . ", " . filemtime($this->getConfig()->getVocabularyConfigFile());
66
                $this->graph = apc_fetch($key);
67
                if ($this->graph === false) { // was not found in cache
68
                    $this->graph = new EasyRdf_Graph();
69
                    $this->graph->parse(file_get_contents($this->getConfig()->getVocabularyConfigFile()));
70
                    apc_store($key, $this->graph);
71
                }
72
            } else { // APC not available, parse on every request
73
                $this->graph = new EasyRdf_Graph();
74
                $this->graph->parse(file_get_contents($this->getConfig()->getVocabularyConfigFile()));
75
            }
76
        } catch (Exception $e) {
77
            echo "Error: " . $e->getMessage();
78
        }
79
    }
80
81
    /**
82
     * Return the version of this Skosmos installation, or "unknown" if
83
     * it cannot be determined. The version information is based on Git tags.
84
     * @return string version
85
     */
86
    public function getVersion()
87
    {
88
        $ver = null;
89
        if (file_exists('.git')) {
90
            $ver = shell_exec('git describe --tags');
91
        }
92
93
        if ($ver === null) {
94
            return "unknown";
95
        }
96
97
        return $ver;
98
    }
99
100
    /**
101
     * Return all the vocabularies available.
102
     * @param boolean $categories whether you want everything included in a subarray of
103
     * a category.
104
     * @param boolean $shortname set to true if you want the vocabularies sorted by 
105
     * their shortnames instead of ther titles.
106
     */
107
    public function getVocabularyList($categories = true, $shortname = false)
108
    {
109
        $cats = $this->getVocabularyCategories();
110
        $ret = array();
111
        foreach ($cats as $cat) {
112
            $catlabel = $cat->getTitle();
113
114
            // find all the vocabs in this category
115
            $vocs = array();
116
            foreach ($cat->getVocabularies() as $voc) {
117
                $vocs[$shortname ? $voc->getConfig()->getShortname() : $voc->getConfig()->getTitle()] = $voc;
118
            }
119
            uksort($vocs, 'strcoll');
120
121
            if (sizeof($vocs) > 0 && $categories) {
122
                $ret[$catlabel] = $vocs;
123
            } elseif (sizeof($vocs) > 0) {
124
                $ret = array_merge($ret, $vocs);
125
            }
126
127
        }
128
129
        if (!$categories) {
130
            uksort($ret, 'strcoll');
131
        }
132
133
        return $ret;
134
    }
135
136
    /**
137
     * Return all types (RDFS/OWL classes) present in the specified vocabulary or all vocabularies.
138
     * @return array Array with URIs (string) as key and array of (label, superclassURI) as value
139
     */
140
    public function getTypes($vocid = null, $lang = null)
141
    {
142
        $sparql = (isset($vocid)) ? $this->getVocabulary($vocid)->getSparql() : $this->getDefaultSparql();
143
        $result = $sparql->queryTypes($lang);
144
145
        foreach ($result as $uri => $values) {
146
            if (empty($values)) {
147
                $shorteneduri = EasyRdf_Namespace::shorten($uri);
148
                if ($shorteneduri !== null) {
149
                    $trans = gettext($shorteneduri);
150
                    if ($trans) {
151
                        $result[$uri] = array('label' => $trans);
152
                    }
153
                }
154
            }
155
        }
156
157
        return $result;
158
    }
159
160
    /**
161
     * Return the languages present in the configured vocabularies.
162
     * @return array Array with lang codes (string)
163
     */
164
    public function getLanguages($lang)
165
    {
166
        $vocabs = $this->getVocabularyList(false);
167
        $ret = array();
168
        foreach ($vocabs as $vocab) {
169
            foreach ($vocab->getConfig()->getLanguages() as $langcode) {
170
                $langlit = Punic\Language::getName($langcode, $lang);
171
                $ret[$langlit] = $langcode;
172
            }
173
        }
174
        ksort($ret);
175
        return array_unique($ret);
176
    }
177
178
    /**
179
     * returns a concept's RDF data in downloadable format
180
     * @param string $vocid vocabulary id, or null for global data (from all vocabularies)
181
     * @param string $uri concept URI
182
     * @param string $format the format in which you want to get the result, currently this function supports
183
     * text/turtle, application/rdf+xml and application/json
184
     * @return string RDF data in the requested serialization
185
     */
186
    public function getRDF($vocid, $uri, $format)
187
    {
188
189
        if ($format == 'text/turtle') {
190
            $retform = 'turtle';
191
            $serialiser = new EasyRdf_Serialiser_Turtle();
192
        } elseif ($format == 'application/ld+json' || $format == 'application/json') {
193
            $retform = 'jsonld'; // serve JSON-LD for both JSON-LD and plain JSON requests
194
            $serialiser = new EasyRdf_Serialiser_JsonLd();
195
        } else {
196
            $retform = 'rdfxml';
197
            $serialiser = new EasyRdf_Serialiser_RdfXml();
198
        }
199
200
        if ($vocid !== null) {
201
            $vocab = $this->getVocabulary($vocid);
202
            $sparql = $vocab->getSparql();
203
            $arrayClass = $vocab->getConfig()->getArrayClassURI();
204
            $vocabs = array($vocab);
205
        } else {
206
            $sparql = $this->getDefaultSparql();
207
            $arrayClass = null;
208
            $vocabs = null;
209
        }
210
        $result = $sparql->queryConceptInfo($uri, $arrayClass, $vocabs, true);
211
212
        return $serialiser->serialise($result, $retform);
213
    }
214
215
    /**
216
     * Makes a SPARQL-query to the endpoint that retrieves concept
217
     * references as it's search results.
218
     * @param ConceptSearchParameters $params an object that contains all the parameters needed for the search
219
     * @return array search results
220
     */
221
    public function searchConcepts($params)
222
    {
223
        // don't even try to search for empty prefix
224
        if ($params->getSearchTerm() === "" || !preg_match('/[^*]/', $params->getSearchTerm())) {
225
            return array();
226
        }
227
228
        $vocids = $params->getVocabIds();
229
        $vocabs = $params->getVocabs();
230
231
        if (sizeof($vocids) == 1) { // search within vocabulary
232
            $voc = $vocabs[0];
233
            $sparql = $voc->getSparql();
234
        } else { // multi-vocabulary or global search
235
            $voc = null;
236
            $sparql = $this->getDefaultSparql();
237
        }
238
239
        $results = $sparql->queryConcepts($vocabs, $params->getHidden(), $params->getAdditionalFields(), $params->getUnique(), $params);
240
        $ret = array();
241
242
        foreach ($results as $hit) {
243
            if (sizeof($vocids) == 1) {
244
                $hit['vocab'] = $vocids[0];
245
            } else {
246
                try {
247
                    $voc = $this->getVocabularyByGraph($hit['graph']);
248
                    $hit['vocab'] = $voc->getId();
249
                } catch (Exception $e) {
250
                    trigger_error($e->getMessage(), E_USER_WARNING);
251
                    $voc = null;
252
                    $hit['vocab'] = "???";
253
                }
254
            }
255
            unset($hit['graph']);
256
257
            $hit['voc'] = $voc;
258
259
            // if uri is a external vocab uri that is included in the current vocab
260
            $realvoc = $this->guessVocabularyFromURI($hit['uri']);
261
            if ($realvoc != $voc) {
262
                unset($hit['localname']);
263
                $hit['exvocab'] = ($realvoc !== null) ? $realvoc->getId() : "???";
264
            }
265
266
            $ret[] = $hit;
267
        }
268
269
        return $ret;
270
    }
271
272
    /**
273
     * Function for performing a search for concepts and their data fields.
274
     * @param ConceptSearchParameters $params an object that contains all the parameters needed for the search
275
     * @return array array with keys 'count' and 'results'
276
     */
277
    public function searchConceptsAndInfo($params)
278
    {
279
        $params->setUnique(true);
280
        $allhits = $this->searchConcepts($params);
281
        $hits = array_slice($allhits, $params->getOffset(), $params->getSearchLimit());
282
283
        $uris = array();
284
        $vocabs = array();
285
        $uniqueVocabs = array();
286
        foreach ($hits as $hit) {
287
            $uniqueVocabs[$hit['voc']->getId()] = $hit['voc']->getId();
288
            $vocabs[] = $hit['voc'];
289
            $uris[] = $hit['uri'];
290
        }
291
        if (sizeof($uniqueVocabs) == 1) {
292
            $voc = $vocabs[0];
293
            $sparql = $voc->getSparql();
294
            $arrayClass = $voc->getConfig()->getArrayClassURI();
295
        } else {
296
            $arrayClass = null;
297
            $sparql = $this->getDefaultSparql();
298
        }
299
        $ret = $sparql->queryConceptInfo($uris, $arrayClass, $vocabs, null, $params->getSearchLang());
300
301
        // For marking that the concept has been found through an alternative label, hidden
302
        // label or a label in another language
303
        foreach ($hits as $idx => $hit) {
304
            if (isset($hit['altLabel']) && isset($ret[$idx])) {
305
                $ret[$idx]->setFoundBy($hit['altLabel'], 'alt');
306
            }
307
308
            if (isset($hit['hiddenLabel']) && isset($ret[$idx])) {
309
                $ret[$idx]->setFoundBy($hit['hiddenLabel'], 'hidden');
310
            }
311
312
            if (isset($hit['matchedPrefLabel'])) {
313
                $ret[$idx]->setFoundBy($hit['matchedPrefLabel'], 'lang');
314
            }
315
316
            if ($ret[$idx]) {
317
                $ret[$idx]->setContentLang($hit['lang']);
318
            }
319
320
        }
321
322
        return array('count' => sizeof($allhits), 'results' => $ret);
323
    }
324
325
    /**
326
     * Creates dataobjects from an input array.
327
     * @param string $class the type of class eg. 'Vocabulary'.
328
     * @param array $resarr contains the EasyRdf_Resources.
329
     */
330
    private function createDataObjects($class, $resarr)
331
    {
332
        $ret = array();
333
        foreach ($resarr as $res) {
334
            $ret[] = new $class($this, $res);
335
        }
336
337
        return $ret;
338
    }
339
340
    /**
341
     * Returns the cached vocabularies.
342
     * @return array of Vocabulary dataobjects
343
     */
344
    public function getVocabularies()
345
    {
346
        if ($this->all_vocabularies === null) { // initialize cache
347
            $vocs = $this->graph->allOfType('skosmos:Vocabulary');
348
            $this->all_vocabularies = $this->createDataObjects("Vocabulary", $vocs);
349
            foreach ($this->all_vocabularies as $voc) {
350
                // register vocabulary ids as RDF namespace prefixes
351
                $prefix = preg_replace('/\W+/', '', $voc->getId()); // strip non-word characters
352
                try {
353
                    if ($prefix != '' && EasyRdf_Namespace::get($prefix) === null) // if not already defined
354
                    {
355
                        EasyRdf_Namespace::set($prefix, $voc->getUriSpace());
356
                    }
357
358
                } catch (Exception $e) {
359
                    // not valid as namespace identifier, ignore
360
                }
361
            }
362
        }
363
364
        return $this->all_vocabularies;
365
    }
366
367
    /**
368
     * Returns the cached vocabularies from a category.
369
     * @param EasyRdf_Resource $cat the category in question
370
     * @return array of vocabulary dataobjects
371
     */
372
    public function getVocabulariesInCategory($cat)
373
    {
374
        $vocs = $this->graph->resourcesMatching('dc:subject', $cat);
375
376
        return $this->createDataObjects("Vocabulary", $vocs);
377
    }
378
379
    /**
380
     * Creates dataobjects of all the different vocabulary categories (Health etc.).
381
     * @return array of Dataobjects of the type VocabularyCategory.
382
     */
383
    public function getVocabularyCategories()
384
    {
385
        $cats = $this->graph->allOfType('skos:Concept');
386
387
        return $this->createDataObjects("VocabularyCategory", $cats);
388
    }
389
390
    /**
391
     * Returns the label defined in vocabularies.ttl with the appropriate language.
392
     * @param string $lang language code of returned labels, eg. 'fi'
393
     * @return string the label for vocabulary categories.
394
     */
395
    public function getClassificationLabel($lang)
396
    {
397
        $cats = $this->graph->allOfType('skos:ConceptScheme');
398
        $label = $cats ? $cats[0]->label($lang) : null;
399
400
        return $label;
401
    }
402
403
    /**
404
     * Returns a single cached vocabulary.
405
     * @param string $vocid the vocabulary id eg. 'mesh'.
406
     * @return vocabulary dataobject
407
     */
408
    public function getVocabulary($vocid)
409
    {
410
        $vocs = $this->getVocabularies();
411
        foreach ($vocs as $voc) {
412
            if ($voc->getId() == $vocid) {
413
                return $voc;
414
            }
415
        }
416
        throw new Exception("Vocabulary id '$vocid' not found in configuration.");
417
    }
418
419
    /**
420
     * Return the vocabulary that is stored in the given graph on the given endpoint.
421
     *
422
     * @param $graph string graph URI
423
     * @param $endpoint string endpoint URL (default SPARQL endpoint if omitted)
424
     * @return Vocabulary vocabulary of this URI, or null if not found
425
     */
426
    public function getVocabularyByGraph($graph, $endpoint = null)
427
    {
428
        if ($endpoint === null) {
429
            $endpoint = $this->getConfig()->getDefaultEndpoint();
430
        }
431
        if ($this->vocabs_by_graph === null) { // initialize cache
432
            $this->vocabs_by_graph = array();
433
            foreach ($this->getVocabularies() as $voc) {
434
                $key = json_encode(array($voc->getGraph(), $voc->getEndpoint()));
435
                $this->vocabs_by_graph[$key] = $voc;
436
            }
437
        }
438
439
        $key = json_encode(array($graph, $endpoint));
440
        if (array_key_exists($key, $this->vocabs_by_graph)) {
441
            return $this->vocabs_by_graph[$key];
442
        } else {
443
            throw new Exception("no vocabulary found for graph $graph and endpoint $endpoint");
444
        }
445
446
    }
447
448
    /**
449
     * Guess which vocabulary a URI originates from, based on the declared
450
     * vocabulary URI spaces.
451
     *
452
     * @param $uri string URI to search
453
     * @return Vocabulary vocabulary of this URI, or null if not found
454
     */
455
    public function guessVocabularyFromURI($uri)
456
    {
457
        if ($this->vocabs_by_urispace === null) { // initialize cache
458
            $this->vocabs_by_urispace = array();
459
            foreach ($this->getVocabularies() as $voc) {
460
                $this->vocabs_by_urispace[$voc->getUriSpace()] = $voc;
461
            }
462
        }
463
464
        // try to guess the URI space and look it up in the cache
465
        $res = new EasyRdf_Resource($uri);
466
        $namespace = substr($uri, 0, -strlen($res->localName()));
467
        if (array_key_exists($namespace, $this->vocabs_by_urispace)) {
468
            return $this->vocabs_by_urispace[$namespace];
469
        }
470
471
        // didn't work, try to match with each URI space separately
472
        foreach ($this->vocabs_by_urispace as $urispace => $voc) {
473
            if (strpos($uri, $urispace) === 0) {
474
                return $voc;
475
            }
476
        }
477
478
        // not found
479
        return null;
480
    }
481
482
    /**
483
     * Get the label for a resource, preferring 1. the given language 2. configured languages 3. any language.
484
     * @param EasyRdf_Resource $res resource whose label to return
485
     * @param string $lang preferred language
486
     * @return EasyRdf_Literal label as an EasyRdf_Literal object, or null if not found
487
     */
488
    public function getResourceLabel($res, $lang)
489
    {
490
        $langs = array_merge(array($lang), array_keys($this->getConfig()->getLanguages()));
491
        foreach ($langs as $l) {
492
            $label = $res->label($l);
493
            if ($label !== null) {
494
                return $label;
495
            }
496
497
        }
498
        return $res->label(); // desperate check for label in any language; will return null if even this fails
499
    }
500
501
    private function fetchResourceFromUri($uri)
502
    {
503
        try {
504
            // change the timeout setting for external requests
505
            $httpclient = EasyRdf_Http::getDefaultHttpClient();
506
            $httpclient->setConfig(array('timeout' => $this->getConfig()->getHttpTimeout()));
507
            EasyRdf_Http::setDefaultHttpClient($httpclient);
508
509
            $client = EasyRdf_Graph::newAndLoad(EasyRdf_Utils::removeFragmentFromUri($uri));
510
            return $client->resource($uri);
511
        } catch (Exception $e) {
512
            return null;
513
        }
514
    }
515
516
    public function getResourceFromUri($uri)
517
    {
518
        EasyRdf_Format::unregister('json'); // prevent parsing errors for sources which return invalid JSON
519
        // using apc cache for the resource if available
520
        if (function_exists('apc_store') && function_exists('apc_fetch')) {
521
            $key = 'fetch: ' . EasyRdf_Utils::removeFragmentFromUri($uri);
522
            $resource = apc_fetch($key);
523
            if ($resource === null || $resource === false) { // was not found in cache, or previous request failed
524
                $resource = $this->fetchResourceFromUri($uri);
525
                apc_store($key, $resource, $this->URI_FETCH_TTL);
526
            }
527
        } else { // APC not available, parse on every request
528
            $resource = $this->fetchResourceFromUri($uri);
529
        }
530
        return $resource;
531
    }
532
533
    /**
534
     * Returns a SPARQL endpoint object.
535
     * @param string $dialect eg. 'JenaText'.
536
     * @param string $endpoint url address of endpoint
537
     * @param string $graph uri for the target graph.
538
     */
539
    public function getSparqlImplementation($dialect, $endpoint, $graph)
540
    {
541
        $classname = $dialect . "Sparql";
542
543
        return new $classname($endpoint, $graph, $this);
544
    }
545
546
    /**
547
     * Returns a SPARQL endpoint object using the default implementation set in the config.inc.
548
     */
549
    public function getDefaultSparql()
550
    {
551
        return $this->getSparqlImplementation($this->getConfig()->getDefaultSparqlDialect(), $this->getConfig()->getDefaultEndpoint(), '?graph');
552
    }
553
554
}
555