Passed
Push — master ( 67a964...2aff1a )
by Konrad
04:08
created

LoadQueryHandler   C

Complexity

Total Complexity 55

Size/Duplication

Total Lines 306
Duplicated Lines 0 %

Test Coverage

Coverage 90.5%

Importance

Changes 1
Bugs 0 Features 0
Metric Value
eloc 174
dl 0
loc 306
rs 6
c 1
b 0
f 0
ccs 162
cts 179
cp 0.905
wmc 55

10 Methods

Rating   Name   Duplication   Size   Complexity  
A bufferIDSQL() 0 27 4
A bufferGraphSQL() 0 15 2
A runQuery() 0 30 2
A getMaxTripleID() 0 10 2
C addT() 0 33 10
A getTripleID() 0 30 3
A getMaxTermID() 0 18 6
A bufferTripleSQL() 0 19 2
D getStoredTermID() 0 68 18
A checkSQLBuffers() 0 19 6

How to fix   Complexity   

Complex Class

Complex classes like LoadQueryHandler often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use LoadQueryHandler, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
/*
4
 * This file is part of the sweetrdf/InMemoryStoreSqlite package and licensed under
5
 * the terms of the GPL-3 license.
6
 *
7
 * (c) Konrad Abicht <[email protected]>
8
 * (c) Benjamin Nowack
9
 *
10
 * For the full copyright and license information, please view the LICENSE
11
 * file that was distributed with this source code.
12
 */
13
14
namespace sweetrdf\InMemoryStoreSqlite\Store\QueryHandler;
15
16
use function sweetrdf\InMemoryStoreSqlite\calcURI;
17
use function sweetrdf\InMemoryStoreSqlite\getNormalizedValue;
18
use sweetrdf\InMemoryStoreSqlite\Store\TurtleLoader;
19
20
class LoadQueryHandler extends QueryHandler
21
{
22
    private string $target_graph;
23
24
    /**
25
     * @todo required?
26
     */
27
    private int $t_count;
28
29
    private int $write_buffer_size = 2500;
30
31 9
    public function runQuery($infos, $data = '', $keep_bnode_ids = 0)
32
    {
33 9
        $url = $infos['query']['url'];
34 9
        $graph = $infos['query']['target_graph'];
35 9
        $this->target_graph = $graph ? calcURI($graph) : calcURI($url);
36 9
        $this->keep_bnode_ids = $keep_bnode_ids;
0 ignored issues
show
Bug Best Practice introduced by
The property keep_bnode_ids does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
37
38
        // remove parameters
39 9
        $parserLogger = $this->store->getLoggerPool()->createNewLogger('Turtle');
40 9
        $loader = new TurtleLoader($parserLogger);
41 9
        $loader->setCaller($this);
42
43
        /* logging */
44 9
        $this->t_count = 0;
45 9
        $this->t_start = 0;
0 ignored issues
show
Bug Best Practice introduced by
The property t_start does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
46
        /* load and parse */
47 9
        $this->max_term_id = $this->getMaxTermID();
0 ignored issues
show
Bug Best Practice introduced by
The property max_term_id does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
48 9
        $this->max_triple_id = $this->getMaxTripleID();
0 ignored issues
show
Bug Best Practice introduced by
The property max_triple_id does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
49
50 9
        $this->term_ids = [];
0 ignored issues
show
Bug Best Practice introduced by
The property term_ids does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
51 9
        $this->triple_ids = [];
0 ignored issues
show
Bug Best Practice introduced by
The property triple_ids does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
52 9
        $this->sql_buffers = [];
0 ignored issues
show
Bug Best Practice introduced by
The property sql_buffers does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
53 9
        $loader->parse($url, $data);
54
55
        /* done */
56 9
        $this->checkSQLBuffers(1);
57
58
        return [
59 9
            't_count' => $this->t_count,
60 9
            'load_time' => 0,
61
        ];
62
    }
63
64 9
    public function addT($s, $p, $o, $s_type, $o_type, $o_dt = '', $o_lang = '')
65
    {
66 9
        $type_ids = ['uri' => '0', 'bnode' => '1', 'literal' => '2'];
67 9
        $g = $this->getStoredTermID($this->target_graph, '0', 'id');
68 9
        $s = (('bnode' == $s_type) && !$this->keep_bnode_ids) ? '_:b'.abs(crc32($g.$s)).'_'.(\strlen($s) > 12 ? substr(substr($s, 2), -10) : substr($s, 2)) : $s;
69 9
        $o = (('bnode' == $o_type) && !$this->keep_bnode_ids) ? '_:b'.abs(crc32($g.$o)).'_'.(\strlen($o) > 12 ? substr(substr($o, 2), -10) : substr($o, 2)) : $o;
70
        /* triple */
71 9
        $t = [
72 9
            's' => $this->getStoredTermID($s, $type_ids[$s_type], 's'),
73 9
            'p' => $this->getStoredTermID($p, '0', 'id'),
74 9
            'o' => $this->getStoredTermID($o, $type_ids[$o_type], 'o'),
75 9
            'o_lang_dt' => $this->getStoredTermID($o_dt.$o_lang, $o_dt ? '0' : '2', 'id'),
76 9
            'o_comp' => getNormalizedValue($o),
77 9
            's_type' => $type_ids[$s_type],
78 9
            'o_type' => $type_ids[$o_type],
79
        ];
80 9
        $t['t'] = $this->getTripleID($t);
81 9
        if (\is_array($t['t'])) {/* t exists already */
82
            $t['t'] = $t['t'][0];
83
        } else {
84 9
            $this->bufferTripleSQL($t);
85
        }
86
        /* g2t */
87 9
        $g2t = ['g' => $g, 't' => $t['t']];
88 9
        $this->bufferGraphSQL($g2t);
89 9
        ++$this->t_count;
90
        /* check buffers */
91 9
        if (0 == ($this->t_count % $this->write_buffer_size)) {
92
            $force_write = 1;
93
            $reset_buffers = (0 == ($this->t_count % ($this->write_buffer_size * 2)));
94
            $refresh_lock = (0 == ($this->t_count % 25000));
95
            $split_tables = (0 == ($this->t_count % ($this->write_buffer_size * 10)));
96
            $this->checkSQLBuffers($force_write, $reset_buffers, $refresh_lock, $split_tables);
0 ignored issues
show
Unused Code introduced by
The call to sweetrdf\InMemoryStoreSq...dler::checkSQLBuffers() has too many arguments starting with $refresh_lock. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

96
            $this->/** @scrutinizer ignore-call */ 
97
                   checkSQLBuffers($force_write, $reset_buffers, $refresh_lock, $split_tables);

This check compares calls to functions or methods with their respective definitions. If the call has more arguments than are defined, it raises an issue.

If a function is defined several times with a different number of parameters, the check may pick up the wrong definition and report false positives. One codebase where this has been known to happen is Wordpress. Please note the @ignore annotation hint above.

Loading history...
97
        }
98 9
    }
99
100 9
    public function getMaxTermID(): int
101
    {
102 9
        $sql = '';
103 9
        foreach (['id2val', 's2val', 'o2val'] as $tbl) {
104 9
            $sql .= $sql ? ' UNION ' : '';
105 9
            $sql .= 'SELECT MAX(id) as id FROM '.$tbl;
106
        }
107 9
        $r = 0;
108
109 9
        $rows = $this->store->getDBObject()->fetchList($sql);
110
111 9
        if (\is_array($rows)) {
0 ignored issues
show
introduced by
The condition is_array($rows) is always true.
Loading history...
112 9
            foreach ($rows as $row) {
113 9
                $r = ($r < $row['id']) ? $row['id'] : $r;
114
            }
115
        }
116
117 9
        return $r + 1;
118
    }
119
120
    /**
121
     * @todo change DB schema and avoid using this function because it does not protect against race conditions
122
     *
123
     * @return int
124
     */
125 9
    public function getMaxTripleID()
126
    {
127 9
        $sql = 'SELECT MAX(t) AS `id` FROM triple';
128
129 9
        $row = $this->store->getDBObject()->fetchRow($sql);
130 9
        if (isset($row['id'])) {
131 5
            return $row['id'] + 1;
132
        }
133
134 9
        return 1;
135
    }
136
137 9
    public function getStoredTermID($val, $type_id, $tbl)
138
    {
139
        /* buffered */
140 9
        if (isset($this->term_ids[$val])) {
141 9
            if (!isset($this->term_ids[$val][$tbl])) {
142 8
                foreach (['id', 's', 'o'] as $other_tbl) {
143 8
                    if (isset($this->term_ids[$val][$other_tbl])) {
144 8
                        $this->term_ids[$val][$tbl] = $this->term_ids[$val][$other_tbl];
0 ignored issues
show
Bug Best Practice introduced by
The property term_ids does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
145 8
                        $this->bufferIDSQL($tbl, $this->term_ids[$val][$tbl], $val, $type_id);
146 8
                        break;
147
                    }
148
                }
149
            }
150
151 9
            return $this->term_ids[$val][$tbl];
152
        }
153
        /* db */
154 9
        $sub_tbls = ('id' == $tbl)
155 9
            ? ['id2val', 's2val', 'o2val']
156 9
            : ('s' == $tbl
157 9
                ? ['s2val', 'id2val', 'o2val']
158 9
                : ['o2val', 'id2val', 's2val']
159
            );
160
161 9
        foreach ($sub_tbls as $sub_tbl) {
162 9
            $id = 0;
163
            /* via hash */
164 9
            if (preg_match('/^(s2val|o2val)$/', $sub_tbl)) {
165 9
                $sql = 'SELECT id, val
166 9
                    FROM '.$sub_tbl.'
167 9
                    WHERE val_hash = "'.$this->getValueHash($val).'"';
168
169 9
                $rows = $this->store->getDBObject()->fetchList($sql);
170 9
                if (\is_array($rows)) {
171 9
                    foreach ($rows as $row) {
172
                        if ($row['val'] == $val) {
173
                            $id = $row['id'];
174
                            break;
175
                        }
176
                    }
177
                }
178
            } else {
179 9
                $binaryValue = $this->store->getDBObject()->escape($val);
180 9
                if (false !== empty($binaryValue)) {
181 9
                    $sql = 'SELECT id FROM '.$sub_tbl." WHERE val = '".$binaryValue."'";
182
183 9
                    $row = $this->store->getDBObject()->fetchRow($sql);
184 9
                    if (\is_array($row) && isset($row['id'])) {
185 5
                        $id = $row['id'];
186
                    }
187
                }
188
            }
189 9
            if (0 < $id) {
190 5
                $this->term_ids[$val] = [$tbl => $id];
191 5
                if ($sub_tbl != $tbl.'2val') {
192
                    $this->bufferIDSQL($tbl, $id, $val, $type_id);
193
                }
194 5
                break;
195
            }
196
        }
197
        /* new */
198 9
        if (!isset($this->term_ids[$val])) {
199 9
            $this->term_ids[$val] = [$tbl => $this->max_term_id];
200 9
            $this->bufferIDSQL($tbl, $this->max_term_id, $val, $type_id);
201 9
            ++$this->max_term_id;
202
        }
203
204 9
        return $this->term_ids[$val][$tbl];
205
    }
206
207 9
    public function getTripleID($t)
208
    {
209 9
        $val = serialize($t);
210
        /* buffered */
211 9
        if (isset($this->triple_ids[$val])) {
212
            /* hack for "don't insert this triple" */
213
            return [$this->triple_ids[$val]];
214
        }
215
        /* db */
216 9
        $sql = 'SELECT t
217
                  FROM triple
218 9
                 WHERE s = '.$t['s'].'
219 9
                    AND p = '.$t['p'].'
220 9
                    AND o = '.$t['o'].'
221 9
                    AND o_lang_dt = '.$t['o_lang_dt'].'
222 9
                    AND s_type = '.$t['s_type'].'
223 9
                    AND o_type = '.$t['o_type'].'
224
                 LIMIT 1';
225 9
        $row = $this->store->getDBObject()->fetchRow($sql);
226 9
        if (isset($row['t'])) {
227
            /* hack for "don't insert this triple" */
228
            $this->triple_ids[$val] = $row['t'];
0 ignored issues
show
Bug Best Practice introduced by
The property triple_ids does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
229
230
            return [$row['t']];
231
        } else {
232
            /* new */
233 9
            $this->triple_ids[$val] = $this->max_triple_id;
234 9
            ++$this->max_triple_id;
235
236 9
            return $this->triple_ids[$val];
237
        }
238
    }
239
240 9
    public function bufferTripleSQL($t)
241
    {
242 9
        $tbl = 'triple';
243 9
        $sql = ', ';
244
245 9
        $sqlHead = 'INSERT OR IGNORE INTO ';
246
247 9
        if (!isset($this->sql_buffers[$tbl])) {
248 9
            $this->sql_buffers[$tbl] = $sqlHead;
0 ignored issues
show
Bug Best Practice introduced by
The property sql_buffers does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
249 9
            $this->sql_buffers[$tbl] .= $tbl;
250 9
            $this->sql_buffers[$tbl] .= ' (t, s, p, o, o_lang_dt, o_comp, s_type, o_type) VALUES';
251 9
            $sql = ' ';
252
        }
253
254 9
        $oCompEscaped = $this->store->getDBObject()->escape($t['o_comp']);
255
256 9
        $this->sql_buffers[$tbl] .= $sql.'('.$t['t'].', '.$t['s'].', '.$t['p'].', ';
257 9
        $this->sql_buffers[$tbl] .= $t['o'].', '.$t['o_lang_dt'].", '";
258 9
        $this->sql_buffers[$tbl] .= $oCompEscaped."', ".$t['s_type'].', '.$t['o_type'].')';
259 9
    }
260
261 9
    public function bufferGraphSQL($g2t)
262
    {
263 9
        $tbl = 'g2t';
264 9
        $sql = ', ';
265
266
        /*
267
         * Use appropriate INSERT syntax, depending on the DBS.
268
         */
269 9
        $sqlHead = 'INSERT OR IGNORE INTO ';
270
271 9
        if (!isset($this->sql_buffers[$tbl])) {
272 9
            $this->sql_buffers[$tbl] = $sqlHead.$tbl.' (g, t) VALUES';
0 ignored issues
show
Bug Best Practice introduced by
The property sql_buffers does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
273 9
            $sql = ' ';
274
        }
275 9
        $this->sql_buffers[$tbl] .= $sql.'('.$g2t['g'].', '.$g2t['t'].')';
276 9
    }
277
278 9
    public function bufferIDSQL($tbl, $id, $val, $val_type)
279
    {
280 9
        $tbl = $tbl.'2val';
281 9
        if ('id2val' == $tbl) {
282 9
            $cols = 'id, val, val_type';
283 9
            $vals = '('.$id.", '".$this->store->getDBObject()->escape($val)."', ".$val_type.')';
284 9
        } elseif (preg_match('/^(s2val|o2val)$/', $tbl)) {
285 9
            $cols = 'id, val_hash, val';
286 9
            $vals = '('.$id.", '"
287 9
                .$this->getValueHash($val)
288 9
                ."', '"
289 9
                .$this->store->getDBObject()->escape($val)
290 9
                ."')";
291
        } else {
292
            $cols = 'id, val';
293
            $vals = '('.$id.", '".$this->store->getDBObject()->escape($val)."')";
294
        }
295 9
        if (!isset($this->sql_buffers[$tbl])) {
296 9
            $this->sql_buffers[$tbl] = '';
0 ignored issues
show
Bug Best Practice introduced by
The property sql_buffers does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
297 9
            $sqlHead = 'INSERT OR IGNORE INTO ';
298
299 9
            $sql = $sqlHead.$tbl.'('.$cols.') VALUES ';
300
        } else {
301 9
            $sql = ', ';
302
        }
303 9
        $sql .= $vals;
304 9
        $this->sql_buffers[$tbl] .= $sql;
305 9
    }
306
307 9
    public function checkSQLBuffers($force_write = 0, $reset_id_buffers = 0)
308
    {
309 9
        foreach (['triple', 'g2t', 'id2val', 's2val', 'o2val'] as $tbl) {
310 9
            $buffer_size = isset($this->sql_buffers[$tbl]) ? 1 : 0;
311 9
            if ($buffer_size && $force_write) {
312 9
                $this->store->getDBObject()->exec($this->sql_buffers[$tbl]);
313
                /* table error */
314 9
                $this->store->getDBObject()->getErrorMessage();
315 9
                unset($this->sql_buffers[$tbl]);
316
317
                /* reset term id buffers */
318 9
                if ($reset_id_buffers) {
319
                    $this->term_ids = [];
0 ignored issues
show
Bug Best Practice introduced by
The property term_ids does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
320
                    $this->triple_ids = [];
0 ignored issues
show
Bug Best Practice introduced by
The property triple_ids does not exist. Although not strictly required by PHP, it is generally a best practice to declare properties explicitly.
Loading history...
321
                }
322
            }
323
        }
324
325 9
        return 1;
326
    }
327
}
328