|
1
|
|
|
<?php |
|
2
|
|
|
namespace Ivory\Lang\SqlPattern; |
|
3
|
|
|
|
|
4
|
|
|
use Ivory\Exception\NoDataException; |
|
5
|
|
|
|
|
6
|
|
|
/** |
|
7
|
|
|
* An immutable representation of an SQL pattern. |
|
8
|
|
|
* |
|
9
|
|
|
* This class is not intended for direct construction - instead, use {@link SqlPatternParser} to had an `SqlPattern` |
|
10
|
|
|
* created, or {@link \Ivory\Query\SqlRelationRecipe} or {@link \Ivory\Query\SqlCommandRecipe} to use it indirectly. |
|
11
|
|
|
* |
|
12
|
|
|
* An SQL pattern is a plain SQL string with special placeholders for parameters. |
|
13
|
|
|
* |
|
14
|
|
|
* There are two kinds of parameters which might be used in a pattern: |
|
15
|
|
|
* - *named parameters* - these are specified by an explicit name; and |
|
16
|
|
|
* - *positional parameters* - specified solely by their position relative to other positional parameters. |
|
17
|
|
|
* |
|
18
|
|
|
* Multiple occurrences of the same named parameter may be used in the SQL pattern, referring to the same one value. On |
|
19
|
|
|
* the other hand, positional parameters may not be reused - a value must be provided for each positional parameter, and |
|
20
|
|
|
* placeholders may not explicitly refer to positional parameter values. |
|
21
|
|
|
* |
|
22
|
|
|
* Placeholders use the following syntax in SQL patterns: |
|
23
|
|
|
* <pre> |
|
24
|
|
|
* %[type][:name] |
|
25
|
|
|
* </pre> |
|
26
|
|
|
* where: |
|
27
|
|
|
* * `name` is the name of the parameter (if not specified, the parameter is treated as a positional parameter); and |
|
28
|
|
|
* * `type` is an explicit type specification, governing how the value given for the parameter will be encoded to the |
|
29
|
|
|
* SQL string. If type is not given, it is inferred from the actual data type of the parameter value. |
|
30
|
|
|
* |
|
31
|
|
|
* Examples: |
|
32
|
|
|
* * valid `name`s: `tbl`, `person_id`, `p1`; |
|
33
|
|
|
* * valid `type` specifications: `s`, `int_singleton`, `t1`, `public.planet`, `public.planet[]`, `int[][]`, |
|
34
|
|
|
* `public."int"`, `"my schema"."my type"`, `{double precision}`. |
|
35
|
|
|
* |
|
36
|
|
|
* Detailed rules on syntax and the corresponding semantics follow: |
|
37
|
|
|
* * Let us define a "token", first, which in this context means a string of one or more letters, digits and underscore |
|
38
|
|
|
* characters (`_`), not starting with a digit. |
|
39
|
|
|
* * `name` may only be a single token. Especially, the name may not be a number - referring to positional arguments is |
|
40
|
|
|
* not supported. |
|
41
|
|
|
* * `type` may either be: |
|
42
|
|
|
* * a single token, or |
|
43
|
|
|
* * any string enclosed in double quotes (the `"` character; to write a double quote character literally, use two of |
|
44
|
|
|
* them), or |
|
45
|
|
|
* * any string within a pair of curly braces (the `{` and `}`); note there is no way to type the closing brace |
|
46
|
|
|
* literally using this variant). |
|
47
|
|
|
* * The `type` may optionally be prefixed with `typeschema.`, where `typeschema` meets the same syntax rules as `type`. |
|
48
|
|
|
* * The `type` specification may be ended with an empty pair of square brackets, indicating an array type. (Multiple |
|
49
|
|
|
* pairs of brackets are accepted, although these are treated the same as a mere one bracket pair, consistently with |
|
50
|
|
|
* what PostgreSQL does.) Note, however, that only empty pairs of brackets are recognized as part of the type |
|
51
|
|
|
* specification. An array placeholder immediately followed by a subscript works as expected: `SELECT %bigint[][2]` |
|
52
|
|
|
* selects the item under index 2 from the provided array. |
|
53
|
|
|
* * Names of available types, as well as the rules inferring the type automatically (when the type is not specified), |
|
54
|
|
|
* are defined by the {@link Ivory\Type\TypeDictionary} used by the connection for which the recipe will be serialized |
|
55
|
|
|
* to an SQL string. The standard Ivory deployment registers all types defined in the connected-to database under |
|
56
|
|
|
* their fully qualified names (e.g., `public.sometype`) and also some aliases, especially those corresponding to the |
|
57
|
|
|
* SQL reserved types (e.g., `int`). Besides, some custom types special for being used in SQL patterns may be |
|
58
|
|
|
* registered (e.g., `sql`). |
|
59
|
|
|
* * Registered types need not be schema-qualified. Just the name of the type is sufficient - the PostgreSQL |
|
60
|
|
|
* `search_path` facility is leveraged to identify the type. Before the `search_path` schemas are actually searched, |
|
61
|
|
|
* custom types and type aliases are considered. |
|
62
|
|
|
* * Note the difference between quoted type name and an unquoted one (i.e., using the first or the third syntax) is the |
|
63
|
|
|
* same as for PostgreSQL: a quoted type name is case sensitive and it cannot refer to an |
|
64
|
|
|
* {@link \Ivory\Lang\Sql\Types::getReservedTypes() SQL reserved type}. Recall specifying, e.g., `SELECT 1::"int"` |
|
65
|
|
|
* addresses a user-defined type named `int`, while `SELECT 1::int` always refers to the reserved type, regardless of |
|
66
|
|
|
* `search_path` or any user-defined types. This is, actually, also the reason for the curly braces syntax - |
|
67
|
|
|
* otherwise, multi-word SQL reserved types (such as `double precision`) could not be specified. The braces syntax |
|
68
|
|
|
* may, of course, be used for regular built-in or user-defined types as long as there is no conflicting reserved type |
|
69
|
|
|
* (which would be registered with the `TypeDictionary` as an alias). |
|
70
|
|
|
* |
|
71
|
|
|
* In specific situations, multiple same-named placeholders may be used with different type specifications, e.g., |
|
72
|
|
|
* `SELECT id FROM %ident:tbl UNION SELECT object_id FROM log WHERE table = %s:tbl`. This is perfectly legal - a single |
|
73
|
|
|
* value for the `tbl` parameter will be encoded as an identifier for the first placeholder and as a string literal for |
|
74
|
|
|
* the second placeholder. |
|
75
|
|
|
* |
|
76
|
|
|
* To use a literal `%` in the SQL string, type `%%`. |
|
77
|
|
|
* |
|
78
|
|
|
* The percent signs are searched in the whole string, regardless of the surrounding content. Namely, a percent sign |
|
79
|
|
|
* inside string constants written in the SQL string literally *will* be interpreted as a placeholder and replaced with |
|
80
|
|
|
* a provided value. |
|
81
|
|
|
* |
|
82
|
|
|
* Note that even {@link IRelationRecipe} and {@link ICommandRecipe} objects may be used as parameter values to form a |
|
83
|
|
|
* more complex recipe, e.g., a |
|
84
|
|
|
* {@link https://www.postgresql.org/docs/current/static/queries-with.html Common Table Expression}. |
|
85
|
|
|
* |
|
86
|
|
|
* @internal Ivory design note: The positional parameters could have been treated as parameters named by their |
|
87
|
|
|
* zero-based position. This is not the case, though. If placeholders could refer to positional parameters (e.g., |
|
88
|
|
|
* <tt>%s:0</tt>), it would only complicate the specification without any significant benefit. Especially the |
|
89
|
|
|
* {@link SqlRecipe::fromFragments()} would be overcomplicated as the placeholders would have to be re-numbered. |
|
90
|
|
|
* |
|
91
|
|
|
* @internal Ivory design note: The common placeholder syntax ":name" is intentionally unsupported. PostgreSQL uses |
|
92
|
|
|
* <tt>::type</tt> for typecasts, which could be mistaken for the named arguments written as ":name". Moreover, it would |
|
93
|
|
|
* collide with Embedded SQL which also uses the same syntax. Rather than complicating the queries with escaping, Ivory |
|
94
|
|
|
* requires the leading % sign, which also simplifies parsing the patterns - both for the machine and for the humans. |
|
95
|
|
|
*/ |
|
96
|
|
|
class SqlPattern |
|
97
|
|
|
{ |
|
98
|
|
|
private $sqlTorso; |
|
99
|
|
|
/** @var SqlPatternPlaceholder[] */ |
|
100
|
|
|
private $positionalPlaceholders; |
|
101
|
|
|
/** @var SqlPatternPlaceholder[][] */ |
|
102
|
|
|
private $namedPlaceholderMap; |
|
103
|
|
|
/** @var SqlPatternPlaceholder[] */ |
|
104
|
|
|
private $placeholderSequence; |
|
105
|
|
|
|
|
106
|
|
|
/** |
|
107
|
|
|
* @param string $sqlTorso torso of the SQL parsed from the described SQL pattern; this is the pattern with removed |
|
108
|
|
|
* placeholders and unescaped <tt>%%</tt> sequences |
|
109
|
|
|
* @param SqlPatternPlaceholder[] $positionalPlaceholders |
|
110
|
|
|
* list of positional placeholders, in order of appearance, used in the described |
|
111
|
|
|
* SQL pattern |
|
112
|
|
|
* @param SqlPatternPlaceholder[][] $namedPlaceholderMap |
|
113
|
|
|
* map of named placeholders used in the described SQL pattern: name => list of all |
|
114
|
|
|
* placeholders (in order of appearance) referring to the parameter name |
|
115
|
|
|
*/ |
|
116
|
|
|
public function __construct(string $sqlTorso, array $positionalPlaceholders, array $namedPlaceholderMap) |
|
117
|
|
|
{ |
|
118
|
|
|
assert( |
|
119
|
|
|
!$positionalPlaceholders || array_keys($positionalPlaceholders) == range(0, count($positionalPlaceholders) - 1), |
|
|
|
|
|
|
120
|
|
|
new \InvalidArgumentException('$positionalPlaceholders array is not a list - keys do not form a sequence') |
|
121
|
|
|
); |
|
122
|
|
|
|
|
123
|
|
|
$this->sqlTorso = $sqlTorso; |
|
124
|
|
|
$this->positionalPlaceholders = $positionalPlaceholders; |
|
125
|
|
|
$this->namedPlaceholderMap = $namedPlaceholderMap; |
|
126
|
|
|
|
|
127
|
|
|
$this->initPlaceholderSequence(); |
|
128
|
|
|
} |
|
129
|
|
|
|
|
130
|
|
|
private function initPlaceholderSequence() |
|
131
|
|
|
{ |
|
132
|
|
|
$byOffset = []; |
|
133
|
|
|
foreach ($this->positionalPlaceholders as $plcHld) { |
|
134
|
|
|
$byOffset[$plcHld->getOffset()] = $plcHld; |
|
135
|
|
|
} |
|
136
|
|
|
foreach ($this->namedPlaceholderMap as $plcHlds) { |
|
137
|
|
|
foreach ($plcHlds as $plcHld) { |
|
138
|
|
|
$byOffset[$plcHld->getOffset()] = $plcHld; |
|
139
|
|
|
} |
|
140
|
|
|
} |
|
141
|
|
|
ksort($byOffset); |
|
142
|
|
|
$this->placeholderSequence = array_values($byOffset); |
|
|
|
|
|
|
143
|
|
|
} |
|
144
|
|
|
|
|
145
|
|
|
/** |
|
146
|
|
|
* Returns the described SQL pattern with placeholders removed and sequences `%%` unescaped. |
|
147
|
|
|
* |
|
148
|
|
|
* Parameter values must be inserted in this string in place of the removed placeholders to form a valid SQL |
|
149
|
|
|
* statement. Use {@link fillSql()} or {@link generateSql()} for that. |
|
150
|
|
|
*/ |
|
151
|
|
|
public function getSqlTorso(): string |
|
152
|
|
|
{ |
|
153
|
|
|
return $this->sqlTorso; |
|
154
|
|
|
} |
|
155
|
|
|
|
|
156
|
|
|
/** |
|
157
|
|
|
* @return SqlPatternPlaceholder[] list of positional placeholders, in order of appearance, used in the |
|
158
|
|
|
* described SQL pattern |
|
159
|
|
|
*/ |
|
160
|
|
|
public function getPositionalPlaceholders() |
|
161
|
|
|
{ |
|
162
|
|
|
return $this->positionalPlaceholders; |
|
163
|
|
|
} |
|
164
|
|
|
|
|
165
|
|
|
/** |
|
166
|
|
|
* @return SqlPatternPlaceholder[][] map of named placeholders used in the described SQL pattern: name => list |
|
167
|
|
|
* of all placeholders (in order of appearance) referring to the parameter |
|
168
|
|
|
* name |
|
169
|
|
|
*/ |
|
170
|
|
|
public function getNamedPlaceholderMap() |
|
171
|
|
|
{ |
|
172
|
|
|
return $this->namedPlaceholderMap; |
|
173
|
|
|
} |
|
174
|
|
|
|
|
175
|
|
|
/** |
|
176
|
|
|
* Generate SQL string from this pattern with encoded parameter values requested to be filled by the caller. |
|
177
|
|
|
* |
|
178
|
|
|
* This is a more general method to {@link fillSql()}. The reason for this method is that a single named parameter |
|
179
|
|
|
* may have multiple placeholders within the pattern, each with a different type specification. Therefore, each |
|
180
|
|
|
* occurrence might result in different encoding. This method presents the caller each placeholder, one by one, so |
|
181
|
|
|
* that the caller provides the encoded parameter value. |
|
182
|
|
|
* |
|
183
|
|
|
* Technically, this is achieved by returning a `\Generator` being treated as a coroutine. Each placeholder is |
|
184
|
|
|
* yielded from the generator as a {@link SqlPatternPlaceholder} object, describing the placeholder to fill the |
|
185
|
|
|
* value for. The caller has to {@link \Generator::send() send()} the encoded value for this placeholder, and then |
|
186
|
|
|
* take the next placeholder. After iterating over all placeholders for which a value was requested, the final SQL |
|
187
|
|
|
* string may be retrieved by calling {@link \Generator::getReturn()} on the generator. |
|
188
|
|
|
* |
|
189
|
|
|
* The encoded values sent to the generator are treated as strings. They has to either be strings or other types |
|
190
|
|
|
* convertible to strings. `null`, however, is considered as if the caller forgot to provide a value at all, and a |
|
191
|
|
|
* `NoDataException` is thrown in such a case. |
|
192
|
|
|
* |
|
193
|
|
|
* Example: |
|
194
|
|
|
* <code> |
|
195
|
|
|
* <?php |
|
196
|
|
|
* $pattern = new SqlPattern( |
|
197
|
|
|
* 'SELECT id FROM UNION SELECT object_id FROM log WHERE table = ', |
|
198
|
|
|
* [new SqlPatternPlaceholder(15, 'tbl', 'ident'), new SqlPatternPlaceholder(62, 'tbl', 'string')], |
|
199
|
|
|
* [] |
|
200
|
|
|
* ); |
|
201
|
|
|
* $gen = $pattern->generateSql(); |
|
202
|
|
|
* while ($gen->valid()) { |
|
203
|
|
|
* $placeholder = $gen->current(); |
|
204
|
|
|
* $serializedValue = ($placeholder->getTypeName() == 'string' ? "'person'" : 'person'); |
|
205
|
|
|
* $gen->send($serializedValue); |
|
206
|
|
|
* } |
|
207
|
|
|
* echo $gen->getReturn(); // prints "SELECT id FROM person UNION SELECT object_id FROM log WHERE table = 'person'" |
|
208
|
|
|
* </code> |
|
209
|
|
|
* |
|
210
|
|
|
* @throws NoDataException if no encoded value is sent for a yielded parameter |
|
211
|
|
|
*/ |
|
212
|
|
|
public function generateSql(): \Generator |
|
213
|
|
|
{ |
|
214
|
|
|
$result = ''; |
|
215
|
|
|
$offset = 0; |
|
216
|
|
|
foreach ($this->placeholderSequence as $plcHdr) { |
|
217
|
|
|
$encodedValue = yield $plcHdr; |
|
218
|
|
|
if ($encodedValue === null) { |
|
219
|
|
|
throw new NoDataException( |
|
220
|
|
|
"No value encoded for placeholder {$plcHdr->getNameOrPosition()} at offset {$plcHdr->getOffset()}." |
|
221
|
|
|
); |
|
222
|
|
|
} |
|
223
|
|
|
$result .= substr($this->sqlTorso, $offset, $plcHdr->getOffset() - $offset) . $encodedValue; |
|
224
|
|
|
$offset = $plcHdr->getOffset(); |
|
225
|
|
|
} |
|
226
|
|
|
$result .= substr($this->sqlTorso, $offset); |
|
227
|
|
|
|
|
228
|
|
|
return $result; |
|
229
|
|
|
} |
|
230
|
|
|
} |
|
231
|
|
|
|
This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.
Consider making the comparison explicit by using
empty(..)or! empty(...)instead.