Passed
Pull Request — master (#95)
by Mauro
06:03 queued 02:57
created

Xliff20   C

Complexity

Total Complexity 53

Size/Duplication

Total Lines 366
Duplicated Lines 0 %

Importance

Changes 3
Bugs 0 Features 0
Metric Value
eloc 138
c 3
b 0
f 0
dl 0
loc 366
rs 6.96
wmc 53

6 Methods

Rating   Name   Duplication   Size   Complexity  
A handleOpenXliffTag() 0 8 3
A updateCounts() 0 5 2
C tagOpen() 0 85 16
D tagClose() 0 140 22
A getWordCountGroupForXliffV2() 0 28 4
A prepareTranslation() 0 18 6

How to fix   Complexity   

Complex Class

Complex classes like Xliff20 often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Xliff20, and based on these observations, apply Extract Interface, too.

1
<?php
2
/**
3
 * Created by PhpStorm.
4
 * @author hashashiyyin [email protected] / [email protected]
5
 * Date: 02/08/24
6
 * Time: 17:51
7
 *
8
 */
9
10
namespace Matecat\XliffParser\XliffReplacer;
11
12
use Matecat\XliffParser\Utils\Strings;
13
14
class Xliff20 extends AbstractXliffReplacer {
15
16
    /**
17
     * @var int
18
     */
19
    private int $mdaGroupCounter = 0;
20
    /**
21
     * @var bool
22
     */
23
    protected bool $unitContainsMda = false;   // check if <unit> already contains a <mda:metadata> (forXliff v 2.*)
24
25
    /**
26
     * @var string
27
     */
28
    protected string $alternativeMatchesTag = 'mtc:matches';
29
30
    /**
31
     * @var string
32
     */
33
    protected string $tuTagName = 'unit';
34
35
    /**
36
     * @var string
37
     */
38
    protected string $namespace = "matecat";       // Custom namespace
39
40
    /**
41
     * @var array
42
     */
43
    protected array $nodesToBuffer = [
44
            'source',
45
            'mda:metadata',
46
            'memsource:additionalTagData',
47
            'originalData',
48
            'note'
49
    ];
50
51
    /**
52
     * @inheritDoc
53
     */
54
    protected function tagOpen( $parser, string $name, array $attr ) {
55
56
        $this->handleOpenUnit( $name, $attr );
57
58
        if ( 'mda:metadata' === $name ) {
59
            $this->unitContainsMda = true;
60
        }
61
62
        $this->trySetAltTrans( $name );;
63
        $this->checkSetInTarget( $name );
64
65
        // open buffer
66
        $this->setInBuffer( $name );
67
68
        // check if we are inside a <target>, obviously this happen only if there are targets inside the trans-unit
69
        // <target> must be stripped to be replaced, so this check avoids <target> reconstruction
70
        if ( !$this->inTarget ) {
71
72
            // We need bufferIsActive for not target nodes with currentTransUnitIsTranslatable = 'NO'
73
            if($name === 'target' and $this->currentTransUnitIsTranslatable === 'no'){
0 ignored issues
show
Comprehensibility Best Practice introduced by
Using logical operators such as and instead of && is generally not recommended.

PHP has two types of connecting operators (logical operators, and boolean operators):

  Logical Operators Boolean Operator
AND - meaning and &&
OR - meaning or ||

The difference between these is the order in which they are executed. In most cases, you would want to use a boolean operator like &&, or ||.

Let’s take a look at a few examples:

// Logical operators have lower precedence:
$f = false or true;

// is executed like this:
($f = false) or true;


// Boolean operators have higher precedence:
$f = false || true;

// is executed like this:
$f = (false || true);

Logical Operators are used for Control-Flow

One case where you explicitly want to use logical operators is for control-flow such as this:

$x === 5
    or die('$x must be 5.');

// Instead of
if ($x !== 5) {
    die('$x must be 5.');
}

Since die introduces problems of its own, f.e. it makes our code hardly testable, and prevents any kind of more sophisticated error handling; you probably do not want to use this in real-world code. Unfortunately, logical operators cannot be combined with throw at this point:

// The following is currently a parse error.
$x === 5
    or throw new RuntimeException('$x must be 5.');

These limitations lead to logical operators rarely being of use in current PHP code.

Loading history...
74
                $this->bufferIsActive = true;
75
            }
76
77
            $tag = '';
78
79
            //
80
            // ============================================
81
            // only for Xliff 2.*
82
            // ============================================
83
            //
84
            // In xliff v2 we MUST add <mda:metadata> BEFORE <notes>/<originalData>/<segment>/<ignorable>
85
            //
86
            // As documentation says, <unit> contains:
87
            //
88
            // - elements from other namespaces, OPTIONAL
89
            // - Zero or one <notes> elements followed by
90
            // - Zero or one <originalData> element followed by
91
            // - One or more <segment> or <ignorable> elements in any order.
92
            //
93
            // For more info please refer to:
94
            //
95
            // http://docs.oasis-open.org/xliff/xliff-core/v2.0/os/xliff-core-v2.0-os.html#unit
96
            //
97
            if ( in_array( $name, [ 'notes', 'originalData', 'segment', 'ignorable' ] ) &&
98
                    $this->unitContainsMda === false &&
99
                    !empty( $this->transUnits[ $this->currentTransUnitId ] ) &&
100
                    !$this->hasWrittenCounts
101
            ) {
102
                // we need to update counts here
103
                $this->updateCounts();
104
                $this->hasWrittenCounts = true;
105
                $tag                    .= $this->getWordCountGroupForXliffV2();
106
                $this->unitContainsMda  = true;
107
            }
108
109
            // construct tag
110
            $tag .= "<$name ";
111
112
            foreach ( $attr as $k => $v ) {
113
                //normal tag flux, put attributes in it but skip for translation state and set the right value for the attribute
114
                if ( $k != 'state' ) {
115
                    $tag .= "$k=\"$v\" ";
116
                }
117
            }
118
119
            $seg = $this->getCurrentSegment();
120
121
            if ( $name === $this->tuTagName && !empty( $seg ) && isset( $seg[ 'sid' ] ) ) {
122
123
                // add `matecat:segment-id` to xliff v.2*
124
                if ( strpos( $tag, 'matecat:segment-id' ) === false ) {
125
                    $tag .= "matecat:segment-id=\"{$seg[ 'sid' ]}\" ";
126
                }
127
128
            }
129
130
            // replace state for xliff v2
131
            if ( 'segment' === $name ) { // add state to segment in Xliff v2
132
                [ $stateProp, ] = StatusToStateAttribute::getState( $this->xliffVersion, $seg[ 'status' ] );
133
                $tag .= $stateProp;
134
            }
135
136
            $tag = $this->handleOpenXliffTag( $name, $attr, $tag );
137
138
            $this->checkForSelfClosedTagAndFlush( $parser, $tag );
139
140
        }
141
142
    }
143
144
    /**
145
     * @param string $name
146
     * @param array  $attr
147
     * @param string $tag
148
     *
149
     * @return string
150
     */
151
    protected function handleOpenXliffTag( string $name, array $attr, string $tag ): string {
152
        $tag = parent::handleOpenXliffTag( $name, $attr, $tag );
153
        // add oasis xliff 20 namespace
154
        if ( $name === 'xliff' && !array_key_exists( 'xmlns:mda', $attr ) ) {
155
            $tag .= 'xmlns:mda="urn:oasis:names:tc:xliff:metadata:2.0"';
156
        }
157
158
        return $tag;
159
    }
160
161
    /**
162
     * @inheritDoc
163
     */
164
    protected function tagClose( $parser, string $name ) {
165
        $tag = '';
166
167
        /**
168
         * if is a tag within <target> or
169
         * if it is an empty tag, do not add closing tag because we have already closed it in
170
         *
171
         * self::tagOpen method
172
         */
173
        if ( !$this->isEmpty ) {
174
175
            // write closing tag if is not a target
176
            // EXCLUDE the target nodes with currentTransUnitIsTranslatable = 'NO'
177
            if ( !$this->inTarget and $this->currentTransUnitIsTranslatable !== 'no' ) {
0 ignored issues
show
Comprehensibility Best Practice introduced by
Using logical operators such as and instead of && is generally not recommended.

PHP has two types of connecting operators (logical operators, and boolean operators):

  Logical Operators Boolean Operator
AND - meaning and &&
OR - meaning or ||

The difference between these is the order in which they are executed. In most cases, you would want to use a boolean operator like &&, or ||.

Let’s take a look at a few examples:

// Logical operators have lower precedence:
$f = false or true;

// is executed like this:
($f = false) or true;


// Boolean operators have higher precedence:
$f = false || true;

// is executed like this:
$f = (false || true);

Logical Operators are used for Control-Flow

One case where you explicitly want to use logical operators is for control-flow such as this:

$x === 5
    or die('$x must be 5.');

// Instead of
if ($x !== 5) {
    die('$x must be 5.');
}

Since die introduces problems of its own, f.e. it makes our code hardly testable, and prevents any kind of more sophisticated error handling; you probably do not want to use this in real-world code. Unfortunately, logical operators cannot be combined with throw at this point:

// The following is currently a parse error.
$x === 5
    or throw new RuntimeException('$x must be 5.');

These limitations lead to logical operators rarely being of use in current PHP code.

Loading history...
178
                $tag = "</$name>";
179
            }
180
181
            if ( 'target' == $name && !$this->inAltTrans ) {
182
183
                if ( isset( $this->transUnits[ $this->currentTransUnitId ] ) ) {
184
185
                    $seg = $this->getCurrentSegment();
186
187
                    // update counts
188
                    if ( !$this->hasWrittenCounts && !empty( $seg ) ) {
189
                        $this->updateSegmentCounts( $seg );
190
                    }
191
192
                    // delete translations so the prepareSegment
193
                    // will put source content in target tag
194
                    if ( $this->sourceInTarget ) {
195
                        $seg[ 'translation' ] = '';
196
                        $this->resetCounts();
197
                    }
198
199
                    // append $translation
200
                    $translation = $this->prepareTranslation( $seg );
201
202
                    //append translation
203
                    $tag = "<target>$translation</target>";
204
205
                } elseif( !empty($this->CDATABuffer) and $this->currentTransUnitIsTranslatable === 'no' ) {
0 ignored issues
show
Comprehensibility Best Practice introduced by
Using logical operators such as and instead of && is generally not recommended.

PHP has two types of connecting operators (logical operators, and boolean operators):

  Logical Operators Boolean Operator
AND - meaning and &&
OR - meaning or ||

The difference between these is the order in which they are executed. In most cases, you would want to use a boolean operator like &&, or ||.

Let’s take a look at a few examples:

// Logical operators have lower precedence:
$f = false or true;

// is executed like this:
($f = false) or true;


// Boolean operators have higher precedence:
$f = false || true;

// is executed like this:
$f = (false || true);

Logical Operators are used for Control-Flow

One case where you explicitly want to use logical operators is for control-flow such as this:

$x === 5
    or die('$x must be 5.');

// Instead of
if ($x !== 5) {
    die('$x must be 5.');
}

Since die introduces problems of its own, f.e. it makes our code hardly testable, and prevents any kind of more sophisticated error handling; you probably do not want to use this in real-world code. Unfortunately, logical operators cannot be combined with throw at this point:

// The following is currently a parse error.
$x === 5
    or throw new RuntimeException('$x must be 5.');

These limitations lead to logical operators rarely being of use in current PHP code.

Loading history...
206
207
                    // These are target nodes with currentTransUnitIsTranslatable = 'NO'
208
                    $this->bufferIsActive = false;
209
                    $tag                  = $this->CDATABuffer . "</$name>";
210
                    $this->CDATABuffer    = "";
211
                }
212
213
                // signal we are leaving a target
214
                $this->targetWasWritten = true;
215
                $this->inTarget         = false;
216
                $this->postProcAndFlush( $this->outputFP, $tag, true );
217
218
            } elseif ( in_array( $name, $this->nodesToBuffer ) ) { // we are closing a critical CDATA section
219
220
                $this->bufferIsActive = false;
221
222
                // only for Xliff 2.*
223
                // write here <mda:metaGroup> and <mda:meta> if already present in the <unit>
224
                if ( 'mda:metadata' === $name && $this->unitContainsMda && !$this->hasWrittenCounts ) {
225
226
                    // we need to update counts here
227
                    $this->updateCounts();
228
                    $this->hasWrittenCounts = true;
229
230
                    $tag = $this->CDATABuffer;
231
                    $tag .= $this->getWordCountGroupForXliffV2( false );
232
                    $tag .= "    </mda:metadata>";
233
234
                } else {
235
                    $tag = $this->CDATABuffer . "</$name>";
236
                }
237
238
                $this->CDATABuffer = "";
239
240
                //flush to the pointer
241
                $this->postProcAndFlush( $this->outputFP, $tag );
242
243
            } elseif ( 'segment' === $name ) {
244
245
                // only for Xliff 2.*
246
                // if segment has no <target> add it BEFORE </segment>
247
                if ( !$this->targetWasWritten ) {
248
249
                    $seg = $this->getCurrentSegment();
250
251
                    if ( isset( $seg[ 'translation' ] ) ) {
252
253
                        $translation = $this->prepareTranslation( $seg );
254
                        // replace the tag
255
                        $tag = "<target>$translation</target>";
256
257
                        $tag .= '</segment>';
258
259
                    }
260
261
                }
262
263
                // update segmentPositionInTu
264
                $this->segmentInUnitPosition++;
265
266
                $this->postProcAndFlush( $this->outputFP, $tag );
267
268
                // we are leaving <segment>, reset $segmentHasTarget
269
                $this->targetWasWritten = false;
270
271
            } elseif ( $this->bufferIsActive ) { // this is a tag ( <g | <mrk ) inside a seg or seg-source tag
272
                $this->CDATABuffer .= "</$name>";
273
                // Do NOT Flush
274
            } else { //generic tag closure do Nothing
275
                // flush to pointer
276
                $this->postProcAndFlush( $this->outputFP, $tag );
277
            }
278
        } elseif ( in_array( $name, $this->nodesToBuffer ) ) {
279
280
            $this->isEmpty        = false;
281
            $this->bufferIsActive = false;
282
            $tag                  = $this->CDATABuffer;
283
            $this->CDATABuffer    = "";
284
285
            //flush to the pointer
286
            $this->postProcAndFlush( $this->outputFP, $tag );
287
288
        } else {
289
            //ok, nothing to be done; reset flag for next coming tag
290
            $this->isEmpty = false;
291
        }
292
293
        // try to signal that we are leaving a target
294
        $this->tryUnsetAltTrans( $name );
295
296
        // check if we are leaving a <trans-unit> (xliff v1.*) or <unit> (xliff v2.*)
297
        if ( $this->tuTagName === $name ) {
298
            $this->currentTransUnitIsTranslatable = null;
299
            $this->inTU                           = false;
300
            $this->unitContainsMda                = false;
301
            $this->hasWrittenCounts               = false;
302
303
            $this->resetCounts();
304
        }
305
    }
306
307
    /**
308
     * Update counts
309
     */
310
    private function updateCounts() {
311
312
        $seg = $this->getCurrentSegment();
313
        if ( !empty( $seg ) ) {
314
            $this->updateSegmentCounts( $seg );
315
        }
316
317
    }
318
319
    /**
320
     * @param bool $withMetadataTag
321
     *
322
     * @return string
323
     */
324
    private function getWordCountGroupForXliffV2( bool $withMetadataTag = true ): string {
325
326
        $this->mdaGroupCounter++;
327
        $segments_count_array = $this->counts[ 'segments_count_array' ];
328
329
        $tag = '';
330
331
        if ( $withMetadataTag === true ) {
332
            $tag .= '<mda:metadata>';
333
        }
334
335
        $index = 0;
336
        foreach ( $segments_count_array as $segments_count_item ) {
337
338
            $id = 'word_count_tu.' . $this->currentTransUnitId . '.' . $index;
339
            $index++;
340
341
            $tag .= "    <mda:metaGroup id=\"" . $id . "\" category=\"row_xml_attribute\">
342
                                <mda:meta type=\"x-matecat-raw\">" . $segments_count_item[ 'raw_word_count' ] . "</mda:meta>
343
                                <mda:meta type=\"x-matecat-weighted\">" . $segments_count_item[ 'eq_word_count' ] . "</mda:meta>
344
                            </mda:metaGroup>";
345
        }
346
347
        if ( $withMetadataTag === true ) {
348
            $tag .= '</mda:metadata>';
349
        }
350
351
        return $tag;
352
353
    }
354
355
    /**
356
     * prepare segment tagging for xliff insertion
357
     *
358
     * @param array $seg
359
     *
360
     * @return string
361
     */
362
    protected function prepareTranslation( array $seg ): string {
363
364
        $segment     = Strings::removeDangerousChars( $seg [ 'segment' ] );
365
        $translation = Strings::removeDangerousChars( $seg [ 'translation' ] );
366
        $dataRefMap  = ( isset( $seg[ 'data_ref_map' ] ) ) ? Strings::jsonToArray( $seg[ 'data_ref_map' ] ) : [];
367
368
        if ( $seg [ 'translation' ] == '' ) {
369
            $translation = $segment;
370
        } else {
371
            if ( $this->callback instanceof XliffReplacerCallbackInterface ) {
372
                $error = ( !empty( $seg[ 'error' ] ) ) ? $seg[ 'error' ] : null;
373
                if ( $this->callback->thereAreErrors( $seg[ 'sid' ], $segment, $translation, $dataRefMap, $error ) ) {
374
                    $translation = '|||UNTRANSLATED_CONTENT_START|||' . $segment . '|||UNTRANSLATED_CONTENT_END|||';
375
                }
376
            }
377
        }
378
379
        return $translation;
380
381
    }
382
383
}