Conditions | 10 |
Total Lines | 24 |
Code Lines | 19 |
Lines | 0 |
Ratio | 0 % |
Changes | 0 |
Complex classes like ocrd_models.xpath_functions.pc_textequiv() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | from ocrd_utils import xywh_from_points |
||
27 | @_export |
||
28 | def pc_textequiv(nodes): |
||
29 | """ |
||
30 | Extract TextEquiv/Unicode from all nodes, then concatenate |
||
31 | (interspersed with spaces or newlines). |
||
32 | """ |
||
33 | text = '' |
||
34 | for node in nodes: |
||
35 | # FIXME: find out why we need to go to the parent here |
||
36 | node = node.parent.value |
||
37 | if text and node.tag.endswith('Region'): |
||
38 | text += '\n' |
||
39 | if text and node.tag.endswith('Line'): |
||
40 | text += '\n' |
||
41 | if text and node.tag.endswith('Word'): |
||
42 | text += ' ' |
||
43 | equiv = node.find(f'{node.prefix}:TextEquiv', node.nsmap) |
||
44 | if equiv is None: |
||
45 | continue |
||
46 | string = equiv.find(f'{node.prefix}:Unicode', node.nsmap) |
||
47 | if string is None: |
||
48 | continue |
||
49 | text += str(string.text) |
||
50 | return text |
||
51 | |||
52 |