| Conditions | 14 |
| Total Lines | 68 |
| Lines | 0 |
| Ratio | 0 % |
| Changes | 1 | ||
| Bugs | 0 | Features | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like DependencyUtils.lexicalize_path() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
| 1 | #!/usr/bin/env python |
||
| 147 | @staticmethod |
||
| 148 | def lexicalize_path(sentence, |
||
| 149 | path, |
||
| 150 | words=False, |
||
| 151 | lemmas=False, |
||
| 152 | tags=False, |
||
| 153 | simple_tags=False, |
||
| 154 | entities=False): |
||
| 155 | """ |
||
| 156 | Lexicalizes path in syntactic dependency graph using Odin-style token constraints. Operates on output of `DependencyUtils.retrieve_edges` |
||
| 157 | |||
| 158 | Parameters |
||
| 159 | ---------- |
||
| 160 | sentence : processors.ds.Sentence |
||
| 161 | The `Sentence` from which the `path` was found. Used to lexicalize the `path`. |
||
| 162 | words : bool |
||
| 163 | Whether or not to encode nodes in the `path` with a token constraint constructed from `Sentence.words` |
||
| 164 | lemmas : bool |
||
| 165 | Whether or not to encode nodes in the `path` with a token constraint constructed from `Sentence.lemmas` |
||
| 166 | tags : bool |
||
| 167 | Whether or not to encode nodes in the `path` with a token constraint constructed from `Sentence.tags` |
||
| 168 | simple_tags : bool |
||
| 169 | Whether or not to encode nodes in the `path` with a token constraint constructed from `DependencyUtils.simplify_tag` applied to `Sentence.tags` |
||
| 170 | entities : bool |
||
| 171 | Whether or not to encode nodes in the `path` with a token constraint constructed from `Sentence._entities` |
||
| 172 | |||
| 173 | Returns |
||
| 174 | ------- |
||
| 175 | [str] |
||
| 176 | The lexicalized form of `path`, encoded according to the specified parameters. |
||
| 177 | """ |
||
| 178 | UNKNOWN = LabelManager.UNKNOWN |
||
| 179 | lexicalized_path = [] |
||
| 180 | relations = [] |
||
| 181 | nodes = [] |
||
| 182 | # gather edges and nodes |
||
| 183 | for edge in path: |
||
| 184 | relations.append(edge[1]) |
||
| 185 | nodes.append(edge[0]) |
||
| 186 | nodes.append(path[-1][-1]) |
||
| 187 | |||
| 188 | for (i, node) in enumerate(nodes): |
||
| 189 | # build token constraints |
||
| 190 | token_constraints = [] |
||
| 191 | # words |
||
| 192 | if words: |
||
| 193 | token_constraints.append("word=\"{}\"".format(sentence.words[node])) |
||
| 194 | # PoS tags |
||
| 195 | if tags and sentence.tags[node] != UNKNOWN: |
||
| 196 | token_constraints.append("tag=\"{}\"".format(sentence.tags[node])) |
||
| 197 | # lemmas |
||
| 198 | if lemmas and sentence.lemmas[node] != UNKNOWN: |
||
| 199 | token_constraints.append("lemma=\"{}\"".format(sentence.lemmas[node])) |
||
| 200 | # NE labels |
||
| 201 | if entities and sentence._entities[node] != UNKNOWN: |
||
| 202 | token_constraints.append("entity=\"{}\"".format(sentence.entity[node])) |
||
| 203 | # simple tags |
||
| 204 | if simple_tags and sentence.tags[node] != UNKNOWN: |
||
| 205 | token_constraints.append("tag={}".format(DependencyUtils.simplify_tag(sentence.tags[node]))) |
||
| 206 | # build node pattern |
||
| 207 | if len(token_constraints) > 0: |
||
| 208 | node_pattern = "[{}]".format(" & ".join(token_constraints)) |
||
| 209 | # store lexicalized representation of node |
||
| 210 | lexicalized_path.append(node_pattern) |
||
| 211 | # append next edge |
||
| 212 | if i < len(relations): |
||
| 213 | lexicalized_path.append(relations[i]) |
||
| 214 | return lexicalized_path |
||
| 215 | |||
| 238 |