Conditions | 17 |
Total Lines | 76 |
Code Lines | 45 |
Lines | 14 |
Ratio | 18.42 % |
Tests | 32 |
CRAP Score | 17 |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like abydos.distance._phonetic_edit_distance.PhoneticEditDistance._alignment_matrix() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | # -*- coding: utf-8 -*- |
||
117 | 1 | def _alignment_matrix(self, src, tar, backtrace=True): |
|
118 | """Return the phonetic edit distance alignment matrix. |
||
119 | |||
120 | Parameters |
||
121 | ---------- |
||
122 | src : str |
||
123 | Source string for comparison |
||
124 | tar : str |
||
125 | Target string for comparison |
||
126 | backtrace : bool |
||
127 | Return the backtrace matrix as well |
||
128 | |||
129 | Returns |
||
130 | ------- |
||
131 | numpy.ndarray or tuple(numpy.ndarray, numpy.ndarray) |
||
132 | The alignment matrix and (optionally) the backtrace matrix |
||
133 | |||
134 | |||
135 | .. versionadded:: 0.4.1 |
||
136 | |||
137 | """ |
||
138 | 1 | ins_cost, del_cost, sub_cost, trans_cost = self._cost |
|
139 | |||
140 | 1 | src_len = len(src) |
|
141 | 1 | tar_len = len(tar) |
|
142 | |||
143 | 1 | src = ipa_to_features(src) |
|
144 | 1 | tar = ipa_to_features(tar) |
|
145 | |||
146 | 1 | d_mat = np.zeros((src_len + 1, tar_len + 1), dtype=np.float) |
|
147 | 1 | if backtrace: |
|
148 | 1 | trace_mat = np.zeros((src_len + 1, tar_len + 1), dtype=np.int8) |
|
149 | 1 | for i in range(1, src_len + 1): |
|
150 | 1 | d_mat[i, 0] = i * del_cost |
|
151 | 1 | if backtrace: |
|
152 | 1 | trace_mat[i, 0] = 0 |
|
153 | 1 | for j in range(1, tar_len + 1): |
|
154 | 1 | d_mat[0, j] = j * ins_cost |
|
155 | 1 | if backtrace: |
|
156 | 1 | trace_mat[0, j] = 1 |
|
157 | |||
158 | 1 | for i in range(src_len): |
|
159 | 1 | for j in range(tar_len): |
|
160 | 1 | traces = ((i + 1, j), (i, j + 1), (i, j)) |
|
161 | 1 | opts = ( |
|
162 | d_mat[traces[0]] + ins_cost, # ins |
||
163 | d_mat[traces[1]] + del_cost, # del |
||
164 | d_mat[traces[2]] |
||
165 | + ( |
||
166 | sub_cost |
||
167 | * (1.0 - cmp_features(src[i], tar[j], self._weights)) |
||
168 | if src[i] != tar[j] |
||
169 | else 0 |
||
170 | ), # sub/== |
||
171 | ) |
||
172 | 1 | d_mat[i + 1, j + 1] = min(opts) |
|
173 | 1 | if backtrace: |
|
174 | 1 | trace_mat[i + 1, j + 1] = int(np.argmin(opts)) |
|
175 | |||
176 | 1 | View Code Duplication | if self._mode == 'osa': |
|
|||
177 | 1 | if ( |
|
178 | i + 1 > 1 |
||
179 | and j + 1 > 1 |
||
180 | and src[i] == tar[j - 1] |
||
181 | and src[i - 1] == tar[j] |
||
182 | ): |
||
183 | # transposition |
||
184 | 1 | d_mat[i + 1, j + 1] = min( |
|
185 | d_mat[i + 1, j + 1], |
||
186 | d_mat[i - 1, j - 1] + trans_cost, |
||
187 | ) |
||
188 | 1 | if backtrace: |
|
189 | 1 | trace_mat[i + 1, j + 1] = 2 |
|
190 | 1 | if backtrace: |
|
191 | 1 | return d_mat, trace_mat |
|
192 | 1 | return d_mat |
|
193 | |||
306 |