Conditions | 17 |
Total Lines | 85 |
Code Lines | 50 |
Lines | 0 |
Ratio | 0 % |
Tests | 32 |
CRAP Score | 17 |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like abydos.distance._phonetic_edit_distance.PhoneticEditDistance._alignment_matrix() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | # Copyright 2019-2020 by Christopher C. Little. |
||
119 | def _alignment_matrix( |
||
120 | self, src: str, tar: str, backtrace: bool = True |
||
121 | ) -> Union[np.ndarray, Tuple[np.ndarray, np.ndarray]]: |
||
122 | """Return the phonetic edit distance alignment matrix. |
||
123 | |||
124 | Parameters |
||
125 | ---------- |
||
126 | src : str |
||
127 | Source string for comparison |
||
128 | tar : str |
||
129 | Target string for comparison |
||
130 | backtrace : bool |
||
131 | Return the backtrace matrix as well |
||
132 | |||
133 | Returns |
||
134 | ------- |
||
135 | numpy.ndarray or tuple(numpy.ndarray, numpy.ndarray) |
||
136 | The alignment matrix and (optionally) the backtrace matrix |
||
137 | |||
138 | 1 | ||
139 | .. versionadded:: 0.4.1 |
||
140 | 1 | ||
141 | 1 | """ |
|
142 | ins_cost, del_cost, sub_cost, trans_cost = self._cost |
||
143 | 1 | ||
144 | 1 | src_len = len(src) |
|
145 | tar_len = len(tar) |
||
146 | 1 | ||
147 | 1 | src_list = ipa_to_features(src) |
|
148 | 1 | tar_list = ipa_to_features(tar) |
|
149 | 1 | ||
150 | 1 | d_mat = np.zeros((src_len + 1, tar_len + 1), dtype=np.float_) |
|
151 | 1 | if backtrace: |
|
152 | 1 | trace_mat = np.zeros((src_len + 1, tar_len + 1), dtype=np.int8) |
|
153 | 1 | for i in range(1, src_len + 1): |
|
154 | 1 | d_mat[i, 0] = i * del_cost |
|
155 | 1 | if backtrace: |
|
156 | 1 | trace_mat[i, 0] = 0 |
|
157 | for j in range(1, tar_len + 1): |
||
158 | 1 | d_mat[0, j] = j * ins_cost |
|
159 | 1 | if backtrace: |
|
160 | 1 | trace_mat[0, j] = 1 |
|
161 | 1 | ||
162 | for i in range(src_len): |
||
163 | for j in range(tar_len): |
||
164 | traces = ((i + 1, j), (i, j + 1), (i, j)) |
||
165 | opts = ( |
||
166 | d_mat[traces[0]] + ins_cost, # ins |
||
167 | d_mat[traces[1]] + del_cost, # del |
||
168 | d_mat[traces[2]] |
||
169 | + ( |
||
170 | sub_cost |
||
171 | * ( |
||
172 | 1 | 1.0 |
|
173 | 1 | - cmp_features( |
|
174 | 1 | src_list[i], |
|
175 | tar_list[j], |
||
176 | 1 | cast(Sequence[float], self._weights), |
|
177 | 1 | ) |
|
178 | ) |
||
179 | if src_list[i] != tar_list[j] |
||
180 | else 0 |
||
181 | ), # sub/== |
||
182 | ) |
||
183 | d_mat[i + 1, j + 1] = min(opts) |
||
184 | 1 | if backtrace: |
|
185 | trace_mat[i + 1, j + 1] = int(np.argmin(opts)) |
||
186 | |||
187 | if self._mode == 'osa': |
||
188 | 1 | if ( |
|
189 | 1 | i + 1 > 1 |
|
190 | 1 | and j + 1 > 1 |
|
191 | 1 | and src_list[i] == tar_list[j - 1] |
|
192 | 1 | and src_list[i - 1] == tar_list[j] |
|
193 | ): |
||
194 | 1 | # transposition |
|
195 | d_mat[i + 1, j + 1] = min( |
||
196 | d_mat[i + 1, j + 1], |
||
197 | d_mat[i - 1, j - 1] + trans_cost, |
||
198 | ) |
||
199 | if backtrace: |
||
200 | trace_mat[i + 1, j + 1] = 2 |
||
201 | if backtrace: |
||
202 | return d_mat, trace_mat |
||
|
|||
203 | return d_mat |
||
204 | |||
319 |