| Conditions | 17 |
| Total Lines | 85 |
| Code Lines | 50 |
| Lines | 0 |
| Ratio | 0 % |
| Tests | 32 |
| CRAP Score | 17 |
| Changes | 0 | ||
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like abydos.distance._phonetic_edit_distance.PhoneticEditDistance._alignment_matrix() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
| 1 | # Copyright 2019-2020 by Christopher C. Little. |
||
| 119 | def _alignment_matrix( |
||
| 120 | self, src: str, tar: str, backtrace: bool = True |
||
| 121 | ) -> Union[np.ndarray, Tuple[np.ndarray, np.ndarray]]: |
||
| 122 | """Return the phonetic edit distance alignment matrix. |
||
| 123 | |||
| 124 | Parameters |
||
| 125 | ---------- |
||
| 126 | src : str |
||
| 127 | Source string for comparison |
||
| 128 | tar : str |
||
| 129 | Target string for comparison |
||
| 130 | backtrace : bool |
||
| 131 | Return the backtrace matrix as well |
||
| 132 | |||
| 133 | Returns |
||
| 134 | ------- |
||
| 135 | numpy.ndarray or tuple(numpy.ndarray, numpy.ndarray) |
||
| 136 | The alignment matrix and (optionally) the backtrace matrix |
||
| 137 | |||
| 138 | 1 | ||
| 139 | .. versionadded:: 0.4.1 |
||
| 140 | 1 | ||
| 141 | 1 | """ |
|
| 142 | ins_cost, del_cost, sub_cost, trans_cost = self._cost |
||
| 143 | 1 | ||
| 144 | 1 | src_len = len(src) |
|
| 145 | tar_len = len(tar) |
||
| 146 | 1 | ||
| 147 | 1 | src_list = ipa_to_features(src) |
|
| 148 | 1 | tar_list = ipa_to_features(tar) |
|
| 149 | 1 | ||
| 150 | 1 | d_mat = np.zeros((src_len + 1, tar_len + 1), dtype=np.float_) |
|
| 151 | 1 | if backtrace: |
|
| 152 | 1 | trace_mat = np.zeros((src_len + 1, tar_len + 1), dtype=np.int8) |
|
| 153 | 1 | for i in range(1, src_len + 1): |
|
| 154 | 1 | d_mat[i, 0] = i * del_cost |
|
| 155 | 1 | if backtrace: |
|
| 156 | 1 | trace_mat[i, 0] = 0 |
|
| 157 | for j in range(1, tar_len + 1): |
||
| 158 | 1 | d_mat[0, j] = j * ins_cost |
|
| 159 | 1 | if backtrace: |
|
| 160 | 1 | trace_mat[0, j] = 1 |
|
| 161 | 1 | ||
| 162 | for i in range(src_len): |
||
| 163 | for j in range(tar_len): |
||
| 164 | traces = ((i + 1, j), (i, j + 1), (i, j)) |
||
| 165 | opts = ( |
||
| 166 | d_mat[traces[0]] + ins_cost, # ins |
||
| 167 | d_mat[traces[1]] + del_cost, # del |
||
| 168 | d_mat[traces[2]] |
||
| 169 | + ( |
||
| 170 | sub_cost |
||
| 171 | * ( |
||
| 172 | 1 | 1.0 |
|
| 173 | 1 | - cmp_features( |
|
| 174 | 1 | src_list[i], |
|
| 175 | tar_list[j], |
||
| 176 | 1 | cast(Sequence[float], self._weights), |
|
| 177 | 1 | ) |
|
| 178 | ) |
||
| 179 | if src_list[i] != tar_list[j] |
||
| 180 | else 0 |
||
| 181 | ), # sub/== |
||
| 182 | ) |
||
| 183 | d_mat[i + 1, j + 1] = min(opts) |
||
| 184 | 1 | if backtrace: |
|
| 185 | trace_mat[i + 1, j + 1] = int(np.argmin(opts)) |
||
| 186 | |||
| 187 | if self._mode == 'osa': |
||
| 188 | 1 | if ( |
|
| 189 | 1 | i + 1 > 1 |
|
| 190 | 1 | and j + 1 > 1 |
|
| 191 | 1 | and src_list[i] == tar_list[j - 1] |
|
| 192 | 1 | and src_list[i - 1] == tar_list[j] |
|
| 193 | ): |
||
| 194 | 1 | # transposition |
|
| 195 | d_mat[i + 1, j + 1] = min( |
||
| 196 | d_mat[i + 1, j + 1], |
||
| 197 | d_mat[i - 1, j - 1] + trans_cost, |
||
| 198 | ) |
||
| 199 | if backtrace: |
||
| 200 | trace_mat[i + 1, j + 1] = 2 |
||
| 201 | if backtrace: |
||
| 202 | return d_mat, trace_mat |
||
|
|
|||
| 203 | return d_mat |
||
| 204 | |||
| 319 |