Conditions | 14 |
Total Lines | 101 |
Code Lines | 31 |
Lines | 0 |
Ratio | 0 % |
Tests | 11 |
CRAP Score | 14 |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like abydos.phonetic._soundex.Soundex.encode() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | # Copyright 2014-2020 by Christopher C. Little. |
||
137 | def encode(self, word: str, **kwargs: Any) -> str: |
||
138 | """Return the Soundex code for a word. |
||
139 | |||
140 | Parameters |
||
141 | ---------- |
||
142 | word : str |
||
143 | 1 | The word to transform |
|
144 | 1 | ||
145 | Returns |
||
146 | 1 | ------- |
|
147 | str |
||
148 | The Soundex value |
||
149 | |||
150 | Examples |
||
151 | -------- |
||
152 | >>> pe = Soundex() |
||
153 | >>> pe.encode("Christopher") |
||
154 | 'C623' |
||
155 | >>> pe.encode("Niall") |
||
156 | 'N400' |
||
157 | >>> pe.encode('Smith') |
||
158 | 'S530' |
||
159 | >>> pe.encode('Schmidt') |
||
160 | 'S530' |
||
161 | |||
162 | >>> Soundex(max_length=-1).encode('Christopher') |
||
163 | 'C623160000000000000000000000000000000000000000000000000000000000' |
||
164 | >>> Soundex(max_length=-1, zero_pad=False).encode('Christopher') |
||
165 | 'C62316' |
||
166 | |||
167 | >>> Soundex(reverse=True).encode('Christopher') |
||
168 | 'R132' |
||
169 | |||
170 | >>> pe.encode('Ashcroft') |
||
171 | 'A261' |
||
172 | >>> pe.encode('Asicroft') |
||
173 | 'A226' |
||
174 | |||
175 | >>> pe_special = Soundex(var='special') |
||
176 | >>> pe_special.encode('Ashcroft') |
||
177 | 'A226' |
||
178 | >>> pe_special.encode('Asicroft') |
||
179 | 'A226' |
||
180 | |||
181 | |||
182 | .. versionadded:: 0.1.0 |
||
183 | .. versionchanged:: 0.3.6 |
||
184 | Encapsulated in class |
||
185 | .. versionchanged:: 0.6.0 |
||
186 | Made return a str only (comma-separated) |
||
187 | |||
188 | """ |
||
189 | # uppercase, normalize, decompose, and filter non-A-Z out |
||
190 | word = unicode_normalize('NFKD', word.upper()) |
||
191 | |||
192 | if self._var == 'Census' and ( |
||
193 | 'recurse' not in kwargs or kwargs['recurse'] is not False |
||
194 | ): |
||
195 | if word[:3] in {'VAN', 'CON'} and len(word) > 4: |
||
196 | return '{0},{1}'.format( |
||
197 | 1 | self.encode(word, recurse=False), |
|
198 | 1 | self.encode(word[3:], recurse=False), |
|
199 | ) |
||
200 | 1 | if word[:2] in {'DE', 'DI', 'LA', 'LE'} and len(word) > 3: |
|
201 | 1 | return '{0},{1}'.format( |
|
202 | 1 | self.encode(word, recurse=False), |
|
203 | self.encode(word[2:], recurse=False), |
||
204 | ) |
||
205 | # Otherwise, proceed as usual (var='American' mode, ostensibly) |
||
206 | |||
207 | word = ''.join(c for c in word if c in self._uc_set) |
||
208 | |||
209 | # Nothing to convert, return base case |
||
210 | if not word: |
||
211 | if self._zero_pad: |
||
212 | return '0' * self._max_length |
||
213 | return '0' |
||
214 | |||
215 | # Reverse word if computing Reverse Soundex |
||
216 | if self._reverse: |
||
217 | word = word[::-1] |
||
218 | 1 | ||
219 | 1 | # apply the Soundex algorithm |
|
220 | sdx = word.translate(self._trans) |
||
221 | |||
222 | if self._var == 'special': |
||
223 | sdx = sdx.replace('9', '0') # special rule for 1880-1910 census |
||
224 | else: |
||
225 | sdx = sdx.replace('9', '') # rule 1 |
||
226 | sdx = self._delete_consecutive_repeats(sdx) # rule 3 |
||
227 | |||
228 | if word[0] in 'HW': |
||
229 | sdx = word[0] + sdx |
||
230 | else: |
||
231 | sdx = word[0] + sdx[1:] |
||
232 | sdx = sdx.replace('0', '') # rule 1 |
||
233 | |||
234 | if self._zero_pad: |
||
235 | sdx += '0' * self._max_length # rule 4 |
||
236 | |||
237 | 1 | return sdx[: self._max_length] |
|
238 | |||
244 |