Conditions | 15 |
Total Lines | 80 |
Code Lines | 38 |
Lines | 0 |
Ratio | 0 % |
Tests | 34 |
CRAP Score | 15 |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like abydos.distance._relaxed_hamming.RelaxedHamming.dist_abs() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | # -*- coding: utf-8 -*- |
||
82 | 1 | def dist_abs(self, src, tar): |
|
83 | """Return the discounted Hamming distance between two strings. |
||
84 | |||
85 | Parameters |
||
86 | ---------- |
||
87 | src : str |
||
88 | Source string for comparison |
||
89 | tar : str |
||
90 | Target string for comparison |
||
91 | |||
92 | Returns |
||
93 | ------- |
||
94 | float |
||
95 | Relaxed Hamming distance |
||
96 | |||
97 | Examples |
||
98 | -------- |
||
99 | >>> cmp = RelaxedHamming() |
||
100 | >>> cmp.dist_abs('cat', 'hat') |
||
101 | 1.0 |
||
102 | >>> cmp.dist_abs('Niall', 'Neil') |
||
103 | 1.4 |
||
104 | >>> cmp.dist_abs('aluminum', 'Catalan') |
||
105 | 6.4 |
||
106 | >>> cmp.dist_abs('ATCG', 'TAGC') |
||
107 | 0.8 |
||
108 | |||
109 | |||
110 | .. versionadded:: 0.4.1 |
||
111 | |||
112 | """ |
||
113 | 1 | if src == tar: |
|
114 | 1 | return 0 |
|
115 | |||
116 | 1 | if len(src) != len(tar): |
|
117 | 1 | replacement_char = 1 |
|
118 | 1 | while chr(replacement_char) in src or chr(replacement_char) in tar: |
|
119 | 1 | replacement_char += 1 |
|
120 | 1 | replacement_char = chr(replacement_char) |
|
121 | 1 | if len(src) < len(tar): |
|
122 | 1 | src += replacement_char * (len(tar) - len(src)) |
|
123 | else: |
||
124 | 1 | tar += replacement_char * (len(src) - len(tar)) |
|
125 | |||
126 | 1 | if self.params['tokenizer']: |
|
127 | 1 | src = self.params['tokenizer'].tokenize(src).get_list() |
|
128 | 1 | tar = self.params['tokenizer'].tokenize(tar).get_list() |
|
129 | |||
130 | 1 | score = 0 |
|
131 | 1 | for pos in range(len(src)): |
|
132 | 1 | if src[pos] == tar[pos : pos + 1][0]: |
|
133 | 1 | continue |
|
134 | |||
135 | 1 | try: |
|
136 | 1 | diff = ( |
|
137 | tar[pos + 1 : pos + self._maxdist + 1].index(src[pos]) + 1 |
||
138 | ) |
||
139 | 1 | except ValueError: |
|
140 | 1 | diff = 0 |
|
141 | 1 | try: |
|
142 | 1 | found = ( |
|
143 | tar[max(0, pos - self._maxdist) : pos][::-1].index( |
||
144 | src[pos] |
||
145 | ) |
||
146 | + 1 |
||
147 | ) |
||
148 | 1 | except ValueError: |
|
149 | 1 | found = 0 |
|
150 | |||
151 | 1 | if found and diff: |
|
152 | 1 | diff = min(diff, found) |
|
153 | 1 | elif found: |
|
154 | 1 | diff = found |
|
155 | |||
156 | 1 | if diff: |
|
157 | 1 | score += min(1.0, self._discount * diff) |
|
158 | else: |
||
159 | 1 | score += 1.0 |
|
160 | |||
161 | 1 | return score |
|
162 | |||
209 |