| Conditions | 10 |
| Total Lines | 57 |
| Code Lines | 19 |
| Lines | 0 |
| Ratio | 0 % |
| Changes | 0 | ||
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like abydos.qgram.QGrams.__init__() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
| 1 | # -*- coding: utf-8 -*- |
||
| 45 | def __init__(self, term, qval=2, start_stop='$#', skip=0): |
||
| 46 | """Initialize QGrams. |
||
| 47 | |||
| 48 | :param str word: a string to extract q-grams from |
||
| 49 | :param int or iterable qval: the q-gram length (defaults to 2), can be |
||
| 50 | an integer, range object, or list |
||
| 51 | :param str start_stop: a string of length >= 0 indicating start & stop |
||
| 52 | symbols. |
||
| 53 | If the string is '', q-grams will be calculated without start & |
||
| 54 | stop symbols appended to each end. |
||
| 55 | Otherwise, the first character of start_stop will pad the beginning |
||
| 56 | of the string and the last character of start_stop will pad the end |
||
| 57 | of the string before q-grams are calculated. (In the case that |
||
| 58 | start_stop is only 1 character long, the same symbol will be used |
||
| 59 | for both.) |
||
| 60 | :param int or iterable skip: the number of characters to skip, can be |
||
| 61 | an integer, range object, or list |
||
| 62 | |||
| 63 | >>> qg = QGrams('AATTATAT') |
||
| 64 | >>> qg |
||
| 65 | QGrams({'AT': 3, 'TA': 2, '$A': 1, 'AA': 1, 'TT': 1, 'T#': 1}) |
||
| 66 | |||
| 67 | >>> qg = QGrams('AATTATAT', qval=1, start_stop='') |
||
| 68 | >>> qg |
||
| 69 | QGrams({'A': 4, 'T': 4}) |
||
| 70 | |||
| 71 | >>> qg = QGrams('AATTATAT', qval=3, start_stop='') |
||
| 72 | >>> qg |
||
| 73 | QGrams({'TAT': 2, 'AAT': 1, 'ATT': 1, 'TTA': 1, 'ATA': 1}) |
||
| 74 | """ |
||
| 75 | # Save the term itself |
||
| 76 | self.term = term |
||
| 77 | self.ordered_list = [] |
||
| 78 | |||
| 79 | if not isinstance(qval, Iterable): |
||
| 80 | qval = (qval,) |
||
| 81 | if not isinstance(skip, Iterable): |
||
| 82 | skip = (skip,) |
||
| 83 | |||
| 84 | for qval_i in qval: |
||
| 85 | for skip_i in skip: |
||
| 86 | if len(term) < qval_i or qval_i < 1: |
||
| 87 | continue |
||
| 88 | |||
| 89 | if start_stop and qval_i > 1: |
||
| 90 | term = start_stop[0]*(qval_i-1) + term + start_stop[-1]*(qval_i-1) |
||
| 91 | |||
| 92 | # Having appended start & stop symbols (or not), save the result |
||
| 93 | # but only for the longest valid qval_i |
||
| 94 | if len(term) > len(self.term_ss): |
||
| 95 | self.term_ss = term |
||
| 96 | |||
| 97 | skip_i += 1 |
||
| 98 | self.ordered_list += [term[i:i+(qval_i*skip_i):skip_i] for i in |
||
| 99 | range(len(term)-(qval_i-1))] |
||
| 100 | |||
| 101 | super(QGrams, self).__init__(self.ordered_list) |
||
| 102 | |||
| 127 |