Conditions | 10 |
Total Lines | 74 |
Code Lines | 25 |
Lines | 0 |
Ratio | 0 % |
Tests | 20 |
CRAP Score | 10 |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like abydos.tokenizer._qgrams.QGrams.__init__() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | # -*- coding: utf-8 -*- |
||
47 | 1 | def __init__(self, term, qval=2, start_stop='$#', skip=0): |
|
48 | """Initialize QGrams. |
||
49 | |||
50 | Parameters |
||
51 | ---------- |
||
52 | term : str |
||
53 | A string to extract q-grams from |
||
54 | qval : int or Iterable |
||
55 | The q-gram length (defaults to 2), can be an integer, range object, |
||
56 | or list |
||
57 | start_stop : str |
||
58 | A string of length >= 0 indicating start & stop symbols. |
||
59 | If the string is '', q-grams will be calculated without start & |
||
60 | stop symbols appended to each end. |
||
61 | Otherwise, the first character of start_stop will pad the |
||
62 | beginning of the string and the last character of start_stop |
||
63 | will pad the end of the string before q-grams are calculated. |
||
64 | (In the case that start_stop is only 1 character long, the same |
||
65 | symbol will be used for both.) |
||
66 | skip : int or Iterable |
||
67 | The number of characters to skip, can be an integer, range object, |
||
68 | or list |
||
69 | |||
70 | Examples |
||
71 | -------- |
||
72 | >>> qg = QGrams('AATTATAT') |
||
73 | >>> qg |
||
74 | QGrams({'AT': 3, 'TA': 2, '$A': 1, 'AA': 1, 'TT': 1, 'T#': 1}) |
||
75 | |||
76 | >>> qg = QGrams('AATTATAT', qval=1, start_stop='') |
||
77 | >>> qg |
||
78 | QGrams({'A': 4, 'T': 4}) |
||
79 | |||
80 | >>> qg = QGrams('AATTATAT', qval=3, start_stop='') |
||
81 | >>> qg |
||
82 | QGrams({'TAT': 2, 'AAT': 1, 'ATT': 1, 'TTA': 1, 'ATA': 1}) |
||
83 | |||
84 | """ |
||
85 | # Save the term itself |
||
86 | 1 | self._term = term |
|
87 | 1 | self._term_ss = term |
|
88 | 1 | self._ordered_list = [] |
|
89 | |||
90 | 1 | if not isinstance(qval, Iterable): |
|
91 | 1 | qval = (qval,) |
|
92 | 1 | if not isinstance(skip, Iterable): |
|
93 | 1 | skip = (skip,) |
|
94 | |||
95 | 1 | for qval_i in qval: |
|
96 | 1 | for skip_i in skip: |
|
97 | 1 | if len(self._term) < qval_i or qval_i < 1: |
|
98 | 1 | continue |
|
99 | |||
100 | 1 | if start_stop and qval_i > 1: |
|
101 | 1 | term = ( |
|
102 | start_stop[0] * (qval_i - 1) |
||
103 | + self._term |
||
104 | + start_stop[-1] * (qval_i - 1) |
||
105 | ) |
||
106 | else: |
||
107 | 1 | term = self._term |
|
108 | |||
109 | # Having appended start & stop symbols (or not), save the |
||
110 | # result, but only for the longest valid qval_i |
||
111 | 1 | if len(term) > len(self._term_ss): |
|
112 | 1 | self._term_ss = term |
|
113 | |||
114 | 1 | skip_i += 1 |
|
115 | 1 | self._ordered_list += [ |
|
116 | term[i : i + (qval_i * skip_i) : skip_i] |
||
117 | for i in range(len(term) - (qval_i - 1)) |
||
118 | ] |
||
119 | |||
120 | 1 | super(QGrams, self).__init__(self._ordered_list) |
|
121 | |||
152 |