Conditions | 10 |
Total Lines | 53 |
Lines | 0 |
Ratio | 0 % |
Tests | 1 |
CRAP Score | 100.3389 |
Changes | 7 | ||
Bugs | 0 | Features | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like fetch_and_preprocess() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | """ |
||
171 | 1 | def fetch_and_preprocess(directory_to_extract_to, columns_to_use=None): |
|
172 | """ |
||
173 | High level function to fetch_and_preprocess the PAMAP2 dataset |
||
174 | directory_to_extract_to: the directory where the data will be stored |
||
175 | columns_to_use: the columns to use |
||
176 | """ |
||
177 | if columns_to_use is None: |
||
178 | columns_to_use = ['hand_acc_16g_x', 'hand_acc_16g_y', 'hand_acc_16g_z', |
||
179 | 'ankle_acc_16g_x', 'ankle_acc_16g_y', 'ankle_acc_16g_z', |
||
180 | 'chest_acc_16g_x', 'chest_acc_16g_y', 'chest_acc_16g_z'] |
||
181 | targetdir = fetch_data(directory_to_extract_to) |
||
182 | outdatapath = targetdir + '/PAMAP2_Dataset/slidingwindow512cleaned/' |
||
183 | if not os.path.exists(outdatapath): |
||
184 | os.makedirs(outdatapath) |
||
185 | if os.path.isfile(outdatapath+'x_train.npy'): |
||
186 | print('Data previously pre-processed and np-files saved to ' + |
||
187 | outdatapath) |
||
188 | else: |
||
189 | datadir = targetdir + '/PAMAP2_Dataset/Protocol' |
||
190 | filenames = listdir(datadir) |
||
191 | print('Start pre-processing all ' + str(len(filenames)) + ' files...') |
||
192 | # load the files and put them in a list of pandas dataframes: |
||
193 | datasets = [pd.read_csv(datadir+'/'+fn, header=None, sep=' ') \ |
||
194 | for fn in filenames] |
||
195 | datasets = addheader(datasets) # add headers to the datasets |
||
196 | #Interpolate dataset to get same sample rate between channels |
||
197 | datasets_filled = [d.interpolate() for d in datasets] |
||
198 | # Create mapping for class labels |
||
199 | classlabels, nr_classes, mapclasses = map_clas(datasets_filled) |
||
200 | #Create input (x) and output (y) sets |
||
201 | xall = [np.array(data[columns_to_use]) for data in datasets_filled] |
||
202 | yall = [np.array(data.activityID) for data in datasets_filled] |
||
203 | xylists = [split_activities(y, x) for x, y in zip(xall, yall)] |
||
204 | Xlists, ylists = zip(*xylists) |
||
205 | ybinarylists = [transform_y(y, mapclasses, nr_classes) for y in ylists] |
||
206 | # Split in train, test and val |
||
207 | x_trainlist, y_trainlist = split_data(Xlist,ylist,slice(0, 6)) |
||
208 | x_vallist, y_vallist = split_data(Xlist,ylist,indices=6) |
||
209 | test_range = slice(7, len(datasets_filled)) |
||
210 | x_testlist, y_testlist = split_data(Xlist,ylist,test_range) |
||
211 | # Take sliding-window frames, target is label of last time step, |
||
212 | # and store as numpy file |
||
213 | slidingwindow_store(y_list=y_trainlist, x_list=x_trainlist, \ |
||
214 | X_name='X_train', y_name='y_train', \ |
||
215 | outdatapath=outdatapath,shuffle=True) |
||
216 | slidingwindow_store(y_list=y_vallist, x_list=x_vallist, \ |
||
217 | X_name='X_val', y_name='y_val', \ |
||
218 | outdatapath=outdatapath,shuffle=False) |
||
219 | slidingwindow_store(y_list=y_testlist, x_list=x_testlist, \ |
||
220 | X_name='X_test', y_name='y_test', \ |
||
221 | outdatapath=outdatapath,shuffle=False) |
||
222 | print('Processed data succesfully stored in ' + outdatapath) |
||
223 | return outdatapath |
||
224 | |||
234 |
This check looks for invalid names for a range of different identifiers.
You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.
If your project includes a Pylint configuration file, the settings contained in that file take precedence.
To find out more about Pylint, please refer to their site.