diff_classifier.utils.csv_to_pd()   B
last analyzed

Complexity

Conditions 5

Size

Total Lines 58
Code Lines 33

Duplication

Lines 58
Ratio 100 %

Importance

Changes 0
Metric Value
eloc 33
dl 58
loc 58
rs 8.6213
c 0
b 0
f 0
cc 5
nop 1

How to fix   Long Method   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
"""Utility functions used throughout diff_classifier.
2
3
This module includes general functions for tasks such as importing files and
4
converting between data types. Currently only includes a function to generate
5
pandas dataframes for csv output from Trackmate.
6
7
"""
8
import pandas as pd
9
10
11 View Code Duplication
def csv_to_pd(csvfname):
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
12
    """Reads Trackmate csv output file and converts to pandas dataframe.
13
14
    A specialized function designed specifically for TrackMate output files.
15
    This edits out the header at the beginning of the file.
16
17
    Parameters
18
    ----------
19
    csvfname : string
20
        Output csv from a file similar to trackmate_template.  Must
21
        include line 'Data starts here.\n' line in order to parse correctly.
22
23
    Returns
24
    -------
25
    data : pandas DataFrame
26
        Contains all trajectories from csvfname.
27
28
    Examples
29
    --------
30
    >>> data = csv_to_pd('../data/test.csv')
31
32
    """
33
    csvfile = open(csvfname)
34
35
    try:
36
        line = 'test'
37
        counter = 0
38
        while line != 'Data starts here.\n':
39
            line = csvfile.readline()
40
            counter = counter + 1
41
            if counter > 2000:
42
                break
43
44
        data = pd.read_csv(csvfname, skiprows=counter)
45
        data.sort_values(['Track_ID', 'Frame'], ascending=[1, 1])
46
        data = data.astype('float64')
47
48
        partids = data.Track_ID.unique()
49
        counter = 0
50
        for partid in partids:
51
            data.loc[data.Track_ID == partid, 'Track_ID'] = counter
52
            counter = counter + 1
53
    except:
54
        print('No data in csv file.')
55
        rawd = {'Track_ID': [],
56
                'Spot_ID': [],
57
                'Frame': [],
58
                'X': [],
59
                'Y': [],
60
                'Quality': [],
61
                'SN_Ratio': [],
62
                'Mean_Intensity': []}
63
        cols = ['Track_ID', 'Spot_ID', 'Frame', 'X', 'Y', 'Quality', 'SN_Ratio', 'Mean_Intensity']
64
        data = pd.DataFrame(data=rawd, index=[])
65
        data = data[cols]
66
        data = data.astype('float64')
67
68
    return data
69