Passed
Branch master (17b603)
by P.R.
01:31
created

etlt.cleaner.DateCleaner.DateCleaner.clean()   F

Complexity

Conditions 27

Size

Total Lines 52
Code Lines 28

Duplication

Lines 52
Ratio 100 %

Code Coverage

Tests 25
CRAP Score 27

Importance

Changes 0
Metric Value
cc 27
eloc 28
nop 2
dl 52
loc 52
ccs 25
cts 25
cp 1
crap 27
rs 0
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

Complexity

Complex classes like etlt.cleaner.DateCleaner.DateCleaner.clean() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
"""
2
ETLT
3
4
Copyright 2016 Set Based IT Consultancy
5
6
Licence MIT
7
"""
8 1
import re
9
10
11 1 View Code Duplication
class DateCleaner:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
12
    """
13
    Utility class for converting dates in miscellaneous formats to ISO-8601 (YYYY-MM-DD) format.
14
    """
15
16
    # ------------------------------------------------------------------------------------------------------------------
17 1
    month_map = {
18
        # English
19
        'jan': '01',
20
        'feb': '02',
21
        'mar': '03',
22
        'apr': '04',
23
        'may': '05',
24
        'jun': '06',
25
        'jul': '07',
26
        'aug': '08',
27
        'sep': '09',
28
        'oct': '10',
29
        'nov': '11',
30
        'dec': '12',
31
32
        # Dutch
33
        'mrt': '03',
34
        'mei': '05',
35
        'okt': '10'
36
    }
37
38
    # ------------------------------------------------------------------------------------------------------------------
39 1
    @staticmethod
40 1
    def clean(date, ignore_time=False):
41
        """
42
        Converts a date in miscellaneous format to ISO-8601 (YYYY-MM-DD) format.
43
44
        :param str date: The input date.
45
        :param bool ignore_time: If true any trailing time prt is ignore.
46
47
        :rtype: str
48
        """
49
        # Return empty input immediately.
50 1
        if not date:
51 1
            return date
52
53 1
        parts = re.split(r'[\-/. ]', date)
54
55 1
        if (len(parts) == 3) or \
56
                (len(parts) > 3 and ignore_time) or \
57
                (len(parts) == 4 and re.match(r'^[0:]*$', parts[3])) or \
58
                (len(parts) == 5 and re.match(r'^[0:]*$', parts[3]) and re.match(r'^0*$', parts[4])):
59 1
            if len(parts[0]) == 4 and len(parts[1]) <= 2 and len(parts[2]) <= 2:
60
                # Assume date is in  YYYY-MM-DD of YYYY-M-D format.
61 1
                return parts[0] + '-' + ('00' + parts[1])[-2:] + '-' + ('00' + parts[2])[-2:]
62
63 1
            if len(parts[0]) <= 2 and len(parts[1]) <= 2 and len(parts[2]) == 4:
64
                # Assume date is in  DD-MM-YYYY or D-M-YYYY format.
65 1
                return parts[2] + '-' + ('00' + parts[1])[-2:] + '-' + ('00' + parts[0])[-2:]
66
67 1
            if len(parts[0]) <= 2 and len(parts[1]) <= 2 and len(parts[2]) == 2:
68
                # Assume date is in  DD-MM-YY or D-M-YY format.
69 1
                year = '19' + parts[2] if parts[2] >= '20' else '20' + parts[2]
70
71 1
                return year + '-' + ('00' + parts[1])[-2:] + '-' + ('00' + parts[0])[-2:]
72
73
        # Try DDmonYYYY or DDmonYYYY HH:mm:ss format
74 1
        pattern = r'^(\d{2})([a-z]{3})(\d{4})' + ('.*$' if ignore_time else r'(\D(\d{1,2})\D(\d{1,2})\D(\d{1,2}))?$')
75 1
        match = re.match(pattern, date.lower())
76 1
        if match and match.group(2) in DateCleaner.month_map:
77 1
            ret = match.group(3) + '-' + DateCleaner.month_map[match.group(2)] + '-' + match.group(1)
78 1
            if len(match.groups()) == 7 and match.group(4):
79 1
                ret += 'T' + match.group(5) + ':' + match.group(6) + ':' + match.group(7)
80 1
            return ret
81
82
        # Try YYYYMMDD format.
83 1
        pattern = r'^\d{8}' + ('.*$' if ignore_time else '$')
84 1
        match = re.match(pattern, date)
85 1
        if match:
86
            # Assume date is YYYYMMDD format
87 1
            return date[0:4] + '-' + date[4:6] + '-' + date[6:8]
88
89
        # Format not recognized. Just return the original string.
90 1
        return date
91
92
# ----------------------------------------------------------------------------------------------------------------------
93