Passed
Push — master ( ea0d3c...cc0a45 )
by KAMI
03:34
created

osm_poi_matchmaker.dao.data_handlers.insert()   B

Complexity

Conditions 5

Size

Total Lines 25
Code Lines 23

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 5
eloc 23
nop 2
dl 0
loc 25
rs 8.8613
c 0
b 0
f 0
1
# -*- coding: utf-8 -*-
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
3
try:
4
    import logging
5
    import sys
6
    import hashlib
7
    from osm_poi_matchmaker.dao.data_structure import City, POI_common, POI_address, Street_type
8
    from osm_poi_matchmaker.libs import address
9
    from osm_poi_matchmaker.dao import poi_array_structure
10
except ImportError as err:
11
    logging.error('Error %s import module: %s', __name__, err)
12
    logging.exception("Exception occurred")
13
    sys.exit(128)
14
15
POI_COLS = poi_array_structure.POI_COLS
16
17
18
def get_or_create(session, model, **kwargs):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
19
    instance = session.query(model).filter_by(**kwargs).first()
20 View Code Duplication
    if instance:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
unused-code introduced by
Unnecessary "else" after "return"
Loading history...
21
        logging.debug('Already added: %s' ,instance)
0 ignored issues
show
Coding Style introduced by
No space allowed before comma
Loading history...
Coding Style introduced by
Exactly one space required after comma
Loading history...
22
        return instance
23
    else:
24
        try:
25
            instance = model(**kwargs)
26
            session.add(instance)
27
            return instance
28
        except Exception as e:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
29
            logging.error('Cannot add to the database. (%s)', e)
30
            logging.exception("Exception occurred")
31
            raise (e)
0 ignored issues
show
Unused Code Coding Style introduced by
There is an unnecessary parenthesis after raise.
Loading history...
32
33
34
def get_or_create_poi(session, model, **kwargs):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
35
    if kwargs['poi_common_id'] is not None:
36
        if kwargs['poi_common_id'] is not None and kwargs['poi_addr_city'] is not None and (
37
                (kwargs['poi_addr_street'] and kwargs['poi_addr_housenumber'] is not None) or (
38
                kwargs['poi_conscriptionnumber'] is not None)):
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation (add 4 spaces).
Loading history...
39
            logging.debug('Fully filled basic data record')
40
        else:
41
            logging.warning('Missing record data: %s', kwargs)
42
    instance = session.query(model)\
43
        .filter_by(poi_common_id=kwargs['poi_common_id'])\
44
        .filter_by(poi_addr_city=kwargs['poi_addr_city'])\
45
        .filter_by(poi_addr_street=kwargs['poi_addr_street'])\
46
        .filter_by(poi_addr_housenumber=kwargs['poi_addr_housenumber'])\
47
        .filter_by(poi_conscriptionnumber=kwargs['poi_conscriptionnumber'])\
48
        .filter_by(poi_branch=kwargs['poi_branch'])\
49
        .first()
50 View Code Duplication
    if instance:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
unused-code introduced by
Unnecessary "else" after "return"
Loading history...
51
        logging.debug('Already added: %s', instance)
52
        return instance
53
    else:
54
        try:
55
            instance = model(**kwargs)
56
            session.add(instance)
57
            return instance
58
        except Exception as e:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
59
            logging.error('Cannot add to the database. (%s)', e)
60
            logging.exception("Exception occurred")
61
            raise (e)
0 ignored issues
show
Unused Code Coding Style introduced by
There is an unnecessary parenthesis after raise.
Loading history...
62
63
64
def insert_city_dataframe(session, city_df):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
65
    city_df.columns = ['city_post_code', 'city_name']
66
    try:
67
        for index, city_data in city_df.iterrows():
0 ignored issues
show
Unused Code introduced by
The variable index seems to be unused.
Loading history...
68
            get_or_create(session, City, city_post_code=city_data['city_post_code'],
69
                          city_name=address.clean_city(city_data['city_name']))
70
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
71
72
        logging.error('Rolled back: %s.', e)
73
        logging.error(city_data)
0 ignored issues
show
introduced by
The variable city_data does not seem to be defined in case the for loop on line 67 is not entered. Are you sure this can never be the case?
Loading history...
74
        logging.exception("Exception occurred")
75
        session.rollback()
76
    else:
77
        logging.info('Successfully added %s city items to the dataset.', len(city_df))
78
        session.commit()
79
80
81
def insert_street_type_dataframe(session, city_df):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
82
    city_df.columns = ['street_type']
83
    try:
84
        for index, city_data in city_df.iterrows():
0 ignored issues
show
Unused Code introduced by
The variable index seems to be unused.
Loading history...
85
            get_or_create(session, Street_type, street_type=city_data['street_type'])
86
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
87
        logging.error('Rolled back: %s.', e)
88
        logging.error(city_data)
0 ignored issues
show
introduced by
The variable city_data does not seem to be defined in case the for loop on line 84 is not entered. Are you sure this can never be the case?
Loading history...
89
        logging.exception("Exception occurred")
90
        session.rollback()
91
    else:
92
        logging.info('Successfully added %s street type items to the dataset.', len(city_df))
93
        session.commit()
94
95
96
def insert_common_dataframe(session, common_df):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
97
    common_df.columns = ['poi_name', 'poi_tags', 'poi_url_base', 'poi_code']
98
    try:
99
        for index, poi_common_data in common_df.iterrows():
0 ignored issues
show
Unused Code introduced by
The variable index seems to be unused.
Loading history...
100
            get_or_create(session, POI_common, **poi_common_data)
101
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
102
        logging.error('Rolled back: %s.', e)
103
        logging.error(poi_common_data)
0 ignored issues
show
introduced by
The variable poi_common_data does not seem to be defined in case the for loop on line 99 is not entered. Are you sure this can never be the case?
Loading history...
104
        logging.exception("Exception occurred")
105
        session.rollback()
106
    else:
107
        logging.info('Successfully added %s common items to the dataset.', len(common_df))
108
        session.commit()
109
110
111
def search_for_postcode(session, city_name):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
112
    city_col = session.query(City.city_post_code).filter(City.city_name == city_name).all()
113
    if len(city_col) == 1:
0 ignored issues
show
unused-code introduced by
Unnecessary "else" after "return"
Loading history...
114
        return city_col
115
    else:
116
        logging.info('Cannot determine the post code from city name (%s).', city_name)
117
        return None
118
119
120
def insert_poi_dataframe(session, poi_df):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
121
    poi_df.columns = POI_COLS
122
    poi_df[['poi_postcode']] = poi_df[['poi_postcode']].fillna('0000')
123
    poi_df[['poi_postcode']] = poi_df[['poi_postcode']].astype(int)
124
    poi_dict = poi_df.to_dict('records')
125
    try:
126
        for poi_data in poi_dict:
127
            city_col = session.query(City.city_id).filter(City.city_name == poi_data['poi_city']).filter(
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (105/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
128
                City.city_post_code == poi_data['poi_postcode']).first()
129
            common_col = session.query(POI_common.pc_id).filter(POI_common.poi_code == poi_data['poi_code']).first()
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
130
            poi_data['poi_addr_city'] = city_col
131
            poi_data['poi_common_id'] = common_col
132
            if 'poi_name' in poi_data: del poi_data['poi_name']
0 ignored issues
show
Coding Style introduced by
More than one statement on a single line
Loading history...
133
            if 'poi_code' in poi_data: del poi_data['poi_code']
0 ignored issues
show
Coding Style introduced by
More than one statement on a single line
Loading history...
134
            get_or_create_poi(session, POI_address, **poi_data)
135
    except Exception as e:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
136
        logging.error('Rolled back: %s.', e)
137
        logging.error(poi_data)
138
        logging.exception("Exception occurred")
139
        session.rollback()
140
        raise (e)
0 ignored issues
show
Unused Code Coding Style introduced by
There is an unnecessary parenthesis after raise.
Loading history...
141
    else:
142
        try:
143
            session.commit()
144
            logging.info('Successfully added %s POI items to the dataset.', len(poi_dict))
145
        except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
146
            logging.error('Unsuccessfull commit: %s.', e)
147
            logging.exception("Exception occurred")
148
149
150
def insert_type(session, type_data):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
151
    try:
152
        for i in type_data:
153
            get_or_create(session, POI_common, **i)
154
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
155
        logging.error('Rolled back: %s.', e)
156
        logging.error(i)
157
        logging.exception("Exception occurred")
158
        session.rollback()
159
    else:
160
        logging.info('Successfully added %s type items to the dataset.', len(type_data))
161
        session.commit()
162
163
164
def insert(session, **kwargs):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
165
    try:
166
        city_col = session.query(City.city_id).filter(City.city_name == kwargs['poi_city']).filter(
167
            City.city_post_code == kwargs['poi_postcode']).first()
168
        common_col = session.query(POI_common.pc_id).filter(POI_common.poi_code == kwargs['poi_code']).first()
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
169
        kwargs['poi_addr_city'] = city_col
170
        kwargs['poi_common_id'] = common_col
171
        kwargs['poi_hash'] = hashlib.sha512(
172
            '{}{}{}{}{}{}'.format(kwargs['poi_code'], kwargs['poi_postcode'], kwargs['poi_city'],
173
                                  kwargs['poi_addr_street'], kwargs['poi_addr_housenumber'],
174
                                  kwargs['poi_conscriptionnumber']).lower().replace(' ', '').encode(
175
                'utf-8')).hexdigest()
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation (add 22 spaces).
Loading history...
176
        if 'poi_name' in kwargs: del kwargs['poi_name']
0 ignored issues
show
Coding Style introduced by
More than one statement on a single line
Loading history...
177
        if 'poi_code' in kwargs: del kwargs['poi_code']
0 ignored issues
show
Coding Style introduced by
More than one statement on a single line
Loading history...
178
        get_or_create(session, POI_address, **kwargs)
179
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
180
        logging.error('Rolled back: %s.', e)
181
        logging.error(kwargs)
182
        logging.exception("Exception occurred")
183
        session.rollback()
184
    else:
185
        logging.debug('Successfully added the item to the dataset.')
186
        session.commit()
187
    finally:
188
        session.close()
189