insert_poi_dataframe()   B
last analyzed

Complexity

Conditions 7

Size

Total Lines 29
Code Lines 27

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 7
eloc 27
nop 2
dl 0
loc 29
rs 7.8319
c 0
b 0
f 0
1
# -*- coding: utf-8 -*-
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
3
try:
4
    import logging
5
    import sys
6
    import hashlib
7
    from osm_poi_matchmaker.dao.data_structure import City, POI_common, POI_address, Street_type
8
    from osm_poi_matchmaker.libs import address
9
    from osm_poi_matchmaker.dao import poi_array_structure
10
except ImportError as err:
11
    logging.error('Error %s import module: %s', __name__, err)
12
    logging.exception('Exception occurred')
13
14
    sys.exit(128)
15
16
POI_COLS = poi_array_structure.POI_COLS
17
18
19
def get_or_create(session, model, **kwargs):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
20
    instance = session.query(model).filter_by(**kwargs).first()
21 View Code Duplication
    if instance:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
unused-code introduced by
Unnecessary "else" after "return"
Loading history...
22
        logging.debug('Already added: %s', instance)
23
        return instance
24
    else:
25
        try:
26
            instance = model(**kwargs)
27
            session.add(instance)
28
            return instance
29
        except Exception as e:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
30
            logging.error('Cannot add to the database. (%s)', e)
31
            logging.exception('Exception occurred')
32
            raise e
33
34
35
def get_or_create_poi(session, model, **kwargs):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
36
    if kwargs['poi_common_id'] is not None:
37
        if kwargs['poi_common_id'] is not None and kwargs['poi_addr_city'] is not None and (
38
                (kwargs['poi_addr_street'] and kwargs['poi_addr_housenumber'] is not None) or (
39
                kwargs['poi_conscriptionnumber'] is not None)):
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation (add 4 spaces).
Loading history...
40
            logging.debug('Fully filled basic data record')
41
        else:
42
            logging.warning('Missing record data: %s', kwargs)
43
    instance = session.query(model) \
44
        .filter_by(poi_common_id=kwargs['poi_common_id']) \
45
        .filter_by(poi_addr_city=kwargs['poi_addr_city']) \
46
        .filter_by(poi_addr_street=kwargs['poi_addr_street']) \
47
        .filter_by(poi_addr_housenumber=kwargs['poi_addr_housenumber']) \
48
        .filter_by(poi_conscriptionnumber=kwargs['poi_conscriptionnumber']) \
49
        .filter_by(poi_branch=kwargs['poi_branch']) \
50
        .first()
51 View Code Duplication
    if instance:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
unused-code introduced by
Unnecessary "else" after "return"
Loading history...
52
        logging.debug('Already added: %s', instance)
53
        return instance
54
    else:
55
        try:
56
            instance = model(**kwargs)
57
            session.add(instance)
58
            return instance
59
        except Exception as e:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
60
            logging.error('Cannot add to the database. (%s)', e)
61
            logging.exception('Exception occurred')
62
            raise e
63
64
65
def get_or_create_cache(session, model, **kwargs):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
66
    if kwargs.get('osm_id') is not None and kwargs.get('osm_object_type'):
67
        instance = session.query(model) \
68
            .filter_by(osm_id=kwargs.get('osm_id')).filter_by(osm_object_type=kwargs.get('poi_addr_city')).first()
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
69 View Code Duplication
    if instance:
0 ignored issues
show
introduced by
The variable instance does not seem to be defined in case kwargs.get('osm_id') is ....get('osm_object_type') on line 66 is False. Are you sure this can never be the case?
Loading history...
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
unused-code introduced by
Unnecessary "else" after "return"
Loading history...
70
        logging.debug('Already added: %s', instance)
71
        return instance
72
    else:
73
        try:
74
            instance = model(**kwargs)
75
            session.add(instance)
76
            return instance
77
        except Exception as e:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
78
            logging.error('Cannot add to the database. (%s)', e)
79
            logging.exception('Exception occurred')
80
            raise e
81
82
def get_or_create_common(session, model, **kwargs):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
83
    if kwargs['poi_code'] is not None and kwargs['poi_code'] != '':
84
        instance = session.query(model).filter_by(poi_code=kwargs['poi_code']).first()
85 View Code Duplication
    if instance:
0 ignored issues
show
introduced by
The variable instance does not seem to be defined in case SubscriptNode is not None and SubscriptNode != '' on line 83 is False. Are you sure this can never be the case?
Loading history...
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
unused-code introduced by
Unnecessary "else" after "return"
Loading history...
86
        logging.debug('Already added: %s', instance)
87
        return instance
88
    else:
89
        try:
90
            instance = model(**kwargs)
91
            session.add(instance)
92
            return instance
93
        except Exception as e:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
94
            logging.error('Cannot add to the database. (%s)', e)
95
            logging.exception('Exception occurred')
96
            raise e
97
98
def insert_city_dataframe(session, city_df):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
99
    city_df.columns = ['city_post_code', 'city_name']
100
    try:
101
        for index, city_data in city_df.iterrows():
0 ignored issues
show
Unused Code introduced by
The variable index seems to be unused.
Loading history...
102
            get_or_create(session, City, city_post_code=city_data['city_post_code'],
103
                          city_name=address.clean_city(city_data['city_name']))
104
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
105
106
        logging.error('Rolled back: %s.', e)
107
        logging.error(city_data)
0 ignored issues
show
introduced by
The variable city_data does not seem to be defined in case the for loop on line 101 is not entered. Are you sure this can never be the case?
Loading history...
108
        logging.exception('Exception occurred')
109
110
        session.rollback()
111
    else:
112
        logging.info('Successfully added %s city items to the dataset.', len(city_df))
113
        session.commit()
114
115
116
def insert_street_type_dataframe(session, city_df):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
117
    city_df.columns = ['street_type']
118
    try:
119
        for index, city_data in city_df.iterrows():
0 ignored issues
show
Unused Code introduced by
The variable index seems to be unused.
Loading history...
120
            get_or_create(session, Street_type, street_type=city_data['street_type'])
121
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
122
        logging.error('Rolled back: %s.', e)
123
        logging.error(city_data)
0 ignored issues
show
introduced by
The variable city_data does not seem to be defined in case the for loop on line 119 is not entered. Are you sure this can never be the case?
Loading history...
124
        logging.exception('Exception occurred')
125
126
        session.rollback()
127
    else:
128
        logging.info('Successfully added %s street type items to the dataset.', len(city_df))
129
        session.commit()
130
131
132
def insert_common_dataframe(session, common_df):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
133
    common_df.columns = ['poi_name', 'poi_tags', 'poi_url_base', 'poi_code']
134
    try:
135
        for index, poi_common_data in common_df.iterrows():
0 ignored issues
show
Unused Code introduced by
The variable index seems to be unused.
Loading history...
136
            get_or_create_common(session, POI_common, **poi_common_data)
137
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
138
        logging.error('Rolled back: %s.', e)
139
        logging.error(poi_common_data)
0 ignored issues
show
introduced by
The variable poi_common_data does not seem to be defined in case the for loop on line 135 is not entered. Are you sure this can never be the case?
Loading history...
140
        logging.exception('Exception occurred')
141
142
        session.rollback()
143
    else:
144
        logging.info('Successfully added %s common items to the dataset.', len(common_df))
145
        session.commit()
146
147
148
def search_for_postcode(session, city_name):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
149
    city_col = session.query(City.city_post_code).filter(City.city_name == city_name).all()
150
    if len(city_col) == 1:
0 ignored issues
show
unused-code introduced by
Unnecessary "else" after "return"
Loading history...
151
        return city_col
152
    else:
153
        logging.info('Cannot determine the post code from city name (%s).', city_name)
154
        return None
155
156
157
def insert_poi_dataframe(session, poi_df):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
158
    poi_df.columns = POI_COLS
159
    poi_df[['poi_postcode']] = poi_df[['poi_postcode']].fillna('0000')
160
    poi_df[['poi_postcode']] = poi_df[['poi_postcode']].astype(int)
161
    poi_dict = poi_df.to_dict('records')
162
    try:
163
        for poi_data in poi_dict:
164
            city_col = session.query(City.city_id).filter(City.city_name == poi_data['poi_city']).filter(
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (105/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
165
                City.city_post_code == poi_data['poi_postcode']).first()
166
            common_col = session.query(POI_common.pc_id).filter(POI_common.poi_code == poi_data['poi_code']).first()
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
167
            poi_data['poi_addr_city'] = city_col
168
            poi_data['poi_common_id'] = common_col
169
            if 'poi_name' in poi_data: del poi_data['poi_name']
0 ignored issues
show
Coding Style introduced by
More than one statement on a single line
Loading history...
170
            if 'poi_code' in poi_data: del poi_data['poi_code']
0 ignored issues
show
Coding Style introduced by
More than one statement on a single line
Loading history...
171
            get_or_create_poi(session, POI_address, **poi_data)
172
    except Exception as e:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
173
        logging.error('Rolled back: %s.', e)
174
        logging.error(poi_data)
175
        logging.exception('Exception occurred')
176
177
        session.rollback()
178
        raise e
179
    else:
180
        try:
181
            session.commit()
182
            logging.info('Successfully added %s POI items to the dataset.', len(poi_dict))
183
        except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
184
            logging.error('Unsuccessfull commit: %s.', e)
185
            logging.exception('Exception occurred')
186
187
188
def insert_type(session, type_data):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
189
    try:
190
        for i in type_data:
191
            get_or_create_common(session, POI_common, **i)
192
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
193
        logging.error('Rolled back: %s.', e)
194
        logging.error(i)
195
        logging.exception('Exception occurred')
196
197
        session.rollback()
198
    else:
199
        logging.info('Successfully added %s type items to the dataset.', len(type_data))
200
        session.commit()
201
202
203
def insert(session, **kwargs):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
204
    try:
205
        city_col = session.query(City.city_id).filter(City.city_name == kwargs['poi_city']).filter(
206
            City.city_post_code == kwargs['poi_postcode']).first()
207
        common_col = session.query(POI_common.pc_id).filter(POI_common.poi_code == kwargs['poi_code']).first()
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
208
        kwargs['poi_addr_city'] = city_col
209
        kwargs['poi_common_id'] = common_col
210
        kwargs['poi_hash'] = hashlib.sha512(
211
            '{}{}{}{}{}{}'.format(kwargs['poi_code'], kwargs['poi_postcode'], kwargs['poi_city'],
212
                                  kwargs['poi_addr_street'], kwargs['poi_addr_housenumber'],
213
                                  kwargs['poi_conscriptionnumber']).lower().replace(' ', '').encode(
214
                'utf-8')).hexdigest()
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation (add 22 spaces).
Loading history...
215
        if 'poi_name' in kwargs: del kwargs['poi_name']
0 ignored issues
show
Coding Style introduced by
More than one statement on a single line
Loading history...
216
        if 'poi_code' in kwargs: del kwargs['poi_code']
0 ignored issues
show
Coding Style introduced by
More than one statement on a single line
Loading history...
217
        get_or_create_poi(session, POI_address, **kwargs)
218
    except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
219
        logging.error('Rolled back: %s.', e)
220
        logging.error(kwargs)
221
        logging.exception('Exception occurred')
222
223
        session.rollback()
224
    else:
225
        logging.debug('Successfully added the item to the dataset.')
226
        session.commit()
227
    finally:
228
        session.close()
229