osm_poi_matchmaker.dataproviders.attic.hu_tesco   A
last analyzed

Complexity

Total Complexity 10

Size/Duplication

Total Lines 110
Duplicated Lines 0 %

Importance

Changes 0
Metric Value
eloc 96
dl 0
loc 110
rs 10
c 0
b 0
f 0
wmc 10

3 Methods

Rating   Name   Duplication   Size   Complexity  
A hu_tesco.__init__() 0 5 1
A hu_tesco.types() 0 14 1
C hu_tesco.process() 0 65 8
1
# -*- coding: utf-8 -*-
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
3
try:
4
    import logging
5
    import os
6
    import re
7
    import pandas as pd
8
    from osm_poi_matchmaker.dao.data_handlers import insert_poi_dataframe
9
    from osm_poi_matchmaker.libs.soup import save_downloaded_soup
10
    from osm_poi_matchmaker.libs.address import extract_street_housenumber_better_2, clean_city
11
    from osm_poi_matchmaker.dao import poi_array_structure
12
except ImportError as err:
13
    logging.error('Error %s import module: %s', __name__, err)
14
    logging.exception('Exception occurred')
15
16
    sys.exit(128)
0 ignored issues
show
Comprehensibility Best Practice introduced by
Undefined variable 'sys'
Loading history...
17
18
POI_COLS = poi_array_structure.POI_COLS
19
POI_DATA = 'http://tesco.hu/aruhazak/nyitvatartas'
20
21
22
class hu_tesco():
0 ignored issues
show
Coding Style Naming introduced by
Class name "hu_tesco" doesn't conform to PascalCase naming style ('[^\\W\\da-z][^\\W_]+$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
introduced by
Missing class docstring
Loading history...
23
24
    def __init__(self, session, download_cache, filename='hu_tesco.html'):
25
        self.session = session
26
        self.link = POI_DATA
27
        self.download_cache = download_cache
28
        self.filename = filename
29
30
    @staticmethod
31
    def types():
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
32
        data = [
33
            {'poi_code': 'hutescoexp', 'poi_name': 'Tesco Expressz', 'poi_type': 'shop',
34
             'poi_tags': "{'shop': 'convenience', 'operator': 'Tesco Global Áruházak Zrt.', 'brand': 'Tesco', 'contact:facebook':'https://www.facebook.com/tescoaruhazak', 'contact:youtube':'https://www.youtube.com/user/TescoMagyarorszag', 'payment:cash': 'yes', 'payment:debit_cards': 'yes'}",
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (293/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
35
             'poi_url_base': 'https://www.tesco.hu'},
36
            {'poi_code': 'hutescoext', 'poi_name': 'Tesco Extra', 'poi_type': 'shop',
37
             'poi_tags': "{'shop': 'supermarket', 'operator': 'Tesco Global Áruházak Zrt.', 'brand': 'Tesco', 'contact:facebook':'https://www.facebook.com/tescoaruhazak', 'contact:youtube':'https://www.youtube.com/user/TescoMagyarorszag', 'payment:cash': 'yes', 'payment:debit_cards': 'yes'}",
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (293/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
38
             'poi_url_base': 'https://www.tesco.hu'},
39
            {'poi_code': 'hutescosup', 'poi_name': 'Tesco', 'poi_type': 'shop',
40
             'poi_tags': "{'shop': 'supermarket', 'operator': 'Tesco Global Áruházak Zrt.', 'brand': 'Tesco', 'contact:facebook':'https://www.facebook.com/tescoaruhazak', 'contact:youtube':'https://www.youtube.com/user/TescoMagyarorszag', 'payment:cash': 'yes', 'payment:debit_cards': 'yes'}",
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (293/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
41
             'poi_url_base': 'https://www.tesco.hu'},
42
        ]
43
        return data
44
45
    def process(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
Comprehensibility introduced by
This function exceeds the maximum number of variables (39/15).
Loading history...
46
        soup = save_downloaded_soup('{}'.format(self.link), os.path.join(self.download_cache, self.filename))
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (109/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
Bug introduced by
It seems like a value for argument filetype is missing in the function call.
Loading history...
47
        data = []
48
        if soup is not None:
49
            # parse the html using beautiful soap and store in variable `soup`
50
            table = soup.find('table', attrs={'class': 'tescoce-table'})
51
            table_body = table.find('tbody')
52
            rows = table_body.find_all('tr')
53
            for row in rows:
54
                cols = row.find_all('td')
55
                link = cols[0].find('a').get('href') if cols[0].find('a') is not None else []
56
                cols = [element.text.strip() for element in cols]
57
                cols[0] = cols[0].split('\n')[0]
58
                del cols[-1]
59
                del cols[-1]
60
                cols.append(link)
61
                data.append(cols)
62
            for poi_data in data:
63
                # Assign: code, postcode, city, name, branch, website, original, street, housenumber, conscriptionnumber, ref, geom
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (131/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
64
                street, housenumber, conscriptionnumber = extract_street_housenumber_better_2(poi_data[3])
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (106/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
65
                tesco_replace = re.compile('(expressz{0,1})', re.IGNORECASE)
66
                poi_data[0] = tesco_replace.sub('Expressz', poi_data[0])
67
                if 'xpres' in poi_data[0]:
68
                    name = 'Tesco Expressz'
69
                    code = 'hutescoexp'
70
                elif 'xtra' in poi_data[0]:
71
                    name = 'Tesco Extra'
72
                    code = 'hutescoext'
73
                else:
74
                    name = 'Tesco'
75
                    code = 'hutescosup'
76
                poi_data[0] = poi_data[0].replace('TESCO', 'Tesco')
77
                poi_data[0] = poi_data[0].replace('Bp.', 'Budapest')
78
                postcode = poi_data[1].strip()
79
                city = clean_city(poi_data[2].split(',')[0])
80
                branch = poi_data[0]
81
                website = poi_data[4]
82
                nonstop = None
83
                mo_o = None
84
                th_o = None
85
                we_o = None
86
                tu_o = None
87
                fr_o = None
88
                sa_o = None
89
                su_o = None
90
                mo_c = None
91
                th_c = None
92
                we_c = None
93
                tu_c = None
94
                fr_c = None
95
                sa_c = None
96
                su_c = None
97
                original = poi_data[3]
98
                geom = None
99
                ref = None
100
                insert_data.append(
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable insert_data does not seem to be defined.
Loading history...
Comprehensibility Best Practice introduced by
Undefined variable 'insert_data'
Loading history...
101
                    [code, postcode, city, name, branch, website, original, street, housenumber, conscriptionnumber,
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
102
                     ref, geom, nonstop, mo_o, tu_o, we_o, th_o, fr_o, sa_o, su_o, mo_c, th_c, we_c, tu_c, fr_c, sa_c,
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (118/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
103
                     su_c])
104
            if len(insert_data) < 1:
0 ignored issues
show
Comprehensibility Best Practice introduced by
Undefined variable 'insert_data'
Loading history...
105
                logging.warning('Resultset is empty. Skipping ...')
106
            else:
107
                df = pd.DataFrame(insert_data)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "df" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Comprehensibility Best Practice introduced by
Undefined variable 'insert_data'
Loading history...
108
                df.columns = POI_COLS
109
                insert_poi_dataframe(self.session, df)
110