Completed
Push — dev ( 8582b4...82307e )
by
unknown
30s queued 19s
created

data.datasets.calculate_dlr.dlr()   F

Complexity

Conditions 14

Size

Total Lines 118
Code Lines 67

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 67
dl 0
loc 118
rs 3.6
c 0
b 0
f 0
cc 14
nop 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

Complexity

Complex classes like data.datasets.calculate_dlr.dlr() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
"""
2
Use the concept of dynamic line rating(DLR) to calculate temporal
3
depending capacity for HV transmission lines.
4
Inspired mainly on Planungsgrundsaetze-2020
5
Available at:
6
<https://www.transnetbw.de/files/pdf/netzentwicklung/netzplanungsgrundsaetze/UENB_PlGrS_Juli2020.pdf>
7
"""
8
9
from pathlib import Path
10
11
from shapely.geometry import Point
12
import geopandas as gpd
13
import numpy as np
14
import pandas as pd
15
import xarray as xr
16
17
from egon.data import config, db
18
from egon.data.datasets import Dataset
19
from egon.data.datasets.scenario_parameters import get_sector_parameters
20
21
22
class Calculate_dlr(Dataset):
23
    """Calculate DLR and assign values to each line in the db
24
25
    Parameters
26
    ----------
27
    *No parameters required
28
29
    *Dependencies*
30
      * :py:class:`DataBundle <egon.data.datasets.data_bundle.DataBundle>`
31
      * :py:class:`Osmtgmod <egon.data.datasets.osmtgmod.Osmtgmod>`
32
      * :py:class:`WeatherData <egon.data.datasets.era5.WeatherData>`
33
      * :py:class:`FixEhvSubnetworks <egon.data.datasets.FixEhvSubnetworks>`
34
35
    *Resulting tables*
36
      * :py:class:`grid.egon_etrago_line_timeseries
37
        <egon.data.datasets.etrago_setup.EgonPfHvLineTimeseries>` is filled
38
    """
39
40
    #:
41
    name: str = "dlr"
42
    #:
43
    version: str = "0.0.2"
44
45
    def __init__(self, dependencies):
46
        super().__init__(
47
            name=self.name,
48
            version=self.version,
49
            dependencies=dependencies,
50
            tasks=(dlr,),
51
        )
52
53
54
def dlr():
55
    """Calculate DLR and assign values to each line in the db
56
57
    Parameters
58
    ----------
59
    *No parameters required
60
61
    """
62
    cfg = config.datasets()["dlr"]
63
    for scn in set(config.settings()["egon-data"]["--scenarios"]):
64
        weather_year = get_sector_parameters("global", scn)["weather_year"]
65
66
        regions_shape_path = (
67
            Path(".")
68
            / "data_bundle_egon_data"
69
            / "regions_dynamic_line_rating"
70
            / "Germany_regions.shp"
71
        )
72
73
        # Calculate hourly DLR per region
74
        dlr_hourly_dic, dlr_hourly = DLR_Regions(
75
            weather_year, regions_shape_path
76
        )
77
78
        regions = gpd.read_file(regions_shape_path)
79
        regions = regions.sort_values(by=["Region"])
80
81
        # Connect to the data base
82
        con = db.engine()
83
84
        sql = f"""
85
        SELECT scn_name, line_id, topo, s_nom FROM
86
        {cfg['sources']['trans_lines']['schema']}.
87
        {cfg['sources']['trans_lines']['table']}
88
        """
89
        df = gpd.GeoDataFrame.from_postgis(
90
            sql, con, crs="EPSG:4326", geom_col="topo"
91
        )
92
93
        trans_lines_R = {}
94
        for i in regions.Region:
95
            shape_area = regions[regions["Region"] == i]
96
            trans_lines_R[i] = gpd.clip(df, shape_area)
97
        trans_lines = df[["s_nom"]]
98
        trans_lines["in_regions"] = [[] for i in range(len(df))]
99
100
        trans_lines[["line_id", "geometry", "scn_name"]] = df[
101
            ["line_id", "topo", "scn_name"]
102
        ]
103
        trans_lines = gpd.GeoDataFrame(trans_lines)
104
        # Assign to each transmission line the region to which it belongs
105
        for i in trans_lines_R:
106
            for j in trans_lines_R[i].index:
107
                trans_lines.loc[j][1] = trans_lines.loc[j][1].append(i)
108
        trans_lines["crossborder"] = ~trans_lines.within(regions.unary_union)
109
110
        DLR = []
111
112
        # Assign to each transmision line the final values of DLR based on location
113
        # and type of line (overhead or underground)
114
        for i in trans_lines.index:
115
            # The concept of DLR does not apply to crossborder lines
116
            if trans_lines.loc[i, "crossborder"] == True:
117
                DLR.append([1] * 8760)
118
                continue
119
            # Underground lines have DLR = 1
120
            if (
121
                trans_lines.loc[i][0] % 280 == 0
122
                or trans_lines.loc[i][0] % 550 == 0
123
                or trans_lines.loc[i][0] % 925 == 0
124
            ):
125
                DLR.append([1] * 8760)
126
                continue
127
            # Lines completely in one of the regions, have the DLR of the region
128
            if len(trans_lines.loc[i][1]) == 1:
129
                region = int(trans_lines.loc[i][1][0])
130
                DLR.append(dlr_hourly_dic["R" + str(region) + "-DLR"])
131
                continue
132
            # For lines crossing 2 or more regions, the lowest DLR between the
133
            # different regions per hour is assigned.
134
            if len(trans_lines.loc[i][1]) > 1:
135
                reg = []
136
                for j in trans_lines.loc[i][1]:
137
                    reg.append("Reg_" + str(j))
138
                min_DLR_reg = dlr_hourly[reg].min(axis=1)
139
                DLR.append(list(min_DLR_reg))
140
141
        trans_lines["s_max_pu"] = DLR
142
143
        # delete unnecessary columns
144
        trans_lines.drop(
145
            columns=["in_regions", "s_nom", "geometry", "crossborder"],
146
            inplace=True,
147
        )
148
149
        # Modify column "s_max_pu" to fit the requirement of the table
150
        trans_lines["s_max_pu"] = trans_lines.apply(
151
            lambda x: list(x["s_max_pu"]), axis=1
152
        )
153
        trans_lines["temp_id"] = 1
154
155
        # Delete existing data
156
        db.execute_sql(
157
            f"""
158
            DELETE FROM {cfg['sources']['line_timeseries']['schema']}.
159
            {cfg['sources']['line_timeseries']['table']};
160
            """
161
        )
162
163
        # Insert into database
164
        trans_lines.to_sql(
165
            f"{cfg['targets']['line_timeseries']['table']}",
166
            schema=f"{cfg['targets']['line_timeseries']['schema']}",
167
            con=db.engine(),
168
            if_exists="append",
169
            index=False,
170
        )
171
        return 0
172
173
174
def DLR_Regions(weather_year, regions_shape_path):
175
    """Calculate DLR values for the given regions
176
177
    Parameters
178
    ----------
179
    weather_info_path: str, mandatory
180
        path of the weather data downloaded from ERA5
181
    regions_shape_path: str, mandatory
182
        path to the shape file with the shape of the regions to analyze
183
184
    """
185
    # load, index and sort shapefile with the 9 regions defined by NEP 2020
186
    regions = gpd.read_file(regions_shape_path)
187
    regions = regions.set_index(["Region"])
188
    regions = regions.sort_values(by=["Region"])
189
190
    # The data downloaded using Atlite is loaded in 'weather_data_raw'.
191
    file_name = f"germany-{weather_year}-era5.nc"
192
    weather_info_path = Path(".") / "cutouts" / file_name
193
    weather_data_raw = xr.open_mfdataset(str(weather_info_path))
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable str does not seem to be defined.
Loading history...
194
    weather_data_raw = weather_data_raw.rio.write_crs(4326)
195
    weather_data_raw = weather_data_raw.rio.clip_box(
196
        minx=5.5, miny=47, maxx=15.5, maxy=55.5
197
    )
198
199
    wind_speed_raw = weather_data_raw.wnd100m.values
200
    temperature_raw = weather_data_raw.temperature.values
201
    roughness_raw = weather_data_raw.roughness.values
202
    index = weather_data_raw.indexes._indexes
203
    # The info in 'weather_data_raw' has 3 dimensions. In 'weather_data' will be
204
    # stored all the relevant data in a 2 dimensions array.
205
    weather_data = np.zeros(shape=(wind_speed_raw.size, 5))
206
    count = 0
207
    for hour in range(index["time"].size):
208
        for row in range(index["y"].size):
209
            for column in range(index["x"].size):
210
                rough = roughness_raw[hour, row, column]
211
                ws_100m = wind_speed_raw[hour, row, column]
212
                # Use Log Law to calculate wind speed at 50m height
213
                ws_50m = ws_100m * (np.log(50 / rough) / np.log(100 / rough))
214
                weather_data[count, 0] = hour
215
                weather_data[count, 1] = index["y"][row]
216
                weather_data[count, 2] = index["x"][column]
217
                weather_data[count, 3] = ws_50m
218
                weather_data[count, 4] = (
219
                    temperature_raw[hour, row, column] - 273.15
220
                )
221
                count += 1
222
223
    weather_data = pd.DataFrame(
224
        weather_data, columns=["hour", "lat", "lon", "wind_s", "temp"]
225
    )
226
227
    region_selec = weather_data[0 : index["x"].size * index["y"].size].copy()
228
    region_selec["geom"] = region_selec.apply(
229
        lambda x: Point(x["lon"], x["lat"]), axis=1
230
    )
231
    region_selec = gpd.GeoDataFrame(region_selec)
232
    region_selec = region_selec.set_geometry("geom")
233
    region_selec["region"] = np.zeros(index["x"].size * index["y"].size)
234
235
    # Mask weather information for each region defined by NEP 2020
236
    for reg in regions.index:
237
        weather_region = gpd.clip(region_selec, regions.loc[reg][0])
238
        region_selec["region"][
239
            region_selec.isin(weather_region).any(axis=1)
240
        ] = reg
241
242
    weather_data["region"] = (
243
        region_selec["region"].tolist() * index["time"].size
244
    )
245
    weather_data = weather_data[weather_data["region"] != 0]
246
247
    # Create data frame to save results(Min wind speed, max temperature and %DLR per region along 8760h in a year)
248
    time = pd.date_range(
249
        f"{weather_year}-01-01", f"{weather_year}-12-31 23:00:00", freq="H"
250
    )
251
    # time = time.transpose()
252
    dlr = pd.DataFrame(
253
        0,
254
        columns=[
255
            "R1-Wind_min",
256
            "R1-Temp_max",
257
            "R1-DLR",
258
            "R2-Wind_min",
259
            "R2-Temp_max",
260
            "R2-DLR",
261
            "R3-Wind_min",
262
            "R3-Temp_max",
263
            "R3-DLR",
264
            "R4-Wind_min",
265
            "R4-Temp_max",
266
            "R4-DLR",
267
            "R5-Wind_min",
268
            "R5-Temp_max",
269
            "R5-DLR",
270
            "R6-Wind_min",
271
            "R6-Temp_max",
272
            "R6-DLR",
273
            "R7-Wind_min",
274
            "R7-Temp_max",
275
            "R7-DLR",
276
            "R8-Wind_min",
277
            "R8-Temp_max",
278
            "R8-DLR",
279
            "R9-Wind_min",
280
            "R9-Temp_max",
281
            "R9-DLR",
282
        ],
283
        index=time,
284
    )
285
286
    # Calculate and save min wind speed and max temperature in a dataframe.
287
    # Since the dataframe generated by the function era5.weather_df_from_era5() is sorted by date,
288
    # it is faster to calculate the hourly results using blocks of data defined by "step", instead of
289
    # using a filter or a search function.
290
    for reg, df in weather_data.groupby("region"):
291
        for t in range(0, len(time)):
292
            step = df.shape[0] / len(time)
293
            low_limit = int(t * step)
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable int does not seem to be defined.
Loading history...
294
            up_limit = int(step * (t + 1))
295
            dlr.iloc[t, 0 + int(reg - 1) * 3] = min(
296
                df.iloc[low_limit:up_limit, 3]
297
            )
298
            dlr.iloc[t, 1 + int(reg - 1) * 3] = max(
299
                df.iloc[low_limit:up_limit, 4]
300
            )
301
302
    # The next loop use the min wind speed and max temperature calculated previously to
303
    # define the hourly DLR for each region based on the table given by NEP 2020 pag 31
304
    for i in range(0, len(regions)):
305
        for j in range(0, len(time)):
306
            if dlr.iloc[j, 1 + i * 3] <= 5:
307
                if dlr.iloc[j, 0 + i * 3] < 3:
308
                    dlr.iloc[j, 2 + i * 3] = 1.30
309
                elif dlr.iloc[j, 0 + i * 3] < 4:
310
                    dlr.iloc[j, 2 + i * 3] = 1.35
311
                elif dlr.iloc[j, 0 + i * 3] < 5:
312
                    dlr.iloc[j, 2 + i * 3] = 1.45
313
                else:
314
                    dlr.iloc[j, 2 + i * 3] = 1.50
315
            elif dlr.iloc[j, 1 + i * 3] <= 15:
316 View Code Duplication
                if dlr.iloc[j, 0 + i * 3] < 3:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
317
                    dlr.iloc[j, 2 + i * 3] = 1.20
318
                elif dlr.iloc[j, 0 + i * 3] < 4:
319
                    dlr.iloc[j, 2 + i * 3] = 1.25
320
                elif dlr.iloc[j, 0 + i * 3] < 5:
321
                    dlr.iloc[j, 2 + i * 3] = 1.35
322
                elif dlr.iloc[j, 0 + i * 3] < 6:
323
                    dlr.iloc[j, 2 + i * 3] = 1.45
324
                else:
325
                    dlr.iloc[j, 2 + i * 3] = 1.50
326
            elif dlr.iloc[j, 1 + i * 3] <= 25:
327 View Code Duplication
                if dlr.iloc[j, 0 + i * 3] < 3:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
328
                    dlr.iloc[j, 2 + i * 3] = 1.10
329
                elif dlr.iloc[j, 0 + i * 3] < 4:
330
                    dlr.iloc[j, 2 + i * 3] = 1.15
331
                elif dlr.iloc[j, 0 + i * 3] < 5:
332
                    dlr.iloc[j, 2 + i * 3] = 1.20
333
                elif dlr.iloc[j, 0 + i * 3] < 6:
334
                    dlr.iloc[j, 2 + i * 3] = 1.30
335
                else:
336
                    dlr.iloc[j, 2 + i * 3] = 1.40
337
            elif dlr.iloc[j, 1 + i * 3] <= 35:
338 View Code Duplication
                if dlr.iloc[j, 0 + i * 3] < 3:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
339
                    dlr.iloc[j, 2 + i * 3] = 1.00
340
                elif dlr.iloc[j, 0 + i * 3] < 4:
341
                    dlr.iloc[j, 2 + i * 3] = 1.05
342
                elif dlr.iloc[j, 0 + i * 3] < 5:
343
                    dlr.iloc[j, 2 + i * 3] = 1.10
344
                elif dlr.iloc[j, 0 + i * 3] < 6:
345
                    dlr.iloc[j, 2 + i * 3] = 1.15
346
                else:
347
                    dlr.iloc[j, 2 + i * 3] = 1.25
348
            else:
349
                dlr.iloc[j, 2 + i * 3] = 1.00
350
351
    DLR_hourly_df_dic = {}
352
    for i in dlr.columns[range(2, 29, 3)]:  # columns with DLR values
353
        DLR_hourly_df_dic[i] = dlr[i].values
354
355
    dlr_hourly = pd.DataFrame(index=time)
356
    for i in range(len(regions)):
357
        dlr_hourly["Reg_" + str(i + 1)] = dlr.iloc[:, 3 * i + 2]
358
359
    return DLR_hourly_df_dic, dlr_hourly
360