Passed
Pull Request — dev (#1054)
by
unknown
01:28
created

data.datasets.power_plants   B

Complexity

Total Complexity 45

Size/Duplication

Total Lines 930
Duplicated Lines 1.4 %

Importance

Changes 0
Metric Value
wmc 45
eloc 488
dl 13
loc 930
rs 8.8
c 0
b 0
f 0

12 Functions

Rating   Name   Duplication   Size   Complexity  
B allocate_other_power_plants() 0 159 3
A assign_bus_id() 0 63 4
B insert_biomass_plants() 0 73 5
A insert_hydro_biomass() 0 19 2
A scale_prox2now() 0 33 4
A create_tables() 0 31 2
B insert_hydro_plants() 0 88 5
A select_target() 0 34 1
B assign_voltage_level() 0 86 4
A filter_mastr_geometry() 0 55 3
B assign_voltage_level_by_capacity() 0 22 7
B allocate_conventional_non_chp_power_plants() 0 159 4

1 Method

Rating   Name   Duplication   Size   Complexity  
A PowerPlants.__init__() 0 22 1

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complexity

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like data.datasets.power_plants often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
"""The central module containing all code dealing with power plant data.
2
"""
3
from geoalchemy2 import Geometry
4
from sqlalchemy import BigInteger, Column, Float, Integer, Sequence, String
5
from sqlalchemy.dialects.postgresql import JSONB
6
from sqlalchemy.ext.declarative import declarative_base
7
from sqlalchemy.orm import sessionmaker
8
import geopandas as gpd
9
import numpy as np
10
import pandas as pd
11
12
from egon.data import db
13
from egon.data.datasets import Dataset
14
from egon.data.datasets.mastr import (
15
    WORKING_DIR_MASTR_OLD, WORKING_DIR_MASTR_NEW
16
)
17
from egon.data.datasets.power_plants.conventional import (
18
    match_nep_no_chp,
19
    select_nep_power_plants,
20
    select_no_chp_combustion_mastr,
21
)
22
from egon.data.datasets.power_plants.mastr import (
23
    EgonPowerPlantsBiomass,
24
    EgonPowerPlantsHydro,
25
    EgonPowerPlantsPv,
26
    EgonPowerPlantsWind,
27
    import_mastr,
28
)
29
from egon.data.datasets.power_plants.pv_rooftop import pv_rooftop_per_mv_grid
30
from egon.data.datasets.power_plants.pv_rooftop_buildings import (
31
    geocode_mastr_data,
32
    pv_rooftop_to_buildings,
33
)
34
import egon.data.config
35
import egon.data.datasets.power_plants.assign_weather_data as assign_weather_data  # noqa: E501
36
import egon.data.datasets.power_plants.pv_ground_mounted as pv_ground_mounted
37
import egon.data.datasets.power_plants.wind_farms as wind_onshore
38
import egon.data.datasets.power_plants.wind_offshore as wind_offshore
39
40
Base = declarative_base()
41
42
43 View Code Duplication
class EgonPowerPlants(Base):
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
44
    __tablename__ = "egon_power_plants"
45
    __table_args__ = {"schema": "supply"}
46
    id = Column(BigInteger, Sequence("pp_seq"), primary_key=True)
47
    sources = Column(JSONB)
48
    source_id = Column(JSONB)
49
    carrier = Column(String)
50
    el_capacity = Column(Float)
51
    bus_id = Column(Integer)
52
    voltage_level = Column(Integer)
53
    weather_cell_id = Column(Integer)
54
    scenario = Column(String)
55
    geom = Column(Geometry("POINT", 4326), index=True)
56
57
58
class PowerPlants(Dataset):
59
    def __init__(self, dependencies):
60
        super().__init__(
61
            name="PowerPlants",
62
            version="0.0.15",
63
            dependencies=dependencies,
64
            tasks=(
65
                create_tables,
66
                import_mastr,
67
                insert_hydro_biomass,
68
                allocate_conventional_non_chp_power_plants,
69
                allocate_other_power_plants,
70
                {
71
                    wind_onshore.insert,
72
                    pv_ground_mounted.insert,
73
                    (
74
                        pv_rooftop_per_mv_grid,
75
                        geocode_mastr_data,
76
                        pv_rooftop_to_buildings,
77
                    ),
78
                },
79
                wind_offshore.insert,
80
                assign_weather_data.weatherId_and_busId,
81
            ),
82
        )
83
84
85
def create_tables():
86
    """Create tables for power plant data
87
    Returns
88
    -------
89
    None.
90
    """
91
92
    # Tables for future scenarios
93
    cfg = egon.data.config.datasets()["power_plants"]
94
    db.execute_sql(f"CREATE SCHEMA IF NOT EXISTS {cfg['target']['schema']};")
95
    engine = db.engine()
96
    db.execute_sql(
97
        f"""DROP TABLE IF EXISTS
98
        {cfg['target']['schema']}.{cfg['target']['table']}"""
99
    )
100
101
    db.execute_sql("""DROP SEQUENCE IF EXISTS pp_seq""")
102
    EgonPowerPlants.__table__.create(bind=engine, checkfirst=True)
103
104
    # Tables for status quo
105
    tables = [
106
        EgonPowerPlantsWind,
107
        EgonPowerPlantsPv,
108
        EgonPowerPlantsBiomass,
109
        EgonPowerPlantsHydro,
110
    ]
111
    for t in tables:
112
        db.execute_sql(
113
            f"DROP TABLE IF EXISTS {t.__table_args__['schema']}.{t.__tablename__} CASCADE;"
114
        )
115
        t.__table__.create(bind=engine, checkfirst=True)
116
117
118
def scale_prox2now(df, target, level="federal_state"):
119
    """Scale installed capacities linear to status quo power plants
120
121
    Parameters
122
    ----------
123
    df : pandas.DataFrame
124
        Status Quo power plants
125
    target : pandas.Series
126
        Target values for future scenario
127
    level : str, optional
128
        Scale per 'federal_state' or 'country'. The default is 'federal_state'.
129
130
    Returns
131
    -------
132
    df : pandas.DataFrame
133
        Future power plants
134
135
    """
136
137
    if level == "federal_state":
138
        df.loc[:, "Nettonennleistung"] = (
139
            df.groupby(df.Bundesland)
140
            .Nettonennleistung.apply(lambda grp: grp / grp.sum())
141
            .mul(target[df.Bundesland.values].values)
142
        )
143
    else:
144
        df.loc[:, "Nettonennleistung"] = df.Nettonennleistung.apply(
145
            lambda x: x / x.sum()
146
        ).mul(target.values)
147
148
    df = df[df.Nettonennleistung > 0]
149
150
    return df
151
152
153
def select_target(carrier, scenario):
154
    """Select installed capacity per scenario and carrier
155
156
    Parameters
157
    ----------
158
    carrier : str
159
        Name of energy carrier
160
    scenario : str
161
        Name of scenario
162
163
    Returns
164
    -------
165
    pandas.Series
166
        Target values for carrier and scenario
167
168
    """
169
    cfg = egon.data.config.datasets()["power_plants"]
170
171
    return (
172
        pd.read_sql(
173
            f"""SELECT DISTINCT ON (b.gen)
174
                         REPLACE(REPLACE(b.gen, '-', ''), 'ü', 'ue') as state,
175
                         a.capacity
176
                         FROM {cfg['sources']['capacities']} a,
177
                         {cfg['sources']['geom_federal_states']} b
178
                         WHERE a.nuts = b.nuts
179
                         AND scenario_name = '{scenario}'
180
                         AND carrier = '{carrier}'
181
                         AND b.gen NOT IN ('Baden-Württemberg (Bodensee)',
182
                                           'Bayern (Bodensee)')""",
183
            con=db.engine(),
184
        )
185
        .set_index("state")
186
        .capacity
187
    )
188
189
190
def filter_mastr_geometry(mastr, federal_state=None):
191
    """Filter data from MaStR by geometry
192
193
    Parameters
194
    ----------
195
    mastr : pandas.DataFrame
196
        All power plants listed in MaStR
197
    federal_state : str or None
198
        Name of federal state whoes power plants are returned.
199
        If None, data for Germany is returned
200
201
    Returns
202
    -------
203
    mastr_loc : pandas.DataFrame
204
        Power plants listed in MaStR with geometry inside German boundaries
205
206
    """
207
    cfg = egon.data.config.datasets()["power_plants"]
208
209
    if type(mastr) == pd.core.frame.DataFrame:
210
        # Drop entries without geometry for insert
211
        mastr_loc = mastr[
212
            mastr.Laengengrad.notnull() & mastr.Breitengrad.notnull()
213
        ]
214
215
        # Create geodataframe
216
        mastr_loc = gpd.GeoDataFrame(
217
            mastr_loc,
218
            geometry=gpd.points_from_xy(
219
                mastr_loc.Laengengrad, mastr_loc.Breitengrad, crs=4326
220
            ),
221
        )
222
    else:
223
        mastr_loc = mastr.copy()
224
225
    # Drop entries outside of germany or federal state
226
    if not federal_state:
227
        sql = f"SELECT geometry as geom FROM {cfg['sources']['geom_germany']}"
228
    else:
229
        sql = f"""
230
        SELECT geometry as geom
231
        FROM boundaries.vg250_lan_union
232
        WHERE REPLACE(REPLACE(gen, '-', ''), 'ü', 'ue') = '{federal_state}'"""
233
234
    mastr_loc = (
235
        gpd.sjoin(
236
            gpd.read_postgis(sql, con=db.engine()).to_crs(4326),
237
            mastr_loc,
238
            how="right",
239
        )
240
        .query("index_left==0")
241
        .drop("index_left", axis=1)
242
    )
243
244
    return mastr_loc
245
246
247
def insert_biomass_plants(scenario):
248
    """Insert biomass power plants of future scenario
249
250
    Parameters
251
    ----------
252
    scenario : str
253
        Name of scenario.
254
255
    Returns
256
    -------
257
    None.
258
259
    """
260
    cfg = egon.data.config.datasets()["power_plants"]
261
262
    # import target values from NEP 2021, scneario C 2035
263
    target = select_target("biomass", scenario)
264
265
    # import data for MaStR
266
    mastr = pd.read_csv(
267
        WORKING_DIR_MASTR_OLD / cfg["sources"]["mastr_biomass"]).query(
268
        "EinheitBetriebsstatus=='InBetrieb'"
269
    )
270
271
    # Drop entries without federal state or 'AusschließlichWirtschaftszone'
272
    mastr = mastr[
273
        mastr.Bundesland.isin(
274
            pd.read_sql(
275
                f"""SELECT DISTINCT ON (gen)
276
        REPLACE(REPLACE(gen, '-', ''), 'ü', 'ue') as states
277
        FROM {cfg['sources']['geom_federal_states']}""",
278
                con=db.engine(),
279
            ).states.values
280
        )
281
    ]
282
283
    # Scaling will be done per federal state in case of eGon2035 scenario.
284
    if scenario == "eGon2035":
285
        level = "federal_state"
286
    else:
287
        level = "country"
288
289
    # Choose only entries with valid geometries inside DE/test mode
290
    mastr_loc = filter_mastr_geometry(mastr).set_geometry("geometry")
291
292
    # Scale capacities to meet target values
293
    mastr_loc = scale_prox2now(mastr_loc, target, level=level)
294
295
    # Assign bus_id
296
    if len(mastr_loc) > 0:
297
        mastr_loc["voltage_level"] = assign_voltage_level(
298
            mastr_loc, cfg, WORKING_DIR_MASTR_OLD
299
        )
300
        mastr_loc = assign_bus_id(mastr_loc, cfg)
301
302
    # Insert entries with location
303
    session = sessionmaker(bind=db.engine())()
304
305
    for i, row in mastr_loc.iterrows():
306
        if not row.ThermischeNutzleistung > 0:
307
            entry = EgonPowerPlants(
308
                sources={"el_capacity": "MaStR scaled with NEP 2021"},
309
                source_id={"MastrNummer": row.EinheitMastrNummer},
310
                carrier="biomass",
311
                el_capacity=row.Nettonennleistung,
312
                scenario=scenario,
313
                bus_id=row.bus_id,
314
                voltage_level=row.voltage_level,
315
                geom=f"SRID=4326;POINT({row.Laengengrad} {row.Breitengrad})",
316
            )
317
            session.add(entry)
318
319
    session.commit()
320
321
322
def insert_hydro_plants(scenario):
323
    """Insert hydro power plants of future scenario.
324
325
    Hydro power plants are diveded into run_of_river and reservoir plants
326
    according to Marktstammdatenregister.
327
    Additional hydro technologies (e.g. turbines inside drinking water
328
    systems) are not considered.
329
330
    Parameters
331
    ----------
332
    scenario : str
333
        Name of scenario.
334
335
    Returns
336
    -------
337
    None.
338
339
    """
340
    cfg = egon.data.config.datasets()["power_plants"]
341
342
    # Map MaStR carriers to eGon carriers
343
    map_carrier = {
344
        "run_of_river": ["Laufwasseranlage"],
345
        "reservoir": ["Speicherwasseranlage"],
346
    }
347
348
    for carrier in map_carrier.keys():
349
350
        # import target values
351
        target = select_target(carrier, scenario)
352
353
        # import data for MaStR
354
        mastr = pd.read_csv(
355
            WORKING_DIR_MASTR_NEW / cfg["sources"]["mastr_hydro"]).query(
356
            "EinheitBetriebsstatus=='InBetrieb'"
357
        )
358
359
        # Choose only plants with specific carriers
360
        mastr = mastr[mastr.ArtDerWasserkraftanlage.isin(map_carrier[carrier])]
361
362
        # Drop entries without federal state or 'AusschließlichWirtschaftszone'
363
        mastr = mastr[
364
            mastr.Bundesland.isin(
365
                pd.read_sql(
366
                    f"""SELECT DISTINCT ON (gen)
367
            REPLACE(REPLACE(gen, '-', ''), 'ü', 'ue') as states
368
            FROM {cfg['sources']['geom_federal_states']}""",
369
                    con=db.engine(),
370
                ).states.values
371
            )
372
        ]
373
374
        # Scaling will be done per federal state in case of eGon2035 scenario.
375
        if scenario == "eGon2035":
376
            level = "federal_state"
377
        else:
378
            level = "country"
379
380
        # Scale capacities to meet target values
381
        mastr = scale_prox2now(mastr, target, level=level)
382
383
        # Choose only entries with valid geometries inside DE/test mode
384
        mastr_loc = filter_mastr_geometry(mastr).set_geometry("geometry")
385
        # TODO: Deal with power plants without geometry
386
387
        # Assign bus_id and voltage level
388
        if len(mastr_loc) > 0:
389
            mastr_loc["voltage_level"] = assign_voltage_level(
390
                mastr_loc, cfg, WORKING_DIR_MASTR_NEW
391
            )
392
            mastr_loc = assign_bus_id(mastr_loc, cfg)
393
394
        # Insert entries with location
395
        session = sessionmaker(bind=db.engine())()
396
        for i, row in mastr_loc.iterrows():
397
            entry = EgonPowerPlants(
398
                sources={"el_capacity": "MaStR scaled with NEP 2021"},
399
                source_id={"MastrNummer": row.EinheitMastrNummer},
400
                carrier=carrier,
401
                el_capacity=row.Nettonennleistung,
402
                scenario=scenario,
403
                bus_id=row.bus_id,
404
                voltage_level=row.voltage_level,
405
                geom=f"SRID=4326;POINT({row.Laengengrad} {row.Breitengrad})",
406
            )
407
            session.add(entry)
408
409
        session.commit()
410
411
412
def assign_voltage_level(mastr_loc, cfg, mastr_working_dir):
413
    """Assigns voltage level to power plants.
414
415
    If location data inluding voltage level is available from
416
    Marktstammdatenregister, this is used. Otherwise the voltage level is
417
    assigned according to the electrical capacity.
418
419
    Parameters
420
    ----------
421
    mastr_loc : pandas.DataFrame
422
        Power plants listed in MaStR with geometry inside German boundaries
423
424
    Returns
425
    -------
426
    pandas.DataFrame
427
        Power plants including voltage_level
428
429
    """
430
    mastr_loc["Spannungsebene"] = np.nan
431
    mastr_loc["voltage_level"] = np.nan
432
433
    if "LokationMastrNummer" in mastr_loc.columns:
434
        # Adjust column names to format of MaStR location dataset
435
        if mastr_working_dir == WORKING_DIR_MASTR_OLD:
436
            cols = ["LokationMastrNummer", "Spannungsebene"]
437
        elif mastr_working_dir == WORKING_DIR_MASTR_NEW:
438
            cols = ["MaStRNummer", "Spannungsebene"]
439
        else:
440
            raise ValueError("Invalid MaStR working directory!")
441
442
        location = pd.read_csv(
443
            mastr_working_dir / cfg["sources"]["mastr_location"],
444
            usecols=cols,
445
        ).rename(
446
            columns={"MaStRNummer": "LokationMastrNummer"}
447
        ).set_index("LokationMastrNummer")
448
449
        location = location[~location.index.duplicated(keep="first")]
450
451
        mastr_loc.loc[
452
            mastr_loc[
453
                mastr_loc.LokationMastrNummer.isin(location.index)
454
            ].index,
455
            "Spannungsebene",
456
        ] = location.Spannungsebene[
457
            mastr_loc[
458
                mastr_loc.LokationMastrNummer.isin(location.index)
459
            ].LokationMastrNummer
460
        ].values
461
462
        # Transfer voltage_level as integer from Spanungsebene
463
        map_voltage_levels = pd.Series(
464
            data={
465
                "Höchstspannung": 1,
466
                "Hoechstspannung": 1,
467
                "UmspannungZurHochspannung": 2,
468
                "Hochspannung": 3,
469
                "UmspannungZurMittelspannung": 4,
470
                "Mittelspannung": 5,
471
                "UmspannungZurNiederspannung": 6,
472
                "Niederspannung": 7,
473
            }
474
        )
475
476
        mastr_loc.loc[
477
            mastr_loc[mastr_loc["Spannungsebene"].notnull()].index,
478
            "voltage_level",
479
        ] = map_voltage_levels[
480
            mastr_loc.loc[
481
                mastr_loc[mastr_loc["Spannungsebene"].notnull()].index,
482
                "Spannungsebene",
483
            ].values
484
        ].values
485
486
    else:
487
        print(
488
            "No information about MaStR location available. "
489
            "All voltage levels are assigned using threshold values."
490
        )
491
492
    # If no voltage level is available from mastr, choose level according
493
    # to threshold values
494
495
    mastr_loc.voltage_level = assign_voltage_level_by_capacity(mastr_loc)
496
497
    return mastr_loc.voltage_level
498
499
500
def assign_voltage_level_by_capacity(mastr_loc):
501
502
    for i, row in mastr_loc[mastr_loc.voltage_level.isnull()].iterrows():
503
504
        if row.Nettonennleistung > 120:
505
            level = 1
506
        elif row.Nettonennleistung > 20:
507
            level = 3
508
        elif row.Nettonennleistung > 5.5:
509
            level = 4
510
        elif row.Nettonennleistung > 0.2:
511
            level = 5
512
        elif row.Nettonennleistung > 0.1:
513
            level = 6
514
        else:
515
            level = 7
516
517
        mastr_loc.loc[i, "voltage_level"] = level
518
519
    mastr_loc.voltage_level = mastr_loc.voltage_level.astype(int)
520
521
    return mastr_loc.voltage_level
522
523
524
def assign_bus_id(power_plants, cfg):
525
    """Assigns bus_ids to power plants according to location and voltage level
526
527
    Parameters
528
    ----------
529
    power_plants : pandas.DataFrame
530
        Power plants including voltage level
531
532
    Returns
533
    -------
534
    power_plants : pandas.DataFrame
535
        Power plants including voltage level and bus_id
536
537
    """
538
539
    mv_grid_districts = db.select_geodataframe(
540
        f"""
541
        SELECT * FROM {cfg['sources']['egon_mv_grid_district']}
542
        """,
543
        epsg=4326,
544
    )
545
546
    ehv_grid_districts = db.select_geodataframe(
547
        f"""
548
        SELECT * FROM {cfg['sources']['ehv_voronoi']}
549
        """,
550
        epsg=4326,
551
    )
552
553
    # Assign power plants in hv and below to hvmv bus
554
    power_plants_hv = power_plants[power_plants.voltage_level >= 3].index
555
    if len(power_plants_hv) > 0:
556
        power_plants.loc[power_plants_hv, "bus_id"] = gpd.sjoin(
557
            power_plants[power_plants.index.isin(power_plants_hv)],
558
            mv_grid_districts,
559
        ).bus_id
560
561
    # Assign power plants in ehv to ehv bus
562
    power_plants_ehv = power_plants[power_plants.voltage_level < 3].index
563
564
    if len(power_plants_ehv) > 0:
565
        ehv_join = gpd.sjoin(
566
            power_plants[power_plants.index.isin(power_plants_ehv)],
567
            ehv_grid_districts,
568
        )
569
570
        if "bus_id_right" in ehv_join.columns:
571
            power_plants.loc[power_plants_ehv, "bus_id"] = gpd.sjoin(
572
                power_plants[power_plants.index.isin(power_plants_ehv)],
573
                ehv_grid_districts,
574
            ).bus_id_right
575
576
        else:
577
            power_plants.loc[power_plants_ehv, "bus_id"] = gpd.sjoin(
578
                power_plants[power_plants.index.isin(power_plants_ehv)],
579
                ehv_grid_districts,
580
            ).bus_id
581
582
    # Assert that all power plants have a bus_id
583
    assert power_plants.bus_id.notnull().all(), f"""Some power plants are
584
    not attached to a bus: {power_plants[power_plants.bus_id.isnull()]}"""
585
586
    return power_plants
587
588
589
def insert_hydro_biomass():
590
    """Insert hydro and biomass power plants in database
591
592
    Returns
593
    -------
594
    None.
595
596
    """
597
    cfg = egon.data.config.datasets()["power_plants"]
598
    db.execute_sql(
599
        f"""
600
        DELETE FROM {cfg['target']['schema']}.{cfg['target']['table']}
601
        WHERE carrier IN ('biomass', 'reservoir', 'run_of_river')
602
        """
603
    )
604
605
    for scenario in ["eGon2035"]:
606
        insert_biomass_plants(scenario)
607
        insert_hydro_plants(scenario)
608
609
610
def allocate_conventional_non_chp_power_plants():
611
612
    carrier = ["oil", "gas"]
613
614
    cfg = egon.data.config.datasets()["power_plants"]
615
616
    # Delete existing plants in the target table
617
    db.execute_sql(
618
        f"""
619
         DELETE FROM {cfg ['target']['schema']}.{cfg ['target']['table']}
620
         WHERE carrier IN ('gas', 'oil')
621
         AND scenario='eGon2035';
622
         """
623
    )
624
625
    for carrier in carrier:
626
627
        nep = select_nep_power_plants(carrier)
628
629
        if nep.empty:
630
            print(f"DataFrame from NEP for carrier {carrier} is empty!")
631
632
        else:
633
634
            mastr = select_no_chp_combustion_mastr(carrier)
635
636
            # Assign voltage level to MaStR
637
            mastr["voltage_level"] = assign_voltage_level(
638
                mastr.rename({"el_capacity": "Nettonennleistung"}, axis=1),
639
                cfg,
640
                WORKING_DIR_MASTR_OLD
641
            )
642
643
            # Initalize DataFrame for matching power plants
644
            matched = gpd.GeoDataFrame(
645
                columns=[
646
                    "carrier",
647
                    "el_capacity",
648
                    "scenario",
649
                    "geometry",
650
                    "MaStRNummer",
651
                    "source",
652
                    "voltage_level",
653
                ]
654
            )
655
656
            # Match combustion plants of a certain carrier from NEP list
657
            # using PLZ and capacity
658
            matched, mastr, nep = match_nep_no_chp(
659
                nep,
660
                mastr,
661
                matched,
662
                buffer_capacity=0.1,
663
                consider_carrier=False,
664
            )
665
666
            # Match plants from NEP list using city and capacity
667
            matched, mastr, nep = match_nep_no_chp(
668
                nep,
669
                mastr,
670
                matched,
671
                buffer_capacity=0.1,
672
                consider_carrier=False,
673
                consider_location="city",
674
            )
675
676
            # Match plants from NEP list using plz,
677
            # neglecting the capacity
678
            matched, mastr, nep = match_nep_no_chp(
679
                nep,
680
                mastr,
681
                matched,
682
                consider_location="plz",
683
                consider_carrier=False,
684
                consider_capacity=False,
685
            )
686
687
            # Match plants from NEP list using city,
688
            # neglecting the capacity
689
            matched, mastr, nep = match_nep_no_chp(
690
                nep,
691
                mastr,
692
                matched,
693
                consider_location="city",
694
                consider_carrier=False,
695
                consider_capacity=False,
696
            )
697
698
            # Match remaining plants from NEP using the federal state
699
            matched, mastr, nep = match_nep_no_chp(
700
                nep,
701
                mastr,
702
                matched,
703
                buffer_capacity=0.1,
704
                consider_location="federal_state",
705
                consider_carrier=False,
706
            )
707
708
            # Match remaining plants from NEP using the federal state
709
            matched, mastr, nep = match_nep_no_chp(
710
                nep,
711
                mastr,
712
                matched,
713
                buffer_capacity=0.7,
714
                consider_location="federal_state",
715
                consider_carrier=False,
716
            )
717
718
            print(f"{matched.el_capacity.sum()} MW of {carrier} matched")
719
            print(f"{nep.c2035_capacity.sum()} MW of {carrier} not matched")
720
721
            matched.crs = "EPSG:4326"
722
723
            # Assign bus_id
724
            # Load grid district polygons
725
            mv_grid_districts = db.select_geodataframe(
726
                f"""
727
            SELECT * FROM {cfg['sources']['egon_mv_grid_district']}
728
            """,
729
                epsg=4326,
730
            )
731
732
            ehv_grid_districts = db.select_geodataframe(
733
                f"""
734
            SELECT * FROM {cfg['sources']['ehv_voronoi']}
735
            """,
736
                epsg=4326,
737
            )
738
739
            # Perform spatial joins for plants in ehv and hv level seperately
740
            power_plants_hv = gpd.sjoin(
741
                matched[matched.voltage_level >= 3],
742
                mv_grid_districts[["bus_id", "geom"]],
743
                how="left",
744
            ).drop(columns=["index_right"])
745
            power_plants_ehv = gpd.sjoin(
746
                matched[matched.voltage_level < 3],
747
                ehv_grid_districts[["bus_id", "geom"]],
748
                how="left",
749
            ).drop(columns=["index_right"])
750
751
            # Combine both dataframes
752
            power_plants = pd.concat([power_plants_hv, power_plants_ehv])
753
754
            # Insert into target table
755
            session = sessionmaker(bind=db.engine())()
756
            for i, row in power_plants.iterrows():
757
                entry = EgonPowerPlants(
758
                    sources={"el_capacity": row.source},
759
                    source_id={"MastrNummer": row.MaStRNummer},
760
                    carrier=row.carrier,
761
                    el_capacity=row.el_capacity,
762
                    voltage_level=row.voltage_level,
763
                    bus_id=row.bus_id,
764
                    scenario=row.scenario,
765
                    geom=f"SRID=4326;POINT({row.geometry.x} {row.geometry.y})",
766
                )
767
                session.add(entry)
768
            session.commit()
769
770
771
def allocate_other_power_plants():
772
773
    # Get configuration
774
    cfg = egon.data.config.datasets()["power_plants"]
775
    boundary = egon.data.config.settings()["egon-data"]["--dataset-boundary"]
776
777
    db.execute_sql(
778
        f"""
779
        DELETE FROM {cfg['target']['schema']}.{cfg['target']['table']}
780
        WHERE carrier ='others'
781
        """
782
    )
783
784
    # Define scenario, carrier 'others' is only present in 'eGon2035'
785
    scenario = "eGon2035"
786
787
    # Select target values for carrier 'others'
788
    target = db.select_dataframe(
789
        f"""
790
        SELECT sum(capacity) as capacity, carrier, scenario_name, nuts
791
            FROM {cfg['sources']['capacities']}
792
            WHERE scenario_name = '{scenario}'
793
            AND carrier = 'others'
794
            GROUP BY carrier, nuts, scenario_name;
795
        """
796
    )
797
798
    # Assign name of federal state
799
800
    map_states = {
801
        "DE1": "BadenWuerttemberg",
802
        "DEA": "NordrheinWestfalen",
803
        "DE7": "Hessen",
804
        "DE4": "Brandenburg",
805
        "DE5": "Bremen",
806
        "DEB": "RheinlandPfalz",
807
        "DEE": "SachsenAnhalt",
808
        "DEF": "SchleswigHolstein",
809
        "DE8": "MecklenburgVorpommern",
810
        "DEG": "Thueringen",
811
        "DE9": "Niedersachsen",
812
        "DED": "Sachsen",
813
        "DE6": "Hamburg",
814
        "DEC": "Saarland",
815
        "DE3": "Berlin",
816
        "DE2": "Bayern",
817
    }
818
819
    target = (
820
        target.replace({"nuts": map_states})
821
        .rename(columns={"nuts": "Bundesland"})
822
        .set_index("Bundesland")
823
    )
824
    target = target.capacity
825
826
    # Select 'non chp' power plants from mastr table
827
    mastr_combustion = select_no_chp_combustion_mastr("others")
828
829
    # Rename columns
830
    mastr_combustion = mastr_combustion.rename(
831
        columns={
832
            "carrier": "Energietraeger",
833
            "plz": "Postleitzahl",
834
            "city": "Ort",
835
            "federal_state": "Bundesland",
836
            "el_capacity": "Nettonennleistung",
837
        }
838
    )
839
840
    # Select power plants representing carrier 'others' from MaStR files
841
    mastr_sludge = pd.read_csv(
842
        WORKING_DIR_MASTR_OLD / cfg["sources"]["mastr_gsgk"]).query(
843
        """EinheitBetriebsstatus=='InBetrieb'and Energietraeger=='Klaerschlamm'"""  # noqa: E501
844
    )
845
    mastr_geothermal = pd.read_csv(
846
        WORKING_DIR_MASTR_OLD / cfg["sources"]["mastr_gsgk"]).query(
847
        "EinheitBetriebsstatus=='InBetrieb' and Energietraeger=='Geothermie' "
848
        "and Technologie == 'ORCOrganicRankineCycleAnlage'"
849
    )
850
851
    mastr_sg = mastr_sludge.append(mastr_geothermal)
852
853
    # Insert geometry column
854
    mastr_sg = mastr_sg[~(mastr_sg["Laengengrad"].isnull())]
855
    mastr_sg = gpd.GeoDataFrame(
856
        mastr_sg,
857
        geometry=gpd.points_from_xy(
858
            mastr_sg["Laengengrad"], mastr_sg["Breitengrad"], crs=4326
859
        ),
860
    )
861
862
    # Exclude columns which are not essential
863
    mastr_sg = mastr_sg.filter(
864
        [
865
            "EinheitMastrNummer",
866
            "Nettonennleistung",
867
            "geometry",
868
            "Energietraeger",
869
            "Postleitzahl",
870
            "Ort",
871
            "Bundesland",
872
        ],
873
        axis=1,
874
    )
875
    # Rename carrier
876
    mastr_sg.Energietraeger = "others"
877
878
    # Change data type
879
    mastr_sg["Postleitzahl"] = mastr_sg["Postleitzahl"].astype(int)
880
881
    # Capacity in MW
882
    mastr_sg.loc[:, "Nettonennleistung"] *= 1e-3
883
884
    # Merge different sources to one df
885
    mastr_others = mastr_sg.append(mastr_combustion).reset_index()
886
887
    # Delete entries outside Schleswig-Holstein for test mode
888
    if boundary == "Schleswig-Holstein":
889
        mastr_others = mastr_others[
890
            mastr_others["Bundesland"] == "SchleswigHolstein"
891
        ]
892
893
    # Scale capacities prox to now to meet target values
894
    mastr_prox = scale_prox2now(mastr_others, target, level="federal_state")
895
896
    # Assign voltage_level based on scaled capacity
897
    mastr_prox["voltage_level"] = np.nan
898
    mastr_prox["voltage_level"] = assign_voltage_level_by_capacity(mastr_prox)
899
900
    # Rename columns
901
    mastr_prox = mastr_prox.rename(
902
        columns={
903
            "Energietraeger": "carrier",
904
            "Postleitzahl": "plz",
905
            "Ort": "city",
906
            "Bundesland": "federal_state",
907
            "Nettonennleistung": "el_capacity",
908
        }
909
    )
910
911
    # Assign bus_id
912
    mastr_prox = assign_bus_id(mastr_prox, cfg)
913
    mastr_prox = mastr_prox.set_crs(4326, allow_override=True)
914
915
    # Insert into target table
916
    session = sessionmaker(bind=db.engine())()
917
    for i, row in mastr_prox.iterrows():
918
        entry = EgonPowerPlants(
919
            sources=row.el_capacity,
920
            source_id={"MastrNummer": row.EinheitMastrNummer},
921
            carrier=row.carrier,
922
            el_capacity=row.el_capacity,
923
            voltage_level=row.voltage_level,
924
            bus_id=row.bus_id,
925
            scenario=scenario,
926
            geom=f"SRID=4326; {row.geometry}",
927
        )
928
        session.add(entry)
929
    session.commit()
930