Passed
Pull Request — dev (#826)
by
unknown
01:35
created

data.datasets.electricity_demand_timeseries.cts_buildings   F

Complexity

Total Complexity 62

Size/Duplication

Total Lines 1519
Duplicated Lines 4.28 %

Importance

Changes 0
Metric Value
wmc 62
eloc 729
dl 65
loc 1519
rs 3.311
c 0
b 0
f 0

18 Functions

Rating   Name   Duplication   Size   Complexity  
A cts_heat() 33 33 2
A place_buildings_with_amenities() 0 41 3
B create_synthetic_buildings() 0 67 6
A delete_synthetic_cts_buildings() 0 14 2
A remove_double_bus_id() 0 47 3
B get_cts_electricity_peak_load() 0 57 5
C calc_building_profiles() 0 106 10
A amenities_without_buildings() 0 60 2
A select_cts_buildings() 0 25 1
B cells_with_cts_demand_only() 0 72 2
B get_cts_heat_peak_load() 0 69 6
A start_logging() 0 12 1
B calc_building_demand_profile_share() 0 105 3
C buildings_with_amenities() 0 183 5
A cts_electricity() 32 32 2
A calc_census_cell_share() 0 56 4
B cts_buildings() 0 187 2
B buildings_without_amenities() 0 99 2

1 Method

Rating   Name   Duplication   Size   Complexity  
A CtsElectricityBuildings.__init__() 0 9 1

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complexity

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like data.datasets.electricity_demand_timeseries.cts_buildings often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
"""
2
CTS electricity and heat demand time series for scenarios in 2035 and 2050
3
assigned to OSM-buildings.
4
5
Disaggregation of cts heat & electricity demand time series from MV Substation
6
to census cells via annual demand and then to OSM buildings via
7
amenity tags or randomly if no sufficient OSM-data is available in the
8
respective census cell. If no OSM-buildings or synthetic residential buildings
9
are available new synthetic 5x5m buildings are generated.
10
11
The resulting data is stored in separate tables
12
13
* `openstreetmap.osm_buildings_synthetic`:
14
    Lists generated synthetic building with id, zensus_population_id and
15
    building type. This table is already created within
16
    :func:`hh_buildings.map_houseprofiles_to_buildings()`
17
* `openstreetmap.egon_cts_buildings`:
18
    Table of all selected cts buildings with id, census cell id, geometry and
19
    amenity count in building. This table is created within
20
    :func:`cts_buildings()`
21
* `demand.egon_cts_electricity_demand_building_share`:
22
    Table including the mv substation electricity profile share of all selected
23
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
24
    within :func:`cts_electricity()`
25
* `demand.egon_cts_heat_demand_building_share`:
26
    Table including the mv substation heat profile share of all selected
27
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
28
    within :func:`cts_heat()`
29
* `demand.egon_building_peak_loads`:
30
    Mapping of demand time series and buildings including cell_id, building
31
    area and peak load. This table is already created within
32
    :func:`hh_buildings.get_building_peak_loads()`
33
34
**The following datasets from the database are mainly used for creation:**
35
36
* `openstreetmap.osm_buildings_filtered`:
37
    Table of OSM-buildings filtered by tags to selecting residential and cts
38
    buildings only.
39
* `openstreetmap.osm_amenities_shops_filtered`:
40
    Table of OSM-amenities filtered by tags to select cts only.
41
* `openstreetmap.osm_amenities_not_in_buildings_filtered`:
42
    Table of amenities which do not intersect with any building from
43
    `openstreetmap.osm_buildings_filtered`
44
* `openstreetmap.osm_buildings_synthetic`:
45
    Table of synthetic residential buildings
46
* `boundaries.egon_map_zensus_buildings_filtered_all`:
47
    Mapping table of census cells and buildings filtered even if population
48
    in census cell = 0.
49
* `demand.egon_demandregio_zensus_electricity`:
50
    Table of annual electricity load demand for residential and cts at census
51
    cell level. Residential load demand is derived from aggregated residential
52
    building profiles. DemandRegio CTS load demand at NUTS3 is distributed to
53
    census cells linearly to heat demand from peta5.
54
* `demand.egon_peta_heat`:
55
    Table of annual heat load demand for residential and cts at census cell
56
    level from peta5.
57
* `demand.egon_etrago_electricity_cts`:
58
    Scaled cts electricity time series for every MV substation. Derived from
59
    DemandRegio SLP for selected economic sectors at nuts3. Scaled with annual
60
    demand from `demand.egon_demandregio_zensus_electricity`
61
* `demand.egon_etrago_heat_cts`:
62
    Scaled cts heat time series for every MV substation. Derived from
63
    DemandRegio SLP Gas for selected economic sectors at nuts3. Scaled with
64
    annual demand from `demand.egon_peta_heat`.
65
66
**What is the goal?**
67
68
To disaggregate cts heat and electricity time series from MV substation level
69
to geo-referenced buildings. DemandRegio and Peta5 is used to identify census
70
cells with load demand. Openstreetmap data is used and filtered via tags to
71
identify buildings and count amenities within. The number of amenities serve
72
to assign the appropriate load demand share to the building.
73
74
**What is the challenge?**
75
76
The OSM, DemandRegio and Peta5 dataset differ from each other. The OSM dataset
77
is a community based dataset which is extended throughout and does not claim to
78
be complete. Therefore not all census cells which have a demand assigned by
79
DemandRegio or Peta5 methodology also have buildings with respective tags or no
80
buildings at all. Merging these datasets inconsistencies need
81
to be addressed. For example: not yet tagged buildings or amenities in OSM
82
83
**How are these datasets combined?**
84
85
------>>>>>> continue
86
87
Firstly, all cts buildings are selected. Buildings which have cts amenities
88
inside.
89
90
91
**What are central assumptions during the data processing?**
92
93
* Mapping census to OSM data is not trivial. Discrepancies are substituted.
94
* Missing OSM buildings are generated by census building count.
95
* If no census building count data is available, the number of buildings is
96
derived by an average rate of households/buildings applied to the number of
97
households.
98
99
**Drawbacks and limitations of the data**
100
101
* Missing OSM buildings in cells without census building count are derived by
102
an average rate of households/buildings applied to the number of households.
103
As only whole houses can exist, the substitute is ceiled to the next higher
104
integer. Ceiling is applied to avoid rounding to amount of 0 buildings.
105
106
* As this datasets is a cascade after profile assignement at census cells
107
also check drawbacks and limitations in hh_profiles.py.
108
109
110
111
Example Query
112
-----
113
114
115
Notes
116
-----
117
118
This module docstring is rather a dataset documentation. Once, a decision
119
is made in ... the content of this module docstring needs to be moved to
120
docs attribute of the respective dataset class.
121
"""
122
123
import logging
124
125
from geoalchemy2 import Geometry
126
from geoalchemy2.shape import to_shape
127
from sqlalchemy import REAL, Column, Integer, String, func
128
from sqlalchemy.ext.declarative import declarative_base
129
import geopandas as gpd
130
import numpy as np
131
import pandas as pd
132
import saio
133
134
from egon.data import db
135
from egon.data.datasets import Dataset
136
from egon.data.datasets.electricity_demand import (
137
    EgonDemandRegioZensusElectricity,
138
)
139
from egon.data.datasets.electricity_demand.temporal import (
140
    EgonEtragoElectricityCts,
141
    calc_load_curves_cts,
142
)
143
from egon.data.datasets.electricity_demand_timeseries.hh_buildings import (
144
    BuildingPeakLoads,
145
    OsmBuildingsSynthetic,
146
)
147
from egon.data.datasets.electricity_demand_timeseries.tools import (
148
    random_ints_until_sum,
149
    random_point_in_square,
150
    specific_int_until_sum,
151
    write_table_to_postgis,
152
    write_table_to_postgres,
153
)
154
from egon.data.datasets.heat_demand import EgonPetaHeat
155
from egon.data.datasets.zensus_mv_grid_districts import MapZensusGridDistricts
156
from egon.data.datasets.zensus_vg250 import DestatisZensusPopulationPerHa
157
158
engine = db.engine()
159
Base = declarative_base()
160
161
# import db tables
162
saio.register_schema("openstreetmap", engine=engine)
163
saio.register_schema("boundaries", engine=engine)
164
165
166
class EgonCtsElectricityDemandBuildingShare(Base):
167
    __tablename__ = "egon_cts_electricity_demand_building_share"
168
    __table_args__ = {"schema": "demand"}
169
170
    id = Column(Integer, primary_key=True)
171
    scenario = Column(String, primary_key=True)
172
    bus_id = Column(Integer, index=True)
173
    profile_share = Column(REAL)
174
175
176
class EgonCtsHeatDemandBuildingShare(Base):
177
    __tablename__ = "egon_cts_heat_demand_building_share"
178
    __table_args__ = {"schema": "demand"}
179
180
    id = Column(Integer, primary_key=True)
181
    scenario = Column(String, primary_key=True)
182
    bus_id = Column(Integer, index=True)
183
    profile_share = Column(REAL)
184
185
186
class CtsBuildings(Base):
187
    __tablename__ = "egon_cts_buildings"
188
    __table_args__ = {"schema": "openstreetmap"}
189
190
    serial = Column(Integer, primary_key=True)
191
    id = Column(Integer, index=True)
192
    zensus_population_id = Column(Integer, index=True)
193
    geom_building = Column(Geometry("Polygon", 3035))
194
    n_amenities_inside = Column(Integer)
195
    source = Column(String)
196
197
198
class BuildingHeatPeakLoads(Base):
199
    __tablename__ = "egon_building_heat_peak_loads"
200
    __table_args__ = {"schema": "demand"}
201
202
    building_id = Column(Integer, primary_key=True)
203
    scenario = Column(String, primary_key=True)
204
    sector = Column(String, primary_key=True)
205
    peak_load_in_w = Column(REAL)
206
207
208
def start_logging(name=None):
209
    """Start logging into console"""
210
    log = logging.getLogger(name)
211
    log.propagate = False
212
    log.setLevel(logging.INFO)
213
    logformat = logging.Formatter(
214
        "%(asctime)s %(message)s", "%m/%d/%Y %H:%M:%S"
215
    )
216
    sh = logging.StreamHandler()
217
    sh.setFormatter(logformat)
218
    log.addHandler(sh)
219
    return log
220
221
222
def amenities_without_buildings():
223
    """
224
    Amenities which have no buildings assigned and are in
225
    a cell with cts demand are determined.
226
227
    Returns
228
    -------
229
    pd.DataFrame
230
        Table of amenities without buildings
231
    """
232
    from saio.openstreetmap import osm_amenities_not_in_buildings_filtered
233
234
    with db.session_scope() as session:
235
        cells_query = (
236
            session.query(
237
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
238
                # TODO can be used for square around amenity
239
                #  (1 geom_amenity: 1 geom_building)
240
                #  not unique amenity_ids yet
241
                osm_amenities_not_in_buildings_filtered.geom_amenity,
242
                osm_amenities_not_in_buildings_filtered.egon_amenity_id,
243
                # EgonDemandRegioZensusElectricity.demand,
244
                # # TODO can be used to generate n random buildings
245
                # # (n amenities : 1 randombuilding)
246
                # func.count(
247
                #     osm_amenities_not_in_buildings_filtered.egon_amenity_id
248
                # ).label("n_amenities_inside"),
249
                # DestatisZensusPopulationPerHa.geom,
250
            )
251
            .filter(
252
                func.st_within(
253
                    osm_amenities_not_in_buildings_filtered.geom_amenity,
254
                    DestatisZensusPopulationPerHa.geom,
255
                )
256
            )
257
            .filter(
258
                DestatisZensusPopulationPerHa.id
259
                == EgonDemandRegioZensusElectricity.zensus_population_id
260
            )
261
            .filter(
262
                EgonDemandRegioZensusElectricity.sector == "service",
263
                EgonDemandRegioZensusElectricity.scenario == "eGon2035"
264
                #         ).group_by(
265
                #             EgonDemandRegioZensusElectricity.zensus_population_id,
266
                #             DestatisZensusPopulationPerHa.geom,
267
            )
268
        )
269
    # # TODO can be used to generate n random buildings
270
    # df_cells_with_amenities_not_in_buildings = gpd.read_postgis(
271
    #     cells_query.statement, cells_query.session.bind, geom_col="geom"
272
    # )
273
    #
274
275
    # # TODO can be used for square around amenity
276
    df_amenities_without_buildings = gpd.read_postgis(
277
        cells_query.statement,
278
        cells_query.session.bind,
279
        geom_col="geom_amenity",
280
    )
281
    return df_amenities_without_buildings
282
283
284
def place_buildings_with_amenities(df, amenities=None, max_amenities=None):
285
    """
286
    Building centroids are placed randomly within census cells.
287
    The Number of buildings is derived from n_amenity_inside, the selected
288
    method and number of amenities per building.
289
290
    Returns
291
    -------
292
    df: gpd.GeoDataFrame
293
        Table of buildings centroids
294
    """
295
    if isinstance(max_amenities, int):
296
        # amount of amenities is randomly generated within bounds
297
        # (max_amenities, amenities per cell)
298
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
299
            random_ints_until_sum, args=[max_amenities]
300
        )
301
    if isinstance(amenities, int):
302
        # Specific amount of amenities per building
303
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
304
            specific_int_until_sum, args=[amenities]
305
        )
306
307
    # Unnest each building
308
    df = df.explode(column="n_amenities_inside")
309
310
    # building count per cell
311
    df["building_count"] = df.groupby(["zensus_population_id"]).cumcount() + 1
312
313
    # generate random synthetic buildings
314
    edge_length = 5
315
    # create random points within census cells
316
    points = random_point_in_square(geom=df["geom"], tol=edge_length / 2)
317
318
    df.reset_index(drop=True, inplace=True)
319
    # Store center of polygon
320
    df["geom_point"] = points
321
    # Drop geometry of census cell
322
    df = df.drop(columns=["geom"])
323
324
    return df
325
326
327
def create_synthetic_buildings(df, points=None, crs="EPSG:3035"):
328
    """
329
    Synthetic buildings are generated around points.
330
331
    Parameters
332
    ----------
333
    df: pd.DataFrame
334
        Table of census cells
335
    points: gpd.GeoSeries or str
336
        List of points to place buildings around or column name of df
337
    crs: str
338
        CRS of result table
339
340
    Returns
341
    -------
342
    df: gpd.GeoDataFrame
343
        Synthetic buildings
344
    """
345
346
    if isinstance(points, str) and points in df.columns:
347
        points = df[points]
348
    elif isinstance(points, gpd.GeoSeries):
349
        pass
350
    else:
351
        raise ValueError("Points are of the wrong type")
352
353
    # Create building using a square around point
354
    edge_length = 5
355
    df["geom_building"] = points.buffer(distance=edge_length / 2, cap_style=3)
356
357
    if "geom_point" not in df.columns:
358
        df["geom_point"] = df["geom_building"].centroid
359
360
    # TODO Check CRS
361
    df = gpd.GeoDataFrame(
362
        df,
363
        crs=crs,
364
        geometry="geom_building",
365
    )
366
367
    # TODO remove after implementation of egon_building_id
368
    df.rename(columns={"id": "egon_building_id"}, inplace=True)
369
370
    # get max number of building ids from synthetic residential table
371
    with db.session_scope() as session:
372
        max_synth_residential_id = session.execute(
373
            func.max(OsmBuildingsSynthetic.id)
374
        ).scalar()
375
    max_synth_residential_id = int(max_synth_residential_id)
376
377
    # create sequential ids
378
    df["egon_building_id"] = range(
379
        max_synth_residential_id + 1,
380
        max_synth_residential_id + df.shape[0] + 1,
381
    )
382
383
    df["area"] = df["geom_building"].area
384
    # set building type of synthetic building
385
    df["building"] = "cts"
386
    # TODO remove in #772
387
    df = df.rename(
388
        columns={
389
            # "zensus_population_id": "cell_id",
390
            "egon_building_id": "id",
391
        }
392
    )
393
    return df
394
395
396
def buildings_with_amenities():
397
    """
398
    Amenities which are assigned to buildings are determined
399
    and grouped per building and zensus cell. Buildings
400
    covering multiple cells therefore exists multiple times
401
    but in different zensus cells. This is necessary to cover
402
    all cells with a cts demand. If buildings exist in multiple
403
    substations, their amenities are summed and assigned and kept in
404
    one substation only. If as a result, a census cell is uncovered,
405
    a synthetic amenity is placed. The buildings are aggregated
406
    afterwards during the calculation of the profile_share.
407
408
    Returns
409
    -------
410
    df_buildings_with_amenities: gpd.GeoDataFrame
411
        Contains all buildings with amenities per zensus cell.
412
    df_lost_cells: gpd.GeoDataFrame
413
        Contains synthetic amenities in lost cells. Might be empty
414
    """
415
416
    from saio.openstreetmap import osm_amenities_in_buildings_filtered
417
418
    with db.session_scope() as session:
419
        cells_query = (
420
            session.query(
421
                osm_amenities_in_buildings_filtered,
422
                MapZensusGridDistricts.bus_id,
423
            )
424
            .filter(
425
                MapZensusGridDistricts.zensus_population_id
426
                == osm_amenities_in_buildings_filtered.zensus_population_id
427
            )
428
            .filter(
429
                EgonDemandRegioZensusElectricity.zensus_population_id
430
                == osm_amenities_in_buildings_filtered.zensus_population_id
431
            )
432
            .filter(
433
                EgonDemandRegioZensusElectricity.sector == "service",
434
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
435
            )
436
        )
437
    df_amenities_in_buildings = pd.read_sql(
438
        cells_query.statement, cells_query.session.bind, index_col=None
439
    )
440
441
    df_amenities_in_buildings["geom_building"] = df_amenities_in_buildings[
442
        "geom_building"
443
    ].apply(to_shape)
444
    df_amenities_in_buildings["geom_amenity"] = df_amenities_in_buildings[
445
        "geom_amenity"
446
    ].apply(to_shape)
447
448
    df_amenities_in_buildings["n_amenities_inside"] = 1
449
450
    # add identifier column for buildings in multiple substations
451
    df_amenities_in_buildings[
452
        "duplicate_identifier"
453
    ] = df_amenities_in_buildings.groupby(["id", "bus_id"])[
454
        "n_amenities_inside"
455
    ].transform(
456
        "cumsum"
457
    )
458
    df_amenities_in_buildings = df_amenities_in_buildings.sort_values(
459
        ["id", "duplicate_identifier"]
460
    )
461
    # sum amenities of buildings with multiple substations
462
    df_amenities_in_buildings[
463
        "n_amenities_inside"
464
    ] = df_amenities_in_buildings.groupby(["id", "duplicate_identifier"])[
465
        "n_amenities_inside"
466
    ].transform(
467
        "sum"
468
    )
469
470
    # create column to always go for bus_id with max amenities
471
    df_amenities_in_buildings[
472
        "max_amenities"
473
    ] = df_amenities_in_buildings.groupby(["id", "bus_id"])[
474
        "n_amenities_inside"
475
    ].transform(
476
        "sum"
477
    )
478
    # sort to go for
479
    df_amenities_in_buildings.sort_values(
480
        ["id", "max_amenities"], ascending=False, inplace=True
481
    )
482
483
    # identify lost zensus cells
484
    df_lost_cells = df_amenities_in_buildings.loc[
485
        df_amenities_in_buildings.duplicated(
486
            subset=["id", "duplicate_identifier"], keep="first"
487
        )
488
    ]
489
    df_lost_cells.drop_duplicates(
490
        subset=["zensus_population_id"], inplace=True
491
    )
492
493
    # drop buildings with multiple substation and lower max amenity
494
    df_amenities_in_buildings.drop_duplicates(
495
        subset=["id", "duplicate_identifier"], keep="first", inplace=True
496
    )
497
498
    # check if lost zensus cells are already covered
499
    if not df_lost_cells.empty:
500
        if not (
501
            df_amenities_in_buildings["zensus_population_id"]
502
            .isin(df_lost_cells["zensus_population_id"])
503
            .empty
504
        ):
505
            # query geom data for cell if not
506
            with db.session_scope() as session:
507
                cells_query = session.query(
508
                    DestatisZensusPopulationPerHa.id,
509
                    DestatisZensusPopulationPerHa.geom,
510
                ).filter(
511
                    DestatisZensusPopulationPerHa.id.in_(
512
                        df_lost_cells["zensus_population_id"]
513
                    )
514
                )
515
516
            df_lost_cells = gpd.read_postgis(
517
                cells_query.statement,
518
                cells_query.session.bind,
519
                geom_col="geom",
520
            )
521
            # TODO maybe adapt method
522
            # place random amenity in cell
523
            df_lost_cells["n_amenities_inside"] = 1
524
            df_lost_cells.rename(
525
                columns={
526
                    "id": "zensus_population_id",
527
                },
528
                inplace=True,
529
            )
530
            df_lost_cells = place_buildings_with_amenities(
531
                df_lost_cells, amenities=1
532
            )
533
            df_lost_cells.rename(
534
                columns={
535
                    # "id": "zensus_population_id",
536
                    "geom_point": "geom_amenity",
537
                },
538
                inplace=True,
539
            )
540
            df_lost_cells.drop(
541
                columns=["building_count", "n_amenities_inside"], inplace=True
542
            )
543
        else:
544
            df_lost_cells = None
545
    else:
546
        df_lost_cells = None
547
548
    # drop helper columns
549
    df_amenities_in_buildings.drop(
550
        columns=["duplicate_identifier"], inplace=True
551
    )
552
553
    # sum amenities per building and cell
554
    df_amenities_in_buildings[
555
        "n_amenities_inside"
556
    ] = df_amenities_in_buildings.groupby(["zensus_population_id", "id"])[
557
        "n_amenities_inside"
558
    ].transform(
559
        "sum"
560
    )
561
    # drop duplicated buildings
562
    df_buildings_with_amenities = df_amenities_in_buildings.drop_duplicates(
563
        ["id", "zensus_population_id"]
564
    )
565
    df_buildings_with_amenities.reset_index(inplace=True, drop=True)
566
567
    df_buildings_with_amenities = df_buildings_with_amenities[
568
        ["id", "zensus_population_id", "geom_building", "n_amenities_inside"]
569
    ]
570
    df_buildings_with_amenities.rename(
571
        columns={
572
            # "zensus_population_id": "cell_id",
573
            "egon_building_id": "id"
574
        },
575
        inplace=True,
576
    )
577
578
    return df_buildings_with_amenities, df_lost_cells
579
580
581
def buildings_without_amenities():
582
    """
583
    Buildings (filtered and synthetic) in cells with
584
    cts demand but no amenities are determined.
585
586
    Returns
587
    -------
588
    df_buildings_without_amenities: gpd.GeoDataFrame
589
        Table of buildings without amenities in zensus cells
590
        with cts demand.
591
    """
592
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
593
    from saio.openstreetmap import (
594
        osm_amenities_shops_filtered,
595
        osm_buildings_filtered,
596
        osm_buildings_synthetic,
597
    )
598
599
    # buildings_filtered in cts-demand-cells without amenities
600
    with db.session_scope() as session:
601
602
        # Synthetic Buildings
603
        q_synth_buildings = session.query(
604
            osm_buildings_synthetic.cell_id.cast(Integer).label(
605
                "zensus_population_id"
606
            ),
607
            osm_buildings_synthetic.id.cast(Integer).label("id"),
608
            osm_buildings_synthetic.area.label("area"),
609
            osm_buildings_synthetic.geom_building.label("geom_building"),
610
            osm_buildings_synthetic.geom_point.label("geom_point"),
611
        )
612
613
        # Buildings filtered
614
        q_buildings_filtered = session.query(
615
            egon_map_zensus_buildings_filtered_all.zensus_population_id,
616
            osm_buildings_filtered.id,
617
            osm_buildings_filtered.area,
618
            osm_buildings_filtered.geom_building,
619
            osm_buildings_filtered.geom_point,
620
        ).filter(
621
            osm_buildings_filtered.id
622
            == egon_map_zensus_buildings_filtered_all.id
623
        )
624
625
        # Amenities + zensus_population_id
626
        q_amenities = (
627
            session.query(
628
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
629
            )
630
            .filter(
631
                func.st_within(
632
                    osm_amenities_shops_filtered.geom_amenity,
633
                    DestatisZensusPopulationPerHa.geom,
634
                )
635
            )
636
            .distinct(DestatisZensusPopulationPerHa.id)
637
        )
638
639
        # Cells with CTS demand but without amenities
640
        q_cts_without_amenities = (
641
            session.query(
642
                EgonDemandRegioZensusElectricity.zensus_population_id,
643
            )
644
            .filter(
645
                EgonDemandRegioZensusElectricity.sector == "service",
646
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
647
            )
648
            .filter(
649
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
650
                    q_amenities
651
                )
652
            )
653
            .distinct()
654
        )
655
656
        # Buildings filtered + synthetic buildings residential in
657
        # cells with CTS demand but without amenities
658
        cells_query = q_synth_buildings.union(q_buildings_filtered).filter(
659
            egon_map_zensus_buildings_filtered_all.zensus_population_id.in_(
660
                q_cts_without_amenities
661
            )
662
        )
663
664
    # df_buildings_without_amenities = pd.read_sql(
665
    #     cells_query.statement, cells_query.session.bind, index_col=None)
666
    df_buildings_without_amenities = gpd.read_postgis(
667
        cells_query.statement,
668
        cells_query.session.bind,
669
        geom_col="geom_building",
670
    )
671
672
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
673
        columns={
674
            # "zensus_population_id": "cell_id",
675
            "egon_building_id": "id",
676
        }
677
    )
678
679
    return df_buildings_without_amenities
680
681
682
def select_cts_buildings(df_buildings_wo_amenities, max_n):
683
    """
684
    N Buildings (filtered and synthetic) in each cell with
685
    cts demand are selected. Only the first n buildings
686
    are taken for each cell. The buildings are sorted by surface
687
    area.
688
689
    Returns
690
    -------
691
    df_buildings_with_cts_demand: gpd.GeoDataFrame
692
        Table of buildings
693
    """
694
695
    df_buildings_wo_amenities.sort_values(
696
        "area", ascending=False, inplace=True
697
    )
698
    # select first n ids each census cell if available
699
    df_buildings_with_cts_demand = (
700
        df_buildings_wo_amenities.groupby("zensus_population_id")
701
        .nth(list(range(max_n)))
702
        .reset_index()
703
    )
704
    df_buildings_with_cts_demand.reset_index(drop=True, inplace=True)
705
706
    return df_buildings_with_cts_demand
707
708
709
def cells_with_cts_demand_only(df_buildings_without_amenities):
710
    """
711
    Cells with cts demand but no amenities or buildilngs
712
    are determined.
713
714
    Returns
715
    -------
716
    df_cells_only_cts_demand: gpd.GeoDataFrame
717
        Table of cells with cts demand but no amenities or buildings
718
    """
719
    from saio.openstreetmap import osm_amenities_shops_filtered
720
721
    # cells mit amenities
722
    with db.session_scope() as session:
723
        sub_query = (
724
            session.query(
725
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
726
            )
727
            .filter(
728
                func.st_within(
729
                    osm_amenities_shops_filtered.geom_amenity,
730
                    DestatisZensusPopulationPerHa.geom,
731
                )
732
            )
733
            .distinct(DestatisZensusPopulationPerHa.id)
734
        )
735
736
        cells_query = (
737
            session.query(
738
                EgonDemandRegioZensusElectricity.zensus_population_id,
739
                EgonDemandRegioZensusElectricity.scenario,
740
                EgonDemandRegioZensusElectricity.sector,
741
                EgonDemandRegioZensusElectricity.demand,
742
                DestatisZensusPopulationPerHa.geom,
743
            )
744
            .filter(
745
                EgonDemandRegioZensusElectricity.sector == "service",
746
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
747
            )
748
            .filter(
749
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
750
                    sub_query
751
                )
752
            )
753
            .filter(
754
                EgonDemandRegioZensusElectricity.zensus_population_id
755
                == DestatisZensusPopulationPerHa.id
756
            )
757
        )
758
759
    df_cts_cell_without_amenities = gpd.read_postgis(
760
        cells_query.statement,
761
        cells_query.session.bind,
762
        geom_col="geom",
763
        index_col=None,
764
    )
765
766
    # TODO maybe remove
767
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
768
        columns={"cell_id": "zensus_population_id"}
769
    )
770
771
    # Census cells with only cts demand
772
    df_cells_only_cts_demand = df_cts_cell_without_amenities.loc[
773
        ~df_cts_cell_without_amenities["zensus_population_id"].isin(
774
            df_buildings_without_amenities["zensus_population_id"].unique()
775
        )
776
    ]
777
778
    df_cells_only_cts_demand.reset_index(drop=True, inplace=True)
779
780
    return df_cells_only_cts_demand
781
782
783
def calc_census_cell_share(scenario="eGon2035", sector="electricity"):
784
    """
785
    The profile share for each census cell is calculated by it's
786
    share of annual demand per substation bus. The annual demand
787
    per cell is defined by DemandRegio/Peta5. The share is for both
788
    scenarios identical as the annual demand is linearly scaled.
789
790
    Parameters
791
    ----------
792
    scenario: str
793
        Scenario for which the share is calculated.
794
    sector: str
795
        Scenario for which the share is calculated.
796
797
    Returns
798
    -------
799
    df_census_share: pd.DataFrame
800
    """
801
    if sector == "electricity":
802
        demand_table = EgonDemandRegioZensusElectricity
803
    elif sector == "heat":
804
        demand_table = EgonPetaHeat
805
806
    with db.session_scope() as session:
807
        cells_query = (
808
            session.query(demand_table, MapZensusGridDistricts.bus_id)
0 ignored issues
show
introduced by
The variable demand_table does not seem to be defined for all execution paths.
Loading history...
809
            .filter(demand_table.sector == "service")
810
            .filter(demand_table.scenario == scenario)
811
            .filter(
812
                demand_table.zensus_population_id
813
                == MapZensusGridDistricts.zensus_population_id
814
            )
815
        )
816
817
    df_demand = pd.read_sql(
818
        cells_query.statement,
819
        cells_query.session.bind,
820
        index_col="zensus_population_id",
821
    )
822
823
    # get demand share of cell per bus
824
    df_census_share = df_demand["demand"] / df_demand.groupby("bus_id")[
825
        "demand"
826
    ].transform("sum")
827
    df_census_share = df_census_share.rename("cell_share")
828
829
    df_census_share = pd.concat(
830
        [
831
            df_census_share,
832
            df_demand[["bus_id", "scenario"]],
833
        ],
834
        axis=1,
835
    )
836
837
    df_census_share.reset_index(inplace=True)
838
    return df_census_share
839
840
841
def calc_building_demand_profile_share(
842
    df_cts_buildings, scenario="eGon2035", sector="electricity"
843
):
844
    """
845
    Share of cts electricity demand profile per bus for every selected building
846
    is calculated. Building-amenity share is multiplied with census cell share
847
    to get the substation bus profile share for each building. The share is
848
    grouped and aggregated per building as some buildings exceed the shape of
849
    census cells and have amenities assigned from multiple cells. Building
850
    therefore get the amenity share of all census cells.
851
852
    Parameters
853
    ----------
854
    df_cts_buildings: gpd.GeoDataFrame
855
        Table of all buildings with cts demand assigned
856
    scenario: str
857
        Scenario for which the share is calculated.
858
    sector: str
859
        Sector for which the share is calculated.
860
861
    Returns
862
    -------
863
    df_building_share: pd.DataFrame
864
        Table of bus profile share per building
865
866
    """
867
868
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
869
870
    def calc_building_amenity_share(df_cts_buildings):
871
        """
872
        Calculate the building share by the number amenities per building
873
        within a census cell. Building ids can exist multiple time but with
874
        different zensus_population_ids.
875
        """
876
        df_building_amenity_share = df_cts_buildings[
877
            "n_amenities_inside"
878
        ] / df_cts_buildings.groupby("zensus_population_id")[
879
            "n_amenities_inside"
880
        ].transform(
881
            "sum"
882
        )
883
        df_building_amenity_share = pd.concat(
884
            [
885
                df_building_amenity_share.rename("building_amenity_share"),
886
                df_cts_buildings[["zensus_population_id", "id"]],
887
            ],
888
            axis=1,
889
        )
890
        return df_building_amenity_share
891
892
    df_building_amenity_share = calc_building_amenity_share(df_cts_buildings)
893
894
    df_census_cell_share = calc_census_cell_share(
895
        scenario=scenario, sector=sector
896
    )
897
898
    df_demand_share = pd.merge(
899
        left=df_building_amenity_share,
900
        right=df_census_cell_share,
901
        left_on="zensus_population_id",
902
        right_on="zensus_population_id",
903
    )
904
    df_demand_share["profile_share"] = df_demand_share[
905
        "building_amenity_share"
906
    ].multiply(df_demand_share["cell_share"])
907
908
    # TODO bus_id fix
909
    # df_demand_share = df_demand_share[
910
    #     ["id", "bus_id", "scenario", "profile_share"]
911
    # ]
912
    df_demand_share = df_demand_share[["id", "scenario", "profile_share"]]
913
914
    # assign bus_id via census cell of building centroid
915
    with db.session_scope() as session:
916
        cells_query = session.query(
917
            egon_map_zensus_buildings_filtered_all.id,
918
            egon_map_zensus_buildings_filtered_all.zensus_population_id,
919
            MapZensusGridDistricts.bus_id,
920
        ).filter(
921
            MapZensusGridDistricts.zensus_population_id
922
            == egon_map_zensus_buildings_filtered_all.zensus_population_id
923
        )
924
925
    df_egon_map_zensus_buildings_buses = pd.read_sql(
926
        cells_query.statement,
927
        cells_query.session.bind,
928
        index_col=None,
929
    )
930
    df_demand_share = pd.merge(
931
        left=df_demand_share, right=df_egon_map_zensus_buildings_buses, on="id"
932
    )
933
934
    # TODO adapt groupby?
935
    # Group and aggregate per building for multi cell buildings
936
    df_demand_share = (
937
        df_demand_share.groupby(["scenario", "id", "bus_id"])
938
        .sum()
939
        .reset_index()
940
    )
941
    if df_demand_share.duplicated("id", keep=False).any():
942
        print(
943
            df_demand_share.loc[df_demand_share.duplicated("id", keep=False)]
944
        )
945
    return df_demand_share
946
947
948
def calc_building_profiles(
949
    egon_building_id=None,
950
    bus_id=None,
951
    scenario="eGon2035",
952
    sector="electricity",
953
):
954
    """
955
    Calculate the demand profile for each building. The profile is
956
    calculated by the demand share of the building per substation bus.
957
958
    Parameters
959
    ----------
960
    egon_building_id: int
961
        Id of the building for which the profile is calculated.
962
        If not given, the profiles are calculated for all buildings.
963
    bus_id: int
964
        Id of the substation for which the all profiles are calculated.
965
        If not given, the profiles are calculated for all buildings.
966
    scenario: str
967
        Scenario for which the share is calculated.
968
    sector: str
969
        Sector for which the share is calculated.
970
971
    Returns
972
    -------
973
    df_building_profiles: pd.DataFrame
974
        Table of demand profile per building
975
    """
976
    if sector == "electricity":
977
        with db.session_scope() as session:
978
            cells_query = session.query(
979
                EgonCtsElectricityDemandBuildingShare,
980
            ).filter(
981
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
982
            )
983
984
        df_demand_share = pd.read_sql(
985
            cells_query.statement, cells_query.session.bind, index_col=None
986
        )
987
988
        # TODO maybe use demand.egon_etrago_electricity_cts
989
        # with db.session_scope() as session:
990
        #     cells_query = (
991
        #         session.query(
992
        #             EgonEtragoElectricityCts
993
        #         ).filter(
994
        #             EgonEtragoElectricityCts.scn_name == scenario)
995
        #     )
996
        #
997
        # df_cts_profiles = pd.read_sql(
998
        #     cells_query.statement,
999
        #     cells_query.session.bind,
1000
        # )
1001
        # df_cts_profiles = pd.DataFrame.from_dict(
1002
        #   df_cts_profiles.set_index('bus_id')['p_set'].to_dict(),
1003
        #   orient="index")
1004
        df_cts_profiles = calc_load_curves_cts(scenario)
1005
1006
    elif sector == "heat":
1007
        with db.session_scope() as session:
1008
            cells_query = session.query(
1009
                EgonCtsHeatDemandBuildingShare,
1010
            ).filter(EgonCtsHeatDemandBuildingShare.scenario == scenario)
1011
1012
        df_demand_share = pd.read_sql(
1013
            cells_query.statement, cells_query.session.bind, index_col=None
1014
        )
1015
1016
        # TODO cts heat substation profiles missing
1017
1018
    # get demand share of selected building id
1019
    if isinstance(egon_building_id, int):
1020
        if egon_building_id in df_demand_share["id"]:
0 ignored issues
show
introduced by
The variable df_demand_share does not seem to be defined for all execution paths.
Loading history...
1021
            df_demand_share = df_demand_share.loc[
1022
                df_demand_share["id"] == egon_building_id
1023
            ]
1024
        else:
1025
            raise KeyError(f"Building with id {egon_building_id} not found")
1026
    # TODO maybe add list
1027
    # elif isinstance(egon_building_id, list):
1028
1029
    # get demand share of all buildings for selected bus id
1030
    if isinstance(bus_id, int):
1031
        if bus_id in df_demand_share["bus_id"]:
1032
            df_demand_share = df_demand_share.loc[
1033
                df_demand_share["bus_id"] == bus_id
1034
            ]
1035
        else:
1036
            raise KeyError(f"Bus with id {bus_id} not found")
1037
1038
    # get demand profile for all buildings for selected demand share
1039
    # TODO takes a few seconds per iteration
1040
    df_building_profiles = pd.DataFrame()
1041
    for bus_id, df in df_demand_share.groupby("bus_id"):
1042
        shares = df.set_index("id", drop=True)["profile_share"]
1043
        profile = df_cts_profiles.loc[:, bus_id]
0 ignored issues
show
introduced by
The variable df_cts_profiles does not seem to be defined for all execution paths.
Loading history...
1044
        # building_profiles = profile.apply(lambda x: x * shares)
1045
        building_profiles = np.outer(profile, shares)
1046
        building_profiles = pd.DataFrame(
1047
            building_profiles, index=profile.index, columns=shares.index
1048
        )
1049
        df_building_profiles = pd.concat(
1050
            [df_building_profiles, building_profiles], axis=1
1051
        )
1052
1053
    return df_building_profiles
1054
1055
1056
def delete_synthetic_cts_buildings():
1057
    """
1058
    All synthetic cts buildings are deleted from the DB. This is necessary if
1059
    the task is run multiple times as the existing synthetic buildings
1060
    influence the results.
1061
    """
1062
    # import db tables
1063
    from saio.openstreetmap import osm_buildings_synthetic
1064
1065
    # cells mit amenities
1066
    with db.session_scope() as session:
1067
        session.query(osm_buildings_synthetic).filter(
1068
            osm_buildings_synthetic.building == "cts"
1069
        ).delete()
1070
1071
1072
def remove_double_bus_id(df_cts_buildings):
1073
    """"""
1074
    # from saio.boundaries import egon_map_zensus_buildings_filtered_all
1075
1076
    # assign bus_id via census cell of building centroid
1077
    with db.session_scope() as session:
1078
        cells_query = session.query(
1079
            # egon_map_zensus_buildings_filtered_all.id,
1080
            # egon_map_zensus_buildings_filtered_all.zensus_population_id,
1081
            MapZensusGridDistricts.zensus_population_id,
1082
            MapZensusGridDistricts.bus_id
1083
            # ).filter(
1084
            #     MapZensusGridDistricts.zensus_population_id
1085
            #     == egon_map_zensus_buildings_filtered_all.zensus_population_id
1086
        )
1087
1088
    df_egon_map_zensus_buildings_buses = pd.read_sql(
1089
        cells_query.statement,
1090
        cells_query.session.bind,
1091
        index_col=None,
1092
    )
1093
    df_cts_buildings = pd.merge(
1094
        left=df_cts_buildings,
1095
        right=df_egon_map_zensus_buildings_buses,
1096
        on="zensus_population_id",
1097
    )
1098
1099
    substation_per_building = df_cts_buildings.groupby("id")[
1100
        "bus_id"
1101
    ].nunique()
1102
    building_id = substation_per_building.loc[
1103
        substation_per_building > 1
1104
    ].index
1105
    df_duplicates = df_cts_buildings.loc[
1106
        df_cts_buildings["id"].isin(building_id)
1107
    ]
1108
    for unique_id in df_duplicates["id"].unique():
1109
        drop_index = df_duplicates[df_duplicates["id"] == unique_id].index[0]
1110
        print(
1111
            f"Buildings {df_cts_buildings.loc[drop_index, 'id']}"
1112
            f" dropped because of double substation"
1113
        )
1114
        df_cts_buildings.drop(index=drop_index, inplace=True)
1115
1116
    df_cts_buildings.drop(columns="bus_id", inplace=True)
1117
1118
    return df_cts_buildings
1119
1120
1121
def cts_buildings():
1122
    """
1123
    Assigns CTS demand to buildings and calculates the respective demand
1124
    profiles. The demand profile per substation are disaggregated per
1125
    annual demand share of each census cell and by the number of amenities
1126
    per building within the cell. If no building data is available,
1127
    synthetic buildings are generated around the amenities. If no amenities
1128
    but cts demand is available, buildings are randomly selected. If no
1129
    building nor amenity is available, random synthetic buildings are
1130
    generated. The demand share is stored in the database.
1131
1132
    Note:
1133
    -----
1134
    Cells with CTS demand, amenities and buildings do not change within
1135
    the scenarios, only the demand itself. Therefore scenario eGon2035
1136
    can be used universally to determine the cts buildings but not for
1137
    he demand share.
1138
    """
1139
1140
    log = start_logging(name="CTS-buildings")
1141
    log.info("Start logging!")
1142
    # Buildings with amenities
1143
    df_buildings_with_amenities, df_lost_cells = buildings_with_amenities()
1144
    log.info("Buildings with amenities selected!")
1145
1146
    # Median number of amenities per cell
1147
    median_n_amenities = int(
1148
        df_buildings_with_amenities.groupby("zensus_population_id")[
1149
            "n_amenities_inside"
1150
        ]
1151
        .sum()
1152
        .median()
1153
    )
1154
    # TODO remove
1155
    print(f"Median amenity value: {median_n_amenities}")
1156
1157
    # Remove synthetic CTS buildings if existing
1158
    delete_synthetic_cts_buildings()
1159
    log.info("Old synthetic cts buildings deleted!")
1160
1161
    # Amenities not assigned to buildings
1162
    df_amenities_without_buildings = amenities_without_buildings()
1163
    log.info("Amenities without buildlings selected!")
1164
1165
    # Append lost cells due to duplicated ids, to cover all demand cells
1166
    if not df_lost_cells.empty:
1167
1168
        df_lost_cells["amenities"] = median_n_amenities
1169
        # create row for every amenity
1170
        df_lost_cells["amenities"] = (
1171
            df_lost_cells["amenities"].astype(int).apply(range)
1172
        )
1173
        df_lost_cells = df_lost_cells.explode("amenities")
1174
        df_lost_cells.drop(columns="amenities", inplace=True)
1175
        df_amenities_without_buildings = df_amenities_without_buildings.append(
1176
            df_lost_cells, ignore_index=True
1177
        )
1178
        log.info("Lost cells due to substation intersection appended!")
1179
1180
    # One building per amenity
1181
    df_amenities_without_buildings["n_amenities_inside"] = 1
1182
    # Create synthetic buildings for amenites without buildings
1183
    df_synthetic_buildings_with_amenities = create_synthetic_buildings(
1184
        df_amenities_without_buildings, points="geom_amenity"
1185
    )
1186
    log.info("Synthetic buildings created!")
1187
1188
    # TODO write to DB and remove renaming
1189
    write_table_to_postgis(
1190
        df_synthetic_buildings_with_amenities.rename(
1191
            columns={
1192
                "zensus_population_id": "cell_id",
1193
                "egon_building_id": "id",
1194
            }
1195
        ),
1196
        OsmBuildingsSynthetic,
1197
        drop=False,
1198
    )
1199
    log.info("Synthetic buildings exported to DB!")
1200
1201
    # Cells without amenities but CTS demand and buildings
1202
    df_buildings_without_amenities = buildings_without_amenities()
1203
    log.info("Buildings without amenities in demand cells identified!")
1204
1205
    # TODO Fix Adhoc Bugfix duplicated buildings
1206
    # drop building ids which have already been used
1207
    mask = df_buildings_without_amenities.loc[
1208
        df_buildings_without_amenities["id"].isin(
1209
            df_buildings_with_amenities["id"]
1210
        )
1211
    ].index
1212
    df_buildings_without_amenities = df_buildings_without_amenities.drop(
1213
        index=mask
1214
    ).reset_index(drop=True)
1215
    log.info(f"{len(mask)} duplicated ids removed!")
1216
1217
    # select median n buildings per cell
1218
    df_buildings_without_amenities = select_cts_buildings(
1219
        df_buildings_without_amenities, max_n=median_n_amenities
1220
    )
1221
    df_buildings_without_amenities["n_amenities_inside"] = 1
1222
    log.info(f"{median_n_amenities} buildings per cell selected!")
1223
1224
    # Create synthetic amenities and buildings in cells with only CTS demand
1225
    df_cells_with_cts_demand_only = cells_with_cts_demand_only(
1226
        df_buildings_without_amenities
1227
    )
1228
    log.info("Cells with only demand identified!")
1229
1230
    # Median n Amenities per cell
1231
    df_cells_with_cts_demand_only["amenities"] = median_n_amenities
1232
    # create row for every amenity
1233
    df_cells_with_cts_demand_only["amenities"] = (
1234
        df_cells_with_cts_demand_only["amenities"].astype(int).apply(range)
1235
    )
1236
    df_cells_with_cts_demand_only = df_cells_with_cts_demand_only.explode(
1237
        "amenities"
1238
    )
1239
    df_cells_with_cts_demand_only.drop(columns="amenities", inplace=True)
1240
1241
    # Only 1 Amenity per Building
1242
    df_cells_with_cts_demand_only["n_amenities_inside"] = 1
1243
    df_cells_with_cts_demand_only = place_buildings_with_amenities(
1244
        df_cells_with_cts_demand_only, amenities=1
1245
    )
1246
    df_synthetic_buildings_without_amenities = create_synthetic_buildings(
1247
        df_cells_with_cts_demand_only, points="geom_point"
1248
    )
1249
    log.info(f"{median_n_amenities} synthetic buildings per cell created")
1250
1251
    # TODO write to DB and remove (backup) renaming
1252
    write_table_to_postgis(
1253
        df_synthetic_buildings_without_amenities.rename(
1254
            columns={
1255
                "zensus_population_id": "cell_id",
1256
                "egon_building_id": "id",
1257
            }
1258
        ),
1259
        OsmBuildingsSynthetic,
1260
        drop=False,
1261
    )
1262
    log.info("Synthetic buildings exported to DB")
1263
1264
    # Concat all buildings
1265
    columns = [
1266
        "zensus_population_id",
1267
        "id",
1268
        "geom_building",
1269
        "n_amenities_inside",
1270
        "source",
1271
    ]
1272
1273
    df_buildings_with_amenities["source"] = "bwa"
1274
    df_synthetic_buildings_with_amenities["source"] = "sbwa"
1275
    df_buildings_without_amenities["source"] = "bwoa"
1276
    df_synthetic_buildings_without_amenities["source"] = "sbwoa"
1277
1278
    df_cts_buildings = pd.concat(
1279
        [
1280
            df_buildings_with_amenities[columns],
1281
            df_synthetic_buildings_with_amenities[columns],
1282
            df_buildings_without_amenities[columns],
1283
            df_synthetic_buildings_without_amenities[columns],
1284
        ],
1285
        axis=0,
1286
        ignore_index=True,
1287
    )
1288
    df_cts_buildings = remove_double_bus_id(df_cts_buildings)
1289
    log.info("Double bus_id checked")
1290
1291
    # TODO maybe remove after #772
1292
    df_cts_buildings["id"] = df_cts_buildings["id"].astype(int)
1293
1294
    # Write table to db for debugging
1295
    # TODO remove later
1296
    df_cts_buildings = gpd.GeoDataFrame(
1297
        df_cts_buildings, geometry="geom_building", crs=3035
1298
    )
1299
    df_cts_buildings = df_cts_buildings.reset_index().rename(
1300
        columns={"index": "serial"}
1301
    )
1302
    write_table_to_postgis(
1303
        df_cts_buildings,
1304
        CtsBuildings,
1305
        drop=True,
1306
    )
1307
    log.info("CTS buildings exported to DB!")
1308
1309
1310 View Code Duplication
def cts_electricity():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1311
    """
1312
    Calculate cts electricity demand share of hvmv substation profile
1313
     for buildings.
1314
    """
1315
    log = start_logging(name="CTS-electricity")
1316
    log.info("Start logging!")
1317
    with db.session_scope() as session:
1318
        cells_query = session.query(CtsBuildings)
1319
1320
    df_cts_buildings = pd.read_sql(
1321
        cells_query.statement, cells_query.session.bind, index_col=None
1322
    )
1323
    log.info("CTS buildings from DB imported!")
1324
    df_demand_share_2035 = calc_building_demand_profile_share(
1325
        df_cts_buildings, scenario="eGon2035", sector="electricity"
1326
    )
1327
    log.info("Profile share for egon2035 calculated!")
1328
    df_demand_share_100RE = calc_building_demand_profile_share(
1329
        df_cts_buildings, scenario="eGon100RE", sector="electricity"
1330
    )
1331
    log.info("Profile share for egon100RE calculated!")
1332
    df_demand_share = pd.concat(
1333
        [df_demand_share_2035, df_demand_share_100RE],
1334
        axis=0,
1335
        ignore_index=True,
1336
    )
1337
1338
    write_table_to_postgres(
1339
        df_demand_share, EgonCtsElectricityDemandBuildingShare, drop=True
1340
    )
1341
    log.info("Profile share exported to DB!")
1342
1343
1344 View Code Duplication
def cts_heat():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1345
    """
1346
    Calculate cts electricity demand share of hvmv substation profile
1347
     for buildings.
1348
    """
1349
    log = start_logging(name="CTS-heat")
1350
    log.info("Start logging!")
1351
    with db.session_scope() as session:
1352
        cells_query = session.query(CtsBuildings)
1353
1354
    df_cts_buildings = pd.read_sql(
1355
        cells_query.statement, cells_query.session.bind, index_col=None
1356
    )
1357
    log.info("CTS buildings from DB imported!")
1358
1359
    df_demand_share_2035 = calc_building_demand_profile_share(
1360
        df_cts_buildings, scenario="eGon2035", sector="heat"
1361
    )
1362
    log.info("Profile share for egon2035 calculated!")
1363
    df_demand_share_100RE = calc_building_demand_profile_share(
1364
        df_cts_buildings, scenario="eGon100RE", sector="heat"
1365
    )
1366
    log.info("Profile share for egon100RE calculated!")
1367
    df_demand_share = pd.concat(
1368
        [df_demand_share_2035, df_demand_share_100RE],
1369
        axis=0,
1370
        ignore_index=True,
1371
    )
1372
1373
    write_table_to_postgres(
1374
        df_demand_share, EgonCtsHeatDemandBuildingShare, drop=True
1375
    )
1376
    log.info("Profile share exported to DB!")
1377
1378
1379
def get_cts_electricity_peak_load():
1380
    """
1381
    Get electricity peak load of all CTS buildings for both scenarios and
1382
    store in DB.
1383
    """
1384
    log = start_logging(name="CTS-electricity-peak-load")
1385
    log.info("Start logging!")
1386
    # Delete rows with cts demand
1387
    with db.session_scope() as session:
1388
        session.query(BuildingPeakLoads).filter(
1389
            BuildingPeakLoads.sector == "cts"
1390
        ).delete()
1391
    log.info("CTS Peak load removed from DB!")
1392
1393
    for scenario in ["eGon2035", "eGon100RE"]:
1394
1395
        with db.session_scope() as session:
1396
            cells_query = session.query(
1397
                EgonCtsElectricityDemandBuildingShare
1398
            ).filter(
1399
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1400
            )
1401
1402
        df_demand_share = pd.read_sql(
1403
            cells_query.statement, cells_query.session.bind, index_col=None
1404
        )
1405
1406
        df_cts_profiles = calc_load_curves_cts(scenario=scenario)
1407
1408
        df_peak_load = pd.merge(
1409
            left=df_cts_profiles.max(axis=0).astype(float).rename("max"),
1410
            right=df_demand_share,
1411
            left_on="bus_id",
1412
            right_on="bus_id",
1413
        )
1414
1415
        # Convert unit from MWh to W
1416
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1417
        df_peak_load["peak_load_in_w"] = (
1418
            df_peak_load["max"] * df_peak_load["profile_share"]
1419
        )
1420
        log.info(f"Peak load for {scenario} determined!")
1421
1422
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1423
        df_peak_load["sector"] = "cts"
1424
1425
        df_peak_load = df_peak_load[
1426
            ["building_id", "sector", "scenario", "peak_load_in_w"]
1427
        ]
1428
1429
        # Write peak loads into db
1430
        with db.session_scope() as session:
1431
            session.bulk_insert_mappings(
1432
                BuildingPeakLoads,
1433
                df_peak_load.to_dict(orient="records"),
1434
            )
1435
        log.info(f"Peak load for {scenario} exported to DB!")
1436
1437
1438
def get_cts_heat_peak_load():
1439
    """
1440
    Get heat peak load of all CTS buildings for both scenarios and store in DB.
1441
    """
1442
    log = start_logging(name="CTS-heat-peak-load")
1443
    log.info("Start logging!")
1444
1445
    BuildingHeatPeakLoads.__table__.create(bind=engine, checkfirst=True)
1446
1447
    # Delete rows with cts demand
1448
    with db.session_scope() as session:
1449
        session.query(BuildingHeatPeakLoads).filter(
1450
            BuildingHeatPeakLoads.sector == "cts"
1451
        ).delete()
1452
    log.info("CTS Peak load removed from DB!")
1453
1454
    for scenario in ["eGon2035", "eGon100RE"]:
1455
1456
        with db.session_scope() as session:
1457
            cells_query = session.query(EgonCtsHeatDemandBuildingShare).filter(
1458
                EgonCtsHeatDemandBuildingShare.scenario == scenario
1459
            )
1460
1461
        df_demand_share = pd.read_sql(
1462
            cells_query.statement, cells_query.session.bind, index_col=None
1463
        )
1464
1465
        with db.session_scope() as session:
1466
            cells_query = session.query(EgonEtragoElectricityCts).filter(
1467
                EgonEtragoElectricityCts.scn_name == scenario
1468
            )
1469
1470
        df_cts_profiles = pd.read_sql(
1471
            cells_query.statement,
1472
            cells_query.session.bind,
1473
        )
1474
        df_cts_profiles = pd.DataFrame.from_dict(
1475
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1476
            orient="index",
1477
        )
1478
1479
        df_peak_load = pd.merge(
1480
            left=df_cts_profiles.max(axis=0).astype(float).rename("max"),
1481
            right=df_demand_share,
1482
            left_on="bus_id",
1483
            right_on="bus_id",
1484
        )
1485
1486
        # Convert unit from MWh to W
1487
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1488
        df_peak_load["peak_load_in_w"] = (
1489
            df_peak_load["max"] * df_peak_load["profile_share"]
1490
        )
1491
        log.info(f"Peak load for {scenario} determined!")
1492
1493
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1494
        df_peak_load["sector"] = "cts"
1495
1496
        df_peak_load = df_peak_load[
1497
            ["building_id", "sector", "scenario", "peak_load_in_w"]
1498
        ]
1499
1500
        # Write peak loads into db
1501
        with db.session_scope() as session:
1502
            session.bulk_insert_mappings(
1503
                BuildingHeatPeakLoads,
1504
                df_peak_load.to_dict(orient="records"),
1505
            )
1506
        log.info(f"Peak load for {scenario} exported to DB!")
1507
1508
1509
class CtsElectricityBuildings(Dataset):
1510
    def __init__(self, dependencies):
1511
        super().__init__(
1512
            name="CtsElectricityBuildings",
1513
            version="0.0.0",
1514
            dependencies=dependencies,
1515
            tasks=(
1516
                cts_buildings,
1517
                {cts_electricity, cts_heat},
1518
                get_cts_electricity_peak_load,
1519
            ),
1520
        )
1521