Passed
Pull Request — dev (#905)
by
unknown
01:38
created

create_synthetic_buildings()   B

Complexity

Conditions 6

Size

Total Lines 66
Code Lines 29

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 29
dl 0
loc 66
rs 8.2506
c 0
b 0
f 0
cc 6
nop 3

How to fix   Long Method   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
"""
2
CTS electricity and heat demand time series for scenarios in 2035 and 2050
3
assigned to OSM-buildings.
4
5
Disaggregation of cts heat & electricity demand time series from MV Substation
6
to census cells via annual demand and then to OSM buildings via
7
amenity tags or randomly if no sufficient OSM-data is available in the
8
respective census cell. If no OSM-buildings or synthetic residential buildings
9
are available new synthetic 5x5m buildings are generated.
10
11
The resulting data is stored in separate tables
12
13
* `openstreetmap.osm_buildings_synthetic`:
14
    Lists generated synthetic building with id, zensus_population_id and
15
    building type. This table is already created within
16
    :func:`hh_buildings.map_houseprofiles_to_buildings()`
17
* `openstreetmap.egon_cts_buildings`:
18
    Table of all selected cts buildings with id, census cell id, geometry and
19
    amenity count in building. This table is created within
20
    :func:`cts_buildings()`
21
* `demand.egon_cts_electricity_demand_building_share`:
22
    Table including the mv substation electricity profile share of all selected
23
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
24
    within :func:`cts_electricity()`
25
* `demand.egon_cts_heat_demand_building_share`:
26
    Table including the mv substation heat profile share of all selected
27
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
28
    within :func:`cts_heat()`
29
* `demand.egon_building_peak_loads`:
30
    Mapping of demand time series and buildings including cell_id, building
31
    area and peak load. This table is already created within
32
    :func:`hh_buildings.get_building_peak_loads()`
33
34
**The following datasets from the database are mainly used for creation:**
35
36
* `openstreetmap.osm_buildings_filtered`:
37
    Table of OSM-buildings filtered by tags to selecting residential and cts
38
    buildings only.
39
* `openstreetmap.osm_amenities_shops_filtered`:
40
    Table of OSM-amenities filtered by tags to select cts only.
41
* `openstreetmap.osm_amenities_not_in_buildings_filtered`:
42
    Table of amenities which do not intersect with any building from
43
    `openstreetmap.osm_buildings_filtered`
44
* `openstreetmap.osm_buildings_synthetic`:
45
    Table of synthetic residential buildings
46
* `boundaries.egon_map_zensus_buildings_filtered_all`:
47
    Mapping table of census cells and buildings filtered even if population
48
    in census cell = 0.
49
* `demand.egon_demandregio_zensus_electricity`:
50
    Table of annual electricity load demand for residential and cts at census
51
    cell level. Residential load demand is derived from aggregated residential
52
    building profiles. DemandRegio CTS load demand at NUTS3 is distributed to
53
    census cells linearly to heat demand from peta5.
54
* `demand.egon_peta_heat`:
55
    Table of annual heat load demand for residential and cts at census cell
56
    level from peta5.
57
* `demand.egon_etrago_electricity_cts`:
58
    Scaled cts electricity time series for every MV substation. Derived from
59
    DemandRegio SLP for selected economic sectors at nuts3. Scaled with annual
60
    demand from `demand.egon_demandregio_zensus_electricity`
61
* `demand.egon_etrago_heat_cts`:
62
    Scaled cts heat time series for every MV substation. Derived from
63
    DemandRegio SLP Gas for selected economic sectors at nuts3. Scaled with
64
    annual demand from `demand.egon_peta_heat`.
65
66
**What is the goal?**
67
68
To disaggregate cts heat and electricity time series from MV substation level
69
to geo-referenced buildings, the annual demand from DemandRegio and Peta5 is
70
used to identify census cells with load demand. We use Openstreetmap data and
71
filter tags to identify buildings and count the  amenities within. The number
72
of amenities and the annual demand serve to assign a demand share of the MV
73
substation profile to the building.
74
75
**What is the challenge?**
76
77
The OSM, DemandRegio and Peta5 dataset differ from each other. The OSM dataset
78
is a community based dataset which is extended throughout and does not claim to
79
be complete. Therefore, not all census cells which have a demand assigned by
80
DemandRegio or Peta5 methodology also have buildings with respective tags or
81
sometime even any building at all. Furthermore, the substation load areas are
82
determined dynamically in a previous dataset. Merging these datasets different
83
scopes (census cell shapes, building shapes) and their inconsistencies need to
84
be addressed. For example: not yet tagged buildings or amenities in OSM, or
85
building shapes exceeding census cells.
86
87
88
**How are these datasets combined?**
89
90
91
The methodology for heat and electricity is the same and only differ in the
92
annual demand and MV/HV Substation profile. In a previous dataset
93
(openstreetmap), we filter all OSM buildings and amenities for tags, we relate
94
to the cts sector. Amenities are mapped to intersecting buildings and then
95
intersected with the annual demand which exists at census cell level. We obtain
96
census cells with demand and amenities and without amenities. If there is no
97
data on amenities, n synthetic ones are assigned to existing buildings. We use
98
the median value of amenities/census cell for n and all filtered buildings +
99
synthetic residential buildings. If no building data is available a synthetic
100
buildings is randomly generated. This also happens for amenities which couldn't
101
be assigned to any osm building. All census cells with an annual demand are
102
covered this way, and we obtain four different categories of amenities:
103
104
* Buildings with amenities
105
* Synthetic buildings with amenities
106
* Buildings with synthetic amenities
107
* Synthetics buildings with synthetic amenities
108
109
The amenities are summed per census cell (of amenity) and building to derive
110
the building amenity share per census cell. Multiplied with the annual demand,
111
we receive the profile demand share for each cell. Some buildings exceed the
112
census cell shape and have amenities in different cells although mapped to one
113
building only. To have unique buildings the demand share is summed once more
114
per building id. This factor can now be used to obtain the profile for each
115
building.
116
117
A schematic flow chart exist in the correspondent issue #671:
118
https://github.com/openego/eGon-data/issues/671#issuecomment-1260740258
119
120
121
**What are central assumptions during the data processing?**
122
123
* We assume OSM data to be the most reliable and complete open source dataset.
124
* We assume building and amenity tags to be truthful and accurate.
125
* Mapping census to OSM data is not trivial. Discrepancies are substituted.
126
* Missing OSM buildings are generated for each amenity.
127
* Missing amenities are generated by median value of amenities/census cell.
128
129
130
**Drawbacks and limitations of the data**
131
132
* Shape of profiles for each building is similar within one substation load
133
area and only scaled differently.
134
* Load areas are generated dynamically. In case of buildings with amenities
135
exceeding borders, the number of amenities are transferred randomly to either
136
of two load areas.
137
* The completeness of the OSM data depends on community contribution and is
138
crucial to the quality of our results.
139
* Randomly selected buildings and generated amenities may inadequately reflect
140
reality, but are chosen for sake of simplicity as a measure to fill data gaps.
141
* Since this dataset is a cascade after generation of synthetic residential
142
buildings also check drawbacks and limitations in hh_buildings.py
143
* Synthetic buildings may be placed within osm buildings which exceed multiple
144
census cells. This is currently accepted but may be solved in  #953
145
* Scattered high peak loads occur and might lead to single MV grid connections
146
in ding0. In some cases this might not be viable. Postprocessing is needed and
147
may be solved in #954
148
149
150
Example Query
151
-----
152
153
154
Notes
155
-----
156
157
This module docstring is rather a dataset documentation. Once, a decision
158
is made in ... the content of this module docstring needs to be moved to
159
docs attribute of the respective dataset class.
160
"""
161
162
from geoalchemy2 import Geometry
163
from geoalchemy2.shape import to_shape
164
from sqlalchemy import REAL, Column, Integer, String, func
165
from sqlalchemy.ext.declarative import declarative_base
166
import geopandas as gpd
167
import numpy as np
168
import pandas as pd
169
import saio
170
171
from egon.data import db
172
from egon.data import logger as log
173
from egon.data.datasets import Dataset
174
from egon.data.datasets.electricity_demand import (
175
    EgonDemandRegioZensusElectricity,
176
)
177
from egon.data.datasets.electricity_demand.temporal import (
178
    EgonEtragoElectricityCts,
179
)
180
from egon.data.datasets.electricity_demand_timeseries.hh_buildings import (
181
    BuildingElectricityPeakLoads,
182
    OsmBuildingsSynthetic,
183
)
184
from egon.data.datasets.electricity_demand_timeseries.tools import (
185
    random_ints_until_sum,
186
    random_point_in_square,
187
    specific_int_until_sum,
188
    write_table_to_postgis,
189
    write_table_to_postgres,
190
)
191
from egon.data.datasets.heat_demand import EgonPetaHeat
192
from egon.data.datasets.heat_demand_timeseries import EgonEtragoHeatCts
193
from egon.data.datasets.zensus_mv_grid_districts import MapZensusGridDistricts
194
from egon.data.datasets.zensus_vg250 import DestatisZensusPopulationPerHa
195
196
engine = db.engine()
197
Base = declarative_base()
198
199
# import db tables
200
saio.register_schema("openstreetmap", engine=engine)
201
saio.register_schema("boundaries", engine=engine)
202
203
204
class EgonCtsElectricityDemandBuildingShare(Base):
205
    __tablename__ = "egon_cts_electricity_demand_building_share"
206
    __table_args__ = {"schema": "demand"}
207
208
    building_id = Column(Integer, primary_key=True)
209
    scenario = Column(String, primary_key=True)
210
    bus_id = Column(Integer, index=True)
211
    profile_share = Column(REAL)
212
213
214
class EgonCtsHeatDemandBuildingShare(Base):
215
    __tablename__ = "egon_cts_heat_demand_building_share"
216
    __table_args__ = {"schema": "demand"}
217
218
    building_id = Column(Integer, primary_key=True)
219
    scenario = Column(String, primary_key=True)
220
    bus_id = Column(Integer, index=True)
221
    profile_share = Column(REAL)
222
223
224
class CtsBuildings(Base):
225
    __tablename__ = "egon_cts_buildings"
226
    __table_args__ = {"schema": "openstreetmap"}
227
228
    serial = Column(Integer, primary_key=True)
229
    id = Column(Integer, index=True)
230
    zensus_population_id = Column(Integer, index=True)
231
    geom_building = Column(Geometry("Polygon", 3035))
232
    n_amenities_inside = Column(Integer)
233
    source = Column(String)
234
235
236
class BuildingHeatPeakLoads(Base):
237
    __tablename__ = "egon_building_heat_peak_loads"
238
    __table_args__ = {"schema": "demand"}
239
240
    building_id = Column(Integer, primary_key=True)
241
    scenario = Column(String, primary_key=True)
242
    sector = Column(String, primary_key=True)
243
    peak_load_in_w = Column(REAL)
244
245
246
class CtsDemandBuildings(Dataset):
247
    def __init__(self, dependencies):
248
        super().__init__(
249
            name="CtsDemandBuildings",
250
            version="0.0.0",
251
            dependencies=dependencies,
252
            tasks=(
253
                cts_buildings,
254
                {cts_electricity, cts_heat},
255
                {get_cts_electricity_peak_load, get_cts_heat_peak_load},
256
            ),
257
        )
258
259
260
def amenities_without_buildings():
261
    """
262
    Amenities which have no buildings assigned and are in a cell with cts
263
    demand are determined.
264
265
    Returns
266
    -------
267
    pd.DataFrame
268
        Table of amenities without buildings
269
    """
270
    from saio.openstreetmap import osm_amenities_not_in_buildings_filtered
271
272
    with db.session_scope() as session:
273
        cells_query = (
274
            session.query(
275
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
276
                osm_amenities_not_in_buildings_filtered.geom_amenity,
277
                osm_amenities_not_in_buildings_filtered.egon_amenity_id,
278
            )
279
            .filter(
280
                func.st_within(
281
                    osm_amenities_not_in_buildings_filtered.geom_amenity,
282
                    DestatisZensusPopulationPerHa.geom,
283
                )
284
            )
285
            .filter(
286
                DestatisZensusPopulationPerHa.id
287
                == EgonDemandRegioZensusElectricity.zensus_population_id
288
            )
289
            .filter(
290
                EgonDemandRegioZensusElectricity.sector == "service",
291
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
292
            )
293
        )
294
295
    df_amenities_without_buildings = gpd.read_postgis(
296
        cells_query.statement,
297
        cells_query.session.bind,
298
        geom_col="geom_amenity",
299
    )
300
    return df_amenities_without_buildings
301
302
303
def place_buildings_with_amenities(df, amenities=None, max_amenities=None):
304
    """
305
    Building centroids are placed randomly within census cells.
306
    The Number of buildings is derived from n_amenity_inside, the selected
307
    method and number of amenities per building.
308
309
    Returns
310
    -------
311
    df: gpd.GeoDataFrame
312
        Table of buildings centroids
313
    """
314
    if isinstance(max_amenities, int):
315
        # amount of amenities is randomly generated within bounds
316
        # (max_amenities, amenities per cell)
317
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
318
            random_ints_until_sum, args=[max_amenities]
319
        )
320
    if isinstance(amenities, int):
321
        # Specific amount of amenities per building
322
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
323
            specific_int_until_sum, args=[amenities]
324
        )
325
326
    # Unnest each building
327
    df = df.explode(column="n_amenities_inside")
328
329
    # building count per cell
330
    df["building_count"] = df.groupby(["zensus_population_id"]).cumcount() + 1
331
332
    # generate random synthetic buildings
333
    edge_length = 5
334
    # create random points within census cells
335
    points = random_point_in_square(geom=df["geom"], tol=edge_length / 2)
336
337
    df.reset_index(drop=True, inplace=True)
338
    # Store center of polygon
339
    df["geom_point"] = points
340
    # Drop geometry of census cell
341
    df = df.drop(columns=["geom"])
342
343
    return df
344
345
346
def create_synthetic_buildings(df, points=None, crs="EPSG:3035"):
347
    """
348
    Synthetic buildings are generated around points.
349
350
    Parameters
351
    ----------
352
    df: pd.DataFrame
353
        Table of census cells
354
    points: gpd.GeoSeries or str
355
        List of points to place buildings around or column name of df
356
    crs: str
357
        CRS of result table
358
359
    Returns
360
    -------
361
    df: gpd.GeoDataFrame
362
        Synthetic buildings
363
    """
364
365
    if isinstance(points, str) and points in df.columns:
366
        points = df[points]
367
    elif isinstance(points, gpd.GeoSeries):
368
        pass
369
    else:
370
        raise ValueError("Points are of the wrong type")
371
372
    # Create building using a square around point
373
    edge_length = 5
374
    df["geom_building"] = points.buffer(distance=edge_length / 2, cap_style=3)
375
376
    if "geom_point" not in df.columns:
377
        df["geom_point"] = df["geom_building"].centroid
378
379
    df = gpd.GeoDataFrame(
380
        df,
381
        crs=crs,
382
        geometry="geom_building",
383
    )
384
385
    # TODO remove after #772 implementation of egon_building_id
386
    df.rename(columns={"id": "egon_building_id"}, inplace=True)
387
388
    # get max number of building ids from synthetic residential table
389
    with db.session_scope() as session:
390
        max_synth_residential_id = session.execute(
391
            func.max(OsmBuildingsSynthetic.id)
392
        ).scalar()
393
    max_synth_residential_id = int(max_synth_residential_id)
394
395
    # create sequential ids
396
    df["egon_building_id"] = range(
397
        max_synth_residential_id + 1,
398
        max_synth_residential_id + df.shape[0] + 1,
399
    )
400
401
    df["area"] = df["geom_building"].area
402
    # set building type of synthetic building
403
    df["building"] = "cts"
404
    # TODO remove after #772
405
    df = df.rename(
406
        columns={
407
            # "zensus_population_id": "cell_id",
408
            "egon_building_id": "id",
409
        }
410
    )
411
    return df
412
413
414
def buildings_with_amenities():
415
    """
416
    Amenities which are assigned to buildings are determined
417
    and grouped per building and zensus cell. Buildings
418
    covering multiple cells therefore exists multiple times
419
    but in different zensus cells. This is necessary to cover
420
    all cells with a cts demand. If buildings exist in multiple
421
    substations, their amenities are summed and assigned and kept in
422
    one substation only. If as a result, a census cell is uncovered,
423
    a synthetic amenity is placed. The buildings are aggregated
424
    afterwards during the calculation of the profile_share.
425
426
    Returns
427
    -------
428
    df_buildings_with_amenities: gpd.GeoDataFrame
429
        Contains all buildings with amenities per zensus cell.
430
    df_lost_cells: gpd.GeoDataFrame
431
        Contains synthetic amenities in lost cells. Might be empty
432
    """
433
434
    from saio.openstreetmap import osm_amenities_in_buildings_filtered
435
436
    with db.session_scope() as session:
437
        cells_query = (
438
            session.query(
439
                osm_amenities_in_buildings_filtered,
440
                MapZensusGridDistricts.bus_id,
441
            )
442
            .filter(
443
                MapZensusGridDistricts.zensus_population_id
444
                == osm_amenities_in_buildings_filtered.zensus_population_id
445
            )
446
            .filter(
447
                EgonDemandRegioZensusElectricity.zensus_population_id
448
                == osm_amenities_in_buildings_filtered.zensus_population_id
449
            )
450
            .filter(
451
                EgonDemandRegioZensusElectricity.sector == "service",
452
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
453
            )
454
        )
455
    df_amenities_in_buildings = pd.read_sql(
456
        cells_query.statement, cells_query.session.bind, index_col=None
457
    )
458
459
    df_amenities_in_buildings["geom_building"] = df_amenities_in_buildings[
460
        "geom_building"
461
    ].apply(to_shape)
462
    df_amenities_in_buildings["geom_amenity"] = df_amenities_in_buildings[
463
        "geom_amenity"
464
    ].apply(to_shape)
465
466
    df_amenities_in_buildings["n_amenities_inside"] = 1
467
468
    # add identifier column for buildings in multiple substations
469
    df_amenities_in_buildings[
470
        "duplicate_identifier"
471
    ] = df_amenities_in_buildings.groupby(["id", "bus_id"])[
472
        "n_amenities_inside"
473
    ].transform(
474
        "cumsum"
475
    )
476
    df_amenities_in_buildings = df_amenities_in_buildings.sort_values(
477
        ["id", "duplicate_identifier"]
478
    )
479
    # sum amenities of buildings with multiple substations
480
    df_amenities_in_buildings[
481
        "n_amenities_inside"
482
    ] = df_amenities_in_buildings.groupby(["id", "duplicate_identifier"])[
483
        "n_amenities_inside"
484
    ].transform(
485
        "sum"
486
    )
487
488
    # create column to always go for bus_id with max amenities
489
    df_amenities_in_buildings[
490
        "max_amenities"
491
    ] = df_amenities_in_buildings.groupby(["id", "bus_id"])[
492
        "n_amenities_inside"
493
    ].transform(
494
        "sum"
495
    )
496
    # sort to go for
497
    df_amenities_in_buildings.sort_values(
498
        ["id", "max_amenities"], ascending=False, inplace=True
499
    )
500
501
    # identify lost zensus cells
502
    df_lost_cells = df_amenities_in_buildings.loc[
503
        df_amenities_in_buildings.duplicated(
504
            subset=["id", "duplicate_identifier"], keep="first"
505
        )
506
    ]
507
    df_lost_cells.drop_duplicates(
508
        subset=["zensus_population_id"], inplace=True
509
    )
510
511
    # drop buildings with multiple substation and lower max amenity
512
    df_amenities_in_buildings.drop_duplicates(
513
        subset=["id", "duplicate_identifier"], keep="first", inplace=True
514
    )
515
516
    # check if lost zensus cells are already covered
517
    if not df_lost_cells.empty:
518
        if not (
519
            df_amenities_in_buildings["zensus_population_id"]
520
            .isin(df_lost_cells["zensus_population_id"])
521
            .empty
522
        ):
523
            # query geom data for cell if not
524
            with db.session_scope() as session:
525
                cells_query = session.query(
526
                    DestatisZensusPopulationPerHa.id,
527
                    DestatisZensusPopulationPerHa.geom,
528
                ).filter(
529
                    DestatisZensusPopulationPerHa.id.in_(
530
                        df_lost_cells["zensus_population_id"]
531
                    )
532
                )
533
534
            df_lost_cells = gpd.read_postgis(
535
                cells_query.statement,
536
                cells_query.session.bind,
537
                geom_col="geom",
538
            )
539
540
            # place random amenity in cell
541
            df_lost_cells["n_amenities_inside"] = 1
542
            df_lost_cells.rename(
543
                columns={
544
                    "id": "zensus_population_id",
545
                },
546
                inplace=True,
547
            )
548
            df_lost_cells = place_buildings_with_amenities(
549
                df_lost_cells, amenities=1
550
            )
551
            df_lost_cells.rename(
552
                columns={
553
                    # "id": "zensus_population_id",
554
                    "geom_point": "geom_amenity",
555
                },
556
                inplace=True,
557
            )
558
            df_lost_cells.drop(
559
                columns=["building_count", "n_amenities_inside"], inplace=True
560
            )
561
        else:
562
            df_lost_cells = None
563
    else:
564
        df_lost_cells = None
565
566
    # drop helper columns
567
    df_amenities_in_buildings.drop(
568
        columns=["duplicate_identifier"], inplace=True
569
    )
570
571
    # sum amenities per building and cell
572
    df_amenities_in_buildings[
573
        "n_amenities_inside"
574
    ] = df_amenities_in_buildings.groupby(["zensus_population_id", "id"])[
575
        "n_amenities_inside"
576
    ].transform(
577
        "sum"
578
    )
579
    # drop duplicated buildings
580
    df_buildings_with_amenities = df_amenities_in_buildings.drop_duplicates(
581
        ["id", "zensus_population_id"]
582
    )
583
    df_buildings_with_amenities.reset_index(inplace=True, drop=True)
584
585
    df_buildings_with_amenities = df_buildings_with_amenities[
586
        ["id", "zensus_population_id", "geom_building", "n_amenities_inside"]
587
    ]
588
    df_buildings_with_amenities.rename(
589
        columns={
590
            # "zensus_population_id": "cell_id",
591
            "egon_building_id": "id"
592
        },
593
        inplace=True,
594
    )
595
596
    return df_buildings_with_amenities, df_lost_cells
597
598
599
def buildings_without_amenities():
600
    """
601
    Buildings (filtered and synthetic) in cells with
602
    cts demand but no amenities are determined.
603
604
    Returns
605
    -------
606
    df_buildings_without_amenities: gpd.GeoDataFrame
607
        Table of buildings without amenities in zensus cells
608
        with cts demand.
609
    """
610
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
611
    from saio.openstreetmap import (
612
        osm_amenities_shops_filtered,
613
        osm_buildings_filtered,
614
        osm_buildings_synthetic,
615
    )
616
617
    # buildings_filtered in cts-demand-cells without amenities
618
    with db.session_scope() as session:
619
620
        # Synthetic Buildings
621
        q_synth_buildings = session.query(
622
            osm_buildings_synthetic.cell_id.cast(Integer).label(
623
                "zensus_population_id"
624
            ),
625
            osm_buildings_synthetic.id.cast(Integer).label("id"),
626
            osm_buildings_synthetic.area.label("area"),
627
            osm_buildings_synthetic.geom_building.label("geom_building"),
628
            osm_buildings_synthetic.geom_point.label("geom_point"),
629
        )
630
631
        # Buildings filtered
632
        q_buildings_filtered = session.query(
633
            egon_map_zensus_buildings_filtered_all.zensus_population_id,
634
            osm_buildings_filtered.id,
635
            osm_buildings_filtered.area,
636
            osm_buildings_filtered.geom_building,
637
            osm_buildings_filtered.geom_point,
638
        ).filter(
639
            osm_buildings_filtered.id
640
            == egon_map_zensus_buildings_filtered_all.id
641
        )
642
643
        # Amenities + zensus_population_id
644
        q_amenities = (
645
            session.query(
646
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
647
            )
648
            .filter(
649
                func.st_within(
650
                    osm_amenities_shops_filtered.geom_amenity,
651
                    DestatisZensusPopulationPerHa.geom,
652
                )
653
            )
654
            .distinct(DestatisZensusPopulationPerHa.id)
655
        )
656
657
        # Cells with CTS demand but without amenities
658
        q_cts_without_amenities = (
659
            session.query(
660
                EgonDemandRegioZensusElectricity.zensus_population_id,
661
            )
662
            .filter(
663
                EgonDemandRegioZensusElectricity.sector == "service",
664
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
665
            )
666
            .filter(
667
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
668
                    q_amenities
669
                )
670
            )
671
            .distinct()
672
        )
673
674
        # Buildings filtered + synthetic buildings residential in
675
        # cells with CTS demand but without amenities
676
        cells_query = q_synth_buildings.union(q_buildings_filtered).filter(
677
            egon_map_zensus_buildings_filtered_all.zensus_population_id.in_(
678
                q_cts_without_amenities
679
            )
680
        )
681
682
    # df_buildings_without_amenities = pd.read_sql(
683
    #     cells_query.statement, cells_query.session.bind, index_col=None)
684
    df_buildings_without_amenities = gpd.read_postgis(
685
        cells_query.statement,
686
        cells_query.session.bind,
687
        geom_col="geom_building",
688
    )
689
690
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
691
        columns={
692
            # "zensus_population_id": "cell_id",
693
            "egon_building_id": "id",
694
        }
695
    )
696
697
    return df_buildings_without_amenities
698
699
700
def select_cts_buildings(df_buildings_wo_amenities, max_n):
701
    """
702
    N Buildings (filtered and synthetic) in each cell with
703
    cts demand are selected. Only the first n buildings
704
    are taken for each cell. The buildings are sorted by surface
705
    area.
706
707
    Returns
708
    -------
709
    df_buildings_with_cts_demand: gpd.GeoDataFrame
710
        Table of buildings
711
    """
712
713
    df_buildings_wo_amenities.sort_values(
714
        "area", ascending=False, inplace=True
715
    )
716
    # select first n ids each census cell if available
717
    df_buildings_with_cts_demand = (
718
        df_buildings_wo_amenities.groupby("zensus_population_id")
719
        .nth(list(range(max_n)))
720
        .reset_index()
721
    )
722
    df_buildings_with_cts_demand.reset_index(drop=True, inplace=True)
723
724
    return df_buildings_with_cts_demand
725
726
727
def cells_with_cts_demand_only(df_buildings_without_amenities):
728
    """
729
    Cells with cts demand but no amenities or buildilngs
730
    are determined.
731
732
    Returns
733
    -------
734
    df_cells_only_cts_demand: gpd.GeoDataFrame
735
        Table of cells with cts demand but no amenities or buildings
736
    """
737
    from saio.openstreetmap import osm_amenities_shops_filtered
738
739
    # cells mit amenities
740
    with db.session_scope() as session:
741
        sub_query = (
742
            session.query(
743
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
744
            )
745
            .filter(
746
                func.st_within(
747
                    osm_amenities_shops_filtered.geom_amenity,
748
                    DestatisZensusPopulationPerHa.geom,
749
                )
750
            )
751
            .distinct(DestatisZensusPopulationPerHa.id)
752
        )
753
754
        cells_query = (
755
            session.query(
756
                EgonDemandRegioZensusElectricity.zensus_population_id,
757
                EgonDemandRegioZensusElectricity.scenario,
758
                EgonDemandRegioZensusElectricity.sector,
759
                EgonDemandRegioZensusElectricity.demand,
760
                DestatisZensusPopulationPerHa.geom,
761
            )
762
            .filter(
763
                EgonDemandRegioZensusElectricity.sector == "service",
764
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
765
            )
766
            .filter(
767
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
768
                    sub_query
769
                )
770
            )
771
            .filter(
772
                EgonDemandRegioZensusElectricity.zensus_population_id
773
                == DestatisZensusPopulationPerHa.id
774
            )
775
        )
776
777
    df_cts_cell_without_amenities = gpd.read_postgis(
778
        cells_query.statement,
779
        cells_query.session.bind,
780
        geom_col="geom",
781
        index_col=None,
782
    )
783
784
    # TODO remove after #722
785
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
786
        columns={"cell_id": "zensus_population_id"}
787
    )
788
789
    # Census cells with only cts demand
790
    df_cells_only_cts_demand = df_cts_cell_without_amenities.loc[
791
        ~df_cts_cell_without_amenities["zensus_population_id"].isin(
792
            df_buildings_without_amenities["zensus_population_id"].unique()
793
        )
794
    ]
795
796
    df_cells_only_cts_demand.reset_index(drop=True, inplace=True)
797
798
    return df_cells_only_cts_demand
799
800
801
def calc_census_cell_share(scenario, sector):
802
    """
803
    The profile share for each census cell is calculated by it's
804
    share of annual demand per substation bus. The annual demand
805
    per cell is defined by DemandRegio/Peta5. The share is for both
806
    scenarios identical as the annual demand is linearly scaled.
807
808
    Parameters
809
    ----------
810
    scenario: str
811
        Scenario for which the share is calculated: "eGon2035" or "eGon100RE"
812
    sector: str
813
        Scenario for which the share is calculated: "electricity" or "heat"
814
815
    Returns
816
    -------
817
    df_census_share: pd.DataFrame
818
    """
819
    if sector == "electricity":
820
        with db.session_scope() as session:
821
            cells_query = (
822
                session.query(
823
                    EgonDemandRegioZensusElectricity,
824
                    MapZensusGridDistricts.bus_id,
825
                )
826
                .filter(EgonDemandRegioZensusElectricity.sector == "service")
827
                .filter(EgonDemandRegioZensusElectricity.scenario == scenario)
828
                .filter(
829
                    EgonDemandRegioZensusElectricity.zensus_population_id
830
                    == MapZensusGridDistricts.zensus_population_id
831
                )
832
            )
833
834
    elif sector == "heat":
835
        with db.session_scope() as session:
836
            cells_query = (
837
                session.query(EgonPetaHeat, MapZensusGridDistricts.bus_id)
838
                .filter(EgonPetaHeat.sector == "service")
839
                .filter(EgonPetaHeat.scenario == scenario)
840
                .filter(
841
                    EgonPetaHeat.zensus_population_id
842
                    == MapZensusGridDistricts.zensus_population_id
843
                )
844
            )
845
846
    df_demand = pd.read_sql(
847
        cells_query.statement,
0 ignored issues
show
introduced by
The variable cells_query does not seem to be defined for all execution paths.
Loading history...
848
        cells_query.session.bind,
849
        index_col="zensus_population_id",
850
    )
851
852
    # get demand share of cell per bus
853
    df_census_share = df_demand["demand"] / df_demand.groupby("bus_id")[
854
        "demand"
855
    ].transform("sum")
856
    df_census_share = df_census_share.rename("cell_share")
857
858
    df_census_share = pd.concat(
859
        [
860
            df_census_share,
861
            df_demand[["bus_id", "scenario"]],
862
        ],
863
        axis=1,
864
    )
865
866
    df_census_share.reset_index(inplace=True)
867
    return df_census_share
868
869
870
def calc_building_demand_profile_share(
871
    df_cts_buildings, scenario="eGon2035", sector="electricity"
872
):
873
    """
874
    Share of cts electricity demand profile per bus for every selected building
875
    is calculated. Building-amenity share is multiplied with census cell share
876
    to get the substation bus profile share for each building. The share is
877
    grouped and aggregated per building as some buildings exceed the shape of
878
    census cells and have amenities assigned from multiple cells. Building
879
    therefore get the amenity share of all census cells.
880
881
    Parameters
882
    ----------
883
    df_cts_buildings: gpd.GeoDataFrame
884
        Table of all buildings with cts demand assigned
885
    scenario: str
886
        Scenario for which the share is calculated.
887
    sector: str
888
        Sector for which the share is calculated.
889
890
    Returns
891
    -------
892
    df_building_share: pd.DataFrame
893
        Table of bus profile share per building
894
895
    """
896
897
    def calc_building_amenity_share(df_cts_buildings):
898
        """
899
        Calculate the building share by the number amenities per building
900
        within a census cell. Building ids can exist multiple time but with
901
        different zensus_population_ids.
902
        """
903
        df_building_amenity_share = df_cts_buildings[
904
            "n_amenities_inside"
905
        ] / df_cts_buildings.groupby("zensus_population_id")[
906
            "n_amenities_inside"
907
        ].transform(
908
            "sum"
909
        )
910
        df_building_amenity_share = pd.concat(
911
            [
912
                df_building_amenity_share.rename("building_amenity_share"),
913
                df_cts_buildings[["zensus_population_id", "id"]],
914
            ],
915
            axis=1,
916
        )
917
        return df_building_amenity_share
918
919
    df_building_amenity_share = calc_building_amenity_share(df_cts_buildings)
920
921
    df_census_cell_share = calc_census_cell_share(
922
        scenario=scenario, sector=sector
923
    )
924
925
    df_demand_share = pd.merge(
926
        left=df_building_amenity_share,
927
        right=df_census_cell_share,
928
        left_on="zensus_population_id",
929
        right_on="zensus_population_id",
930
    )
931
    df_demand_share["profile_share"] = df_demand_share[
932
        "building_amenity_share"
933
    ].multiply(df_demand_share["cell_share"])
934
935
    # only pass selected columns
936
    df_demand_share = df_demand_share[
937
        ["id", "bus_id", "scenario", "profile_share"]
938
    ]
939
940
    # Group and aggregate per building for multi cell buildings
941
    df_demand_share = (
942
        df_demand_share.groupby(["scenario", "id", "bus_id"])
943
        .sum()
944
        .reset_index()
945
    )
946
    if df_demand_share.duplicated("id", keep=False).any():
947
        print(
948
            df_demand_share.loc[df_demand_share.duplicated("id", keep=False)]
949
        )
950
    return df_demand_share
951
952
953
def calc_cts_building_profiles(
954
    bus_ids,
955
    scenario,
956
    sector,
957
):
958
    """
959
    Calculate the cts demand profile for each building. The profile is
960
    calculated by the demand share of the building per substation bus.
961
962
    Parameters
963
    ----------
964
    bus_ids: list of int
965
        Ids of the substation for which selected building profiles are
966
        calculated.
967
    scenario: str
968
        Scenario for which the share is calculated: "eGon2035" or "eGon100RE"
969
    sector: str
970
        Sector for which the share is calculated: "electricity" or "heat"
971
972
    Returns
973
    -------
974
    df_building_profiles: pd.DataFrame
975
        Table of demand profile per building. Column names are building IDs and index
976
        is hour of the year as int (0-8759).
977
978
    """
979
    if sector == "electricity":
980
        # Get cts building electricity demand share of selected buildings
981
        with db.session_scope() as session:
982
            cells_query = (
983
                session.query(
984
                    EgonCtsElectricityDemandBuildingShare,
985
                )
986
                .filter(
987
                    EgonCtsElectricityDemandBuildingShare.scenario == scenario
988
                )
989
                .filter(
990
                    EgonCtsElectricityDemandBuildingShare.bus_id.in_(
991
                        bus_ids
992
                    )
993
                )
994
            )
995
996
        df_demand_share = pd.read_sql(
997
            cells_query.statement, cells_query.session.bind, index_col=None
998
        )
999
1000
        # Get substation cts electricity load profiles of selected bus_ids
1001
        with db.session_scope() as session:
1002
            cells_query = (
1003
                session.query(EgonEtragoElectricityCts).filter(
1004
                    EgonEtragoElectricityCts.scn_name == scenario
1005
                )
1006
            ).filter(EgonEtragoElectricityCts.bus_id.in_(bus_ids))
1007
1008
        df_cts_substation_profiles = pd.read_sql(
1009
            cells_query.statement,
1010
            cells_query.session.bind,
1011
        )
1012
        df_cts_substation_profiles = pd.DataFrame.from_dict(
1013
            df_cts_substation_profiles.set_index("bus_id")["p_set"].to_dict(),
1014
            orient="index",
1015
        )
1016
        # df_cts_profiles = calc_load_curves_cts(scenario)
1017
1018
    elif sector == "heat":
1019
        # Get cts building heat demand share of selected buildings
1020
        with db.session_scope() as session:
1021
            cells_query = (
1022
                session.query(
1023
                    EgonCtsHeatDemandBuildingShare,
1024
                )
1025
                .filter(EgonCtsHeatDemandBuildingShare.scenario == scenario)
1026
                .filter(
1027
                    EgonCtsHeatDemandBuildingShare.bus_id.in_(
1028
                        bus_ids
1029
                    )
1030
                )
1031
            )
1032
1033
        df_demand_share = pd.read_sql(
1034
            cells_query.statement, cells_query.session.bind, index_col=None
1035
        )
1036
1037
        # Get substation cts heat load profiles of selected bus_ids
1038
        with db.session_scope() as session:
1039
            cells_query = (
1040
                session.query(EgonEtragoHeatCts).filter(
1041
                    EgonEtragoHeatCts.scn_name == scenario
1042
                )
1043
            ).filter(EgonEtragoHeatCts.bus_id.in_(bus_ids))
1044
1045
        df_cts_substation_profiles = pd.read_sql(
1046
            cells_query.statement,
1047
            cells_query.session.bind,
1048
        )
1049
        df_cts_substation_profiles = pd.DataFrame.from_dict(
1050
            df_cts_substation_profiles.set_index("bus_id")["p_set"].to_dict(),
1051
            orient="index",
1052
        )
1053
1054
    else:
1055
        raise KeyError("Sector needs to be either 'electricity' or 'heat'")
1056
1057
    # TODO remove after #722
1058
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1059
1060
    # get demand profile for all buildings for selected demand share
1061
    df_building_profiles = pd.DataFrame()
1062
    for bus_id, df in df_demand_share.groupby("bus_id"):
1063
        shares = df.set_index("building_id", drop=True)["profile_share"]
1064
        try:
1065
            profile_ts = df_cts_substation_profiles.loc[bus_id]
1066
        except KeyError:
1067
            # This should only happen within the SH cutout
1068
            log.info(f"No CTS profile found for substation with bus_id:"
1069
                     f" {bus_id}")
1070
            continue
1071
1072
        building_profiles = np.outer(profile_ts, shares)
1073
        building_profiles = pd.DataFrame(
1074
            building_profiles, index=profile_ts.index, columns=shares.index
1075
        )
1076
        df_building_profiles = pd.concat(
1077
            [df_building_profiles, building_profiles], axis=1
1078
        )
1079
1080
    return df_building_profiles
1081
1082
1083
def delete_synthetic_cts_buildings():
1084
    """
1085
    All synthetic cts buildings are deleted from the DB. This is necessary if
1086
    the task is run multiple times as the existing synthetic buildings
1087
    influence the results.
1088
    """
1089
    # import db tables
1090
    from saio.openstreetmap import osm_buildings_synthetic
1091
1092
    # cells mit amenities
1093
    with db.session_scope() as session:
1094
        session.query(osm_buildings_synthetic).filter(
1095
            osm_buildings_synthetic.building == "cts"
1096
        ).delete()
1097
1098
1099
def remove_double_bus_id(df_cts_buildings):
1100
    """This is an backup adhoc fix if there should still be a building which
1101
    is assigned to 2 substations. In this case one of the buildings is just
1102
    dropped. As this currently accounts for only one building with one amenity
1103
    the deviation is neglectable."""
1104
    # assign bus_id via census cell of amenity
1105
    with db.session_scope() as session:
1106
        cells_query = session.query(
1107
            MapZensusGridDistricts.zensus_population_id,
1108
            MapZensusGridDistricts.bus_id,
1109
        )
1110
1111
    df_egon_map_zensus_buildings_buses = pd.read_sql(
1112
        cells_query.statement,
1113
        cells_query.session.bind,
1114
        index_col=None,
1115
    )
1116
    df_cts_buildings = pd.merge(
1117
        left=df_cts_buildings,
1118
        right=df_egon_map_zensus_buildings_buses,
1119
        on="zensus_population_id",
1120
    )
1121
1122
    substation_per_building = df_cts_buildings.groupby("id")[
1123
        "bus_id"
1124
    ].nunique()
1125
    building_id = substation_per_building.loc[
1126
        substation_per_building > 1
1127
    ].index
1128
    df_duplicates = df_cts_buildings.loc[
1129
        df_cts_buildings["id"].isin(building_id)
1130
    ]
1131
    for unique_id in df_duplicates["id"].unique():
1132
        drop_index = df_duplicates[df_duplicates["id"] == unique_id].index[0]
1133
        print(
1134
            f"Buildings {df_cts_buildings.loc[drop_index, 'id']}"
1135
            f" dropped because of double substation"
1136
        )
1137
        df_cts_buildings.drop(index=drop_index, inplace=True)
1138
1139
    df_cts_buildings.drop(columns="bus_id", inplace=True)
1140
1141
    return df_cts_buildings
1142
1143
1144
def cts_buildings():
1145
    """
1146
    Assigns CTS demand to buildings and calculates the respective demand
1147
    profiles. The demand profile per substation are disaggregated per
1148
    annual demand share of each census cell and by the number of amenities
1149
    per building within the cell. If no building data is available,
1150
    synthetic buildings are generated around the amenities. If no amenities
1151
    but cts demand is available, buildings are randomly selected. If no
1152
    building nor amenity is available, random synthetic buildings are
1153
    generated. The demand share is stored in the database.
1154
1155
    Note:
1156
    -----
1157
    Cells with CTS demand, amenities and buildings do not change within
1158
    the scenarios, only the demand itself. Therefore scenario eGon2035
1159
    can be used universally to determine the cts buildings but not for
1160
    he demand share.
1161
    """
1162
1163
    log.info("Start logging!")
1164
    # Buildings with amenities
1165
    df_buildings_with_amenities, df_lost_cells = buildings_with_amenities()
1166
    log.info("Buildings with amenities selected!")
1167
1168
    # Median number of amenities per cell
1169
    median_n_amenities = int(
1170
        df_buildings_with_amenities.groupby("zensus_population_id")[
1171
            "n_amenities_inside"
1172
        ]
1173
        .sum()
1174
        .median()
1175
    )
1176
    log.info(f"Median amenity value: {median_n_amenities}")
1177
1178
    # Remove synthetic CTS buildings if existing
1179
    delete_synthetic_cts_buildings()
1180
    log.info("Old synthetic cts buildings deleted!")
1181
1182
    # Amenities not assigned to buildings
1183
    df_amenities_without_buildings = amenities_without_buildings()
1184
    log.info("Amenities without buildlings selected!")
1185
1186
    # Append lost cells due to duplicated ids, to cover all demand cells
1187
    if not df_lost_cells.empty:
1188
1189
        # Number of synth amenities per cell
1190
        df_lost_cells["amenities"] = median_n_amenities
1191
        # create row for every amenity
1192
        df_lost_cells["amenities"] = (
1193
            df_lost_cells["amenities"].astype(int).apply(range)
1194
        )
1195
        df_lost_cells = df_lost_cells.explode("amenities")
1196
        df_lost_cells.drop(columns="amenities", inplace=True)
1197
        df_amenities_without_buildings = df_amenities_without_buildings.append(
1198
            df_lost_cells, ignore_index=True
1199
        )
1200
        log.info("Lost cells due to substation intersection appended!")
1201
1202
    # One building per amenity
1203
    df_amenities_without_buildings["n_amenities_inside"] = 1
1204
    # Create synthetic buildings for amenites without buildings
1205
    df_synthetic_buildings_with_amenities = create_synthetic_buildings(
1206
        df_amenities_without_buildings, points="geom_amenity"
1207
    )
1208
    log.info("Synthetic buildings created!")
1209
1210
    # TODO remove renaming after #722
1211
    write_table_to_postgis(
1212
        df_synthetic_buildings_with_amenities.rename(
1213
            columns={
1214
                "zensus_population_id": "cell_id",
1215
                "egon_building_id": "id",
1216
            }
1217
        ),
1218
        OsmBuildingsSynthetic,
1219
        engine=engine,
1220
        drop=False,
1221
    )
1222
    log.info("Synthetic buildings exported to DB!")
1223
1224
    # Cells without amenities but CTS demand and buildings
1225
    df_buildings_without_amenities = buildings_without_amenities()
1226
    log.info("Buildings without amenities in demand cells identified!")
1227
1228
    # Backup Bugfix for duplicated buildings which occure in SQL-Querry
1229
    # drop building ids which have already been used
1230
    mask = df_buildings_without_amenities.loc[
1231
        df_buildings_without_amenities["id"].isin(
1232
            df_buildings_with_amenities["id"]
1233
        )
1234
    ].index
1235
    df_buildings_without_amenities = df_buildings_without_amenities.drop(
1236
        index=mask
1237
    ).reset_index(drop=True)
1238
    log.info(f"{len(mask)} duplicated ids removed!")
1239
1240
    # select median n buildings per cell
1241
    df_buildings_without_amenities = select_cts_buildings(
1242
        df_buildings_without_amenities, max_n=median_n_amenities
1243
    )
1244
    df_buildings_without_amenities["n_amenities_inside"] = 1
1245
    log.info(f"{median_n_amenities} buildings per cell selected!")
1246
1247
    # Create synthetic amenities and buildings in cells with only CTS demand
1248
    df_cells_with_cts_demand_only = cells_with_cts_demand_only(
1249
        df_buildings_without_amenities
1250
    )
1251
    log.info("Cells with only demand identified!")
1252
1253
    # TODO implement overlay prevention #953 here
1254
    # Median n Amenities per cell
1255
    df_cells_with_cts_demand_only["amenities"] = median_n_amenities
1256
    # create row for every amenity
1257
    df_cells_with_cts_demand_only["amenities"] = (
1258
        df_cells_with_cts_demand_only["amenities"].astype(int).apply(range)
1259
    )
1260
    df_cells_with_cts_demand_only = df_cells_with_cts_demand_only.explode(
1261
        "amenities"
1262
    )
1263
    df_cells_with_cts_demand_only.drop(columns="amenities", inplace=True)
1264
1265
    # Only 1 Amenity per Building
1266
    df_cells_with_cts_demand_only["n_amenities_inside"] = 1
1267
    df_cells_with_cts_demand_only = place_buildings_with_amenities(
1268
        df_cells_with_cts_demand_only, amenities=1
1269
    )
1270
    df_synthetic_buildings_without_amenities = create_synthetic_buildings(
1271
        df_cells_with_cts_demand_only, points="geom_point"
1272
    )
1273
    log.info(f"{median_n_amenities} synthetic buildings per cell created")
1274
1275
    # TODO remove renaming after #722
1276
    write_table_to_postgis(
1277
        df_synthetic_buildings_without_amenities.rename(
1278
            columns={
1279
                "zensus_population_id": "cell_id",
1280
                "egon_building_id": "id",
1281
            }
1282
        ),
1283
        OsmBuildingsSynthetic,
1284
        engine=engine,
1285
        drop=False,
1286
    )
1287
    log.info("Synthetic buildings exported to DB")
1288
1289
    # Concat all buildings
1290
    columns = [
1291
        "zensus_population_id",
1292
        "id",
1293
        "geom_building",
1294
        "n_amenities_inside",
1295
        "source",
1296
    ]
1297
1298
    df_buildings_with_amenities["source"] = "bwa"
1299
    df_synthetic_buildings_with_amenities["source"] = "sbwa"
1300
    df_buildings_without_amenities["source"] = "bwoa"
1301
    df_synthetic_buildings_without_amenities["source"] = "sbwoa"
1302
1303
    df_cts_buildings = pd.concat(
1304
        [
1305
            df_buildings_with_amenities[columns],
1306
            df_synthetic_buildings_with_amenities[columns],
1307
            df_buildings_without_amenities[columns],
1308
            df_synthetic_buildings_without_amenities[columns],
1309
        ],
1310
        axis=0,
1311
        ignore_index=True,
1312
    )
1313
    df_cts_buildings = remove_double_bus_id(df_cts_buildings)
1314
    log.info("Double bus_id checked")
1315
1316
    # TODO remove dypte correction after #722
1317
    df_cts_buildings["id"] = df_cts_buildings["id"].astype(int)
1318
1319
    df_cts_buildings = gpd.GeoDataFrame(
1320
        df_cts_buildings, geometry="geom_building", crs=3035
1321
    )
1322
    df_cts_buildings = df_cts_buildings.reset_index().rename(
1323
        columns={"index": "serial"}
1324
    )
1325
    # Write table to db for debugging and postprocessing
1326
    write_table_to_postgis(
1327
        df_cts_buildings,
1328
        CtsBuildings,
1329
        engine=engine,
1330
        drop=True,
1331
    )
1332
    log.info("CTS buildings exported to DB!")
1333
1334
1335 View Code Duplication
def cts_electricity():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1336
    """
1337
    Calculate cts electricity demand share of hvmv substation profile
1338
     for buildings.
1339
    """
1340
    log.info("Start logging!")
1341
    with db.session_scope() as session:
1342
        cells_query = session.query(CtsBuildings)
1343
1344
    df_cts_buildings = pd.read_sql(
1345
        cells_query.statement, cells_query.session.bind, index_col=None
1346
    )
1347
    log.info("CTS buildings from DB imported!")
1348
    df_demand_share_2035 = calc_building_demand_profile_share(
1349
        df_cts_buildings, scenario="eGon2035", sector="electricity"
1350
    )
1351
    log.info("Profile share for egon2035 calculated!")
1352
1353
    df_demand_share_100RE = calc_building_demand_profile_share(
1354
        df_cts_buildings, scenario="eGon100RE", sector="electricity"
1355
    )
1356
    log.info("Profile share for egon100RE calculated!")
1357
1358
    df_demand_share = pd.concat(
1359
        [df_demand_share_2035, df_demand_share_100RE],
1360
        axis=0,
1361
        ignore_index=True,
1362
    )
1363
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1364
1365
    write_table_to_postgres(
1366
        df_demand_share,
1367
        EgonCtsElectricityDemandBuildingShare,
1368
        engine=engine,
1369
        drop=True,
1370
    )
1371
    log.info("Profile share exported to DB!")
1372
1373
1374 View Code Duplication
def cts_heat():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1375
    """
1376
    Calculate cts electricity demand share of hvmv substation profile
1377
     for buildings.
1378
    """
1379
    log.info("Start logging!")
1380
    with db.session_scope() as session:
1381
        cells_query = session.query(CtsBuildings)
1382
1383
    df_cts_buildings = pd.read_sql(
1384
        cells_query.statement, cells_query.session.bind, index_col=None
1385
    )
1386
    log.info("CTS buildings from DB imported!")
1387
1388
    df_demand_share_2035 = calc_building_demand_profile_share(
1389
        df_cts_buildings, scenario="eGon2035", sector="heat"
1390
    )
1391
    log.info("Profile share for egon2035 calculated!")
1392
    df_demand_share_100RE = calc_building_demand_profile_share(
1393
        df_cts_buildings, scenario="eGon100RE", sector="heat"
1394
    )
1395
    log.info("Profile share for egon100RE calculated!")
1396
    df_demand_share = pd.concat(
1397
        [df_demand_share_2035, df_demand_share_100RE],
1398
        axis=0,
1399
        ignore_index=True,
1400
    )
1401
1402
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1403
1404
    write_table_to_postgres(
1405
        df_demand_share,
1406
        EgonCtsHeatDemandBuildingShare,
1407
        engine=engine,
1408
        drop=True,
1409
    )
1410
    log.info("Profile share exported to DB!")
1411
1412
1413 View Code Duplication
def get_cts_electricity_peak_load():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1414
    """
1415
    Get electricity peak load of all CTS buildings for both scenarios and
1416
    store in DB.
1417
    """
1418
    log.info("Start logging!")
1419
1420
    BuildingElectricityPeakLoads.__table__.create(bind=engine, checkfirst=True)
1421
1422
    # Delete rows with cts demand
1423
    with db.session_scope() as session:
1424
        session.query(BuildingElectricityPeakLoads).filter(
1425
            BuildingElectricityPeakLoads.sector == "cts"
1426
        ).delete()
1427
    log.info("Cts electricity peak load removed from DB!")
1428
1429
    for scenario in ["eGon2035", "eGon100RE"]:
1430
1431
        with db.session_scope() as session:
1432
            cells_query = session.query(
1433
                EgonCtsElectricityDemandBuildingShare
1434
            ).filter(
1435
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1436
            )
1437
1438
        df_demand_share = pd.read_sql(
1439
            cells_query.statement, cells_query.session.bind, index_col=None
1440
        )
1441
1442
        with db.session_scope() as session:
1443
            cells_query = session.query(EgonEtragoElectricityCts).filter(
1444
                EgonEtragoElectricityCts.scn_name == scenario
1445
            )
1446
1447
        df_cts_profiles = pd.read_sql(
1448
            cells_query.statement,
1449
            cells_query.session.bind,
1450
        )
1451
        df_cts_profiles = pd.DataFrame.from_dict(
1452
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1453
            orient="columns",
1454
        )
1455
1456
        df_peak_load = pd.merge(
1457
            left=df_cts_profiles.max().astype(float).rename("max"),
1458
            right=df_demand_share,
1459
            left_index=True,
1460
            right_on="bus_id",
1461
        )
1462
1463
        # Convert unit from MWh to W
1464
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1465
        df_peak_load["peak_load_in_w"] = (
1466
            df_peak_load["max"] * df_peak_load["profile_share"]
1467
        )
1468
        log.info(f"Peak load for {scenario} determined!")
1469
1470
        # TODO remove after #772
1471
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1472
        df_peak_load["sector"] = "cts"
1473
1474
        # # Write peak loads into db
1475
        write_table_to_postgres(
1476
            df_peak_load,
1477
            BuildingElectricityPeakLoads,
1478
            engine=engine,
1479
            drop=False,
1480
            index=False,
1481
            if_exists="append",
1482
        )
1483
1484
        log.info(f"Peak load for {scenario} exported to DB!")
1485
1486
1487 View Code Duplication
def get_cts_heat_peak_load():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1488
    """
1489
    Get heat peak load of all CTS buildings for both scenarios and store in DB.
1490
    """
1491
    log.info("Start logging!")
1492
1493
    BuildingHeatPeakLoads.__table__.create(bind=engine, checkfirst=True)
1494
1495
    # Delete rows with cts demand
1496
    with db.session_scope() as session:
1497
        session.query(BuildingHeatPeakLoads).filter(
1498
            BuildingHeatPeakLoads.sector == "cts"
1499
        ).delete()
1500
    log.info("Cts heat peak load removed from DB!")
1501
1502
    for scenario in ["eGon2035", "eGon100RE"]:
1503
1504
        with db.session_scope() as session:
1505
            cells_query = session.query(
1506
                EgonCtsElectricityDemandBuildingShare
1507
            ).filter(
1508
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1509
            )
1510
1511
        df_demand_share = pd.read_sql(
1512
            cells_query.statement, cells_query.session.bind, index_col=None
1513
        )
1514
        log.info(f"Retrieved demand share for scenario: {scenario}")
1515
1516
        with db.session_scope() as session:
1517
            cells_query = session.query(EgonEtragoHeatCts).filter(
1518
                EgonEtragoHeatCts.scn_name == scenario
1519
            )
1520
1521
        df_cts_profiles = pd.read_sql(
1522
            cells_query.statement,
1523
            cells_query.session.bind,
1524
        )
1525
        log.info(f"Retrieved substation profiles for scenario: {scenario}")
1526
1527
        df_cts_profiles = pd.DataFrame.from_dict(
1528
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1529
            orient="columns",
1530
        )
1531
1532
        df_peak_load = pd.merge(
1533
            left=df_cts_profiles.max().astype(float).rename("max"),
1534
            right=df_demand_share,
1535
            left_index=True,
1536
            right_on="bus_id",
1537
        )
1538
1539
        # Convert unit from MWh to W
1540
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1541
        df_peak_load["peak_load_in_w"] = (
1542
            df_peak_load["max"] * df_peak_load["profile_share"]
1543
        )
1544
        log.info(f"Peak load for {scenario} determined!")
1545
1546
        # TODO remove after #772
1547
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1548
        df_peak_load["sector"] = "cts"
1549
1550
        # # Write peak loads into db
1551
        write_table_to_postgres(
1552
            df_peak_load,
1553
            BuildingHeatPeakLoads,
1554
            engine=engine,
1555
            drop=False,
1556
            index=False,
1557
            if_exists="append",
1558
        )
1559
1560
        log.info(f"Peak load for {scenario} exported to DB!")
1561