Passed
Pull Request — dev (#905)
by
unknown
01:46
created

buildings_without_amenities()   B

Complexity

Conditions 2

Size

Total Lines 99
Code Lines 52

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 52
dl 0
loc 99
rs 8.5709
c 0
b 0
f 0
cc 2
nop 0

How to fix   Long Method   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
"""
2
CTS electricity and heat demand time series for scenarios in 2035 and 2050
3
assigned to OSM-buildings.
4
5
Disaggregation of cts heat & electricity demand time series from MV Substation
6
to census cells via annual demand and then to OSM buildings via
7
amenity tags or randomly if no sufficient OSM-data is available in the
8
respective census cell. If no OSM-buildings or synthetic residential buildings
9
are available new synthetic 5x5m buildings are generated.
10
11
The resulting data is stored in separate tables
12
13
* `openstreetmap.osm_buildings_synthetic`:
14
    Lists generated synthetic building with id, zensus_population_id and
15
    building type. This table is already created within
16
    :func:`hh_buildings.map_houseprofiles_to_buildings()`
17
* `openstreetmap.egon_cts_buildings`:
18
    Table of all selected cts buildings with id, census cell id, geometry and
19
    amenity count in building. This table is created within
20
    :func:`cts_buildings()`
21
* `demand.egon_cts_electricity_demand_building_share`:
22
    Table including the mv substation electricity profile share of all selected
23
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
24
    within :func:`cts_electricity()`
25
* `demand.egon_cts_heat_demand_building_share`:
26
    Table including the mv substation heat profile share of all selected
27
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
28
    within :func:`cts_heat()`
29
* `demand.egon_building_electricity_peak_loads`:
30
    Mapping of electricity demand time series and buildings including cell_id,
31
    building area and peak load. This table is already created within
32
    :func:`hh_buildings.get_building_peak_loads()`
33
* `boundaries.egon_map_zensus_mvgd_buildings`:
34
    A final mapping table including all buildings used for residential and
35
    cts, heat and electricity timeseries. Including census cells, mvgd bus_id,
36
    building type (osm or synthetic)
37
38
**The following datasets from the database are mainly used for creation:**
39
40
* `openstreetmap.osm_buildings_filtered`:
41
    Table of OSM-buildings filtered by tags to selecting residential and cts
42
    buildings only.
43
* `openstreetmap.osm_amenities_shops_filtered`:
44
    Table of OSM-amenities filtered by tags to select cts only.
45
* `openstreetmap.osm_amenities_not_in_buildings_filtered`:
46
    Table of amenities which do not intersect with any building from
47
    `openstreetmap.osm_buildings_filtered`
48
* `openstreetmap.osm_buildings_synthetic`:
49
    Table of synthetic residential buildings
50
* `boundaries.egon_map_zensus_buildings_filtered_all`:
51
    Mapping table of census cells and buildings filtered even if population
52
    in census cell = 0.
53
* `demand.egon_demandregio_zensus_electricity`:
54
    Table of annual electricity load demand for residential and cts at census
55
    cell level. Residential load demand is derived from aggregated residential
56
    building profiles. DemandRegio CTS load demand at NUTS3 is distributed to
57
    census cells linearly to heat demand from peta5.
58
* `demand.egon_peta_heat`:
59
    Table of annual heat load demand for residential and cts at census cell
60
    level from peta5.
61
* `demand.egon_etrago_electricity_cts`:
62
    Scaled cts electricity time series for every MV substation. Derived from
63
    DemandRegio SLP for selected economic sectors at nuts3. Scaled with annual
64
    demand from `demand.egon_demandregio_zensus_electricity`
65
* `demand.egon_etrago_heat_cts`:
66
    Scaled cts heat time series for every MV substation. Derived from
67
    DemandRegio SLP Gas for selected economic sectors at nuts3. Scaled with
68
    annual demand from `demand.egon_peta_heat`.
69
70
**What is the goal?**
71
72
To disaggregate cts heat and electricity time series from MV substation level
73
to geo-referenced buildings, the annual demand from DemandRegio and Peta5 is
74
used to identify census cells with load demand. We use Openstreetmap data and
75
filter tags to identify buildings and count the  amenities within. The number
76
of amenities and the annual demand serve to assign a demand share of the MV
77
substation profile to the building.
78
79
**What is the challenge?**
80
81
The OSM, DemandRegio and Peta5 dataset differ from each other. The OSM dataset
82
is a community based dataset which is extended throughout and does not claim to
83
be complete. Therefore, not all census cells which have a demand assigned by
84
DemandRegio or Peta5 methodology also have buildings with respective tags or
85
sometimes even any building at all. Furthermore, the substation load areas are
86
determined dynamically in a previous dataset. Merging these datasets different
87
scopes (census cell shapes, building shapes) and their inconsistencies need to
88
be addressed. For example: not yet tagged buildings or amenities in OSM, or
89
building shapes exceeding census cells.
90
91
92
**How are these datasets combined?**
93
94
95
The methodology for heat and electricity is the same and only differs in the
96
annual demand and MV/HV Substation profile. In a previous dataset
97
(openstreetmap), we filter all OSM buildings and amenities for tags, we relate
98
to the cts sector. Amenities are mapped to intersecting buildings and then
99
intersected with the annual demand which exists at census cell level. We obtain
100
census cells with demand and amenities and without amenities. If there is no
101
data on amenities, n synthetic ones are assigned to existing buildings. We use
102
the median value of amenities/census cell for n and all filtered buildings +
103
synthetic residential buildings. If no building data is available a synthetic
104
buildings is randomly generated. This also happens for amenities which couldn't
105
be assigned to any osm building. All census cells with an annual demand are
106
covered this way, and we obtain four different categories of buildings with
107
amenities:
108
109
* Buildings with amenities
110
* Synthetic buildings with amenities
111
* Buildings with synthetic amenities
112
* Synthetics buildings with synthetic amenities
113
114
The amenities are summed per census cell (of amenity) and building to derive
115
the building amenity share per census cell. Multiplied with the annual demand,
116
we receive the profile demand share for each cell. Some buildings exceed the
117
census cell shape and have amenities in different cells although mapped to one
118
building only. To have unique buildings the demand share is summed once more
119
per building id. This factor can now be used to obtain the profile for each
120
building.
121
122
A schematic flow chart exist in the correspondent issue #671:
123
https://github.com/openego/eGon-data/issues/671#issuecomment-1260740258
124
125
126
**What are central assumptions during the data processing?**
127
128
* We assume OSM data to be the most reliable and complete open source dataset.
129
* We assume building and amenity tags to be truthful and accurate.
130
* Mapping census to OSM data is not trivial. Discrepancies are substituted.
131
* Missing OSM buildings are generated for each amenity.
132
* Missing amenities are generated by median value of amenities/census cell.
133
134
135
**Drawbacks and limitations of the data**
136
137
* Shape of profiles for each building is similar within a MVGD and only scaled
138
with a different factor.
139
* MVGDs are generated dynamically. In case of buildings with amenities
140
exceeding MVGD borders, amenities which are assigned to a different MVGD than
141
the assigned building centroid, the amenities are dropped for sake of
142
simplicity. One building should not have a connection to two MVGDs.
143
* The completeness of the OSM data depends on community contribution and is
144
crucial to the quality of our results.
145
* Randomly selected buildings and generated amenities may inadequately reflect
146
reality, but are chosen for sake of simplicity as a measure to fill data gaps.
147
* Since this dataset is a cascade after generation of synthetic residential
148
buildings also check drawbacks and limitations in hh_buildings.py.
149
* Synthetic buildings may be placed within osm buildings which exceed multiple
150
census cells. This is currently accepted but may be solved in  #953
151
* Scattered high peak loads occur and might lead to single MV grid connections
152
in ding0. In some cases this might not be viable. Postprocessing is needed and
153
may be solved in #954.
154
155
156
Example Query
157
-----
158
159
160
Notes
161
-----
162
163
This module docstring is rather a dataset documentation. Once, a decision
164
is made in ... the content of this module docstring needs to be moved to
165
docs attribute of the respective dataset class.
166
"""
167
168
from geoalchemy2 import Geometry
169
from geoalchemy2.shape import to_shape
170
from psycopg2.extensions import AsIs, register_adapter
171
from sqlalchemy import REAL, Column, Integer, String, func
172
from sqlalchemy.ext.declarative import declarative_base
173
import geopandas as gpd
174
import numpy as np
175
import pandas as pd
176
import saio
177
178
from egon.data import db
179
from egon.data import logger as log
180
from egon.data.datasets import Dataset
181
from egon.data.datasets.electricity_demand import (
182
    EgonDemandRegioZensusElectricity,
183
)
184
from egon.data.datasets.electricity_demand.temporal import (
185
    EgonEtragoElectricityCts,
186
)
187
from egon.data.datasets.electricity_demand_timeseries.hh_buildings import (
188
    BuildingElectricityPeakLoads,
189
    OsmBuildingsSynthetic,
190
)
191
from egon.data.datasets.electricity_demand_timeseries.mapping import (
192
    map_all_used_buildings,
193
)
194
from egon.data.datasets.electricity_demand_timeseries.tools import (
195
    random_ints_until_sum,
196
    random_point_in_square,
197
    specific_int_until_sum,
198
    write_table_to_postgis,
199
    write_table_to_postgres,
200
)
201
from egon.data.datasets.heat_demand import EgonPetaHeat
202
from egon.data.datasets.heat_demand_timeseries import EgonEtragoHeatCts
203
from egon.data.datasets.zensus_mv_grid_districts import MapZensusGridDistricts
204
from egon.data.datasets.zensus_vg250 import DestatisZensusPopulationPerHa
205
206
engine = db.engine()
207
Base = declarative_base()
208
209
# import db tables
210
saio.register_schema("openstreetmap", engine=engine)
211
saio.register_schema("boundaries", engine=engine)
212
213
214
class EgonCtsElectricityDemandBuildingShare(Base):
215
    __tablename__ = "egon_cts_electricity_demand_building_share"
216
    __table_args__ = {"schema": "demand"}
217
218
    building_id = Column(Integer, primary_key=True)
219
    scenario = Column(String, primary_key=True)
220
    bus_id = Column(Integer, index=True)
221
    profile_share = Column(REAL)
222
223
224
class EgonCtsHeatDemandBuildingShare(Base):
225
    __tablename__ = "egon_cts_heat_demand_building_share"
226
    __table_args__ = {"schema": "demand"}
227
228
    building_id = Column(Integer, primary_key=True)
229
    scenario = Column(String, primary_key=True)
230
    bus_id = Column(Integer, index=True)
231
    profile_share = Column(REAL)
232
233
234
class CtsBuildings(Base):
235
    __tablename__ = "egon_cts_buildings"
236
    __table_args__ = {"schema": "openstreetmap"}
237
238
    serial = Column(Integer, primary_key=True)
239
    id = Column(Integer, index=True)
240
    zensus_population_id = Column(Integer, index=True)
241
    geom_building = Column(Geometry("Polygon", 3035))
242
    n_amenities_inside = Column(Integer)
243
    source = Column(String)
244
245
246
class BuildingHeatPeakLoads(Base):
247
    __tablename__ = "egon_building_heat_peak_loads"
248
    __table_args__ = {"schema": "demand"}
249
250
    building_id = Column(Integer, primary_key=True)
251
    scenario = Column(String, primary_key=True)
252
    sector = Column(String, primary_key=True)
253
    peak_load_in_w = Column(REAL)
254
255
256
class CtsDemandBuildings(Dataset):
257
    def __init__(self, dependencies):
258
        super().__init__(
259
            name="CtsDemandBuildings",
260
            version="0.0.3",
261
            dependencies=dependencies,
262
            tasks=(
263
                cts_buildings,
264
                {cts_electricity, cts_heat},
265
                get_cts_electricity_peak_load,
266
                map_all_used_buildings,
267
                assign_voltage_level_to_buildings,
268
            ),
269
        )
270
271
272
def amenities_without_buildings():
273
    """
274
    Amenities which have no buildings assigned and are in a cell with cts
275
    demand are determined.
276
277
    Returns
278
    -------
279
    pd.DataFrame
280
        Table of amenities without buildings
281
    """
282
    from saio.openstreetmap import osm_amenities_not_in_buildings_filtered
283
284
    with db.session_scope() as session:
285
        cells_query = (
286
            session.query(
287
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
288
                osm_amenities_not_in_buildings_filtered.geom_amenity,
289
                osm_amenities_not_in_buildings_filtered.egon_amenity_id,
290
            )
291
            .filter(
292
                func.st_within(
293
                    osm_amenities_not_in_buildings_filtered.geom_amenity,
294
                    DestatisZensusPopulationPerHa.geom,
295
                )
296
            )
297
            .filter(
298
                DestatisZensusPopulationPerHa.id
299
                == EgonDemandRegioZensusElectricity.zensus_population_id
300
            )
301
            .filter(
302
                EgonDemandRegioZensusElectricity.sector == "service",
303
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
304
            )
305
        )
306
307
    df_amenities_without_buildings = gpd.read_postgis(
308
        cells_query.statement,
309
        cells_query.session.bind,
310
        geom_col="geom_amenity",
311
    )
312
    return df_amenities_without_buildings
313
314
315
def place_buildings_with_amenities(df, amenities=None, max_amenities=None):
316
    """
317
    Building centroids are placed randomly within census cells.
318
    The Number of buildings is derived from n_amenity_inside, the selected
319
    method and number of amenities per building.
320
321
    Returns
322
    -------
323
    df: gpd.GeoDataFrame
324
        Table of buildings centroids
325
    """
326
    if isinstance(max_amenities, int):
327
        # amount of amenities is randomly generated within bounds
328
        # (max_amenities, amenities per cell)
329
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
330
            random_ints_until_sum, args=[max_amenities]
331
        )
332
    if isinstance(amenities, int):
333
        # Specific amount of amenities per building
334
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
335
            specific_int_until_sum, args=[amenities]
336
        )
337
338
    # Unnest each building
339
    df = df.explode(column="n_amenities_inside")
340
341
    # building count per cell
342
    df["building_count"] = df.groupby(["zensus_population_id"]).cumcount() + 1
343
344
    # generate random synthetic buildings
345
    edge_length = 5
346
    # create random points within census cells
347
    points = random_point_in_square(geom=df["geom"], tol=edge_length / 2)
348
349
    df.reset_index(drop=True, inplace=True)
350
    # Store center of polygon
351
    df["geom_point"] = points
352
    # Drop geometry of census cell
353
    df = df.drop(columns=["geom"])
354
355
    return df
356
357
358
def create_synthetic_buildings(df, points=None, crs="EPSG:3035"):
359
    """
360
    Synthetic buildings are generated around points.
361
362
    Parameters
363
    ----------
364
    df: pd.DataFrame
365
        Table of census cells
366
    points: gpd.GeoSeries or str
367
        List of points to place buildings around or column name of df
368
    crs: str
369
        CRS of result table
370
371
    Returns
372
    -------
373
    df: gpd.GeoDataFrame
374
        Synthetic buildings
375
    """
376
377
    if isinstance(points, str) and points in df.columns:
378
        points = df[points]
379
    elif isinstance(points, gpd.GeoSeries):
380
        pass
381
    else:
382
        raise ValueError("Points are of the wrong type")
383
384
    # Create building using a square around point
385
    edge_length = 5
386
    df["geom_building"] = points.buffer(distance=edge_length / 2, cap_style=3)
387
388
    if "geom_point" not in df.columns:
389
        df["geom_point"] = df["geom_building"].centroid
390
391
    df = gpd.GeoDataFrame(
392
        df,
393
        crs=crs,
394
        geometry="geom_building",
395
    )
396
397
    # TODO remove after #772 implementation of egon_building_id
398
    df.rename(columns={"id": "egon_building_id"}, inplace=True)
399
400
    # get max number of building ids from synthetic residential table
401
    with db.session_scope() as session:
402
        max_synth_residential_id = session.execute(
403
            func.max(OsmBuildingsSynthetic.id)
404
        ).scalar()
405
    max_synth_residential_id = int(max_synth_residential_id)
406
407
    # create sequential ids
408
    df["egon_building_id"] = range(
409
        max_synth_residential_id + 1,
410
        max_synth_residential_id + df.shape[0] + 1,
411
    )
412
413
    df["area"] = df["geom_building"].area
414
    # set building type of synthetic building
415
    df["building"] = "cts"
416
    # TODO remove after #772
417
    df = df.rename(
418
        columns={
419
            # "zensus_population_id": "cell_id",
420
            "egon_building_id": "id",
421
        }
422
    )
423
    return df
424
425
426
def buildings_with_amenities():
427
    """
428
    Amenities which are assigned to buildings are determined and grouped per
429
    building and zensus cell. Buildings covering multiple cells therefore
430
    exists multiple times but in different zensus cells. This is necessary to
431
    cover as many cells with a cts demand as possible. If buildings exist in
432
    multiple mvgds (bus_id) , only the amenities within the same as the
433
    building centroid are kept. If as a result, a census cell is uncovered
434
    by any buildings, a synthetic amenity is placed. The buildings are
435
    aggregated afterwards during the calculation of the profile_share.
436
437
    Returns
438
    -------
439
    df_buildings_with_amenities: gpd.GeoDataFrame
440
        Contains all buildings with amenities per zensus cell.
441
    df_lost_cells: gpd.GeoDataFrame
442
        Contains synthetic amenities in lost cells. Might be empty
443
    """
444
445
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
446
    from saio.openstreetmap import osm_amenities_in_buildings_filtered
447
448
    with db.session_scope() as session:
449
        cells_query = (
450
            session.query(
451
                osm_amenities_in_buildings_filtered,
452
                MapZensusGridDistricts.bus_id,
453
            )
454
            .filter(
455
                MapZensusGridDistricts.zensus_population_id
456
                == osm_amenities_in_buildings_filtered.zensus_population_id
457
            )
458
            .filter(
459
                EgonDemandRegioZensusElectricity.zensus_population_id
460
                == osm_amenities_in_buildings_filtered.zensus_population_id
461
            )
462
            .filter(
463
                EgonDemandRegioZensusElectricity.sector == "service",
464
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
465
            )
466
        )
467
        df_amenities_in_buildings = pd.read_sql(
468
            cells_query.statement, con=session.connection(), index_col=None
469
        )
470
471
    df_amenities_in_buildings["geom_building"] = df_amenities_in_buildings[
472
        "geom_building"
473
    ].apply(to_shape)
474
    df_amenities_in_buildings["geom_amenity"] = df_amenities_in_buildings[
475
        "geom_amenity"
476
    ].apply(to_shape)
477
478
    # retrieve building centroid bus_id
479
    with db.session_scope() as session:
480
481
        cells_query = session.query(
482
            egon_map_zensus_buildings_filtered_all.id,
483
            MapZensusGridDistricts.bus_id.label("building_bus_id"),
484
        ).filter(
485
            egon_map_zensus_buildings_filtered_all.zensus_population_id
486
            == MapZensusGridDistricts.zensus_population_id
487
        )
488
489
        df_building_bus_id = pd.read_sql(
490
            cells_query.statement, con=session.connection(), index_col=None
491
        )
492
493
    df_amenities_in_buildings = pd.merge(
494
        left=df_amenities_in_buildings, right=df_building_bus_id, on="id"
495
    )
496
497
    # identify amenities with differing bus_id as building
498
    identified_amenities = df_amenities_in_buildings.loc[
499
        df_amenities_in_buildings["bus_id"]
500
        != df_amenities_in_buildings["building_bus_id"]
501
    ].index
502
503
    lost_cells = df_amenities_in_buildings.loc[
504
        identified_amenities, "zensus_population_id"
505
    ].unique()
506
507
    # check if lost zensus cells are already covered
508
    if not (
509
        df_amenities_in_buildings["zensus_population_id"]
510
        .isin(lost_cells)
511
        .empty
512
    ):
513
        # query geom data for cell if not
514
        with db.session_scope() as session:
515
            cells_query = session.query(
516
                DestatisZensusPopulationPerHa.id,
517
                DestatisZensusPopulationPerHa.geom,
518
            ).filter(
519
                DestatisZensusPopulationPerHa.id.in_(pd.Index(lost_cells))
520
            )
521
522
            df_lost_cells = gpd.read_postgis(
523
                cells_query.statement,
524
                cells_query.session.bind,
525
                geom_col="geom",
526
            )
527
528
        # place random amenity in cell
529
        df_lost_cells["n_amenities_inside"] = 1
530
        df_lost_cells.rename(
531
            columns={
532
                "id": "zensus_population_id",
533
            },
534
            inplace=True,
535
        )
536
        df_lost_cells = place_buildings_with_amenities(
537
            df_lost_cells, amenities=1
538
        )
539
        df_lost_cells.rename(
540
            columns={
541
                # "id": "zensus_population_id",
542
                "geom_point": "geom_amenity",
543
            },
544
            inplace=True,
545
        )
546
        df_lost_cells.drop(
547
            columns=["building_count", "n_amenities_inside"], inplace=True
548
        )
549
    else:
550
        df_lost_cells = None
551
552
    df_amenities_in_buildings.drop(identified_amenities, inplace=True)
553
    df_amenities_in_buildings.drop(columns="building_bus_id", inplace=True)
554
555
    df_amenities_in_buildings["n_amenities_inside"] = 1
556
557
    # sum amenities per building and cell
558
    df_amenities_in_buildings[
559
        "n_amenities_inside"
560
    ] = df_amenities_in_buildings.groupby(["zensus_population_id", "id"])[
561
        "n_amenities_inside"
562
    ].transform(
563
        "sum"
564
    )
565
    # drop duplicated buildings
566
    df_buildings_with_amenities = df_amenities_in_buildings.drop_duplicates(
567
        ["id", "zensus_population_id"]
568
    )
569
    df_buildings_with_amenities.reset_index(inplace=True, drop=True)
570
571
    df_buildings_with_amenities = df_buildings_with_amenities[
572
        ["id", "zensus_population_id", "geom_building", "n_amenities_inside"]
573
    ]
574
    df_buildings_with_amenities.rename(
575
        columns={
576
            # "zensus_population_id": "cell_id",
577
            "egon_building_id": "id"
578
        },
579
        inplace=True,
580
    )
581
582
    return df_buildings_with_amenities, df_lost_cells
583
584
585
def buildings_without_amenities():
586
    """
587
    Buildings (filtered and synthetic) in cells with
588
    cts demand but no amenities are determined.
589
590
    Returns
591
    -------
592
    df_buildings_without_amenities: gpd.GeoDataFrame
593
        Table of buildings without amenities in zensus cells
594
        with cts demand.
595
    """
596
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
597
    from saio.openstreetmap import (
598
        osm_amenities_shops_filtered,
599
        osm_buildings_filtered,
600
        osm_buildings_synthetic,
601
    )
602
603
    # buildings_filtered in cts-demand-cells without amenities
604
    with db.session_scope() as session:
605
606
        # Synthetic Buildings
607
        q_synth_buildings = session.query(
608
            osm_buildings_synthetic.cell_id.cast(Integer).label(
609
                "zensus_population_id"
610
            ),
611
            osm_buildings_synthetic.id.cast(Integer).label("id"),
612
            osm_buildings_synthetic.area.label("area"),
613
            osm_buildings_synthetic.geom_building.label("geom_building"),
614
            osm_buildings_synthetic.geom_point.label("geom_point"),
615
        )
616
617
        # Buildings filtered
618
        q_buildings_filtered = session.query(
619
            egon_map_zensus_buildings_filtered_all.zensus_population_id,
620
            osm_buildings_filtered.id,
621
            osm_buildings_filtered.area,
622
            osm_buildings_filtered.geom_building,
623
            osm_buildings_filtered.geom_point,
624
        ).filter(
625
            osm_buildings_filtered.id
626
            == egon_map_zensus_buildings_filtered_all.id
627
        )
628
629
        # Amenities + zensus_population_id
630
        q_amenities = (
631
            session.query(
632
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
633
            )
634
            .filter(
635
                func.st_within(
636
                    osm_amenities_shops_filtered.geom_amenity,
637
                    DestatisZensusPopulationPerHa.geom,
638
                )
639
            )
640
            .distinct(DestatisZensusPopulationPerHa.id)
641
        )
642
643
        # Cells with CTS demand but without amenities
644
        q_cts_without_amenities = (
645
            session.query(
646
                EgonDemandRegioZensusElectricity.zensus_population_id,
647
            )
648
            .filter(
649
                EgonDemandRegioZensusElectricity.sector == "service",
650
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
651
            )
652
            .filter(
653
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
654
                    q_amenities
655
                )
656
            )
657
            .distinct()
658
        )
659
660
        # Buildings filtered + synthetic buildings residential in
661
        # cells with CTS demand but without amenities
662
        cells_query = q_synth_buildings.union(q_buildings_filtered).filter(
663
            egon_map_zensus_buildings_filtered_all.zensus_population_id.in_(
664
                q_cts_without_amenities
665
            )
666
        )
667
668
    # df_buildings_without_amenities = pd.read_sql(
669
    #     cells_query.statement, cells_query.session.bind, index_col=None)
670
    df_buildings_without_amenities = gpd.read_postgis(
671
        cells_query.statement,
672
        cells_query.session.bind,
673
        geom_col="geom_building",
674
    )
675
676
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
677
        columns={
678
            # "zensus_population_id": "cell_id",
679
            "egon_building_id": "id",
680
        }
681
    )
682
683
    return df_buildings_without_amenities
684
685
686
def select_cts_buildings(df_buildings_wo_amenities, max_n):
687
    """
688
    N Buildings (filtered and synthetic) in each cell with
689
    cts demand are selected. Only the first n buildings
690
    are taken for each cell. The buildings are sorted by surface
691
    area.
692
693
    Returns
694
    -------
695
    df_buildings_with_cts_demand: gpd.GeoDataFrame
696
        Table of buildings
697
    """
698
699
    df_buildings_wo_amenities.sort_values(
700
        "area", ascending=False, inplace=True
701
    )
702
    # select first n ids each census cell if available
703
    df_buildings_with_cts_demand = (
704
        df_buildings_wo_amenities.groupby("zensus_population_id")
705
        .nth(list(range(max_n)))
706
        .reset_index()
707
    )
708
    df_buildings_with_cts_demand.reset_index(drop=True, inplace=True)
709
710
    return df_buildings_with_cts_demand
711
712
713
def cells_with_cts_demand_only(df_buildings_without_amenities):
714
    """
715
    Cells with cts demand but no amenities or buildilngs
716
    are determined.
717
718
    Returns
719
    -------
720
    df_cells_only_cts_demand: gpd.GeoDataFrame
721
        Table of cells with cts demand but no amenities or buildings
722
    """
723
    from saio.openstreetmap import osm_amenities_shops_filtered
724
725
    # cells mit amenities
726
    with db.session_scope() as session:
727
        sub_query = (
728
            session.query(
729
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
730
            )
731
            .filter(
732
                func.st_within(
733
                    osm_amenities_shops_filtered.geom_amenity,
734
                    DestatisZensusPopulationPerHa.geom,
735
                )
736
            )
737
            .distinct(DestatisZensusPopulationPerHa.id)
738
        )
739
740
        cells_query = (
741
            session.query(
742
                EgonDemandRegioZensusElectricity.zensus_population_id,
743
                EgonDemandRegioZensusElectricity.scenario,
744
                EgonDemandRegioZensusElectricity.sector,
745
                EgonDemandRegioZensusElectricity.demand,
746
                DestatisZensusPopulationPerHa.geom,
747
            )
748
            .filter(
749
                EgonDemandRegioZensusElectricity.sector == "service",
750
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
751
            )
752
            .filter(
753
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
754
                    sub_query
755
                )
756
            )
757
            .filter(
758
                EgonDemandRegioZensusElectricity.zensus_population_id
759
                == DestatisZensusPopulationPerHa.id
760
            )
761
        )
762
763
    df_cts_cell_without_amenities = gpd.read_postgis(
764
        cells_query.statement,
765
        cells_query.session.bind,
766
        geom_col="geom",
767
        index_col=None,
768
    )
769
770
    # TODO remove after #722
771
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
772
        columns={"cell_id": "zensus_population_id"}
773
    )
774
775
    # Census cells with only cts demand
776
    df_cells_only_cts_demand = df_cts_cell_without_amenities.loc[
777
        ~df_cts_cell_without_amenities["zensus_population_id"].isin(
778
            df_buildings_without_amenities["zensus_population_id"].unique()
779
        )
780
    ]
781
782
    df_cells_only_cts_demand.reset_index(drop=True, inplace=True)
783
784
    return df_cells_only_cts_demand
785
786
787
def calc_census_cell_share(scenario, sector):
788
    """
789
    The profile share for each census cell is calculated by it's
790
    share of annual demand per substation bus. The annual demand
791
    per cell is defined by DemandRegio/Peta5. The share is for both
792
    scenarios identical as the annual demand is linearly scaled.
793
794
    Parameters
795
    ----------
796
    scenario: str
797
        Scenario for which the share is calculated: "eGon2035" or "eGon100RE"
798
    sector: str
799
        Scenario for which the share is calculated: "electricity" or "heat"
800
801
    Returns
802
    -------
803
    df_census_share: pd.DataFrame
804
    """
805
    if sector == "electricity":
806
        with db.session_scope() as session:
807
            cells_query = (
808
                session.query(
809
                    EgonDemandRegioZensusElectricity,
810
                    MapZensusGridDistricts.bus_id,
811
                )
812
                .filter(EgonDemandRegioZensusElectricity.sector == "service")
813
                .filter(EgonDemandRegioZensusElectricity.scenario == scenario)
814
                .filter(
815
                    EgonDemandRegioZensusElectricity.zensus_population_id
816
                    == MapZensusGridDistricts.zensus_population_id
817
                )
818
            )
819
820
    elif sector == "heat":
821
        with db.session_scope() as session:
822
            cells_query = (
823
                session.query(EgonPetaHeat, MapZensusGridDistricts.bus_id)
824
                .filter(EgonPetaHeat.sector == "service")
825
                .filter(EgonPetaHeat.scenario == scenario)
826
                .filter(
827
                    EgonPetaHeat.zensus_population_id
828
                    == MapZensusGridDistricts.zensus_population_id
829
                )
830
            )
831
832
    df_demand = pd.read_sql(
833
        cells_query.statement,
0 ignored issues
show
introduced by
The variable cells_query does not seem to be defined for all execution paths.
Loading history...
834
        cells_query.session.bind,
835
        index_col="zensus_population_id",
836
    )
837
838
    # get demand share of cell per bus
839
    df_census_share = df_demand["demand"] / df_demand.groupby("bus_id")[
840
        "demand"
841
    ].transform("sum")
842
    df_census_share = df_census_share.rename("cell_share")
843
844
    df_census_share = pd.concat(
845
        [
846
            df_census_share,
847
            df_demand[["bus_id", "scenario"]],
848
        ],
849
        axis=1,
850
    )
851
852
    df_census_share.reset_index(inplace=True)
853
    return df_census_share
854
855
856
def calc_building_demand_profile_share(
857
    df_cts_buildings, scenario="eGon2035", sector="electricity"
858
):
859
    """
860
    Share of cts electricity demand profile per bus for every selected building
861
    is calculated. Building-amenity share is multiplied with census cell share
862
    to get the substation bus profile share for each building. The share is
863
    grouped and aggregated per building as some buildings exceed the shape of
864
    census cells and have amenities assigned from multiple cells. Building
865
    therefore get the amenity share of all census cells.
866
867
    Parameters
868
    ----------
869
    df_cts_buildings: gpd.GeoDataFrame
870
        Table of all buildings with cts demand assigned
871
    scenario: str
872
        Scenario for which the share is calculated.
873
    sector: str
874
        Sector for which the share is calculated.
875
876
    Returns
877
    -------
878
    df_building_share: pd.DataFrame
879
        Table of bus profile share per building
880
881
    """
882
883
    def calc_building_amenity_share(df_cts_buildings):
884
        """
885
        Calculate the building share by the number amenities per building
886
        within a census cell. Building ids can exist multiple time but with
887
        different zensus_population_ids.
888
        """
889
        df_building_amenity_share = df_cts_buildings[
890
            "n_amenities_inside"
891
        ] / df_cts_buildings.groupby("zensus_population_id")[
892
            "n_amenities_inside"
893
        ].transform(
894
            "sum"
895
        )
896
        df_building_amenity_share = pd.concat(
897
            [
898
                df_building_amenity_share.rename("building_amenity_share"),
899
                df_cts_buildings[["zensus_population_id", "id"]],
900
            ],
901
            axis=1,
902
        )
903
        return df_building_amenity_share
904
905
    df_building_amenity_share = calc_building_amenity_share(df_cts_buildings)
906
907
    df_census_cell_share = calc_census_cell_share(
908
        scenario=scenario, sector=sector
909
    )
910
911
    df_demand_share = pd.merge(
912
        left=df_building_amenity_share,
913
        right=df_census_cell_share,
914
        left_on="zensus_population_id",
915
        right_on="zensus_population_id",
916
    )
917
    df_demand_share["profile_share"] = df_demand_share[
918
        "building_amenity_share"
919
    ].multiply(df_demand_share["cell_share"])
920
921
    # only pass selected columns
922
    df_demand_share = df_demand_share[
923
        ["id", "bus_id", "scenario", "profile_share"]
924
    ]
925
926
    # Group and aggregate per building for multi cell buildings
927
    df_demand_share = (
928
        df_demand_share.groupby(["scenario", "id", "bus_id"])
929
        .sum()
930
        .reset_index()
931
    )
932
    if df_demand_share.duplicated("id", keep=False).any():
933
        print(
934
            df_demand_share.loc[df_demand_share.duplicated("id", keep=False)]
935
        )
936
    return df_demand_share
937
938
939 View Code Duplication
def get_peta_demand(mvgd, scenario):
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
940
    """
941
    Retrieve annual peta heat demand for CTS for either
942
    eGon2035 or eGon100RE scenario.
943
944
    Parameters
945
    ----------
946
    mvgd : int
947
        ID of substation for which to get CTS demand.
948
    scenario : str
949
        Possible options are eGon2035 or eGon100RE
950
951
    Returns
952
    -------
953
    df_peta_demand : pd.DataFrame
954
        Annual residential heat demand per building and scenario. Columns of
955
        the dataframe are zensus_population_id and demand.
956
957
    """
958
959
    with db.session_scope() as session:
960
        query = (
961
            session.query(
962
                MapZensusGridDistricts.zensus_population_id,
963
                EgonPetaHeat.demand,
964
            )
965
            .filter(MapZensusGridDistricts.bus_id == int(mvgd))
966
            .filter(
967
                MapZensusGridDistricts.zensus_population_id
968
                == EgonPetaHeat.zensus_population_id
969
            )
970
            .filter(
971
                EgonPetaHeat.sector == "service",
972
                EgonPetaHeat.scenario == scenario,
973
            )
974
        )
975
976
        df_peta_demand = pd.read_sql(
977
            query.statement, query.session.bind, index_col=None
978
        )
979
980
    return df_peta_demand
981
982
983
def calc_cts_building_profiles(
984
    bus_ids,
985
    scenario,
986
    sector,
987
):
988
    """
989
    Calculate the cts demand profile for each building. The profile is
990
    calculated by the demand share of the building per substation bus.
991
992
    Parameters
993
    ----------
994
    bus_ids: list of int
995
        Ids of the substation for which selected building profiles are
996
        calculated.
997
    scenario: str
998
        Scenario for which the share is calculated: "eGon2035" or "eGon100RE"
999
    sector: str
1000
        Sector for which the share is calculated: "electricity" or "heat"
1001
1002
    Returns
1003
    -------
1004
    df_building_profiles: pd.DataFrame
1005
        Table of demand profile per building. Column names are building IDs
1006
        and index is hour of the year as int (0-8759).
1007
1008
    """
1009
    if sector == "electricity":
1010
        # Get cts building electricity demand share of selected buildings
1011
        with db.session_scope() as session:
1012
            cells_query = (
1013
                session.query(
1014
                    EgonCtsElectricityDemandBuildingShare,
1015
                )
1016
                .filter(
1017
                    EgonCtsElectricityDemandBuildingShare.scenario == scenario
1018
                )
1019
                .filter(
1020
                    EgonCtsElectricityDemandBuildingShare.bus_id.in_(bus_ids)
1021
                )
1022
            )
1023
1024
        df_demand_share = pd.read_sql(
1025
            cells_query.statement, cells_query.session.bind, index_col=None
1026
        )
1027
1028
        # Get substation cts electricity load profiles of selected bus_ids
1029
        with db.session_scope() as session:
1030
            cells_query = (
1031
                session.query(EgonEtragoElectricityCts).filter(
1032
                    EgonEtragoElectricityCts.scn_name == scenario
1033
                )
1034
            ).filter(EgonEtragoElectricityCts.bus_id.in_(bus_ids))
1035
1036
        df_cts_substation_profiles = pd.read_sql(
1037
            cells_query.statement,
1038
            cells_query.session.bind,
1039
        )
1040
        df_cts_substation_profiles = pd.DataFrame.from_dict(
1041
            df_cts_substation_profiles.set_index("bus_id")["p_set"].to_dict(),
1042
            orient="index",
1043
        )
1044
        # df_cts_profiles = calc_load_curves_cts(scenario)
1045
1046
    elif sector == "heat":
1047
        # Get cts building heat demand share of selected buildings
1048
        with db.session_scope() as session:
1049
            cells_query = (
1050
                session.query(
1051
                    EgonCtsHeatDemandBuildingShare,
1052
                )
1053
                .filter(EgonCtsHeatDemandBuildingShare.scenario == scenario)
1054
                .filter(EgonCtsHeatDemandBuildingShare.bus_id.in_(bus_ids))
1055
            )
1056
1057
        df_demand_share = pd.read_sql(
1058
            cells_query.statement, cells_query.session.bind, index_col=None
1059
        )
1060
1061
        # Get substation cts heat load profiles of selected bus_ids
1062
        # (this profile only contains zensus cells with individual heating;
1063
        #  in order to obtain a profile for the whole MV grid it is afterwards
1064
        #  scaled by the grids total CTS demand from peta)
1065
        with db.session_scope() as session:
1066
            cells_query = (
1067
                session.query(EgonEtragoHeatCts).filter(
1068
                    EgonEtragoHeatCts.scn_name == scenario
1069
                )
1070
            ).filter(EgonEtragoHeatCts.bus_id.in_(bus_ids))
1071
1072
        df_cts_substation_profiles = pd.read_sql(
1073
            cells_query.statement,
1074
            cells_query.session.bind,
1075
        )
1076
        df_cts_substation_profiles = pd.DataFrame.from_dict(
1077
            df_cts_substation_profiles.set_index("bus_id")["p_set"].to_dict(),
1078
            orient="index",
1079
        )
1080
        for bus_id in bus_ids:
1081
            if bus_id in df_cts_substation_profiles.index:
1082
                # get peta demand to scale load profile to
1083
                peta_cts_demand = get_peta_demand(bus_id, scenario)
1084
                scaling_factor = (
1085
                    peta_cts_demand.demand.sum()
1086
                    / df_cts_substation_profiles.loc[bus_id, :].sum()
1087
                )
1088
                # scale load profile
1089
                df_cts_substation_profiles.loc[bus_id, :] *= scaling_factor
1090
1091
    else:
1092
        raise KeyError("Sector needs to be either 'electricity' or 'heat'")
1093
1094
    # TODO remove after #722
1095
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1096
1097
    # get demand profile for all buildings for selected demand share
1098
    df_building_profiles = pd.DataFrame()
1099
    for bus_id, df in df_demand_share.groupby("bus_id"):
1100
        shares = df.set_index("building_id", drop=True)["profile_share"]
1101
        try:
1102
            profile_ts = df_cts_substation_profiles.loc[bus_id]
1103
        except KeyError:
1104
            # This should only happen within the SH cutout
1105
            log.info(
1106
                f"No CTS profile found for substation with bus_id:"
1107
                f" {bus_id}"
1108
            )
1109
            continue
1110
1111
        building_profiles = np.outer(profile_ts, shares)
1112
        building_profiles = pd.DataFrame(
1113
            building_profiles, index=profile_ts.index, columns=shares.index
1114
        )
1115
        df_building_profiles = pd.concat(
1116
            [df_building_profiles, building_profiles], axis=1
1117
        )
1118
1119
    return df_building_profiles
1120
1121
1122
def delete_synthetic_cts_buildings():
1123
    """
1124
    All synthetic cts buildings are deleted from the DB. This is necessary if
1125
    the task is run multiple times as the existing synthetic buildings
1126
    influence the results.
1127
    """
1128
    # import db tables
1129
    from saio.openstreetmap import osm_buildings_synthetic
1130
1131
    # cells mit amenities
1132
    with db.session_scope() as session:
1133
        session.query(osm_buildings_synthetic).filter(
1134
            osm_buildings_synthetic.building == "cts"
1135
        ).delete()
1136
1137
1138
def remove_double_bus_id(df_cts_buildings):
1139
    """This is an backup adhoc fix if there should still be a building which
1140
    is assigned to 2 substations. In this case one of the buildings is just
1141
    dropped. As this currently accounts for only one building with one amenity
1142
    the deviation is neglectable."""
1143
    # assign bus_id via census cell of amenity
1144
    with db.session_scope() as session:
1145
        cells_query = session.query(
1146
            MapZensusGridDistricts.zensus_population_id,
1147
            MapZensusGridDistricts.bus_id,
1148
        )
1149
1150
    df_egon_map_zensus_buildings_buses = pd.read_sql(
1151
        cells_query.statement,
1152
        cells_query.session.bind,
1153
        index_col=None,
1154
    )
1155
    df_cts_buildings = pd.merge(
1156
        left=df_cts_buildings,
1157
        right=df_egon_map_zensus_buildings_buses,
1158
        on="zensus_population_id",
1159
    )
1160
1161
    substation_per_building = df_cts_buildings.groupby("id")[
1162
        "bus_id"
1163
    ].nunique()
1164
    building_id = substation_per_building.loc[
1165
        substation_per_building > 1
1166
    ].index
1167
    df_duplicates = df_cts_buildings.loc[
1168
        df_cts_buildings["id"].isin(building_id)
1169
    ]
1170
    for unique_id in df_duplicates["id"].unique():
1171
        drop_index = df_duplicates[df_duplicates["id"] == unique_id].index[0]
1172
        print(
1173
            f"Buildings {df_cts_buildings.loc[drop_index, 'id']}"
1174
            f" dropped because of double substation"
1175
        )
1176
        df_cts_buildings.drop(index=drop_index, inplace=True)
1177
1178
    df_cts_buildings.drop(columns="bus_id", inplace=True)
1179
1180
    return df_cts_buildings
1181
1182
1183
def cts_buildings():
1184
    """
1185
    Assigns CTS demand to buildings and calculates the respective demand
1186
    profiles. The demand profile per substation are disaggregated per
1187
    annual demand share of each census cell and by the number of amenities
1188
    per building within the cell. If no building data is available,
1189
    synthetic buildings are generated around the amenities. If no amenities
1190
    but cts demand is available, buildings are randomly selected. If no
1191
    building nor amenity is available, random synthetic buildings are
1192
    generated. The demand share is stored in the database.
1193
1194
    Note:
1195
    -----
1196
    Cells with CTS demand, amenities and buildings do not change within
1197
    the scenarios, only the demand itself. Therefore scenario eGon2035
1198
    can be used universally to determine the cts buildings but not for
1199
    the demand share.
1200
    """
1201
    # ========== Register np datatypes with SQLA ==========
1202
    def adapt_numpy_float64(numpy_float64):
1203
        return AsIs(numpy_float64)
1204
1205
    def adapt_numpy_int64(numpy_int64):
1206
        return AsIs(numpy_int64)
1207
1208
    register_adapter(np.float64, adapt_numpy_float64)
1209
    register_adapter(np.int64, adapt_numpy_int64)
1210
    # =====================================================
1211
1212
    log.info("Start logging!")
1213
    # Buildings with amenities
1214
    df_buildings_with_amenities, df_lost_cells = buildings_with_amenities()
1215
    log.info("Buildings with amenities selected!")
1216
1217
    # Median number of amenities per cell
1218
    median_n_amenities = int(
1219
        df_buildings_with_amenities.groupby("zensus_population_id")[
1220
            "n_amenities_inside"
1221
        ]
1222
        .sum()
1223
        .median()
1224
    )
1225
    log.info(f"Median amenity value: {median_n_amenities}")
1226
1227
    # Remove synthetic CTS buildings if existing
1228
    delete_synthetic_cts_buildings()
1229
    log.info("Old synthetic cts buildings deleted!")
1230
1231
    # Amenities not assigned to buildings
1232
    df_amenities_without_buildings = amenities_without_buildings()
1233
    log.info("Amenities without buildlings selected!")
1234
1235
    # Append lost cells due to duplicated ids, to cover all demand cells
1236
    if not df_lost_cells.empty:
1237
1238
        # Number of synth amenities per cell
1239
        df_lost_cells["amenities"] = median_n_amenities
1240
        # create row for every amenity
1241
        df_lost_cells["amenities"] = (
1242
            df_lost_cells["amenities"].astype(int).apply(range)
1243
        )
1244
        df_lost_cells = df_lost_cells.explode("amenities")
1245
        df_lost_cells.drop(columns="amenities", inplace=True)
1246
        df_amenities_without_buildings = df_amenities_without_buildings.append(
1247
            df_lost_cells, ignore_index=True
1248
        )
1249
        log.info(
1250
            f"{df_lost_cells.shape[0]} lost cells due to substation "
1251
            f"intersection appended!"
1252
        )
1253
1254
    # One building per amenity
1255
    df_amenities_without_buildings["n_amenities_inside"] = 1
1256
    # Create synthetic buildings for amenites without buildings
1257
    df_synthetic_buildings_with_amenities = create_synthetic_buildings(
1258
        df_amenities_without_buildings, points="geom_amenity"
1259
    )
1260
    log.info("Synthetic buildings created!")
1261
1262
    # TODO remove renaming after #722
1263
    write_table_to_postgis(
1264
        df_synthetic_buildings_with_amenities.rename(
1265
            columns={
1266
                "zensus_population_id": "cell_id",
1267
                "egon_building_id": "id",
1268
            }
1269
        ),
1270
        OsmBuildingsSynthetic,
1271
        engine=engine,
1272
        drop=False,
1273
    )
1274
    log.info("Synthetic buildings exported to DB!")
1275
1276
    # Cells without amenities but CTS demand and buildings
1277
    df_buildings_without_amenities = buildings_without_amenities()
1278
    log.info("Buildings without amenities in demand cells identified!")
1279
1280
    # Backup Bugfix for duplicated buildings which occure in SQL-Querry
1281
    # drop building ids which have already been used
1282
    mask = df_buildings_without_amenities.loc[
1283
        df_buildings_without_amenities["id"].isin(
1284
            df_buildings_with_amenities["id"]
1285
        )
1286
    ].index
1287
    df_buildings_without_amenities = df_buildings_without_amenities.drop(
1288
        index=mask
1289
    ).reset_index(drop=True)
1290
    log.info(f"{len(mask)} duplicated ids removed!")
1291
1292
    # select median n buildings per cell
1293
    df_buildings_without_amenities = select_cts_buildings(
1294
        df_buildings_without_amenities, max_n=median_n_amenities
1295
    )
1296
    df_buildings_without_amenities["n_amenities_inside"] = 1
1297
    log.info(f"{median_n_amenities} buildings per cell selected!")
1298
1299
    # Create synthetic amenities and buildings in cells with only CTS demand
1300
    df_cells_with_cts_demand_only = cells_with_cts_demand_only(
1301
        df_buildings_without_amenities
1302
    )
1303
    log.info("Cells with only demand identified!")
1304
1305
    # TODO implement overlay prevention #953 here
1306
    # Median n Amenities per cell
1307
    df_cells_with_cts_demand_only["amenities"] = median_n_amenities
1308
    # create row for every amenity
1309
    df_cells_with_cts_demand_only["amenities"] = (
1310
        df_cells_with_cts_demand_only["amenities"].astype(int).apply(range)
1311
    )
1312
    df_cells_with_cts_demand_only = df_cells_with_cts_demand_only.explode(
1313
        "amenities"
1314
    )
1315
    df_cells_with_cts_demand_only.drop(columns="amenities", inplace=True)
1316
1317
    # Only 1 Amenity per Building
1318
    df_cells_with_cts_demand_only["n_amenities_inside"] = 1
1319
    df_cells_with_cts_demand_only = place_buildings_with_amenities(
1320
        df_cells_with_cts_demand_only, amenities=1
1321
    )
1322
    df_synthetic_buildings_without_amenities = create_synthetic_buildings(
1323
        df_cells_with_cts_demand_only, points="geom_point"
1324
    )
1325
    log.info(f"{median_n_amenities} synthetic buildings per cell created")
1326
1327
    # TODO remove renaming after #722
1328
    write_table_to_postgis(
1329
        df_synthetic_buildings_without_amenities.rename(
1330
            columns={
1331
                "zensus_population_id": "cell_id",
1332
                "egon_building_id": "id",
1333
            }
1334
        ),
1335
        OsmBuildingsSynthetic,
1336
        engine=engine,
1337
        drop=False,
1338
    )
1339
    log.info("Synthetic buildings exported to DB")
1340
1341
    # Concat all buildings
1342
    columns = [
1343
        "zensus_population_id",
1344
        "id",
1345
        "geom_building",
1346
        "n_amenities_inside",
1347
        "source",
1348
    ]
1349
1350
    df_buildings_with_amenities["source"] = "bwa"
1351
    df_synthetic_buildings_with_amenities["source"] = "sbwa"
1352
    df_buildings_without_amenities["source"] = "bwoa"
1353
    df_synthetic_buildings_without_amenities["source"] = "sbwoa"
1354
1355
    df_cts_buildings = pd.concat(
1356
        [
1357
            df_buildings_with_amenities[columns],
1358
            df_synthetic_buildings_with_amenities[columns],
1359
            df_buildings_without_amenities[columns],
1360
            df_synthetic_buildings_without_amenities[columns],
1361
        ],
1362
        axis=0,
1363
        ignore_index=True,
1364
    )
1365
    df_cts_buildings = remove_double_bus_id(df_cts_buildings)
1366
    log.info("Double bus_id checked")
1367
1368
    # TODO remove dypte correction after #722
1369
    df_cts_buildings["id"] = df_cts_buildings["id"].astype(int)
1370
1371
    df_cts_buildings = gpd.GeoDataFrame(
1372
        df_cts_buildings, geometry="geom_building", crs=3035
1373
    )
1374
    df_cts_buildings = df_cts_buildings.reset_index().rename(
1375
        columns={"index": "serial"}
1376
    )
1377
1378
    # Write table to db for debugging and postprocessing
1379
    write_table_to_postgis(
1380
        df_cts_buildings,
1381
        CtsBuildings,
1382
        engine=engine,
1383
        drop=True,
1384
    )
1385
    log.info("CTS buildings exported to DB!")
1386
1387
1388 View Code Duplication
def cts_electricity():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1389
    """
1390
    Calculate cts electricity demand share of hvmv substation profile
1391
     for buildings.
1392
    """
1393
    log.info("Start logging!")
1394
    with db.session_scope() as session:
1395
        cells_query = session.query(CtsBuildings)
1396
1397
    df_cts_buildings = pd.read_sql(
1398
        cells_query.statement, cells_query.session.bind, index_col=None
1399
    )
1400
    log.info("CTS buildings from DB imported!")
1401
    df_demand_share_2035 = calc_building_demand_profile_share(
1402
        df_cts_buildings, scenario="eGon2035", sector="electricity"
1403
    )
1404
    log.info("Profile share for egon2035 calculated!")
1405
1406
    df_demand_share_100RE = calc_building_demand_profile_share(
1407
        df_cts_buildings, scenario="eGon100RE", sector="electricity"
1408
    )
1409
    log.info("Profile share for egon100RE calculated!")
1410
1411
    df_demand_share = pd.concat(
1412
        [df_demand_share_2035, df_demand_share_100RE],
1413
        axis=0,
1414
        ignore_index=True,
1415
    )
1416
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1417
1418
    write_table_to_postgres(
1419
        df_demand_share,
1420
        EgonCtsElectricityDemandBuildingShare,
1421
        drop=True,
1422
    )
1423
    log.info("Profile share exported to DB!")
1424
1425
1426 View Code Duplication
def cts_heat():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1427
    """
1428
    Calculate cts electricity demand share of hvmv substation profile
1429
     for buildings.
1430
    """
1431
    log.info("Start logging!")
1432
    with db.session_scope() as session:
1433
        cells_query = session.query(CtsBuildings)
1434
1435
    df_cts_buildings = pd.read_sql(
1436
        cells_query.statement, cells_query.session.bind, index_col=None
1437
    )
1438
    log.info("CTS buildings from DB imported!")
1439
1440
    df_demand_share_2035 = calc_building_demand_profile_share(
1441
        df_cts_buildings, scenario="eGon2035", sector="heat"
1442
    )
1443
    log.info("Profile share for egon2035 calculated!")
1444
    df_demand_share_100RE = calc_building_demand_profile_share(
1445
        df_cts_buildings, scenario="eGon100RE", sector="heat"
1446
    )
1447
    log.info("Profile share for egon100RE calculated!")
1448
    df_demand_share = pd.concat(
1449
        [df_demand_share_2035, df_demand_share_100RE],
1450
        axis=0,
1451
        ignore_index=True,
1452
    )
1453
1454
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1455
1456
    write_table_to_postgres(
1457
        df_demand_share,
1458
        EgonCtsHeatDemandBuildingShare,
1459
        drop=True,
1460
    )
1461
    log.info("Profile share exported to DB!")
1462
1463
1464 View Code Duplication
def get_cts_electricity_peak_load():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1465
    """
1466
    Get electricity peak load of all CTS buildings for both scenarios and
1467
    store in DB.
1468
    """
1469
    log.info("Start logging!")
1470
1471
    BuildingElectricityPeakLoads.__table__.create(bind=engine, checkfirst=True)
1472
1473
    # Delete rows with cts demand
1474
    with db.session_scope() as session:
1475
        session.query(BuildingElectricityPeakLoads).filter(
1476
            BuildingElectricityPeakLoads.sector == "cts"
1477
        ).delete()
1478
    log.info("Cts electricity peak load removed from DB!")
1479
1480
    for scenario in ["eGon2035", "eGon100RE"]:
1481
1482
        with db.session_scope() as session:
1483
            cells_query = session.query(
1484
                EgonCtsElectricityDemandBuildingShare
1485
            ).filter(
1486
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1487
            )
1488
1489
        df_demand_share = pd.read_sql(
1490
            cells_query.statement, cells_query.session.bind, index_col=None
1491
        )
1492
1493
        with db.session_scope() as session:
1494
            cells_query = session.query(EgonEtragoElectricityCts).filter(
1495
                EgonEtragoElectricityCts.scn_name == scenario
1496
            )
1497
1498
        df_cts_profiles = pd.read_sql(
1499
            cells_query.statement,
1500
            cells_query.session.bind,
1501
        )
1502
        df_cts_profiles = pd.DataFrame.from_dict(
1503
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1504
            orient="columns",
1505
        )
1506
1507
        df_peak_load = pd.merge(
1508
            left=df_cts_profiles.max().astype(float).rename("max"),
1509
            right=df_demand_share,
1510
            left_index=True,
1511
            right_on="bus_id",
1512
        )
1513
1514
        # Convert unit from MWh to W
1515
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1516
        df_peak_load["peak_load_in_w"] = (
1517
            df_peak_load["max"] * df_peak_load["profile_share"]
1518
        )
1519
        log.info(f"Peak load for {scenario} determined!")
1520
1521
        # TODO remove after #772
1522
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1523
        df_peak_load["sector"] = "cts"
1524
1525
        # # Write peak loads into db
1526
        write_table_to_postgres(
1527
            df_peak_load,
1528
            BuildingElectricityPeakLoads,
1529
            drop=False,
1530
            index=False,
1531
            if_exists="append",
1532
        )
1533
1534
        log.info(f"Peak load for {scenario} exported to DB!")
1535
1536
1537 View Code Duplication
def get_cts_heat_peak_load():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1538
    """
1539
    Get heat peak load of all CTS buildings for both scenarios and store in DB.
1540
    """
1541
    log.info("Start logging!")
1542
1543
    BuildingHeatPeakLoads.__table__.create(bind=engine, checkfirst=True)
1544
1545
    # Delete rows with cts demand
1546
    with db.session_scope() as session:
1547
        session.query(BuildingHeatPeakLoads).filter(
1548
            BuildingHeatPeakLoads.sector == "cts"
1549
        ).delete()
1550
    log.info("Cts heat peak load removed from DB!")
1551
1552
    for scenario in ["eGon2035", "eGon100RE"]:
1553
1554
        with db.session_scope() as session:
1555
            cells_query = session.query(
1556
                EgonCtsElectricityDemandBuildingShare
1557
            ).filter(
1558
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1559
            )
1560
1561
        df_demand_share = pd.read_sql(
1562
            cells_query.statement, cells_query.session.bind, index_col=None
1563
        )
1564
        log.info(f"Retrieved demand share for scenario: {scenario}")
1565
1566
        with db.session_scope() as session:
1567
            cells_query = session.query(EgonEtragoHeatCts).filter(
1568
                EgonEtragoHeatCts.scn_name == scenario
1569
            )
1570
1571
        df_cts_profiles = pd.read_sql(
1572
            cells_query.statement,
1573
            cells_query.session.bind,
1574
        )
1575
        log.info(f"Retrieved substation profiles for scenario: {scenario}")
1576
1577
        df_cts_profiles = pd.DataFrame.from_dict(
1578
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1579
            orient="columns",
1580
        )
1581
1582
        df_peak_load = pd.merge(
1583
            left=df_cts_profiles.max().astype(float).rename("max"),
1584
            right=df_demand_share,
1585
            left_index=True,
1586
            right_on="bus_id",
1587
        )
1588
1589
        # Convert unit from MWh to W
1590
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1591
        df_peak_load["peak_load_in_w"] = (
1592
            df_peak_load["max"] * df_peak_load["profile_share"]
1593
        )
1594
        log.info(f"Peak load for {scenario} determined!")
1595
1596
        # TODO remove after #772
1597
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1598
        df_peak_load["sector"] = "cts"
1599
1600
        # # Write peak loads into db
1601
        write_table_to_postgres(
1602
            df_peak_load,
1603
            BuildingHeatPeakLoads,
1604
            drop=False,
1605
            index=False,
1606
            if_exists="append",
1607
        )
1608
1609
        log.info(f"Peak load for {scenario} exported to DB!")
1610
1611
1612
def assign_voltage_level_to_buildings():
1613
    """
1614
    Add voltage level to all buildings by summed peak demand.
1615
1616
    All entries with same building id get the voltage level corresponding
1617
    to their summed residential and cts peak demand.
1618
    """
1619
1620
    with db.session_scope() as session:
1621
        cells_query = session.query(BuildingElectricityPeakLoads)
1622
1623
        df_peak_loads = pd.read_sql(
1624
            cells_query.statement,
1625
            cells_query.session.bind,
1626
        )
1627
1628
    df_peak_load_buildings = df_peak_loads.groupby(
1629
        ["building_id", "scenario"]
1630
    )["peak_load_in_w"].sum()
1631
    df_peak_load_buildings = df_peak_load_buildings.to_frame()
1632
    df_peak_load_buildings.loc[:, "voltage_level"] = 0
1633
1634
    # Identify voltage_level by thresholds defined in the eGon project
1635
    df_peak_load_buildings.loc[
1636
        df_peak_load_buildings["peak_load_in_w"] <= 0.1 * 1e6, "voltage_level"
1637
    ] = 7
1638
    df_peak_load_buildings.loc[
1639
        df_peak_load_buildings["peak_load_in_w"] > 0.1 * 1e6, "voltage_level"
1640
    ] = 6
1641
    df_peak_load_buildings.loc[
1642
        df_peak_load_buildings["peak_load_in_w"] > 0.2 * 1e6, "voltage_level"
1643
    ] = 5
1644
    df_peak_load_buildings.loc[
1645
        df_peak_load_buildings["peak_load_in_w"] > 5.5 * 1e6, "voltage_level"
1646
    ] = 4
1647
    df_peak_load_buildings.loc[
1648
        df_peak_load_buildings["peak_load_in_w"] > 20 * 1e6, "voltage_level"
1649
    ] = 3
1650
    df_peak_load_buildings.loc[
1651
        df_peak_load_buildings["peak_load_in_w"] > 120 * 1e6, "voltage_level"
1652
    ] = 1
1653
1654
    df_peak_load = pd.merge(
1655
        left=df_peak_loads.drop(columns="voltage_level"),
1656
        right=df_peak_load_buildings["voltage_level"],
1657
        how="left",
1658
        left_on=["building_id", "scenario"],
1659
        right_index=True,
1660
    )
1661
1662
    # Write peak loads into db
1663
    # remove table and replace by new
1664
    write_table_to_postgres(
1665
        df_peak_load,
1666
        BuildingElectricityPeakLoads,
1667
        drop=True,
1668
        index=False,
1669
        if_exists="append",
1670
    )
1671