Passed
Pull Request — dev (#991)
by
unknown
01:39
created

data.datasets.electricity_demand_timeseries.cts_buildings   F

Complexity

Total Complexity 59

Size/Duplication

Total Lines 1588
Duplicated Lines 13.85 %

Importance

Changes 0
Metric Value
wmc 59
eloc 775
dl 220
loc 1588
rs 3.905
c 0
b 0
f 0

17 Functions

Rating   Name   Duplication   Size   Complexity  
A place_buildings_with_amenities() 0 41 3
B create_synthetic_buildings() 0 66 6
A delete_synthetic_cts_buildings() 0 14 2
A remove_double_bus_id() 0 43 3
C calc_cts_building_profiles() 0 119 8
A amenities_without_buildings() 0 41 2
A select_cts_buildings() 0 25 1
B cells_with_cts_demand_only() 0 72 2
A calc_building_demand_profile_share() 0 81 2
C buildings_with_amenities() 0 183 5
B calc_census_cell_share() 0 67 5
B buildings_without_amenities() 0 99 2
A cts_heat() 37 37 2
B get_cts_electricity_peak_load() 72 72 5
B get_cts_heat_peak_load() 74 74 5
A cts_electricity() 37 37 2
B cts_buildings() 0 224 3

1 Method

Rating   Name   Duplication   Size   Complexity  
A CtsDemandBuildings.__init__() 0 9 1

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complexity

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like data.datasets.electricity_demand_timeseries.cts_buildings often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
"""
2
CTS electricity and heat demand time series for scenarios in 2035 and 2050
3
assigned to OSM-buildings.
4
5
Disaggregation of cts heat & electricity demand time series from MV Substation
6
to census cells via annual demand and then to OSM buildings via
7
amenity tags or randomly if no sufficient OSM-data is available in the
8
respective census cell. If no OSM-buildings or synthetic residential buildings
9
are available new synthetic 5x5m buildings are generated.
10
11
The resulting data is stored in separate tables
12
13
* `openstreetmap.osm_buildings_synthetic`:
14
    Lists generated synthetic building with id, zensus_population_id and
15
    building type. This table is already created within
16
    :func:`hh_buildings.map_houseprofiles_to_buildings()`
17
* `openstreetmap.egon_cts_buildings`:
18
    Table of all selected cts buildings with id, census cell id, geometry and
19
    amenity count in building. This table is created within
20
    :func:`cts_buildings()`
21
* `demand.egon_cts_electricity_demand_building_share`:
22
    Table including the mv substation electricity profile share of all selected
23
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
24
    within :func:`cts_electricity()`
25
* `demand.egon_cts_heat_demand_building_share`:
26
    Table including the mv substation heat profile share of all selected
27
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
28
    within :func:`cts_heat()`
29
* `demand.egon_building_peak_loads`:
30
    Mapping of demand time series and buildings including cell_id, building
31
    area and peak load. This table is already created within
32
    :func:`hh_buildings.get_building_peak_loads()`
33
34
**The following datasets from the database are mainly used for creation:**
35
36
* `openstreetmap.osm_buildings_filtered`:
37
    Table of OSM-buildings filtered by tags to selecting residential and cts
38
    buildings only.
39
* `openstreetmap.osm_amenities_shops_filtered`:
40
    Table of OSM-amenities filtered by tags to select cts only.
41
* `openstreetmap.osm_amenities_not_in_buildings_filtered`:
42
    Table of amenities which do not intersect with any building from
43
    `openstreetmap.osm_buildings_filtered`
44
* `openstreetmap.osm_buildings_synthetic`:
45
    Table of synthetic residential buildings
46
* `boundaries.egon_map_zensus_buildings_filtered_all`:
47
    Mapping table of census cells and buildings filtered even if population
48
    in census cell = 0.
49
* `demand.egon_demandregio_zensus_electricity`:
50
    Table of annual electricity load demand for residential and cts at census
51
    cell level. Residential load demand is derived from aggregated residential
52
    building profiles. DemandRegio CTS load demand at NUTS3 is distributed to
53
    census cells linearly to heat demand from peta5.
54
* `demand.egon_peta_heat`:
55
    Table of annual heat load demand for residential and cts at census cell
56
    level from peta5.
57
* `demand.egon_etrago_electricity_cts`:
58
    Scaled cts electricity time series for every MV substation. Derived from
59
    DemandRegio SLP for selected economic sectors at nuts3. Scaled with annual
60
    demand from `demand.egon_demandregio_zensus_electricity`
61
* `demand.egon_etrago_heat_cts`:
62
    Scaled cts heat time series for every MV substation. Derived from
63
    DemandRegio SLP Gas for selected economic sectors at nuts3. Scaled with
64
    annual demand from `demand.egon_peta_heat`.
65
66
**What is the goal?**
67
68
To disaggregate cts heat and electricity time series from MV substation level
69
to geo-referenced buildings, the annual demand from DemandRegio and Peta5 is
70
used to identify census cells with load demand. We use Openstreetmap data and
71
filter tags to identify buildings and count the  amenities within. The number
72
of amenities and the annual demand serve to assign a demand share of the MV
73
substation profile to the building.
74
75
**What is the challenge?**
76
77
The OSM, DemandRegio and Peta5 dataset differ from each other. The OSM dataset
78
is a community based dataset which is extended throughout and does not claim to
79
be complete. Therefore, not all census cells which have a demand assigned by
80
DemandRegio or Peta5 methodology also have buildings with respective tags or
81
sometime even any building at all. Furthermore, the substation load areas are
82
determined dynamically in a previous dataset. Merging these datasets different
83
scopes (census cell shapes, building shapes) and their inconsistencies need to
84
be addressed. For example: not yet tagged buildings or amenities in OSM, or
85
building shapes exceeding census cells.
86
87
88
**How are these datasets combined?**
89
90
91
The methodology for heat and electricity is the same and only differ in the
92
annual demand and MV/HV Substation profile. In a previous dataset
93
(openstreetmap), we filter all OSM buildings and amenities for tags, we relate
94
to the cts sector. Amenities are mapped to intersecting buildings and then
95
intersected with the annual demand which exists at census cell level. We obtain
96
census cells with demand and amenities and without amenities. If there is no
97
data on amenities, n synthetic ones are assigned to existing buildings. We use
98
the median value of amenities/census cell for n and all filtered buildings +
99
synthetic residential buildings. If no building data is available a synthetic
100
buildings is randomly generated. This also happens for amenities which couldn't
101
be assigned to any osm building. All census cells with an annual demand are
102
covered this way, and we obtain four different categories of amenities:
103
104
* Buildings with amenities
105
* Synthetic buildings with amenities
106
* Buildings with synthetic amenities
107
* Synthetics buildings with synthetic amenities
108
109
The amenities are summed per census cell (of amenity) and building to derive
110
the building amenity share per census cell. Multiplied with the annual demand,
111
we receive the profile demand share for each cell. Some buildings exceed the
112
census cell shape and have amenities in different cells although mapped to one
113
building only. To have unique buildings the demand share is summed once more
114
per building id. This factor can now be used to obtain the profile for each
115
building.
116
117
A schematic flow chart exist in the correspondent issue #671:
118
https://github.com/openego/eGon-data/issues/671#issuecomment-1260740258
119
120
121
**What are central assumptions during the data processing?**
122
123
* We assume OSM data to be the most reliable and complete open source dataset.
124
* We assume building and amenity tags to be truthful and accurate.
125
* Mapping census to OSM data is not trivial. Discrepancies are substituted.
126
* Missing OSM buildings are generated for each amenity.
127
* Missing amenities are generated by median value of amenities/census cell.
128
129
130
**Drawbacks and limitations of the data**
131
132
* Shape of profiles for each building is similar within one substation load
133
area and only scaled differently.
134
* Load areas are generated dynamically. In case of buildings with amenities
135
exceeding borders, the number of amenities are transferred randomly to either
136
of two load areas.
137
* The completeness of the OSM data depends on community contribution and is
138
crucial to the quality of our results.
139
* Randomly selected buildings and generated amenities may inadequately reflect
140
reality, but are chosen for sake of simplicity as a measure to fill data gaps.
141
* Since this dataset is a cascade after generation of synthetic residential
142
buildings also check drawbacks and limitations in hh_buildings.py
143
* Synthetic buildings may be placed within osm buildings which exceed multiple
144
census cells. This is currently accepted but may be solved in  #953
145
* Scattered high peak loads occur and might lead to single MV grid connections
146
in ding0. In some cases this might not be viable. Postprocessing is needed and
147
may be solved in #954
148
149
150
Example Query
151
-----
152
153
154
Notes
155
-----
156
157
This module docstring is rather a dataset documentation. Once, a decision
158
is made in ... the content of this module docstring needs to be moved to
159
docs attribute of the respective dataset class.
160
"""
161
162
from geoalchemy2 import Geometry
163
from geoalchemy2.shape import to_shape
164
from psycopg2.extensions import AsIs, register_adapter
165
from sqlalchemy import REAL, Column, Integer, String, func
166
from sqlalchemy.ext.declarative import declarative_base
167
import geopandas as gpd
168
import numpy as np
169
import pandas as pd
170
import saio
171
172
from egon.data import db
173
from egon.data import logger as log
174
from egon.data.datasets import Dataset
175
from egon.data.datasets.electricity_demand import (
176
    EgonDemandRegioZensusElectricity,
177
)
178
from egon.data.datasets.electricity_demand.temporal import (
179
    EgonEtragoElectricityCts,
180
)
181
from egon.data.datasets.electricity_demand_timeseries.hh_buildings import (
182
    BuildingElectricityPeakLoads,
183
    OsmBuildingsSynthetic,
184
)
185
from egon.data.datasets.electricity_demand_timeseries.tools import (
186
    random_ints_until_sum,
187
    random_point_in_square,
188
    specific_int_until_sum,
189
    write_table_to_postgis,
190
    write_table_to_postgres,
191
)
192
from egon.data.datasets.heat_demand import EgonPetaHeat
193
from egon.data.datasets.heat_demand_timeseries import EgonEtragoHeatCts
194
from egon.data.datasets.zensus_mv_grid_districts import MapZensusGridDistricts
195
from egon.data.datasets.zensus_vg250 import DestatisZensusPopulationPerHa
196
197
engine = db.engine()
198
Base = declarative_base()
199
200
# import db tables
201
saio.register_schema("openstreetmap", engine=engine)
202
saio.register_schema("boundaries", engine=engine)
203
204
205
class EgonCtsElectricityDemandBuildingShare(Base):
206
    __tablename__ = "egon_cts_electricity_demand_building_share"
207
    __table_args__ = {"schema": "demand"}
208
209
    building_id = Column(Integer, primary_key=True)
210
    scenario = Column(String, primary_key=True)
211
    bus_id = Column(Integer, index=True)
212
    profile_share = Column(REAL)
213
214
215
class EgonCtsHeatDemandBuildingShare(Base):
216
    __tablename__ = "egon_cts_heat_demand_building_share"
217
    __table_args__ = {"schema": "demand"}
218
219
    building_id = Column(Integer, primary_key=True)
220
    scenario = Column(String, primary_key=True)
221
    bus_id = Column(Integer, index=True)
222
    profile_share = Column(REAL)
223
224
225
class CtsBuildings(Base):
226
    __tablename__ = "egon_cts_buildings"
227
    __table_args__ = {"schema": "openstreetmap"}
228
229
    serial = Column(Integer, primary_key=True)
230
    id = Column(Integer, index=True)
231
    zensus_population_id = Column(Integer, index=True)
232
    geom_building = Column(Geometry("Polygon", 3035))
233
    n_amenities_inside = Column(Integer)
234
    source = Column(String)
235
236
237
class BuildingHeatPeakLoads(Base):
238
    __tablename__ = "egon_building_heat_peak_loads"
239
    __table_args__ = {"schema": "demand"}
240
241
    building_id = Column(Integer, primary_key=True)
242
    scenario = Column(String, primary_key=True)
243
    sector = Column(String, primary_key=True)
244
    peak_load_in_w = Column(REAL)
245
246
247
class CtsDemandBuildings(Dataset):
248
    def __init__(self, dependencies):
249
        super().__init__(
250
            name="CtsDemandBuildings",
251
            version="0.0.1",
252
            dependencies=dependencies,
253
            tasks=(
254
                cts_buildings,
255
                {cts_electricity, cts_heat},
256
                {get_cts_electricity_peak_load, get_cts_heat_peak_load},
257
            ),
258
        )
259
260
261
def amenities_without_buildings():
262
    """
263
    Amenities which have no buildings assigned and are in a cell with cts
264
    demand are determined.
265
266
    Returns
267
    -------
268
    pd.DataFrame
269
        Table of amenities without buildings
270
    """
271
    from saio.openstreetmap import osm_amenities_not_in_buildings_filtered
272
273
    with db.session_scope() as session:
274
        cells_query = (
275
            session.query(
276
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
277
                osm_amenities_not_in_buildings_filtered.geom_amenity,
278
                osm_amenities_not_in_buildings_filtered.egon_amenity_id,
279
            )
280
            .filter(
281
                func.st_within(
282
                    osm_amenities_not_in_buildings_filtered.geom_amenity,
283
                    DestatisZensusPopulationPerHa.geom,
284
                )
285
            )
286
            .filter(
287
                DestatisZensusPopulationPerHa.id
288
                == EgonDemandRegioZensusElectricity.zensus_population_id
289
            )
290
            .filter(
291
                EgonDemandRegioZensusElectricity.sector == "service",
292
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
293
            )
294
        )
295
296
    df_amenities_without_buildings = gpd.read_postgis(
297
        cells_query.statement,
298
        cells_query.session.bind,
299
        geom_col="geom_amenity",
300
    )
301
    return df_amenities_without_buildings
302
303
304
def place_buildings_with_amenities(df, amenities=None, max_amenities=None):
305
    """
306
    Building centroids are placed randomly within census cells.
307
    The Number of buildings is derived from n_amenity_inside, the selected
308
    method and number of amenities per building.
309
310
    Returns
311
    -------
312
    df: gpd.GeoDataFrame
313
        Table of buildings centroids
314
    """
315
    if isinstance(max_amenities, int):
316
        # amount of amenities is randomly generated within bounds
317
        # (max_amenities, amenities per cell)
318
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
319
            random_ints_until_sum, args=[max_amenities]
320
        )
321
    if isinstance(amenities, int):
322
        # Specific amount of amenities per building
323
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
324
            specific_int_until_sum, args=[amenities]
325
        )
326
327
    # Unnest each building
328
    df = df.explode(column="n_amenities_inside")
329
330
    # building count per cell
331
    df["building_count"] = df.groupby(["zensus_population_id"]).cumcount() + 1
332
333
    # generate random synthetic buildings
334
    edge_length = 5
335
    # create random points within census cells
336
    points = random_point_in_square(geom=df["geom"], tol=edge_length / 2)
337
338
    df.reset_index(drop=True, inplace=True)
339
    # Store center of polygon
340
    df["geom_point"] = points
341
    # Drop geometry of census cell
342
    df = df.drop(columns=["geom"])
343
344
    return df
345
346
347
def create_synthetic_buildings(df, points=None, crs="EPSG:3035"):
348
    """
349
    Synthetic buildings are generated around points.
350
351
    Parameters
352
    ----------
353
    df: pd.DataFrame
354
        Table of census cells
355
    points: gpd.GeoSeries or str
356
        List of points to place buildings around or column name of df
357
    crs: str
358
        CRS of result table
359
360
    Returns
361
    -------
362
    df: gpd.GeoDataFrame
363
        Synthetic buildings
364
    """
365
366
    if isinstance(points, str) and points in df.columns:
367
        points = df[points]
368
    elif isinstance(points, gpd.GeoSeries):
369
        pass
370
    else:
371
        raise ValueError("Points are of the wrong type")
372
373
    # Create building using a square around point
374
    edge_length = 5
375
    df["geom_building"] = points.buffer(distance=edge_length / 2, cap_style=3)
376
377
    if "geom_point" not in df.columns:
378
        df["geom_point"] = df["geom_building"].centroid
379
380
    df = gpd.GeoDataFrame(
381
        df,
382
        crs=crs,
383
        geometry="geom_building",
384
    )
385
386
    # TODO remove after #772 implementation of egon_building_id
387
    df.rename(columns={"id": "egon_building_id"}, inplace=True)
388
389
    # get max number of building ids from synthetic residential table
390
    with db.session_scope() as session:
391
        max_synth_residential_id = session.execute(
392
            func.max(OsmBuildingsSynthetic.id)
393
        ).scalar()
394
    max_synth_residential_id = int(max_synth_residential_id)
395
396
    # create sequential ids
397
    df["egon_building_id"] = range(
398
        max_synth_residential_id + 1,
399
        max_synth_residential_id + df.shape[0] + 1,
400
    )
401
402
    df["area"] = df["geom_building"].area
403
    # set building type of synthetic building
404
    df["building"] = "cts"
405
    # TODO remove after #772
406
    df = df.rename(
407
        columns={
408
            # "zensus_population_id": "cell_id",
409
            "egon_building_id": "id",
410
        }
411
    )
412
    return df
413
414
415
def buildings_with_amenities():
416
    """
417
    Amenities which are assigned to buildings are determined
418
    and grouped per building and zensus cell. Buildings
419
    covering multiple cells therefore exists multiple times
420
    but in different zensus cells. This is necessary to cover
421
    all cells with a cts demand. If buildings exist in multiple
422
    substations, their amenities are summed and assigned and kept in
423
    one substation only. If as a result, a census cell is uncovered,
424
    a synthetic amenity is placed. The buildings are aggregated
425
    afterwards during the calculation of the profile_share.
426
427
    Returns
428
    -------
429
    df_buildings_with_amenities: gpd.GeoDataFrame
430
        Contains all buildings with amenities per zensus cell.
431
    df_lost_cells: gpd.GeoDataFrame
432
        Contains synthetic amenities in lost cells. Might be empty
433
    """
434
435
    from saio.openstreetmap import osm_amenities_in_buildings_filtered
436
437
    with db.session_scope() as session:
438
        cells_query = (
439
            session.query(
440
                osm_amenities_in_buildings_filtered,
441
                MapZensusGridDistricts.bus_id,
442
            )
443
            .filter(
444
                MapZensusGridDistricts.zensus_population_id
445
                == osm_amenities_in_buildings_filtered.zensus_population_id
446
            )
447
            .filter(
448
                EgonDemandRegioZensusElectricity.zensus_population_id
449
                == osm_amenities_in_buildings_filtered.zensus_population_id
450
            )
451
            .filter(
452
                EgonDemandRegioZensusElectricity.sector == "service",
453
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
454
            )
455
        )
456
    df_amenities_in_buildings = pd.read_sql(
457
        cells_query.statement, cells_query.session.bind, index_col=None
458
    )
459
460
    df_amenities_in_buildings["geom_building"] = df_amenities_in_buildings[
461
        "geom_building"
462
    ].apply(to_shape)
463
    df_amenities_in_buildings["geom_amenity"] = df_amenities_in_buildings[
464
        "geom_amenity"
465
    ].apply(to_shape)
466
467
    df_amenities_in_buildings["n_amenities_inside"] = 1
468
469
    # add identifier column for buildings in multiple substations
470
    df_amenities_in_buildings[
471
        "duplicate_identifier"
472
    ] = df_amenities_in_buildings.groupby(["id", "bus_id"])[
473
        "n_amenities_inside"
474
    ].transform(
475
        "cumsum"
476
    )
477
    df_amenities_in_buildings = df_amenities_in_buildings.sort_values(
478
        ["id", "duplicate_identifier"]
479
    )
480
    # sum amenities of buildings with multiple substations
481
    df_amenities_in_buildings[
482
        "n_amenities_inside"
483
    ] = df_amenities_in_buildings.groupby(["id", "duplicate_identifier"])[
484
        "n_amenities_inside"
485
    ].transform(
486
        "sum"
487
    )
488
489
    # create column to always go for bus_id with max amenities
490
    df_amenities_in_buildings[
491
        "max_amenities"
492
    ] = df_amenities_in_buildings.groupby(["id", "bus_id"])[
493
        "n_amenities_inside"
494
    ].transform(
495
        "sum"
496
    )
497
    # sort to go for
498
    df_amenities_in_buildings.sort_values(
499
        ["id", "max_amenities"], ascending=False, inplace=True
500
    )
501
502
    # identify lost zensus cells
503
    df_lost_cells = df_amenities_in_buildings.loc[
504
        df_amenities_in_buildings.duplicated(
505
            subset=["id", "duplicate_identifier"], keep="first"
506
        )
507
    ]
508
    df_lost_cells.drop_duplicates(
509
        subset=["zensus_population_id"], inplace=True
510
    )
511
512
    # drop buildings with multiple substation and lower max amenity
513
    df_amenities_in_buildings.drop_duplicates(
514
        subset=["id", "duplicate_identifier"], keep="first", inplace=True
515
    )
516
517
    # check if lost zensus cells are already covered
518
    if not df_lost_cells.empty:
519
        if not (
520
            df_amenities_in_buildings["zensus_population_id"]
521
            .isin(df_lost_cells["zensus_population_id"])
522
            .empty
523
        ):
524
            # query geom data for cell if not
525
            with db.session_scope() as session:
526
                cells_query = session.query(
527
                    DestatisZensusPopulationPerHa.id,
528
                    DestatisZensusPopulationPerHa.geom,
529
                ).filter(
530
                    DestatisZensusPopulationPerHa.id.in_(
531
                        df_lost_cells["zensus_population_id"]
532
                    )
533
                )
534
535
            df_lost_cells = gpd.read_postgis(
536
                cells_query.statement,
537
                cells_query.session.bind,
538
                geom_col="geom",
539
            )
540
541
            # place random amenity in cell
542
            df_lost_cells["n_amenities_inside"] = 1
543
            df_lost_cells.rename(
544
                columns={
545
                    "id": "zensus_population_id",
546
                },
547
                inplace=True,
548
            )
549
            df_lost_cells = place_buildings_with_amenities(
550
                df_lost_cells, amenities=1
551
            )
552
            df_lost_cells.rename(
553
                columns={
554
                    # "id": "zensus_population_id",
555
                    "geom_point": "geom_amenity",
556
                },
557
                inplace=True,
558
            )
559
            df_lost_cells.drop(
560
                columns=["building_count", "n_amenities_inside"], inplace=True
561
            )
562
        else:
563
            df_lost_cells = None
564
    else:
565
        df_lost_cells = None
566
567
    # drop helper columns
568
    df_amenities_in_buildings.drop(
569
        columns=["duplicate_identifier"], inplace=True
570
    )
571
572
    # sum amenities per building and cell
573
    df_amenities_in_buildings[
574
        "n_amenities_inside"
575
    ] = df_amenities_in_buildings.groupby(["zensus_population_id", "id"])[
576
        "n_amenities_inside"
577
    ].transform(
578
        "sum"
579
    )
580
    # drop duplicated buildings
581
    df_buildings_with_amenities = df_amenities_in_buildings.drop_duplicates(
582
        ["id", "zensus_population_id"]
583
    )
584
    df_buildings_with_amenities.reset_index(inplace=True, drop=True)
585
586
    df_buildings_with_amenities = df_buildings_with_amenities[
587
        ["id", "zensus_population_id", "geom_building", "n_amenities_inside"]
588
    ]
589
    df_buildings_with_amenities.rename(
590
        columns={
591
            # "zensus_population_id": "cell_id",
592
            "egon_building_id": "id"
593
        },
594
        inplace=True,
595
    )
596
597
    return df_buildings_with_amenities, df_lost_cells
598
599
600
def buildings_without_amenities():
601
    """
602
    Buildings (filtered and synthetic) in cells with
603
    cts demand but no amenities are determined.
604
605
    Returns
606
    -------
607
    df_buildings_without_amenities: gpd.GeoDataFrame
608
        Table of buildings without amenities in zensus cells
609
        with cts demand.
610
    """
611
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
612
    from saio.openstreetmap import (
613
        osm_amenities_shops_filtered,
614
        osm_buildings_filtered,
615
        osm_buildings_synthetic,
616
    )
617
618
    # buildings_filtered in cts-demand-cells without amenities
619
    with db.session_scope() as session:
620
621
        # Synthetic Buildings
622
        q_synth_buildings = session.query(
623
            osm_buildings_synthetic.cell_id.cast(Integer).label(
624
                "zensus_population_id"
625
            ),
626
            osm_buildings_synthetic.id.cast(Integer).label("id"),
627
            osm_buildings_synthetic.area.label("area"),
628
            osm_buildings_synthetic.geom_building.label("geom_building"),
629
            osm_buildings_synthetic.geom_point.label("geom_point"),
630
        )
631
632
        # Buildings filtered
633
        q_buildings_filtered = session.query(
634
            egon_map_zensus_buildings_filtered_all.zensus_population_id,
635
            osm_buildings_filtered.id,
636
            osm_buildings_filtered.area,
637
            osm_buildings_filtered.geom_building,
638
            osm_buildings_filtered.geom_point,
639
        ).filter(
640
            osm_buildings_filtered.id
641
            == egon_map_zensus_buildings_filtered_all.id
642
        )
643
644
        # Amenities + zensus_population_id
645
        q_amenities = (
646
            session.query(
647
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
648
            )
649
            .filter(
650
                func.st_within(
651
                    osm_amenities_shops_filtered.geom_amenity,
652
                    DestatisZensusPopulationPerHa.geom,
653
                )
654
            )
655
            .distinct(DestatisZensusPopulationPerHa.id)
656
        )
657
658
        # Cells with CTS demand but without amenities
659
        q_cts_without_amenities = (
660
            session.query(
661
                EgonDemandRegioZensusElectricity.zensus_population_id,
662
            )
663
            .filter(
664
                EgonDemandRegioZensusElectricity.sector == "service",
665
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
666
            )
667
            .filter(
668
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
669
                    q_amenities
670
                )
671
            )
672
            .distinct()
673
        )
674
675
        # Buildings filtered + synthetic buildings residential in
676
        # cells with CTS demand but without amenities
677
        cells_query = q_synth_buildings.union(q_buildings_filtered).filter(
678
            egon_map_zensus_buildings_filtered_all.zensus_population_id.in_(
679
                q_cts_without_amenities
680
            )
681
        )
682
683
    # df_buildings_without_amenities = pd.read_sql(
684
    #     cells_query.statement, cells_query.session.bind, index_col=None)
685
    df_buildings_without_amenities = gpd.read_postgis(
686
        cells_query.statement,
687
        cells_query.session.bind,
688
        geom_col="geom_building",
689
    )
690
691
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
692
        columns={
693
            # "zensus_population_id": "cell_id",
694
            "egon_building_id": "id",
695
        }
696
    )
697
698
    return df_buildings_without_amenities
699
700
701
def select_cts_buildings(df_buildings_wo_amenities, max_n):
702
    """
703
    N Buildings (filtered and synthetic) in each cell with
704
    cts demand are selected. Only the first n buildings
705
    are taken for each cell. The buildings are sorted by surface
706
    area.
707
708
    Returns
709
    -------
710
    df_buildings_with_cts_demand: gpd.GeoDataFrame
711
        Table of buildings
712
    """
713
714
    df_buildings_wo_amenities.sort_values(
715
        "area", ascending=False, inplace=True
716
    )
717
    # select first n ids each census cell if available
718
    df_buildings_with_cts_demand = (
719
        df_buildings_wo_amenities.groupby("zensus_population_id")
720
        .nth(list(range(max_n)))
721
        .reset_index()
722
    )
723
    df_buildings_with_cts_demand.reset_index(drop=True, inplace=True)
724
725
    return df_buildings_with_cts_demand
726
727
728
def cells_with_cts_demand_only(df_buildings_without_amenities):
729
    """
730
    Cells with cts demand but no amenities or buildilngs
731
    are determined.
732
733
    Returns
734
    -------
735
    df_cells_only_cts_demand: gpd.GeoDataFrame
736
        Table of cells with cts demand but no amenities or buildings
737
    """
738
    from saio.openstreetmap import osm_amenities_shops_filtered
739
740
    # cells mit amenities
741
    with db.session_scope() as session:
742
        sub_query = (
743
            session.query(
744
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
745
            )
746
            .filter(
747
                func.st_within(
748
                    osm_amenities_shops_filtered.geom_amenity,
749
                    DestatisZensusPopulationPerHa.geom,
750
                )
751
            )
752
            .distinct(DestatisZensusPopulationPerHa.id)
753
        )
754
755
        cells_query = (
756
            session.query(
757
                EgonDemandRegioZensusElectricity.zensus_population_id,
758
                EgonDemandRegioZensusElectricity.scenario,
759
                EgonDemandRegioZensusElectricity.sector,
760
                EgonDemandRegioZensusElectricity.demand,
761
                DestatisZensusPopulationPerHa.geom,
762
            )
763
            .filter(
764
                EgonDemandRegioZensusElectricity.sector == "service",
765
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
766
            )
767
            .filter(
768
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
769
                    sub_query
770
                )
771
            )
772
            .filter(
773
                EgonDemandRegioZensusElectricity.zensus_population_id
774
                == DestatisZensusPopulationPerHa.id
775
            )
776
        )
777
778
    df_cts_cell_without_amenities = gpd.read_postgis(
779
        cells_query.statement,
780
        cells_query.session.bind,
781
        geom_col="geom",
782
        index_col=None,
783
    )
784
785
    # TODO remove after #722
786
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
787
        columns={"cell_id": "zensus_population_id"}
788
    )
789
790
    # Census cells with only cts demand
791
    df_cells_only_cts_demand = df_cts_cell_without_amenities.loc[
792
        ~df_cts_cell_without_amenities["zensus_population_id"].isin(
793
            df_buildings_without_amenities["zensus_population_id"].unique()
794
        )
795
    ]
796
797
    df_cells_only_cts_demand.reset_index(drop=True, inplace=True)
798
799
    return df_cells_only_cts_demand
800
801
802
def calc_census_cell_share(scenario, sector):
803
    """
804
    The profile share for each census cell is calculated by it's
805
    share of annual demand per substation bus. The annual demand
806
    per cell is defined by DemandRegio/Peta5. The share is for both
807
    scenarios identical as the annual demand is linearly scaled.
808
809
    Parameters
810
    ----------
811
    scenario: str
812
        Scenario for which the share is calculated: "eGon2035" or "eGon100RE"
813
    sector: str
814
        Scenario for which the share is calculated: "electricity" or "heat"
815
816
    Returns
817
    -------
818
    df_census_share: pd.DataFrame
819
    """
820
    if sector == "electricity":
821
        with db.session_scope() as session:
822
            cells_query = (
823
                session.query(
824
                    EgonDemandRegioZensusElectricity,
825
                    MapZensusGridDistricts.bus_id,
826
                )
827
                .filter(EgonDemandRegioZensusElectricity.sector == "service")
828
                .filter(EgonDemandRegioZensusElectricity.scenario == scenario)
829
                .filter(
830
                    EgonDemandRegioZensusElectricity.zensus_population_id
831
                    == MapZensusGridDistricts.zensus_population_id
832
                )
833
            )
834
835
    elif sector == "heat":
836
        with db.session_scope() as session:
837
            cells_query = (
838
                session.query(EgonPetaHeat, MapZensusGridDistricts.bus_id)
839
                .filter(EgonPetaHeat.sector == "service")
840
                .filter(EgonPetaHeat.scenario == scenario)
841
                .filter(
842
                    EgonPetaHeat.zensus_population_id
843
                    == MapZensusGridDistricts.zensus_population_id
844
                )
845
            )
846
847
    df_demand = pd.read_sql(
848
        cells_query.statement,
0 ignored issues
show
introduced by
The variable cells_query does not seem to be defined for all execution paths.
Loading history...
849
        cells_query.session.bind,
850
        index_col="zensus_population_id",
851
    )
852
853
    # get demand share of cell per bus
854
    df_census_share = df_demand["demand"] / df_demand.groupby("bus_id")[
855
        "demand"
856
    ].transform("sum")
857
    df_census_share = df_census_share.rename("cell_share")
858
859
    df_census_share = pd.concat(
860
        [
861
            df_census_share,
862
            df_demand[["bus_id", "scenario"]],
863
        ],
864
        axis=1,
865
    )
866
867
    df_census_share.reset_index(inplace=True)
868
    return df_census_share
869
870
871
def calc_building_demand_profile_share(
872
    df_cts_buildings, scenario="eGon2035", sector="electricity"
873
):
874
    """
875
    Share of cts electricity demand profile per bus for every selected building
876
    is calculated. Building-amenity share is multiplied with census cell share
877
    to get the substation bus profile share for each building. The share is
878
    grouped and aggregated per building as some buildings exceed the shape of
879
    census cells and have amenities assigned from multiple cells. Building
880
    therefore get the amenity share of all census cells.
881
882
    Parameters
883
    ----------
884
    df_cts_buildings: gpd.GeoDataFrame
885
        Table of all buildings with cts demand assigned
886
    scenario: str
887
        Scenario for which the share is calculated.
888
    sector: str
889
        Sector for which the share is calculated.
890
891
    Returns
892
    -------
893
    df_building_share: pd.DataFrame
894
        Table of bus profile share per building
895
896
    """
897
898
    def calc_building_amenity_share(df_cts_buildings):
899
        """
900
        Calculate the building share by the number amenities per building
901
        within a census cell. Building ids can exist multiple time but with
902
        different zensus_population_ids.
903
        """
904
        df_building_amenity_share = df_cts_buildings[
905
            "n_amenities_inside"
906
        ] / df_cts_buildings.groupby("zensus_population_id")[
907
            "n_amenities_inside"
908
        ].transform(
909
            "sum"
910
        )
911
        df_building_amenity_share = pd.concat(
912
            [
913
                df_building_amenity_share.rename("building_amenity_share"),
914
                df_cts_buildings[["zensus_population_id", "id"]],
915
            ],
916
            axis=1,
917
        )
918
        return df_building_amenity_share
919
920
    df_building_amenity_share = calc_building_amenity_share(df_cts_buildings)
921
922
    df_census_cell_share = calc_census_cell_share(
923
        scenario=scenario, sector=sector
924
    )
925
926
    df_demand_share = pd.merge(
927
        left=df_building_amenity_share,
928
        right=df_census_cell_share,
929
        left_on="zensus_population_id",
930
        right_on="zensus_population_id",
931
    )
932
    df_demand_share["profile_share"] = df_demand_share[
933
        "building_amenity_share"
934
    ].multiply(df_demand_share["cell_share"])
935
936
    # only pass selected columns
937
    df_demand_share = df_demand_share[
938
        ["id", "bus_id", "scenario", "profile_share"]
939
    ]
940
941
    # Group and aggregate per building for multi cell buildings
942
    df_demand_share = (
943
        df_demand_share.groupby(["scenario", "id", "bus_id"])
944
        .sum()
945
        .reset_index()
946
    )
947
    if df_demand_share.duplicated("id", keep=False).any():
948
        print(
949
            df_demand_share.loc[df_demand_share.duplicated("id", keep=False)]
950
        )
951
    return df_demand_share
952
953
954
def calc_cts_building_profiles(
955
    egon_building_ids,
956
    bus_ids,
957
    scenario,
958
    sector,
959
):
960
    """
961
    Calculate the cts demand profile for each building. The profile is
962
    calculated by the demand share of the building per substation bus.
963
964
    Parameters
965
    ----------
966
    egon_building_ids: list of int
967
        Ids of the building for which the profile is calculated.
968
    bus_ids: list of int
969
        Ids of the substation for which selected building profiles are
970
        calculated.
971
    scenario: str
972
        Scenario for which the share is calculated: "eGon2035" or "eGon100RE"
973
    sector: str
974
        Sector for which the share is calculated: "electricity" or "heat"
975
976
    Returns
977
    -------
978
    df_building_profiles: pd.DataFrame
979
        Table of demand profile per building
980
    """
981
    if sector == "electricity":
982
        # Get cts building electricity demand share of selected buildings
983
        with db.session_scope() as session:
984
            cells_query = (
985
                session.query(
986
                    EgonCtsElectricityDemandBuildingShare,
987
                )
988
                .filter(
989
                    EgonCtsElectricityDemandBuildingShare.scenario == scenario
990
                )
991
                .filter(
992
                    EgonCtsElectricityDemandBuildingShare.building_id.in_(
993
                        egon_building_ids
994
                    )
995
                )
996
            )
997
998
        df_demand_share = pd.read_sql(
999
            cells_query.statement, cells_query.session.bind, index_col=None
1000
        )
1001
1002
        # Get substation cts electricity load profiles of selected bus_ids
1003
        with db.session_scope() as session:
1004
            cells_query = (
1005
                session.query(EgonEtragoElectricityCts).filter(
1006
                    EgonEtragoElectricityCts.scn_name == scenario
1007
                )
1008
            ).filter(EgonEtragoElectricityCts.bus_id.in_(bus_ids))
1009
1010
        df_cts_profiles = pd.read_sql(
1011
            cells_query.statement,
1012
            cells_query.session.bind,
1013
        )
1014
        df_cts_profiles = pd.DataFrame.from_dict(
1015
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1016
            orient="index",
1017
        )
1018
        # df_cts_profiles = calc_load_curves_cts(scenario)
1019
1020
    elif sector == "heat":
1021
        # Get cts building heat demand share of selected buildings
1022
        with db.session_scope() as session:
1023
            cells_query = (
1024
                session.query(
1025
                    EgonCtsHeatDemandBuildingShare,
1026
                )
1027
                .filter(EgonCtsHeatDemandBuildingShare.scenario == scenario)
1028
                .filter(
1029
                    EgonCtsHeatDemandBuildingShare.building_id.in_(
1030
                        egon_building_ids
1031
                    )
1032
                )
1033
            )
1034
1035
        df_demand_share = pd.read_sql(
1036
            cells_query.statement, cells_query.session.bind, index_col=None
1037
        )
1038
1039
        # Get substation cts heat load profiles of selected bus_ids
1040
        with db.session_scope() as session:
1041
            cells_query = (
1042
                session.query(EgonEtragoHeatCts).filter(
1043
                    EgonEtragoHeatCts.scn_name == scenario
1044
                )
1045
            ).filter(EgonEtragoHeatCts.bus_id.in_(bus_ids))
1046
1047
        df_cts_profiles = pd.read_sql(
1048
            cells_query.statement,
1049
            cells_query.session.bind,
1050
        )
1051
        df_cts_profiles = pd.DataFrame.from_dict(
1052
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1053
            orient="index",
1054
        )
1055
1056
    # TODO remove after #722
1057
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
0 ignored issues
show
introduced by
The variable df_demand_share does not seem to be defined for all execution paths.
Loading history...
1058
1059
    # get demand profile for all buildings for selected demand share
1060
    df_building_profiles = pd.DataFrame()
1061
    for bus_id, df in df_demand_share.groupby("bus_id"):
1062
        shares = df.set_index("building_id", drop=True)["profile_share"]
1063
        profile_ts = df_cts_profiles.loc[bus_id]
0 ignored issues
show
introduced by
The variable df_cts_profiles does not seem to be defined for all execution paths.
Loading history...
1064
        building_profiles = np.outer(profile_ts, shares)
1065
        building_profiles = pd.DataFrame(
1066
            building_profiles, index=profile_ts.index, columns=shares.index
1067
        )
1068
        df_building_profiles = pd.concat(
1069
            [df_building_profiles, building_profiles], axis=1
1070
        )
1071
1072
    return df_building_profiles
1073
1074
1075
def delete_synthetic_cts_buildings():
1076
    """
1077
    All synthetic cts buildings are deleted from the DB. This is necessary if
1078
    the task is run multiple times as the existing synthetic buildings
1079
    influence the results.
1080
    """
1081
    # import db tables
1082
    from saio.openstreetmap import osm_buildings_synthetic
1083
1084
    # cells mit amenities
1085
    with db.session_scope() as session:
1086
        session.query(osm_buildings_synthetic).filter(
1087
            osm_buildings_synthetic.building == "cts"
1088
        ).delete()
1089
1090
1091
def remove_double_bus_id(df_cts_buildings):
1092
    """This is an backup adhoc fix if there should still be a building which
1093
    is assigned to 2 substations. In this case one of the buildings is just
1094
    dropped. As this currently accounts for only one building with one amenity
1095
    the deviation is neglectable."""
1096
    # assign bus_id via census cell of amenity
1097
    with db.session_scope() as session:
1098
        cells_query = session.query(
1099
            MapZensusGridDistricts.zensus_population_id,
1100
            MapZensusGridDistricts.bus_id,
1101
        )
1102
1103
    df_egon_map_zensus_buildings_buses = pd.read_sql(
1104
        cells_query.statement,
1105
        cells_query.session.bind,
1106
        index_col=None,
1107
    )
1108
    df_cts_buildings = pd.merge(
1109
        left=df_cts_buildings,
1110
        right=df_egon_map_zensus_buildings_buses,
1111
        on="zensus_population_id",
1112
    )
1113
1114
    substation_per_building = df_cts_buildings.groupby("id")[
1115
        "bus_id"
1116
    ].nunique()
1117
    building_id = substation_per_building.loc[
1118
        substation_per_building > 1
1119
    ].index
1120
    df_duplicates = df_cts_buildings.loc[
1121
        df_cts_buildings["id"].isin(building_id)
1122
    ]
1123
    for unique_id in df_duplicates["id"].unique():
1124
        drop_index = df_duplicates[df_duplicates["id"] == unique_id].index[0]
1125
        print(
1126
            f"Buildings {df_cts_buildings.loc[drop_index, 'id']}"
1127
            f" dropped because of double substation"
1128
        )
1129
        df_cts_buildings.drop(index=drop_index, inplace=True)
1130
1131
    df_cts_buildings.drop(columns="bus_id", inplace=True)
1132
1133
    return df_cts_buildings
1134
1135
1136
def cts_buildings():
1137
    """
1138
    Assigns CTS demand to buildings and calculates the respective demand
1139
    profiles. The demand profile per substation are disaggregated per
1140
    annual demand share of each census cell and by the number of amenities
1141
    per building within the cell. If no building data is available,
1142
    synthetic buildings are generated around the amenities. If no amenities
1143
    but cts demand is available, buildings are randomly selected. If no
1144
    building nor amenity is available, random synthetic buildings are
1145
    generated. The demand share is stored in the database.
1146
1147
    Note:
1148
    -----
1149
    Cells with CTS demand, amenities and buildings do not change within
1150
    the scenarios, only the demand itself. Therefore scenario eGon2035
1151
    can be used universally to determine the cts buildings but not for
1152
    the demand share.
1153
    """
1154
    # ========== Register np datatypes with SQLA ==========
1155
    def adapt_numpy_float64(numpy_float64):
1156
        return AsIs(numpy_float64)
1157
1158
    def adapt_numpy_int64(numpy_int64):
1159
        return AsIs(numpy_int64)
1160
1161
    register_adapter(np.float64, adapt_numpy_float64)
1162
    register_adapter(np.int64, adapt_numpy_int64)
1163
    # =====================================================
1164
1165
    log.info("Start logging!")
1166
    # Buildings with amenities
1167
    df_buildings_with_amenities, df_lost_cells = buildings_with_amenities()
1168
    log.info("Buildings with amenities selected!")
1169
1170
    # Median number of amenities per cell
1171
    median_n_amenities = int(
1172
        df_buildings_with_amenities.groupby("zensus_population_id")[
1173
            "n_amenities_inside"
1174
        ]
1175
        .sum()
1176
        .median()
1177
    )
1178
    log.info(f"Median amenity value: {median_n_amenities}")
1179
1180
    # Remove synthetic CTS buildings if existing
1181
    delete_synthetic_cts_buildings()
1182
    log.info("Old synthetic cts buildings deleted!")
1183
1184
    # Amenities not assigned to buildings
1185
    df_amenities_without_buildings = amenities_without_buildings()
1186
    log.info("Amenities without buildlings selected!")
1187
1188
    # Append lost cells due to duplicated ids, to cover all demand cells
1189
    if not df_lost_cells.empty:
1190
1191
        # Number of synth amenities per cell
1192
        df_lost_cells["amenities"] = median_n_amenities
1193
        # create row for every amenity
1194
        df_lost_cells["amenities"] = (
1195
            df_lost_cells["amenities"].astype(int).apply(range)
1196
        )
1197
        df_lost_cells = df_lost_cells.explode("amenities")
1198
        df_lost_cells.drop(columns="amenities", inplace=True)
1199
        df_amenities_without_buildings = df_amenities_without_buildings.append(
1200
            df_lost_cells, ignore_index=True
1201
        )
1202
        log.info("Lost cells due to substation intersection appended!")
1203
1204
    # One building per amenity
1205
    df_amenities_without_buildings["n_amenities_inside"] = 1
1206
    # Create synthetic buildings for amenites without buildings
1207
    df_synthetic_buildings_with_amenities = create_synthetic_buildings(
1208
        df_amenities_without_buildings, points="geom_amenity"
1209
    )
1210
    log.info("Synthetic buildings created!")
1211
1212
    # TODO remove renaming after #722
1213
    write_table_to_postgis(
1214
        df_synthetic_buildings_with_amenities.rename(
1215
            columns={
1216
                "zensus_population_id": "cell_id",
1217
                "egon_building_id": "id",
1218
            }
1219
        ),
1220
        OsmBuildingsSynthetic,
1221
        engine=engine,
1222
        drop=False,
1223
    )
1224
    log.info("Synthetic buildings exported to DB!")
1225
1226
    # Cells without amenities but CTS demand and buildings
1227
    df_buildings_without_amenities = buildings_without_amenities()
1228
    log.info("Buildings without amenities in demand cells identified!")
1229
1230
    # Backup Bugfix for duplicated buildings which occure in SQL-Querry
1231
    # drop building ids which have already been used
1232
    mask = df_buildings_without_amenities.loc[
1233
        df_buildings_without_amenities["id"].isin(
1234
            df_buildings_with_amenities["id"]
1235
        )
1236
    ].index
1237
    df_buildings_without_amenities = df_buildings_without_amenities.drop(
1238
        index=mask
1239
    ).reset_index(drop=True)
1240
    log.info(f"{len(mask)} duplicated ids removed!")
1241
1242
    # select median n buildings per cell
1243
    df_buildings_without_amenities = select_cts_buildings(
1244
        df_buildings_without_amenities, max_n=median_n_amenities
1245
    )
1246
    df_buildings_without_amenities["n_amenities_inside"] = 1
1247
    log.info(f"{median_n_amenities} buildings per cell selected!")
1248
1249
    # Create synthetic amenities and buildings in cells with only CTS demand
1250
    df_cells_with_cts_demand_only = cells_with_cts_demand_only(
1251
        df_buildings_without_amenities
1252
    )
1253
    log.info("Cells with only demand identified!")
1254
1255
    # TODO implement overlay prevention #953 here
1256
    # Median n Amenities per cell
1257
    df_cells_with_cts_demand_only["amenities"] = median_n_amenities
1258
    # create row for every amenity
1259
    df_cells_with_cts_demand_only["amenities"] = (
1260
        df_cells_with_cts_demand_only["amenities"].astype(int).apply(range)
1261
    )
1262
    df_cells_with_cts_demand_only = df_cells_with_cts_demand_only.explode(
1263
        "amenities"
1264
    )
1265
    df_cells_with_cts_demand_only.drop(columns="amenities", inplace=True)
1266
1267
    # Only 1 Amenity per Building
1268
    df_cells_with_cts_demand_only["n_amenities_inside"] = 1
1269
    df_cells_with_cts_demand_only = place_buildings_with_amenities(
1270
        df_cells_with_cts_demand_only, amenities=1
1271
    )
1272
    df_synthetic_buildings_without_amenities = create_synthetic_buildings(
1273
        df_cells_with_cts_demand_only, points="geom_point"
1274
    )
1275
    log.info(f"{median_n_amenities} synthetic buildings per cell created")
1276
1277
    # TODO remove renaming after #722
1278
    write_table_to_postgis(
1279
        df_synthetic_buildings_without_amenities.rename(
1280
            columns={
1281
                "zensus_population_id": "cell_id",
1282
                "egon_building_id": "id",
1283
            }
1284
        ),
1285
        OsmBuildingsSynthetic,
1286
        engine=engine,
1287
        drop=False,
1288
    )
1289
    log.info("Synthetic buildings exported to DB")
1290
1291
    # Concat all buildings
1292
    columns = [
1293
        "zensus_population_id",
1294
        "id",
1295
        "geom_building",
1296
        "n_amenities_inside",
1297
        "source",
1298
    ]
1299
1300
    df_buildings_with_amenities["source"] = "bwa"
1301
    df_synthetic_buildings_with_amenities["source"] = "sbwa"
1302
    df_buildings_without_amenities["source"] = "bwoa"
1303
    df_synthetic_buildings_without_amenities["source"] = "sbwoa"
1304
1305
    df_cts_buildings = pd.concat(
1306
        [
1307
            df_buildings_with_amenities[columns],
1308
            df_synthetic_buildings_with_amenities[columns],
1309
            df_buildings_without_amenities[columns],
1310
            df_synthetic_buildings_without_amenities[columns],
1311
        ],
1312
        axis=0,
1313
        ignore_index=True,
1314
    )
1315
    df_cts_buildings = remove_double_bus_id(df_cts_buildings)
1316
    log.info("Double bus_id checked")
1317
1318
    # TODO remove dypte correction after #722
1319
    df_cts_buildings["id"] = df_cts_buildings["id"].astype(int)
1320
1321
    df_cts_buildings = gpd.GeoDataFrame(
1322
        df_cts_buildings, geometry="geom_building", crs=3035
1323
    )
1324
    df_cts_buildings = df_cts_buildings.reset_index().rename(
1325
        columns={"index": "serial"}
1326
    )
1327
1328
    # Fix #989
1329
    # Mismatched zensus population id
1330
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
1331
1332
    with db.session_scope() as session:
1333
        query = session.query(
1334
            egon_map_zensus_buildings_filtered_all.id,
1335
            egon_map_zensus_buildings_filtered_all.zensus_population_id,
1336
        ).filter(
1337
            egon_map_zensus_buildings_filtered_all.id.in_(
1338
                df_cts_buildings.id.values
1339
            )
1340
        )
1341
1342
        df_map_zensus_population_id = pd.read_sql(
1343
            query.statement, session.connection(), index_col=None
1344
        )
1345
1346
    df_cts_buildings = pd.merge(
1347
        left=df_cts_buildings.drop(columns="zensus_population_id"),
1348
        right=df_map_zensus_population_id,
1349
        on="id",
1350
    )
1351
1352
    # Write table to db for debugging and postprocessing
1353
    write_table_to_postgis(
1354
        df_cts_buildings,
1355
        CtsBuildings,
1356
        engine=engine,
1357
        drop=True,
1358
    )
1359
    log.info("CTS buildings exported to DB!")
1360
1361
1362 View Code Duplication
def cts_electricity():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1363
    """
1364
    Calculate cts electricity demand share of hvmv substation profile
1365
     for buildings.
1366
    """
1367
    log.info("Start logging!")
1368
    with db.session_scope() as session:
1369
        cells_query = session.query(CtsBuildings)
1370
1371
    df_cts_buildings = pd.read_sql(
1372
        cells_query.statement, cells_query.session.bind, index_col=None
1373
    )
1374
    log.info("CTS buildings from DB imported!")
1375
    df_demand_share_2035 = calc_building_demand_profile_share(
1376
        df_cts_buildings, scenario="eGon2035", sector="electricity"
1377
    )
1378
    log.info("Profile share for egon2035 calculated!")
1379
1380
    df_demand_share_100RE = calc_building_demand_profile_share(
1381
        df_cts_buildings, scenario="eGon100RE", sector="electricity"
1382
    )
1383
    log.info("Profile share for egon100RE calculated!")
1384
1385
    df_demand_share = pd.concat(
1386
        [df_demand_share_2035, df_demand_share_100RE],
1387
        axis=0,
1388
        ignore_index=True,
1389
    )
1390
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1391
1392
    write_table_to_postgres(
1393
        df_demand_share,
1394
        EgonCtsElectricityDemandBuildingShare,
1395
        engine=engine,
1396
        drop=True,
1397
    )
1398
    log.info("Profile share exported to DB!")
1399
1400
1401 View Code Duplication
def cts_heat():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1402
    """
1403
    Calculate cts electricity demand share of hvmv substation profile
1404
     for buildings.
1405
    """
1406
    log.info("Start logging!")
1407
    with db.session_scope() as session:
1408
        cells_query = session.query(CtsBuildings)
1409
1410
    df_cts_buildings = pd.read_sql(
1411
        cells_query.statement, cells_query.session.bind, index_col=None
1412
    )
1413
    log.info("CTS buildings from DB imported!")
1414
1415
    df_demand_share_2035 = calc_building_demand_profile_share(
1416
        df_cts_buildings, scenario="eGon2035", sector="heat"
1417
    )
1418
    log.info("Profile share for egon2035 calculated!")
1419
    df_demand_share_100RE = calc_building_demand_profile_share(
1420
        df_cts_buildings, scenario="eGon100RE", sector="heat"
1421
    )
1422
    log.info("Profile share for egon100RE calculated!")
1423
    df_demand_share = pd.concat(
1424
        [df_demand_share_2035, df_demand_share_100RE],
1425
        axis=0,
1426
        ignore_index=True,
1427
    )
1428
1429
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1430
1431
    write_table_to_postgres(
1432
        df_demand_share,
1433
        EgonCtsHeatDemandBuildingShare,
1434
        engine=engine,
1435
        drop=True,
1436
    )
1437
    log.info("Profile share exported to DB!")
1438
1439
1440 View Code Duplication
def get_cts_electricity_peak_load():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1441
    """
1442
    Get electricity peak load of all CTS buildings for both scenarios and
1443
    store in DB.
1444
    """
1445
    log.info("Start logging!")
1446
1447
    BuildingElectricityPeakLoads.__table__.create(bind=engine, checkfirst=True)
1448
1449
    # Delete rows with cts demand
1450
    with db.session_scope() as session:
1451
        session.query(BuildingElectricityPeakLoads).filter(
1452
            BuildingElectricityPeakLoads.sector == "cts"
1453
        ).delete()
1454
    log.info("Cts electricity peak load removed from DB!")
1455
1456
    for scenario in ["eGon2035", "eGon100RE"]:
1457
1458
        with db.session_scope() as session:
1459
            cells_query = session.query(
1460
                EgonCtsElectricityDemandBuildingShare
1461
            ).filter(
1462
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1463
            )
1464
1465
        df_demand_share = pd.read_sql(
1466
            cells_query.statement, cells_query.session.bind, index_col=None
1467
        )
1468
1469
        with db.session_scope() as session:
1470
            cells_query = session.query(EgonEtragoElectricityCts).filter(
1471
                EgonEtragoElectricityCts.scn_name == scenario
1472
            )
1473
1474
        df_cts_profiles = pd.read_sql(
1475
            cells_query.statement,
1476
            cells_query.session.bind,
1477
        )
1478
        df_cts_profiles = pd.DataFrame.from_dict(
1479
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1480
            orient="columns",
1481
        )
1482
1483
        df_peak_load = pd.merge(
1484
            left=df_cts_profiles.max().astype(float).rename("max"),
1485
            right=df_demand_share,
1486
            left_index=True,
1487
            right_on="bus_id",
1488
        )
1489
1490
        # Convert unit from MWh to W
1491
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1492
        df_peak_load["peak_load_in_w"] = (
1493
            df_peak_load["max"] * df_peak_load["profile_share"]
1494
        )
1495
        log.info(f"Peak load for {scenario} determined!")
1496
1497
        # TODO remove after #772
1498
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1499
        df_peak_load["sector"] = "cts"
1500
1501
        # # Write peak loads into db
1502
        write_table_to_postgres(
1503
            df_peak_load,
1504
            BuildingElectricityPeakLoads,
1505
            engine=engine,
1506
            drop=False,
1507
            index=False,
1508
            if_exists="append",
1509
        )
1510
1511
        log.info(f"Peak load for {scenario} exported to DB!")
1512
1513
1514 View Code Duplication
def get_cts_heat_peak_load():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1515
    """
1516
    Get heat peak load of all CTS buildings for both scenarios and store in DB.
1517
    """
1518
    log.info("Start logging!")
1519
1520
    BuildingHeatPeakLoads.__table__.create(bind=engine, checkfirst=True)
1521
1522
    # Delete rows with cts demand
1523
    with db.session_scope() as session:
1524
        session.query(BuildingHeatPeakLoads).filter(
1525
            BuildingHeatPeakLoads.sector == "cts"
1526
        ).delete()
1527
    log.info("Cts heat peak load removed from DB!")
1528
1529
    for scenario in ["eGon2035", "eGon100RE"]:
1530
1531
        with db.session_scope() as session:
1532
            cells_query = session.query(
1533
                EgonCtsElectricityDemandBuildingShare
1534
            ).filter(
1535
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1536
            )
1537
1538
        df_demand_share = pd.read_sql(
1539
            cells_query.statement, cells_query.session.bind, index_col=None
1540
        )
1541
        log.info(f"Retrieved demand share for scenario: {scenario}")
1542
1543
        with db.session_scope() as session:
1544
            cells_query = session.query(EgonEtragoHeatCts).filter(
1545
                EgonEtragoHeatCts.scn_name == scenario
1546
            )
1547
1548
        df_cts_profiles = pd.read_sql(
1549
            cells_query.statement,
1550
            cells_query.session.bind,
1551
        )
1552
        log.info(f"Retrieved substation profiles for scenario: {scenario}")
1553
1554
        df_cts_profiles = pd.DataFrame.from_dict(
1555
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1556
            orient="columns",
1557
        )
1558
1559
        df_peak_load = pd.merge(
1560
            left=df_cts_profiles.max().astype(float).rename("max"),
1561
            right=df_demand_share,
1562
            left_index=True,
1563
            right_on="bus_id",
1564
        )
1565
1566
        # Convert unit from MWh to W
1567
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1568
        df_peak_load["peak_load_in_w"] = (
1569
            df_peak_load["max"] * df_peak_load["profile_share"]
1570
        )
1571
        log.info(f"Peak load for {scenario} determined!")
1572
1573
        # TODO remove after #772
1574
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1575
        df_peak_load["sector"] = "cts"
1576
1577
        # # Write peak loads into db
1578
        write_table_to_postgres(
1579
            df_peak_load,
1580
            BuildingHeatPeakLoads,
1581
            engine=engine,
1582
            drop=False,
1583
            index=False,
1584
            if_exists="append",
1585
        )
1586
1587
        log.info(f"Peak load for {scenario} exported to DB!")
1588