Passed
Pull Request — dev (#991)
by
unknown
01:35
created

data.datasets.electricity_demand_timeseries.cts_buildings   F

Complexity

Total Complexity 59

Size/Duplication

Total Lines 1615
Duplicated Lines 13.62 %

Importance

Changes 0
Metric Value
wmc 59
eloc 777
dl 220
loc 1615
rs 3.9029
c 0
b 0
f 0

17 Functions

Rating   Name   Duplication   Size   Complexity  
A place_buildings_with_amenities() 0 41 3
B create_synthetic_buildings() 0 66 6
A amenities_without_buildings() 0 41 2
A select_cts_buildings() 0 25 1
B cells_with_cts_demand_only() 0 72 2
C buildings_with_amenities() 0 183 5
B calc_census_cell_share() 0 67 5
B buildings_without_amenities() 0 99 2
A cts_heat() 37 37 2
A delete_synthetic_cts_buildings() 0 14 2
A remove_double_bus_id() 0 43 3
B get_cts_electricity_peak_load() 72 72 5
C calc_cts_building_profiles() 0 119 8
B get_cts_heat_peak_load() 74 74 5
B calc_building_demand_profile_share() 0 108 3
A cts_electricity() 37 37 2
B cts_buildings() 0 224 2

1 Method

Rating   Name   Duplication   Size   Complexity  
A CtsDemandBuildings.__init__() 0 9 1

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complexity

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like data.datasets.electricity_demand_timeseries.cts_buildings often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
"""
2
CTS electricity and heat demand time series for scenarios in 2035 and 2050
3
assigned to OSM-buildings.
4
5
Disaggregation of cts heat & electricity demand time series from MV Substation
6
to census cells via annual demand and then to OSM buildings via
7
amenity tags or randomly if no sufficient OSM-data is available in the
8
respective census cell. If no OSM-buildings or synthetic residential buildings
9
are available new synthetic 5x5m buildings are generated.
10
11
The resulting data is stored in separate tables
12
13
* `openstreetmap.osm_buildings_synthetic`:
14
    Lists generated synthetic building with id, zensus_population_id and
15
    building type. This table is already created within
16
    :func:`hh_buildings.map_houseprofiles_to_buildings()`
17
* `openstreetmap.egon_cts_buildings`:
18
    Table of all selected cts buildings with id, census cell id, geometry and
19
    amenity count in building. This table is created within
20
    :func:`cts_buildings()`
21
* `demand.egon_cts_electricity_demand_building_share`:
22
    Table including the mv substation electricity profile share of all selected
23
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
24
    within :func:`cts_electricity()`
25
* `demand.egon_cts_heat_demand_building_share`:
26
    Table including the mv substation heat profile share of all selected
27
    cts buildings for scenario eGon2035 and eGon100RE. This table is created
28
    within :func:`cts_heat()`
29
* `demand.egon_building_peak_loads`:
30
    Mapping of demand time series and buildings including cell_id, building
31
    area and peak load. This table is already created within
32
    :func:`hh_buildings.get_building_peak_loads()`
33
34
**The following datasets from the database are mainly used for creation:**
35
36
* `openstreetmap.osm_buildings_filtered`:
37
    Table of OSM-buildings filtered by tags to selecting residential and cts
38
    buildings only.
39
* `openstreetmap.osm_amenities_shops_filtered`:
40
    Table of OSM-amenities filtered by tags to select cts only.
41
* `openstreetmap.osm_amenities_not_in_buildings_filtered`:
42
    Table of amenities which do not intersect with any building from
43
    `openstreetmap.osm_buildings_filtered`
44
* `openstreetmap.osm_buildings_synthetic`:
45
    Table of synthetic residential buildings
46
* `boundaries.egon_map_zensus_buildings_filtered_all`:
47
    Mapping table of census cells and buildings filtered even if population
48
    in census cell = 0.
49
* `demand.egon_demandregio_zensus_electricity`:
50
    Table of annual electricity load demand for residential and cts at census
51
    cell level. Residential load demand is derived from aggregated residential
52
    building profiles. DemandRegio CTS load demand at NUTS3 is distributed to
53
    census cells linearly to heat demand from peta5.
54
* `demand.egon_peta_heat`:
55
    Table of annual heat load demand for residential and cts at census cell
56
    level from peta5.
57
* `demand.egon_etrago_electricity_cts`:
58
    Scaled cts electricity time series for every MV substation. Derived from
59
    DemandRegio SLP for selected economic sectors at nuts3. Scaled with annual
60
    demand from `demand.egon_demandregio_zensus_electricity`
61
* `demand.egon_etrago_heat_cts`:
62
    Scaled cts heat time series for every MV substation. Derived from
63
    DemandRegio SLP Gas for selected economic sectors at nuts3. Scaled with
64
    annual demand from `demand.egon_peta_heat`.
65
66
**What is the goal?**
67
68
To disaggregate cts heat and electricity time series from MV substation level
69
to geo-referenced buildings, the annual demand from DemandRegio and Peta5 is
70
used to identify census cells with load demand. We use Openstreetmap data and
71
filter tags to identify buildings and count the  amenities within. The number
72
of amenities and the annual demand serve to assign a demand share of the MV
73
substation profile to the building.
74
75
**What is the challenge?**
76
77
The OSM, DemandRegio and Peta5 dataset differ from each other. The OSM dataset
78
is a community based dataset which is extended throughout and does not claim to
79
be complete. Therefore, not all census cells which have a demand assigned by
80
DemandRegio or Peta5 methodology also have buildings with respective tags or
81
sometime even any building at all. Furthermore, the substation load areas are
82
determined dynamically in a previous dataset. Merging these datasets different
83
scopes (census cell shapes, building shapes) and their inconsistencies need to
84
be addressed. For example: not yet tagged buildings or amenities in OSM, or
85
building shapes exceeding census cells.
86
87
88
**How are these datasets combined?**
89
90
91
The methodology for heat and electricity is the same and only differ in the
92
annual demand and MV/HV Substation profile. In a previous dataset
93
(openstreetmap), we filter all OSM buildings and amenities for tags, we relate
94
to the cts sector. Amenities are mapped to intersecting buildings and then
95
intersected with the annual demand which exists at census cell level. We obtain
96
census cells with demand and amenities and without amenities. If there is no
97
data on amenities, n synthetic ones are assigned to existing buildings. We use
98
the median value of amenities/census cell for n and all filtered buildings +
99
synthetic residential buildings. If no building data is available a synthetic
100
buildings is randomly generated. This also happens for amenities which couldn't
101
be assigned to any osm building. All census cells with an annual demand are
102
covered this way, and we obtain four different categories of amenities:
103
104
* Buildings with amenities
105
* Synthetic buildings with amenities
106
* Buildings with synthetic amenities
107
* Synthetics buildings with synthetic amenities
108
109
The amenities are summed per census cell (of amenity) and building to derive
110
the building amenity share per census cell. Multiplied with the annual demand,
111
we receive the profile demand share for each cell. Some buildings exceed the
112
census cell shape and have amenities in different cells although mapped to one
113
building only. To have unique buildings the demand share is summed once more
114
per building id. This factor can now be used to obtain the profile for each
115
building.
116
117
A schematic flow chart exist in the correspondent issue #671:
118
https://github.com/openego/eGon-data/issues/671#issuecomment-1260740258
119
120
121
**What are central assumptions during the data processing?**
122
123
* We assume OSM data to be the most reliable and complete open source dataset.
124
* We assume building and amenity tags to be truthful and accurate.
125
* Mapping census to OSM data is not trivial. Discrepancies are substituted.
126
* Missing OSM buildings are generated for each amenity.
127
* Missing amenities are generated by median value of amenities/census cell.
128
129
130
**Drawbacks and limitations of the data**
131
132
* Shape of profiles for each building is similar within one substation load
133
area and only scaled differently.
134
* Load areas are generated dynamically. In case of buildings with amenities
135
exceeding borders, the number of amenities are transferred randomly to either
136
of two load areas.
137
* The completeness of the OSM data depends on community contribution and is
138
crucial to the quality of our results.
139
* Randomly selected buildings and generated amenities may inadequately reflect
140
reality, but are chosen for sake of simplicity as a measure to fill data gaps.
141
* Since this dataset is a cascade after generation of synthetic residential
142
buildings also check drawbacks and limitations in hh_buildings.py
143
* Synthetic buildings may be placed within osm buildings which exceed multiple
144
census cells. This is currently accepted but may be solved in  #953
145
* Scattered high peak loads occur and might lead to single MV grid connections
146
in ding0. In some cases this might not be viable. Postprocessing is needed and
147
may be solved in #954
148
149
150
Example Query
151
-----
152
153
154
Notes
155
-----
156
157
This module docstring is rather a dataset documentation. Once, a decision
158
is made in ... the content of this module docstring needs to be moved to
159
docs attribute of the respective dataset class.
160
"""
161
162
from geoalchemy2 import Geometry
163
from geoalchemy2.shape import to_shape
164
from psycopg2.extensions import AsIs, register_adapter
165
from sqlalchemy import REAL, Column, Integer, String, func
166
from sqlalchemy.ext.declarative import declarative_base
167
import geopandas as gpd
168
import numpy as np
169
import pandas as pd
170
import saio
171
172
from egon.data import db
173
from egon.data import logger as log
174
from egon.data.datasets import Dataset
175
from egon.data.datasets.electricity_demand import (
176
    EgonDemandRegioZensusElectricity,
177
)
178
from egon.data.datasets.electricity_demand.temporal import (
179
    EgonEtragoElectricityCts,
180
)
181
from egon.data.datasets.electricity_demand_timeseries.hh_buildings import (
182
    BuildingElectricityPeakLoads,
183
    OsmBuildingsSynthetic,
184
)
185
from egon.data.datasets.electricity_demand_timeseries.tools import (
186
    random_ints_until_sum,
187
    random_point_in_square,
188
    specific_int_until_sum,
189
    write_table_to_postgis,
190
    write_table_to_postgres,
191
)
192
from egon.data.datasets.heat_demand import EgonPetaHeat
193
from egon.data.datasets.heat_demand_timeseries import EgonEtragoHeatCts
194
from egon.data.datasets.zensus_mv_grid_districts import MapZensusGridDistricts
195
from egon.data.datasets.zensus_vg250 import DestatisZensusPopulationPerHa
196
197
engine = db.engine()
198
Base = declarative_base()
199
200
# import db tables
201
saio.register_schema("openstreetmap", engine=engine)
202
saio.register_schema("boundaries", engine=engine)
203
204
205
class EgonCtsElectricityDemandBuildingShare(Base):
206
    __tablename__ = "egon_cts_electricity_demand_building_share"
207
    __table_args__ = {"schema": "demand"}
208
209
    building_id = Column(Integer, primary_key=True)
210
    scenario = Column(String, primary_key=True)
211
    bus_id = Column(Integer, index=True)
212
    profile_share = Column(REAL)
213
214
215
class EgonCtsHeatDemandBuildingShare(Base):
216
    __tablename__ = "egon_cts_heat_demand_building_share"
217
    __table_args__ = {"schema": "demand"}
218
219
    building_id = Column(Integer, primary_key=True)
220
    scenario = Column(String, primary_key=True)
221
    bus_id = Column(Integer, index=True)
222
    profile_share = Column(REAL)
223
224
225
class CtsBuildings(Base):
226
    __tablename__ = "egon_cts_buildings"
227
    __table_args__ = {"schema": "openstreetmap"}
228
229
    serial = Column(Integer, primary_key=True)
230
    id = Column(Integer, index=True)
231
    zensus_population_id = Column(Integer, index=True)
232
    geom_building = Column(Geometry("Polygon", 3035))
233
    n_amenities_inside = Column(Integer)
234
    source = Column(String)
235
236
237
class BuildingHeatPeakLoads(Base):
238
    __tablename__ = "egon_building_heat_peak_loads"
239
    __table_args__ = {"schema": "demand"}
240
241
    building_id = Column(Integer, primary_key=True)
242
    scenario = Column(String, primary_key=True)
243
    sector = Column(String, primary_key=True)
244
    peak_load_in_w = Column(REAL)
245
246
247
class CtsDemandBuildings(Dataset):
248
    def __init__(self, dependencies):
249
        super().__init__(
250
            name="CtsDemandBuildings",
251
            version="0.0.1",
252
            dependencies=dependencies,
253
            tasks=(
254
                cts_buildings,
255
                {cts_electricity, cts_heat},
256
                {get_cts_electricity_peak_load, get_cts_heat_peak_load},
257
            ),
258
        )
259
260
261
def amenities_without_buildings():
262
    """
263
    Amenities which have no buildings assigned and are in a cell with cts
264
    demand are determined.
265
266
    Returns
267
    -------
268
    pd.DataFrame
269
        Table of amenities without buildings
270
    """
271
    from saio.openstreetmap import osm_amenities_not_in_buildings_filtered
272
273
    with db.session_scope() as session:
274
        cells_query = (
275
            session.query(
276
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
277
                osm_amenities_not_in_buildings_filtered.geom_amenity,
278
                osm_amenities_not_in_buildings_filtered.egon_amenity_id,
279
            )
280
            .filter(
281
                func.st_within(
282
                    osm_amenities_not_in_buildings_filtered.geom_amenity,
283
                    DestatisZensusPopulationPerHa.geom,
284
                )
285
            )
286
            .filter(
287
                DestatisZensusPopulationPerHa.id
288
                == EgonDemandRegioZensusElectricity.zensus_population_id
289
            )
290
            .filter(
291
                EgonDemandRegioZensusElectricity.sector == "service",
292
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
293
            )
294
        )
295
296
    df_amenities_without_buildings = gpd.read_postgis(
297
        cells_query.statement,
298
        cells_query.session.bind,
299
        geom_col="geom_amenity",
300
    )
301
    return df_amenities_without_buildings
302
303
304
def place_buildings_with_amenities(df, amenities=None, max_amenities=None):
305
    """
306
    Building centroids are placed randomly within census cells.
307
    The Number of buildings is derived from n_amenity_inside, the selected
308
    method and number of amenities per building.
309
310
    Returns
311
    -------
312
    df: gpd.GeoDataFrame
313
        Table of buildings centroids
314
    """
315
    if isinstance(max_amenities, int):
316
        # amount of amenities is randomly generated within bounds
317
        # (max_amenities, amenities per cell)
318
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
319
            random_ints_until_sum, args=[max_amenities]
320
        )
321
    if isinstance(amenities, int):
322
        # Specific amount of amenities per building
323
        df["n_amenities_inside"] = df["n_amenities_inside"].apply(
324
            specific_int_until_sum, args=[amenities]
325
        )
326
327
    # Unnest each building
328
    df = df.explode(column="n_amenities_inside")
329
330
    # building count per cell
331
    df["building_count"] = df.groupby(["zensus_population_id"]).cumcount() + 1
332
333
    # generate random synthetic buildings
334
    edge_length = 5
335
    # create random points within census cells
336
    points = random_point_in_square(geom=df["geom"], tol=edge_length / 2)
337
338
    df.reset_index(drop=True, inplace=True)
339
    # Store center of polygon
340
    df["geom_point"] = points
341
    # Drop geometry of census cell
342
    df = df.drop(columns=["geom"])
343
344
    return df
345
346
347
def create_synthetic_buildings(df, points=None, crs="EPSG:3035"):
348
    """
349
    Synthetic buildings are generated around points.
350
351
    Parameters
352
    ----------
353
    df: pd.DataFrame
354
        Table of census cells
355
    points: gpd.GeoSeries or str
356
        List of points to place buildings around or column name of df
357
    crs: str
358
        CRS of result table
359
360
    Returns
361
    -------
362
    df: gpd.GeoDataFrame
363
        Synthetic buildings
364
    """
365
366
    if isinstance(points, str) and points in df.columns:
367
        points = df[points]
368
    elif isinstance(points, gpd.GeoSeries):
369
        pass
370
    else:
371
        raise ValueError("Points are of the wrong type")
372
373
    # Create building using a square around point
374
    edge_length = 5
375
    df["geom_building"] = points.buffer(distance=edge_length / 2, cap_style=3)
376
377
    if "geom_point" not in df.columns:
378
        df["geom_point"] = df["geom_building"].centroid
379
380
    df = gpd.GeoDataFrame(
381
        df,
382
        crs=crs,
383
        geometry="geom_building",
384
    )
385
386
    # TODO remove after #772 implementation of egon_building_id
387
    df.rename(columns={"id": "egon_building_id"}, inplace=True)
388
389
    # get max number of building ids from synthetic residential table
390
    with db.session_scope() as session:
391
        max_synth_residential_id = session.execute(
392
            func.max(OsmBuildingsSynthetic.id)
393
        ).scalar()
394
    max_synth_residential_id = int(max_synth_residential_id)
395
396
    # create sequential ids
397
    df["egon_building_id"] = range(
398
        max_synth_residential_id + 1,
399
        max_synth_residential_id + df.shape[0] + 1,
400
    )
401
402
    df["area"] = df["geom_building"].area
403
    # set building type of synthetic building
404
    df["building"] = "cts"
405
    # TODO remove after #772
406
    df = df.rename(
407
        columns={
408
            # "zensus_population_id": "cell_id",
409
            "egon_building_id": "id",
410
        }
411
    )
412
    return df
413
414
415
def buildings_with_amenities():
416
    """
417
    Amenities which are assigned to buildings are determined
418
    and grouped per building and zensus cell. Buildings
419
    covering multiple cells therefore exists multiple times
420
    but in different zensus cells. This is necessary to cover
421
    all cells with a cts demand. If buildings exist in multiple
422
    substations, their amenities are summed and assigned and kept in
423
    one substation only. If as a result, a census cell is uncovered,
424
    a synthetic amenity is placed. The buildings are aggregated
425
    afterwards during the calculation of the profile_share.
426
427
    Returns
428
    -------
429
    df_buildings_with_amenities: gpd.GeoDataFrame
430
        Contains all buildings with amenities per zensus cell.
431
    df_lost_cells: gpd.GeoDataFrame
432
        Contains synthetic amenities in lost cells. Might be empty
433
    """
434
435
    from saio.openstreetmap import osm_amenities_in_buildings_filtered
436
437
    with db.session_scope() as session:
438
        cells_query = (
439
            session.query(
440
                osm_amenities_in_buildings_filtered,
441
                MapZensusGridDistricts.bus_id,
442
            )
443
            .filter(
444
                MapZensusGridDistricts.zensus_population_id
445
                == osm_amenities_in_buildings_filtered.zensus_population_id
446
            )
447
            .filter(
448
                EgonDemandRegioZensusElectricity.zensus_population_id
449
                == osm_amenities_in_buildings_filtered.zensus_population_id
450
            )
451
            .filter(
452
                EgonDemandRegioZensusElectricity.sector == "service",
453
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
454
            )
455
        )
456
    df_amenities_in_buildings = pd.read_sql(
457
        cells_query.statement, cells_query.session.bind, index_col=None
458
    )
459
460
    df_amenities_in_buildings["geom_building"] = df_amenities_in_buildings[
461
        "geom_building"
462
    ].apply(to_shape)
463
    df_amenities_in_buildings["geom_amenity"] = df_amenities_in_buildings[
464
        "geom_amenity"
465
    ].apply(to_shape)
466
467
    df_amenities_in_buildings["n_amenities_inside"] = 1
468
469
    # add identifier column for buildings in multiple substations
470
    df_amenities_in_buildings[
471
        "duplicate_identifier"
472
    ] = df_amenities_in_buildings.groupby(["id", "bus_id"])[
473
        "n_amenities_inside"
474
    ].transform(
475
        "cumsum"
476
    )
477
    df_amenities_in_buildings = df_amenities_in_buildings.sort_values(
478
        ["id", "duplicate_identifier"]
479
    )
480
    # sum amenities of buildings with multiple substations
481
    df_amenities_in_buildings[
482
        "n_amenities_inside"
483
    ] = df_amenities_in_buildings.groupby(["id", "duplicate_identifier"])[
484
        "n_amenities_inside"
485
    ].transform(
486
        "sum"
487
    )
488
489
    # create column to always go for bus_id with max amenities
490
    df_amenities_in_buildings[
491
        "max_amenities"
492
    ] = df_amenities_in_buildings.groupby(["id", "bus_id"])[
493
        "n_amenities_inside"
494
    ].transform(
495
        "sum"
496
    )
497
    # sort to go for
498
    df_amenities_in_buildings.sort_values(
499
        ["id", "max_amenities"], ascending=False, inplace=True
500
    )
501
502
    # identify lost zensus cells
503
    df_lost_cells = df_amenities_in_buildings.loc[
504
        df_amenities_in_buildings.duplicated(
505
            subset=["id", "duplicate_identifier"], keep="first"
506
        )
507
    ]
508
    df_lost_cells.drop_duplicates(
509
        subset=["zensus_population_id"], inplace=True
510
    )
511
512
    # drop buildings with multiple substation and lower max amenity
513
    df_amenities_in_buildings.drop_duplicates(
514
        subset=["id", "duplicate_identifier"], keep="first", inplace=True
515
    )
516
517
    # check if lost zensus cells are already covered
518
    if not df_lost_cells.empty:
519
        if not (
520
            df_amenities_in_buildings["zensus_population_id"]
521
            .isin(df_lost_cells["zensus_population_id"])
522
            .empty
523
        ):
524
            # query geom data for cell if not
525
            with db.session_scope() as session:
526
                cells_query = session.query(
527
                    DestatisZensusPopulationPerHa.id,
528
                    DestatisZensusPopulationPerHa.geom,
529
                ).filter(
530
                    DestatisZensusPopulationPerHa.id.in_(
531
                        df_lost_cells["zensus_population_id"]
532
                    )
533
                )
534
535
            df_lost_cells = gpd.read_postgis(
536
                cells_query.statement,
537
                cells_query.session.bind,
538
                geom_col="geom",
539
            )
540
541
            # place random amenity in cell
542
            df_lost_cells["n_amenities_inside"] = 1
543
            df_lost_cells.rename(
544
                columns={
545
                    "id": "zensus_population_id",
546
                },
547
                inplace=True,
548
            )
549
            df_lost_cells = place_buildings_with_amenities(
550
                df_lost_cells, amenities=1
551
            )
552
            df_lost_cells.rename(
553
                columns={
554
                    # "id": "zensus_population_id",
555
                    "geom_point": "geom_amenity",
556
                },
557
                inplace=True,
558
            )
559
            df_lost_cells.drop(
560
                columns=["building_count", "n_amenities_inside"], inplace=True
561
            )
562
        else:
563
            df_lost_cells = None
564
    else:
565
        df_lost_cells = None
566
567
    # drop helper columns
568
    df_amenities_in_buildings.drop(
569
        columns=["duplicate_identifier"], inplace=True
570
    )
571
572
    # sum amenities per building and cell
573
    df_amenities_in_buildings[
574
        "n_amenities_inside"
575
    ] = df_amenities_in_buildings.groupby(["zensus_population_id", "id"])[
576
        "n_amenities_inside"
577
    ].transform(
578
        "sum"
579
    )
580
    # drop duplicated buildings
581
    df_buildings_with_amenities = df_amenities_in_buildings.drop_duplicates(
582
        ["id", "zensus_population_id"]
583
    )
584
    df_buildings_with_amenities.reset_index(inplace=True, drop=True)
585
586
    df_buildings_with_amenities = df_buildings_with_amenities[
587
        ["id", "zensus_population_id", "geom_building", "n_amenities_inside"]
588
    ]
589
    df_buildings_with_amenities.rename(
590
        columns={
591
            # "zensus_population_id": "cell_id",
592
            "egon_building_id": "id"
593
        },
594
        inplace=True,
595
    )
596
597
    return df_buildings_with_amenities, df_lost_cells
598
599
600
def buildings_without_amenities():
601
    """
602
    Buildings (filtered and synthetic) in cells with
603
    cts demand but no amenities are determined.
604
605
    Returns
606
    -------
607
    df_buildings_without_amenities: gpd.GeoDataFrame
608
        Table of buildings without amenities in zensus cells
609
        with cts demand.
610
    """
611
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
612
    from saio.openstreetmap import (
613
        osm_amenities_shops_filtered,
614
        osm_buildings_filtered,
615
        osm_buildings_synthetic,
616
    )
617
618
    # buildings_filtered in cts-demand-cells without amenities
619
    with db.session_scope() as session:
620
621
        # Synthetic Buildings
622
        q_synth_buildings = session.query(
623
            osm_buildings_synthetic.cell_id.cast(Integer).label(
624
                "zensus_population_id"
625
            ),
626
            osm_buildings_synthetic.id.cast(Integer).label("id"),
627
            osm_buildings_synthetic.area.label("area"),
628
            osm_buildings_synthetic.geom_building.label("geom_building"),
629
            osm_buildings_synthetic.geom_point.label("geom_point"),
630
        )
631
632
        # Buildings filtered
633
        q_buildings_filtered = session.query(
634
            egon_map_zensus_buildings_filtered_all.zensus_population_id,
635
            osm_buildings_filtered.id,
636
            osm_buildings_filtered.area,
637
            osm_buildings_filtered.geom_building,
638
            osm_buildings_filtered.geom_point,
639
        ).filter(
640
            osm_buildings_filtered.id
641
            == egon_map_zensus_buildings_filtered_all.id
642
        )
643
644
        # Amenities + zensus_population_id
645
        q_amenities = (
646
            session.query(
647
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
648
            )
649
            .filter(
650
                func.st_within(
651
                    osm_amenities_shops_filtered.geom_amenity,
652
                    DestatisZensusPopulationPerHa.geom,
653
                )
654
            )
655
            .distinct(DestatisZensusPopulationPerHa.id)
656
        )
657
658
        # Cells with CTS demand but without amenities
659
        q_cts_without_amenities = (
660
            session.query(
661
                EgonDemandRegioZensusElectricity.zensus_population_id,
662
            )
663
            .filter(
664
                EgonDemandRegioZensusElectricity.sector == "service",
665
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
666
            )
667
            .filter(
668
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
669
                    q_amenities
670
                )
671
            )
672
            .distinct()
673
        )
674
675
        # Buildings filtered + synthetic buildings residential in
676
        # cells with CTS demand but without amenities
677
        cells_query = q_synth_buildings.union(q_buildings_filtered).filter(
678
            egon_map_zensus_buildings_filtered_all.zensus_population_id.in_(
679
                q_cts_without_amenities
680
            )
681
        )
682
683
    # df_buildings_without_amenities = pd.read_sql(
684
    #     cells_query.statement, cells_query.session.bind, index_col=None)
685
    df_buildings_without_amenities = gpd.read_postgis(
686
        cells_query.statement,
687
        cells_query.session.bind,
688
        geom_col="geom_building",
689
    )
690
691
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
692
        columns={
693
            # "zensus_population_id": "cell_id",
694
            "egon_building_id": "id",
695
        }
696
    )
697
698
    return df_buildings_without_amenities
699
700
701
def select_cts_buildings(df_buildings_wo_amenities, max_n):
702
    """
703
    N Buildings (filtered and synthetic) in each cell with
704
    cts demand are selected. Only the first n buildings
705
    are taken for each cell. The buildings are sorted by surface
706
    area.
707
708
    Returns
709
    -------
710
    df_buildings_with_cts_demand: gpd.GeoDataFrame
711
        Table of buildings
712
    """
713
714
    df_buildings_wo_amenities.sort_values(
715
        "area", ascending=False, inplace=True
716
    )
717
    # select first n ids each census cell if available
718
    df_buildings_with_cts_demand = (
719
        df_buildings_wo_amenities.groupby("zensus_population_id")
720
        .nth(list(range(max_n)))
721
        .reset_index()
722
    )
723
    df_buildings_with_cts_demand.reset_index(drop=True, inplace=True)
724
725
    return df_buildings_with_cts_demand
726
727
728
def cells_with_cts_demand_only(df_buildings_without_amenities):
729
    """
730
    Cells with cts demand but no amenities or buildilngs
731
    are determined.
732
733
    Returns
734
    -------
735
    df_cells_only_cts_demand: gpd.GeoDataFrame
736
        Table of cells with cts demand but no amenities or buildings
737
    """
738
    from saio.openstreetmap import osm_amenities_shops_filtered
739
740
    # cells mit amenities
741
    with db.session_scope() as session:
742
        sub_query = (
743
            session.query(
744
                DestatisZensusPopulationPerHa.id.label("zensus_population_id"),
745
            )
746
            .filter(
747
                func.st_within(
748
                    osm_amenities_shops_filtered.geom_amenity,
749
                    DestatisZensusPopulationPerHa.geom,
750
                )
751
            )
752
            .distinct(DestatisZensusPopulationPerHa.id)
753
        )
754
755
        cells_query = (
756
            session.query(
757
                EgonDemandRegioZensusElectricity.zensus_population_id,
758
                EgonDemandRegioZensusElectricity.scenario,
759
                EgonDemandRegioZensusElectricity.sector,
760
                EgonDemandRegioZensusElectricity.demand,
761
                DestatisZensusPopulationPerHa.geom,
762
            )
763
            .filter(
764
                EgonDemandRegioZensusElectricity.sector == "service",
765
                EgonDemandRegioZensusElectricity.scenario == "eGon2035",
766
            )
767
            .filter(
768
                EgonDemandRegioZensusElectricity.zensus_population_id.notin_(
769
                    sub_query
770
                )
771
            )
772
            .filter(
773
                EgonDemandRegioZensusElectricity.zensus_population_id
774
                == DestatisZensusPopulationPerHa.id
775
            )
776
        )
777
778
    df_cts_cell_without_amenities = gpd.read_postgis(
779
        cells_query.statement,
780
        cells_query.session.bind,
781
        geom_col="geom",
782
        index_col=None,
783
    )
784
785
    # TODO remove after #722
786
    df_buildings_without_amenities = df_buildings_without_amenities.rename(
787
        columns={"cell_id": "zensus_population_id"}
788
    )
789
790
    # Census cells with only cts demand
791
    df_cells_only_cts_demand = df_cts_cell_without_amenities.loc[
792
        ~df_cts_cell_without_amenities["zensus_population_id"].isin(
793
            df_buildings_without_amenities["zensus_population_id"].unique()
794
        )
795
    ]
796
797
    df_cells_only_cts_demand.reset_index(drop=True, inplace=True)
798
799
    return df_cells_only_cts_demand
800
801
802
def calc_census_cell_share(scenario, sector):
803
    """
804
    The profile share for each census cell is calculated by it's
805
    share of annual demand per substation bus. The annual demand
806
    per cell is defined by DemandRegio/Peta5. The share is for both
807
    scenarios identical as the annual demand is linearly scaled.
808
809
    Parameters
810
    ----------
811
    scenario: str
812
        Scenario for which the share is calculated: "eGon2035" or "eGon100RE"
813
    sector: str
814
        Scenario for which the share is calculated: "electricity" or "heat"
815
816
    Returns
817
    -------
818
    df_census_share: pd.DataFrame
819
    """
820
    if sector == "electricity":
821
        with db.session_scope() as session:
822
            cells_query = (
823
                session.query(
824
                    EgonDemandRegioZensusElectricity,
825
                    MapZensusGridDistricts.bus_id,
826
                )
827
                .filter(EgonDemandRegioZensusElectricity.sector == "service")
828
                .filter(EgonDemandRegioZensusElectricity.scenario == scenario)
829
                .filter(
830
                    EgonDemandRegioZensusElectricity.zensus_population_id
831
                    == MapZensusGridDistricts.zensus_population_id
832
                )
833
            )
834
835
    elif sector == "heat":
836
        with db.session_scope() as session:
837
            cells_query = (
838
                session.query(EgonPetaHeat, MapZensusGridDistricts.bus_id)
839
                .filter(EgonPetaHeat.sector == "service")
840
                .filter(EgonPetaHeat.scenario == scenario)
841
                .filter(
842
                    EgonPetaHeat.zensus_population_id
843
                    == MapZensusGridDistricts.zensus_population_id
844
                )
845
            )
846
847
    df_demand = pd.read_sql(
848
        cells_query.statement,
0 ignored issues
show
introduced by
The variable cells_query does not seem to be defined for all execution paths.
Loading history...
849
        cells_query.session.bind,
850
        index_col="zensus_population_id",
851
    )
852
853
    # get demand share of cell per bus
854
    df_census_share = df_demand["demand"] / df_demand.groupby("bus_id")[
855
        "demand"
856
    ].transform("sum")
857
    df_census_share = df_census_share.rename("cell_share")
858
859
    df_census_share = pd.concat(
860
        [
861
            df_census_share,
862
            df_demand[["bus_id", "scenario"]],
863
        ],
864
        axis=1,
865
    )
866
867
    df_census_share.reset_index(inplace=True)
868
    return df_census_share
869
870
871
def calc_building_demand_profile_share(
872
    df_cts_buildings, scenario="eGon2035", sector="electricity"
873
):
874
    """
875
    Share of cts electricity demand profile per bus for every selected building
876
    is calculated. Building-amenity share is multiplied with census cell share
877
    to get the substation bus profile share for each building. The share is
878
    grouped and aggregated per building as some buildings exceed the shape of
879
    census cells and have amenities assigned from multiple cells. Building
880
    therefore get the amenity share of all census cells.
881
882
    Parameters
883
    ----------
884
    df_cts_buildings: gpd.GeoDataFrame
885
        Table of all buildings with cts demand assigned
886
    scenario: str
887
        Scenario for which the share is calculated.
888
    sector: str
889
        Sector for which the share is calculated.
890
891
    Returns
892
    -------
893
    df_building_share: pd.DataFrame
894
        Table of bus profile share per building
895
896
    """
897
898
    def calc_building_amenity_share(df_cts_buildings):
899
        """
900
        Calculate the building share by the number amenities per building
901
        within a census cell. Building ids can exist multiple time but with
902
        different zensus_population_ids.
903
        """
904
        df_building_amenity_share = df_cts_buildings[
905
            "n_amenities_inside"
906
        ] / df_cts_buildings.groupby("zensus_population_id")[
907
            "n_amenities_inside"
908
        ].transform(
909
            "sum"
910
        )
911
        df_building_amenity_share = pd.concat(
912
            [
913
                df_building_amenity_share.rename("building_amenity_share"),
914
                df_cts_buildings[["zensus_population_id", "id"]],
915
            ],
916
            axis=1,
917
        )
918
        return df_building_amenity_share
919
920
    df_building_amenity_share = calc_building_amenity_share(df_cts_buildings)
921
922
    df_census_cell_share = calc_census_cell_share(
923
        scenario=scenario, sector=sector
924
    )
925
926
    df_demand_share = pd.merge(
927
        left=df_building_amenity_share,
928
        right=df_census_cell_share,
929
        left_on="zensus_population_id",
930
        right_on="zensus_population_id",
931
    )
932
    df_demand_share["profile_share"] = df_demand_share[
933
        "building_amenity_share"
934
    ].multiply(df_demand_share["cell_share"])
935
936
    # Fix #989
937
    # Mismatched bus_id
938
    # May result in failing sanity checks
939
    from saio.boundaries import egon_map_zensus_buildings_filtered_all
940
941
    with db.session_scope() as session:
942
        query = session.query(
943
            egon_map_zensus_buildings_filtered_all.id,
944
            MapZensusGridDistricts.bus_id,
945
        ).filter(
946
            egon_map_zensus_buildings_filtered_all.id.in_(
947
                df_cts_buildings.id.values
948
            ),
949
            MapZensusGridDistricts.zensus_population_id
950
            == egon_map_zensus_buildings_filtered_all.zensus_population_id,
951
        )
952
953
        df_map_bus_id = pd.read_sql(
954
            query.statement, session.connection(), index_col=None
955
        )
956
957
    df_demand_share = pd.merge(
958
        left=df_demand_share.drop(columns="bus_id"),
959
        right=df_map_bus_id,
960
        on="id",
961
    )
962
963
    # only pass selected columns
964
    df_demand_share = df_demand_share[
965
        ["id", "bus_id", "scenario", "profile_share"]
966
    ]
967
968
    # Group and aggregate per building for multi cell buildings
969
    df_demand_share = (
970
        df_demand_share.groupby(["scenario", "id", "bus_id"])
971
        .sum()
972
        .reset_index()
973
    )
974
    if df_demand_share.duplicated("id", keep=False).any():
975
        print(
976
            df_demand_share.loc[df_demand_share.duplicated("id", keep=False)]
977
        )
978
    return df_demand_share
979
980
981
def calc_cts_building_profiles(
982
    egon_building_ids,
983
    bus_ids,
984
    scenario,
985
    sector,
986
):
987
    """
988
    Calculate the cts demand profile for each building. The profile is
989
    calculated by the demand share of the building per substation bus.
990
991
    Parameters
992
    ----------
993
    egon_building_ids: list of int
994
        Ids of the building for which the profile is calculated.
995
    bus_ids: list of int
996
        Ids of the substation for which selected building profiles are
997
        calculated.
998
    scenario: str
999
        Scenario for which the share is calculated: "eGon2035" or "eGon100RE"
1000
    sector: str
1001
        Sector for which the share is calculated: "electricity" or "heat"
1002
1003
    Returns
1004
    -------
1005
    df_building_profiles: pd.DataFrame
1006
        Table of demand profile per building
1007
    """
1008
    if sector == "electricity":
1009
        # Get cts building electricity demand share of selected buildings
1010
        with db.session_scope() as session:
1011
            cells_query = (
1012
                session.query(
1013
                    EgonCtsElectricityDemandBuildingShare,
1014
                )
1015
                .filter(
1016
                    EgonCtsElectricityDemandBuildingShare.scenario == scenario
1017
                )
1018
                .filter(
1019
                    EgonCtsElectricityDemandBuildingShare.building_id.in_(
1020
                        egon_building_ids
1021
                    )
1022
                )
1023
            )
1024
1025
        df_demand_share = pd.read_sql(
1026
            cells_query.statement, cells_query.session.bind, index_col=None
1027
        )
1028
1029
        # Get substation cts electricity load profiles of selected bus_ids
1030
        with db.session_scope() as session:
1031
            cells_query = (
1032
                session.query(EgonEtragoElectricityCts).filter(
1033
                    EgonEtragoElectricityCts.scn_name == scenario
1034
                )
1035
            ).filter(EgonEtragoElectricityCts.bus_id.in_(bus_ids))
1036
1037
        df_cts_profiles = pd.read_sql(
1038
            cells_query.statement,
1039
            cells_query.session.bind,
1040
        )
1041
        df_cts_profiles = pd.DataFrame.from_dict(
1042
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1043
            orient="index",
1044
        )
1045
        # df_cts_profiles = calc_load_curves_cts(scenario)
1046
1047
    elif sector == "heat":
1048
        # Get cts building heat demand share of selected buildings
1049
        with db.session_scope() as session:
1050
            cells_query = (
1051
                session.query(
1052
                    EgonCtsHeatDemandBuildingShare,
1053
                )
1054
                .filter(EgonCtsHeatDemandBuildingShare.scenario == scenario)
1055
                .filter(
1056
                    EgonCtsHeatDemandBuildingShare.building_id.in_(
1057
                        egon_building_ids
1058
                    )
1059
                )
1060
            )
1061
1062
        df_demand_share = pd.read_sql(
1063
            cells_query.statement, cells_query.session.bind, index_col=None
1064
        )
1065
1066
        # Get substation cts heat load profiles of selected bus_ids
1067
        with db.session_scope() as session:
1068
            cells_query = (
1069
                session.query(EgonEtragoHeatCts).filter(
1070
                    EgonEtragoHeatCts.scn_name == scenario
1071
                )
1072
            ).filter(EgonEtragoHeatCts.bus_id.in_(bus_ids))
1073
1074
        df_cts_profiles = pd.read_sql(
1075
            cells_query.statement,
1076
            cells_query.session.bind,
1077
        )
1078
        df_cts_profiles = pd.DataFrame.from_dict(
1079
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1080
            orient="index",
1081
        )
1082
1083
    # TODO remove after #722
1084
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
0 ignored issues
show
introduced by
The variable df_demand_share does not seem to be defined for all execution paths.
Loading history...
1085
1086
    # get demand profile for all buildings for selected demand share
1087
    df_building_profiles = pd.DataFrame()
1088
    for bus_id, df in df_demand_share.groupby("bus_id"):
1089
        shares = df.set_index("building_id", drop=True)["profile_share"]
1090
        profile_ts = df_cts_profiles.loc[bus_id]
0 ignored issues
show
introduced by
The variable df_cts_profiles does not seem to be defined for all execution paths.
Loading history...
1091
        building_profiles = np.outer(profile_ts, shares)
1092
        building_profiles = pd.DataFrame(
1093
            building_profiles, index=profile_ts.index, columns=shares.index
1094
        )
1095
        df_building_profiles = pd.concat(
1096
            [df_building_profiles, building_profiles], axis=1
1097
        )
1098
1099
    return df_building_profiles
1100
1101
1102
def delete_synthetic_cts_buildings():
1103
    """
1104
    All synthetic cts buildings are deleted from the DB. This is necessary if
1105
    the task is run multiple times as the existing synthetic buildings
1106
    influence the results.
1107
    """
1108
    # import db tables
1109
    from saio.openstreetmap import osm_buildings_synthetic
1110
1111
    # cells mit amenities
1112
    with db.session_scope() as session:
1113
        session.query(osm_buildings_synthetic).filter(
1114
            osm_buildings_synthetic.building == "cts"
1115
        ).delete()
1116
1117
1118
def remove_double_bus_id(df_cts_buildings):
1119
    """This is an backup adhoc fix if there should still be a building which
1120
    is assigned to 2 substations. In this case one of the buildings is just
1121
    dropped. As this currently accounts for only one building with one amenity
1122
    the deviation is neglectable."""
1123
    # assign bus_id via census cell of amenity
1124
    with db.session_scope() as session:
1125
        cells_query = session.query(
1126
            MapZensusGridDistricts.zensus_population_id,
1127
            MapZensusGridDistricts.bus_id,
1128
        )
1129
1130
    df_egon_map_zensus_buildings_buses = pd.read_sql(
1131
        cells_query.statement,
1132
        cells_query.session.bind,
1133
        index_col=None,
1134
    )
1135
    df_cts_buildings = pd.merge(
1136
        left=df_cts_buildings,
1137
        right=df_egon_map_zensus_buildings_buses,
1138
        on="zensus_population_id",
1139
    )
1140
1141
    substation_per_building = df_cts_buildings.groupby("id")[
1142
        "bus_id"
1143
    ].nunique()
1144
    building_id = substation_per_building.loc[
1145
        substation_per_building > 1
1146
    ].index
1147
    df_duplicates = df_cts_buildings.loc[
1148
        df_cts_buildings["id"].isin(building_id)
1149
    ]
1150
    for unique_id in df_duplicates["id"].unique():
1151
        drop_index = df_duplicates[df_duplicates["id"] == unique_id].index[0]
1152
        print(
1153
            f"Buildings {df_cts_buildings.loc[drop_index, 'id']}"
1154
            f" dropped because of double substation"
1155
        )
1156
        df_cts_buildings.drop(index=drop_index, inplace=True)
1157
1158
    df_cts_buildings.drop(columns="bus_id", inplace=True)
1159
1160
    return df_cts_buildings
1161
1162
1163
def cts_buildings():
1164
    """
1165
    Assigns CTS demand to buildings and calculates the respective demand
1166
    profiles. The demand profile per substation are disaggregated per
1167
    annual demand share of each census cell and by the number of amenities
1168
    per building within the cell. If no building data is available,
1169
    synthetic buildings are generated around the amenities. If no amenities
1170
    but cts demand is available, buildings are randomly selected. If no
1171
    building nor amenity is available, random synthetic buildings are
1172
    generated. The demand share is stored in the database.
1173
1174
    Note:
1175
    -----
1176
    Cells with CTS demand, amenities and buildings do not change within
1177
    the scenarios, only the demand itself. Therefore scenario eGon2035
1178
    can be used universally to determine the cts buildings but not for
1179
    the demand share.
1180
    """
1181
    # ========== Register np datatypes with SQLA ==========
1182
    def adapt_numpy_float64(numpy_float64):
1183
        return AsIs(numpy_float64)
1184
1185
    def adapt_numpy_int64(numpy_int64):
1186
        return AsIs(numpy_int64)
1187
1188
    register_adapter(np.float64, adapt_numpy_float64)
1189
    register_adapter(np.int64, adapt_numpy_int64)
1190
    # =====================================================
1191
1192
    log.info("Start logging!")
1193
    # Buildings with amenities
1194
    df_buildings_with_amenities, df_lost_cells = buildings_with_amenities()
1195
    log.info("Buildings with amenities selected!")
1196
1197
    # Median number of amenities per cell
1198
    median_n_amenities = int(
1199
        df_buildings_with_amenities.groupby("zensus_population_id")[
1200
            "n_amenities_inside"
1201
        ]
1202
        .sum()
1203
        .median()
1204
    )
1205
    log.info(f"Median amenity value: {median_n_amenities}")
1206
1207
    # Remove synthetic CTS buildings if existing
1208
    delete_synthetic_cts_buildings()
1209
    log.info("Old synthetic cts buildings deleted!")
1210
1211
    # Amenities not assigned to buildings
1212
    df_amenities_without_buildings = amenities_without_buildings()
1213
    log.info("Amenities without buildlings selected!")
1214
1215
    # Append lost cells due to duplicated ids, to cover all demand cells
1216
    if not df_lost_cells.empty:
1217
1218
        # Number of synth amenities per cell
1219
        df_lost_cells["amenities"] = median_n_amenities
1220
        # create row for every amenity
1221
        df_lost_cells["amenities"] = (
1222
            df_lost_cells["amenities"].astype(int).apply(range)
1223
        )
1224
        df_lost_cells = df_lost_cells.explode("amenities")
1225
        df_lost_cells.drop(columns="amenities", inplace=True)
1226
        df_amenities_without_buildings = df_amenities_without_buildings.append(
1227
            df_lost_cells, ignore_index=True
1228
        )
1229
        log.info("Lost cells due to substation intersection appended!")
1230
1231
    # One building per amenity
1232
    df_amenities_without_buildings["n_amenities_inside"] = 1
1233
    # Create synthetic buildings for amenites without buildings
1234
    df_synthetic_buildings_with_amenities = create_synthetic_buildings(
1235
        df_amenities_without_buildings, points="geom_amenity"
1236
    )
1237
    log.info("Synthetic buildings created!")
1238
1239
    # TODO remove renaming after #722
1240
    write_table_to_postgis(
1241
        df_synthetic_buildings_with_amenities.rename(
1242
            columns={
1243
                "zensus_population_id": "cell_id",
1244
                "egon_building_id": "id",
1245
            }
1246
        ),
1247
        OsmBuildingsSynthetic,
1248
        engine=engine,
1249
        drop=False,
1250
    )
1251
    log.info("Synthetic buildings exported to DB!")
1252
1253
    # Cells without amenities but CTS demand and buildings
1254
    df_buildings_without_amenities = buildings_without_amenities()
1255
    log.info("Buildings without amenities in demand cells identified!")
1256
1257
    # Backup Bugfix for duplicated buildings which occure in SQL-Querry
1258
    # drop building ids which have already been used
1259
    mask = df_buildings_without_amenities.loc[
1260
        df_buildings_without_amenities["id"].isin(
1261
            df_buildings_with_amenities["id"]
1262
        )
1263
    ].index
1264
    df_buildings_without_amenities = df_buildings_without_amenities.drop(
1265
        index=mask
1266
    ).reset_index(drop=True)
1267
    log.info(f"{len(mask)} duplicated ids removed!")
1268
1269
    # select median n buildings per cell
1270
    df_buildings_without_amenities = select_cts_buildings(
1271
        df_buildings_without_amenities, max_n=median_n_amenities
1272
    )
1273
    df_buildings_without_amenities["n_amenities_inside"] = 1
1274
    log.info(f"{median_n_amenities} buildings per cell selected!")
1275
1276
    # Create synthetic amenities and buildings in cells with only CTS demand
1277
    df_cells_with_cts_demand_only = cells_with_cts_demand_only(
1278
        df_buildings_without_amenities
1279
    )
1280
    log.info("Cells with only demand identified!")
1281
1282
    # TODO implement overlay prevention #953 here
1283
    # Median n Amenities per cell
1284
    df_cells_with_cts_demand_only["amenities"] = median_n_amenities
1285
    # create row for every amenity
1286
    df_cells_with_cts_demand_only["amenities"] = (
1287
        df_cells_with_cts_demand_only["amenities"].astype(int).apply(range)
1288
    )
1289
    df_cells_with_cts_demand_only = df_cells_with_cts_demand_only.explode(
1290
        "amenities"
1291
    )
1292
    df_cells_with_cts_demand_only.drop(columns="amenities", inplace=True)
1293
1294
    # Only 1 Amenity per Building
1295
    df_cells_with_cts_demand_only["n_amenities_inside"] = 1
1296
    df_cells_with_cts_demand_only = place_buildings_with_amenities(
1297
        df_cells_with_cts_demand_only, amenities=1
1298
    )
1299
    df_synthetic_buildings_without_amenities = create_synthetic_buildings(
1300
        df_cells_with_cts_demand_only, points="geom_point"
1301
    )
1302
    log.info(f"{median_n_amenities} synthetic buildings per cell created")
1303
1304
    # TODO remove renaming after #722
1305
    write_table_to_postgis(
1306
        df_synthetic_buildings_without_amenities.rename(
1307
            columns={
1308
                "zensus_population_id": "cell_id",
1309
                "egon_building_id": "id",
1310
            }
1311
        ),
1312
        OsmBuildingsSynthetic,
1313
        engine=engine,
1314
        drop=False,
1315
    )
1316
    log.info("Synthetic buildings exported to DB")
1317
1318
    # Concat all buildings
1319
    columns = [
1320
        "zensus_population_id",
1321
        "id",
1322
        "geom_building",
1323
        "n_amenities_inside",
1324
        "source",
1325
    ]
1326
1327
    df_buildings_with_amenities["source"] = "bwa"
1328
    df_synthetic_buildings_with_amenities["source"] = "sbwa"
1329
    df_buildings_without_amenities["source"] = "bwoa"
1330
    df_synthetic_buildings_without_amenities["source"] = "sbwoa"
1331
1332
    df_cts_buildings = pd.concat(
1333
        [
1334
            df_buildings_with_amenities[columns],
1335
            df_synthetic_buildings_with_amenities[columns],
1336
            df_buildings_without_amenities[columns],
1337
            df_synthetic_buildings_without_amenities[columns],
1338
        ],
1339
        axis=0,
1340
        ignore_index=True,
1341
    )
1342
    df_cts_buildings = remove_double_bus_id(df_cts_buildings)
1343
    log.info("Double bus_id checked")
1344
1345
    # TODO remove dypte correction after #722
1346
    df_cts_buildings["id"] = df_cts_buildings["id"].astype(int)
1347
1348
    df_cts_buildings = gpd.GeoDataFrame(
1349
        df_cts_buildings, geometry="geom_building", crs=3035
1350
    )
1351
    df_cts_buildings = df_cts_buildings.reset_index().rename(
1352
        columns={"index": "serial"}
1353
    )
1354
1355
    # # Fix #989
1356
    # # Mismatched zensus population id
1357
    # from saio.boundaries import egon_map_zensus_buildings_filtered_all
1358
    #
1359
    # with db.session_scope() as session:
1360
    #     query = session.query(
1361
    #         egon_map_zensus_buildings_filtered_all.id,
1362
    #         egon_map_zensus_buildings_filtered_all.zensus_population_id,
1363
    #     ).filter(
1364
    #         egon_map_zensus_buildings_filtered_all.id.in_(
1365
    #             df_cts_buildings.id.values
1366
    #         )
1367
    #     )
1368
    #
1369
    #     df_map_zensus_population_id = pd.read_sql(
1370
    #         query.statement, session.connection(), index_col=None
1371
    #     )
1372
    #
1373
    # df_cts_buildings = pd.merge(
1374
    #     left=df_cts_buildings.drop(columns="zensus_population_id"),
1375
    #     right=df_map_zensus_population_id,
1376
    #     on="id",
1377
    # )
1378
1379
    # Write table to db for debugging and postprocessing
1380
    write_table_to_postgis(
1381
        df_cts_buildings,
1382
        CtsBuildings,
1383
        engine=engine,
1384
        drop=True,
1385
    )
1386
    log.info("CTS buildings exported to DB!")
1387
1388
1389 View Code Duplication
def cts_electricity():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1390
    """
1391
    Calculate cts electricity demand share of hvmv substation profile
1392
     for buildings.
1393
    """
1394
    log.info("Start logging!")
1395
    with db.session_scope() as session:
1396
        cells_query = session.query(CtsBuildings)
1397
1398
    df_cts_buildings = pd.read_sql(
1399
        cells_query.statement, cells_query.session.bind, index_col=None
1400
    )
1401
    log.info("CTS buildings from DB imported!")
1402
    df_demand_share_2035 = calc_building_demand_profile_share(
1403
        df_cts_buildings, scenario="eGon2035", sector="electricity"
1404
    )
1405
    log.info("Profile share for egon2035 calculated!")
1406
1407
    df_demand_share_100RE = calc_building_demand_profile_share(
1408
        df_cts_buildings, scenario="eGon100RE", sector="electricity"
1409
    )
1410
    log.info("Profile share for egon100RE calculated!")
1411
1412
    df_demand_share = pd.concat(
1413
        [df_demand_share_2035, df_demand_share_100RE],
1414
        axis=0,
1415
        ignore_index=True,
1416
    )
1417
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1418
1419
    write_table_to_postgres(
1420
        df_demand_share,
1421
        EgonCtsElectricityDemandBuildingShare,
1422
        engine=engine,
1423
        drop=True,
1424
    )
1425
    log.info("Profile share exported to DB!")
1426
1427
1428 View Code Duplication
def cts_heat():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1429
    """
1430
    Calculate cts electricity demand share of hvmv substation profile
1431
     for buildings.
1432
    """
1433
    log.info("Start logging!")
1434
    with db.session_scope() as session:
1435
        cells_query = session.query(CtsBuildings)
1436
1437
    df_cts_buildings = pd.read_sql(
1438
        cells_query.statement, cells_query.session.bind, index_col=None
1439
    )
1440
    log.info("CTS buildings from DB imported!")
1441
1442
    df_demand_share_2035 = calc_building_demand_profile_share(
1443
        df_cts_buildings, scenario="eGon2035", sector="heat"
1444
    )
1445
    log.info("Profile share for egon2035 calculated!")
1446
    df_demand_share_100RE = calc_building_demand_profile_share(
1447
        df_cts_buildings, scenario="eGon100RE", sector="heat"
1448
    )
1449
    log.info("Profile share for egon100RE calculated!")
1450
    df_demand_share = pd.concat(
1451
        [df_demand_share_2035, df_demand_share_100RE],
1452
        axis=0,
1453
        ignore_index=True,
1454
    )
1455
1456
    df_demand_share.rename(columns={"id": "building_id"}, inplace=True)
1457
1458
    write_table_to_postgres(
1459
        df_demand_share,
1460
        EgonCtsHeatDemandBuildingShare,
1461
        engine=engine,
1462
        drop=True,
1463
    )
1464
    log.info("Profile share exported to DB!")
1465
1466
1467 View Code Duplication
def get_cts_electricity_peak_load():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1468
    """
1469
    Get electricity peak load of all CTS buildings for both scenarios and
1470
    store in DB.
1471
    """
1472
    log.info("Start logging!")
1473
1474
    BuildingElectricityPeakLoads.__table__.create(bind=engine, checkfirst=True)
1475
1476
    # Delete rows with cts demand
1477
    with db.session_scope() as session:
1478
        session.query(BuildingElectricityPeakLoads).filter(
1479
            BuildingElectricityPeakLoads.sector == "cts"
1480
        ).delete()
1481
    log.info("Cts electricity peak load removed from DB!")
1482
1483
    for scenario in ["eGon2035", "eGon100RE"]:
1484
1485
        with db.session_scope() as session:
1486
            cells_query = session.query(
1487
                EgonCtsElectricityDemandBuildingShare
1488
            ).filter(
1489
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1490
            )
1491
1492
        df_demand_share = pd.read_sql(
1493
            cells_query.statement, cells_query.session.bind, index_col=None
1494
        )
1495
1496
        with db.session_scope() as session:
1497
            cells_query = session.query(EgonEtragoElectricityCts).filter(
1498
                EgonEtragoElectricityCts.scn_name == scenario
1499
            )
1500
1501
        df_cts_profiles = pd.read_sql(
1502
            cells_query.statement,
1503
            cells_query.session.bind,
1504
        )
1505
        df_cts_profiles = pd.DataFrame.from_dict(
1506
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1507
            orient="columns",
1508
        )
1509
1510
        df_peak_load = pd.merge(
1511
            left=df_cts_profiles.max().astype(float).rename("max"),
1512
            right=df_demand_share,
1513
            left_index=True,
1514
            right_on="bus_id",
1515
        )
1516
1517
        # Convert unit from MWh to W
1518
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1519
        df_peak_load["peak_load_in_w"] = (
1520
            df_peak_load["max"] * df_peak_load["profile_share"]
1521
        )
1522
        log.info(f"Peak load for {scenario} determined!")
1523
1524
        # TODO remove after #772
1525
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1526
        df_peak_load["sector"] = "cts"
1527
1528
        # # Write peak loads into db
1529
        write_table_to_postgres(
1530
            df_peak_load,
1531
            BuildingElectricityPeakLoads,
1532
            engine=engine,
1533
            drop=False,
1534
            index=False,
1535
            if_exists="append",
1536
        )
1537
1538
        log.info(f"Peak load for {scenario} exported to DB!")
1539
1540
1541 View Code Duplication
def get_cts_heat_peak_load():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
1542
    """
1543
    Get heat peak load of all CTS buildings for both scenarios and store in DB.
1544
    """
1545
    log.info("Start logging!")
1546
1547
    BuildingHeatPeakLoads.__table__.create(bind=engine, checkfirst=True)
1548
1549
    # Delete rows with cts demand
1550
    with db.session_scope() as session:
1551
        session.query(BuildingHeatPeakLoads).filter(
1552
            BuildingHeatPeakLoads.sector == "cts"
1553
        ).delete()
1554
    log.info("Cts heat peak load removed from DB!")
1555
1556
    for scenario in ["eGon2035", "eGon100RE"]:
1557
1558
        with db.session_scope() as session:
1559
            cells_query = session.query(
1560
                EgonCtsElectricityDemandBuildingShare
1561
            ).filter(
1562
                EgonCtsElectricityDemandBuildingShare.scenario == scenario
1563
            )
1564
1565
        df_demand_share = pd.read_sql(
1566
            cells_query.statement, cells_query.session.bind, index_col=None
1567
        )
1568
        log.info(f"Retrieved demand share for scenario: {scenario}")
1569
1570
        with db.session_scope() as session:
1571
            cells_query = session.query(EgonEtragoHeatCts).filter(
1572
                EgonEtragoHeatCts.scn_name == scenario
1573
            )
1574
1575
        df_cts_profiles = pd.read_sql(
1576
            cells_query.statement,
1577
            cells_query.session.bind,
1578
        )
1579
        log.info(f"Retrieved substation profiles for scenario: {scenario}")
1580
1581
        df_cts_profiles = pd.DataFrame.from_dict(
1582
            df_cts_profiles.set_index("bus_id")["p_set"].to_dict(),
1583
            orient="columns",
1584
        )
1585
1586
        df_peak_load = pd.merge(
1587
            left=df_cts_profiles.max().astype(float).rename("max"),
1588
            right=df_demand_share,
1589
            left_index=True,
1590
            right_on="bus_id",
1591
        )
1592
1593
        # Convert unit from MWh to W
1594
        df_peak_load["max"] = df_peak_load["max"] * 1e6
1595
        df_peak_load["peak_load_in_w"] = (
1596
            df_peak_load["max"] * df_peak_load["profile_share"]
1597
        )
1598
        log.info(f"Peak load for {scenario} determined!")
1599
1600
        # TODO remove after #772
1601
        df_peak_load.rename(columns={"id": "building_id"}, inplace=True)
1602
        df_peak_load["sector"] = "cts"
1603
1604
        # # Write peak loads into db
1605
        write_table_to_postgres(
1606
            df_peak_load,
1607
            BuildingHeatPeakLoads,
1608
            engine=engine,
1609
            drop=False,
1610
            index=False,
1611
            if_exists="append",
1612
        )
1613
1614
        log.info(f"Peak load for {scenario} exported to DB!")
1615