data.datasets.electricity_demand.get_annual_household_el_demand_cells() - Code Metrics - Inspection of "Fix/#1180 missing cts buildings" - openego/eGon-data - Measure and Improve Code Quality continuously with Scrutinizer

Passed

Pull Request — dev (#1181)

unknown

created 2025-02-03 09:46 UTC

get_annual_household_el_demand_cells() C

↳ Parent: data.datasets.electricity_demand

Complexity

Conditions

Size

Total Lines	111
Code Lines	68

Duplication

Lines	0
Ratio	0 %

Importance

Changes

Metric	Value
eloc	68
dl	0
loc	111
rs	5.2472
c	0
b	0
f	0
cc	10
nop	0

How to fix Long Method Complexity

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

If many parameters/temporary variables are present:
- Replace temporary variables with Query
- Introduce parameter object; often combined with preserve whole object
- If the above two are insufficient: Replace method with method object
If you have long conditionals: Decompose Conditional
Otherwise: Extract method

Complexity

Complex classes like data.datasets.electricity_demand.get_annual_household_el_demand_cells() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

"""The central module containing all code dealing with processing
 data from demandRegio

"""
from sqlalchemy import Column, Float, ForeignKey, Integer, String
from sqlalchemy.ext.declarative import declarative_base
import pandas as pd

from egon.data import db
from egon.data.datasets import Dataset
from egon.data.datasets.electricity_demand.temporal import insert_cts_load
from egon.data.datasets.electricity_demand_timeseries.hh_buildings import (
    HouseholdElectricityProfilesOfBuildings,
    get_iee_hh_demand_profiles_raw,
)
from egon.data.datasets.electricity_demand_timeseries.hh_profiles import (
    HouseholdElectricityProfilesInCensusCells,
)
from egon.data.datasets.zensus_vg250 import DestatisZensusPopulationPerHa
import egon.data.config

# will be later imported from another file ###
Base = declarative_base()
engine = db.engine()


class HouseholdElectricityDemand(Dataset):
    def __init__(self, dependencies):
        super().__init__(
            name="HouseholdElectricityDemand",
            version="0.0.5",
            dependencies=dependencies,
            tasks=(create_tables, get_annual_household_el_demand_cells),
        )


class CtsElectricityDemand(Dataset):
    def __init__(self, dependencies):
        super().__init__(
            name="CtsElectricityDemand",
            version="0.0.2",
            dependencies=dependencies,
            tasks=(distribute_cts_demands, insert_cts_load),
        )


class EgonDemandRegioZensusElectricity(Base):
    __tablename__ = "egon_demandregio_zensus_electricity"
    __table_args__ = {"schema": "demand", "extend_existing": True}
    zensus_population_id = Column(
        Integer, ForeignKey(DestatisZensusPopulationPerHa.id), primary_key=True
    )
    scenario = Column(String(50), primary_key=True)
    sector = Column(String, primary_key=True)
    demand = Column(Float)


def create_tables():
    """Create tables for demandregio data
    Returns
    -------
    None.
    """
    db.execute_sql("CREATE SCHEMA IF NOT EXISTS demand;")
    db.execute_sql("CREATE SCHEMA IF NOT EXISTS society;")
    engine = db.engine()
    EgonDemandRegioZensusElectricity.__table__.drop(
        bind=engine, checkfirst=True
    )
    EgonDemandRegioZensusElectricity.__table__.create(
        bind=engine, checkfirst=True
    )


def get_annual_household_el_demand_cells():
    """
    Annual electricity demand per cell is determined

    Timeseries for every cell are accumulated, the maximum value
    determined and with the respective nuts3 factor scaled for 2035 and 2050
    scenario.

    Note
    ----------
    In test-mode 'SH' the iteration takes place by 'cell_id' to avoid
    intensive RAM usage. For whole Germany 'nuts3' are taken and
    RAM > 32GB is necessary.
    """

    with db.session_scope() as session:
        cells_query = (
            session.query(
                HouseholdElectricityProfilesOfBuildings,
                HouseholdElectricityProfilesInCensusCells.nuts3,
                HouseholdElectricityProfilesInCensusCells.factor_2019,
                HouseholdElectricityProfilesInCensusCells.factor_2023,
                HouseholdElectricityProfilesInCensusCells.factor_2035,
                HouseholdElectricityProfilesInCensusCells.factor_2050,
            )
            .filter(
                HouseholdElectricityProfilesOfBuildings.cell_id
                == HouseholdElectricityProfilesInCensusCells.cell_id
            )
            .order_by(HouseholdElectricityProfilesOfBuildings.id)
        )

    df_buildings_and_profiles = pd.read_sql(
        cells_query.statement, cells_query.session.bind, index_col="id"
    )

    # Read demand profiles from egon-data-bundle
    df_profiles = get_iee_hh_demand_profiles_raw()

    def ve(s):
        raise (ValueError(s))

    dataset = egon.data.config.settings()["egon-data"]["--dataset-boundary"]
    scenarios = egon.data.config.settings()["egon-data"]["--scenarios"]

    iterate_over = (
        "nuts3"
        if dataset == "Everything"
        else "cell_id"
        if dataset == "Schleswig-Holstein"
        else ve(f"'{dataset}' is not a valid dataset boundary.")
    )

    df_annual_demand = pd.DataFrame(
        columns=scenarios + ["zensus_population_id"]
    )

    for _, df in df_buildings_and_profiles.groupby(by=iterate_over):
        df_annual_demand_iter = pd.DataFrame(
            columns=scenarios + ["zensus_population_id"]
        )

        if "eGon2035" in scenarios:
            df_annual_demand_iter["eGon2035"] = (
                df_profiles.loc[:, df["profile_id"]].sum(axis=0)
                * df["factor_2035"].values
            )
        if "eGon100RE" in scenarios:
            df_annual_demand_iter["eGon100RE"] = (
                df_profiles.loc[:, df["profile_id"]].sum(axis=0)
                * df["factor_2050"].values
            )
        if "status2019" in scenarios:
            df_annual_demand_iter["status2019"] = (
                df_profiles.loc[:, df["profile_id"]].sum(axis=0)
                * df["factor_2019"].values
            )

        if "status2023" in scenarios:
            df_annual_demand_iter["status2023"] = (
                df_profiles.loc[:, df["profile_id"]].sum(axis=0)
                * df["factor_2023"].values
            )
        df_annual_demand_iter["zensus_population_id"] = df["cell_id"].values
        df_annual_demand = pd.concat([df_annual_demand, df_annual_demand_iter])

    df_annual_demand = (
        df_annual_demand.groupby("zensus_population_id").sum().reset_index()
    )
    df_annual_demand["sector"] = "residential"
    df_annual_demand = df_annual_demand.melt(
        id_vars=["zensus_population_id", "sector"],
        var_name="scenario",
        value_name="demand",
    )
    # convert from Wh to MWh
    df_annual_demand["demand"] = df_annual_demand["demand"] / 1e6

    # delete all cells for residentials
    with db.session_scope() as session:
        session.query(EgonDemandRegioZensusElectricity).filter(
            EgonDemandRegioZensusElectricity.sector == "residential"
        ).delete()

    # Insert data to target table
    df_annual_demand.to_sql(
        name=EgonDemandRegioZensusElectricity.__table__.name,
        schema=EgonDemandRegioZensusElectricity.__table__.schema,
        con=db.engine(),
        index=False,
        if_exists="append",
    )


def distribute_cts_demands():
    """Distribute electrical demands for cts to zensus cells.

    The demands on nuts3-level from demandregio are linear distributed
    to the heat demand of cts in each zensus cell.

    Returns
    -------
    None.

    """

    sources = egon.data.config.datasets()["electrical_demands_cts"]["sources"]

    target = egon.data.config.datasets()["electrical_demands_cts"]["targets"][
        "cts_demands_zensus"
    ]

    db.execute_sql(
        f"""DELETE FROM {target['schema']}.{target['table']}
                   WHERE sector = 'service'"""
    )

    # Select match between zensus cells and nuts3 regions of vg250
    map_nuts3 = db.select_dataframe(
        f"""SELECT zensus_population_id, vg250_nuts3 as nuts3 FROM
        {sources['map_zensus_vg250']['schema']}.
        {sources['map_zensus_vg250']['table']}""",
        index_col="zensus_population_id",
    )

    # Insert data per scenario
    for scn in egon.data.config.settings()["egon-data"]["--scenarios"]:
        # Select heat_demand per zensus cell
        peta = db.select_dataframe(
            f"""SELECT zensus_population_id, demand as heat_demand,
            sector, scenario FROM
            {sources['heat_demand_cts']['schema']}.
            {sources['heat_demand_cts']['table']}
            WHERE scenario = '{scn}'
            AND sector = 'service'""",
            index_col="zensus_population_id",
        )

        # Add nuts3 key to zensus cells
        peta["nuts3"] = map_nuts3.nuts3

        # Calculate share of nuts3 heat demand per zensus cell
        for nuts3, df in peta.groupby("nuts3"):
            peta.loc[df.index, "share"] = (
                df["heat_demand"] / df["heat_demand"].sum()
            )

        # Select forecasted electrical demands from demandregio table
        demand_nuts3 = db.select_dataframe(
            f"""SELECT nuts3, SUM(demand) as demand FROM
            {sources['demandregio']['schema']}.
            {sources['demandregio']['table']}
            WHERE scenario = '{scn}'
            AND wz IN (
                SELECT wz FROM
                {sources['demandregio_wz']['schema']}.
                {sources['demandregio_wz']['table']}
                WHERE sector = 'CTS')
            GROUP BY nuts3""",
            index_col="nuts3",
        )

        # Scale demands on nuts3 level linear to heat demand share
        peta["demand"] = peta["share"].mul(
            demand_nuts3.demand[peta["nuts3"]].values
        )

        # Rename index
        peta.index = peta.index.rename("zensus_population_id")

        # Insert data to target table
        peta[["scenario", "demand", "sector"]].to_sql(
            target["table"],
            schema=target["schema"],
            con=db.engine(),
            if_exists="append",
        )


1			"""The central module containing all code dealing with processing
2			data from demandRegio
3
4			"""
5			from sqlalchemy import Column, Float, ForeignKey, Integer, String
6			from sqlalchemy.ext.declarative import declarative_base
7			import pandas as pd
8
9			from egon.data import db
10			from egon.data.datasets import Dataset
11			from egon.data.datasets.electricity_demand.temporal import insert_cts_load
12			from egon.data.datasets.electricity_demand_timeseries.hh_buildings import (
13			HouseholdElectricityProfilesOfBuildings,
14			get_iee_hh_demand_profiles_raw,
15			)
16			from egon.data.datasets.electricity_demand_timeseries.hh_profiles import (
17			HouseholdElectricityProfilesInCensusCells,
18			)
19			from egon.data.datasets.zensus_vg250 import DestatisZensusPopulationPerHa
20			import egon.data.config
21
22			# will be later imported from another file ###
23			Base = declarative_base()
24			engine = db.engine()
25
26
27			class HouseholdElectricityDemand(Dataset):
28			def __init__(self, dependencies):
29			super().__init__(
30			name="HouseholdElectricityDemand",
31			version="0.0.5",
32			dependencies=dependencies,
33			tasks=(create_tables, get_annual_household_el_demand_cells),
34			)
35
36
37			class CtsElectricityDemand(Dataset):
38			def __init__(self, dependencies):
39			super().__init__(
40			name="CtsElectricityDemand",
41			version="0.0.2",
42			dependencies=dependencies,
43			tasks=(distribute_cts_demands, insert_cts_load),
44			)
45
46
47			class EgonDemandRegioZensusElectricity(Base):
48			__tablename__ = "egon_demandregio_zensus_electricity"
49			__table_args__ = {"schema": "demand", "extend_existing": True}
50			zensus_population_id = Column(
51			Integer, ForeignKey(DestatisZensusPopulationPerHa.id), primary_key=True
52			)
53			scenario = Column(String(50), primary_key=True)
54			sector = Column(String, primary_key=True)
55			demand = Column(Float)
56
57
58			def create_tables():
59			"""Create tables for demandregio data
60			Returns
61			-------
62			None.
63			"""
64			db.execute_sql("CREATE SCHEMA IF NOT EXISTS demand;")
65			db.execute_sql("CREATE SCHEMA IF NOT EXISTS society;")
66			engine = db.engine()
67			EgonDemandRegioZensusElectricity.__table__.drop(
68			bind=engine, checkfirst=True
69			)
70			EgonDemandRegioZensusElectricity.__table__.create(
71			bind=engine, checkfirst=True
72			)
73
74
75			def get_annual_household_el_demand_cells():
76			"""
77			Annual electricity demand per cell is determined
78
79			Timeseries for every cell are accumulated, the maximum value
80			determined and with the respective nuts3 factor scaled for 2035 and 2050
81			scenario.
82
83			Note
84			----------
85			In test-mode 'SH' the iteration takes place by 'cell_id' to avoid
86			intensive RAM usage. For whole Germany 'nuts3' are taken and
87			RAM > 32GB is necessary.
88			"""
89
90			with db.session_scope() as session:
91			cells_query = (
92			session.query(
93			HouseholdElectricityProfilesOfBuildings,
94			HouseholdElectricityProfilesInCensusCells.nuts3,
95			HouseholdElectricityProfilesInCensusCells.factor_2019,
96			HouseholdElectricityProfilesInCensusCells.factor_2023,
97			HouseholdElectricityProfilesInCensusCells.factor_2035,
98			HouseholdElectricityProfilesInCensusCells.factor_2050,
99			)
100			.filter(
101			HouseholdElectricityProfilesOfBuildings.cell_id
102			== HouseholdElectricityProfilesInCensusCells.cell_id
103			)
104			.order_by(HouseholdElectricityProfilesOfBuildings.id)
105			)
106
107			df_buildings_and_profiles = pd.read_sql(
108			cells_query.statement, cells_query.session.bind, index_col="id"
109			)
110
111			# Read demand profiles from egon-data-bundle
112			df_profiles = get_iee_hh_demand_profiles_raw()
113
114			def ve(s):
115			raise (ValueError(s))
116
117			dataset = egon.data.config.settings()["egon-data"]["--dataset-boundary"]
118			scenarios = egon.data.config.settings()["egon-data"]["--scenarios"]
119
120			iterate_over = (
121			"nuts3"
122			if dataset == "Everything"
123			else "cell_id"
124			if dataset == "Schleswig-Holstein"
125			else ve(f"'{dataset}' is not a valid dataset boundary.")
126			)
127
128			df_annual_demand = pd.DataFrame(
129			columns=scenarios + ["zensus_population_id"]
130			)
131
132			for _, df in df_buildings_and_profiles.groupby(by=iterate_over):
133			df_annual_demand_iter = pd.DataFrame(
134			columns=scenarios + ["zensus_population_id"]
135			)
136
137			if "eGon2035" in scenarios:
138			df_annual_demand_iter["eGon2035"] = (
139			df_profiles.loc[:, df["profile_id"]].sum(axis=0)
140			* df["factor_2035"].values
141			)
142			if "eGon100RE" in scenarios:
143			df_annual_demand_iter["eGon100RE"] = (
144			df_profiles.loc[:, df["profile_id"]].sum(axis=0)
145			* df["factor_2050"].values
146			)
147			if "status2019" in scenarios:
148			df_annual_demand_iter["status2019"] = (
149			df_profiles.loc[:, df["profile_id"]].sum(axis=0)
150			* df["factor_2019"].values
151			)
152
153			if "status2023" in scenarios:
154			df_annual_demand_iter["status2023"] = (
155			df_profiles.loc[:, df["profile_id"]].sum(axis=0)
156			* df["factor_2023"].values
157			)
158			df_annual_demand_iter["zensus_population_id"] = df["cell_id"].values
159			df_annual_demand = pd.concat([df_annual_demand, df_annual_demand_iter])
160
161			df_annual_demand = (
162			df_annual_demand.groupby("zensus_population_id").sum().reset_index()
163			)
164			df_annual_demand["sector"] = "residential"
165			df_annual_demand = df_annual_demand.melt(
166			id_vars=["zensus_population_id", "sector"],
167			var_name="scenario",
168			value_name="demand",
169			)
170			# convert from Wh to MWh
171			df_annual_demand["demand"] = df_annual_demand["demand"] / 1e6
172
173			# delete all cells for residentials
174			with db.session_scope() as session:
175			session.query(EgonDemandRegioZensusElectricity).filter(
176			EgonDemandRegioZensusElectricity.sector == "residential"
177			).delete()
178
179			# Insert data to target table
180			df_annual_demand.to_sql(
181			name=EgonDemandRegioZensusElectricity.__table__.name,
182			schema=EgonDemandRegioZensusElectricity.__table__.schema,
183			con=db.engine(),
184			index=False,
185			if_exists="append",
186			)
187
188
189			def distribute_cts_demands():
190			"""Distribute electrical demands for cts to zensus cells.
191
192			The demands on nuts3-level from demandregio are linear distributed
193			to the heat demand of cts in each zensus cell.
194
195			Returns
196			-------
197			None.
198
199			"""
200
201			sources = egon.data.config.datasets()["electrical_demands_cts"]["sources"]
202
203			target = egon.data.config.datasets()["electrical_demands_cts"]["targets"][
204			"cts_demands_zensus"
205			]
206
207			db.execute_sql(
208			f"""DELETE FROM {target['schema']}.{target['table']}
209			WHERE sector = 'service'"""
210			)
211
212			# Select match between zensus cells and nuts3 regions of vg250
213			map_nuts3 = db.select_dataframe(
214			f"""SELECT zensus_population_id, vg250_nuts3 as nuts3 FROM
215			{sources['map_zensus_vg250']['schema']}.
216			{sources['map_zensus_vg250']['table']}""",
217			index_col="zensus_population_id",
218			)
219
220			# Insert data per scenario
221			for scn in egon.data.config.settings()["egon-data"]["--scenarios"]:
222			# Select heat_demand per zensus cell
223			peta = db.select_dataframe(
224			f"""SELECT zensus_population_id, demand as heat_demand,
225			sector, scenario FROM
226			{sources['heat_demand_cts']['schema']}.
227			{sources['heat_demand_cts']['table']}
228			WHERE scenario = '{scn}'
229			AND sector = 'service'""",
230			index_col="zensus_population_id",
231			)
232
233			# Add nuts3 key to zensus cells
234			peta["nuts3"] = map_nuts3.nuts3
235
236			# Calculate share of nuts3 heat demand per zensus cell
237			for nuts3, df in peta.groupby("nuts3"):
238			peta.loc[df.index, "share"] = (
239			df["heat_demand"] / df["heat_demand"].sum()
240			)
241
242			# Select forecasted electrical demands from demandregio table
243			demand_nuts3 = db.select_dataframe(
244			f"""SELECT nuts3, SUM(demand) as demand FROM
245			{sources['demandregio']['schema']}.
246			{sources['demandregio']['table']}
247			WHERE scenario = '{scn}'
248			AND wz IN (
249			SELECT wz FROM
250			{sources['demandregio_wz']['schema']}.
251			{sources['demandregio_wz']['table']}
252			WHERE sector = 'CTS')
253			GROUP BY nuts3""",
254			index_col="nuts3",
255			)
256
257			# Scale demands on nuts3 level linear to heat demand share
258			peta["demand"] = peta["share"].mul(
259			demand_nuts3.demand[peta["nuts3"]].values
260			)
261
262			# Rename index
263			peta.index = peta.index.rename("zensus_population_id")
264
265			# Insert data to target table
266			peta[["scenario", "demand", "sector"]].to_sql(
267			target["table"],
268			schema=target["schema"],
269			con=db.engine(),
270			if_exists="append",
271			)
272

openego / eGon-data

Pull Request — dev (#1181)

get_annual_household_el_demand_cells() C

Complexity

Size

Duplication

Importance

How to fix Long Method Complexity

Long Method

Complexity

Duplication Side-by-Side

Filter issues like