pypsa-eur/scripts/build_industrial_distribution_key.py
Fabian Neumann 1fc1d2a17d
Revision complete (#139)
* ammonia_production: minor cleaning and move into __main__ (#106)

* biomass_potentials: code cleaning and automatic country index inferral (#107)

* Revision: build energy totals (#111)

* blacken

* energy_totals: preliminaries

* energy_totals: update build_swiss

* energy_totals: update build_eurostat

* energy_totals: update build_idees

* energy_totals: update build_energy_totals

* energy_totals: update build_eea_co2

* energy_totals: update build_eurostat_co2

* energy_totals: update build_co2_totals

* energy_totals: update build_transport_data

* energy_totals: add tqdm progressbar to idees

* energy_totals: adjust __main__ section

* energy_totals: handle inputs via Snakefile and config

* energy_totals: handle data and emissions year via config

* energy_totals: fix reading in eurostat for different years

* energy_totals: fix erroneous drop duplicates
This caused problems for waste management in HU and SI

* energy_totals: make scope selection of CO2 or GHG a config option

* Revision: build industrial production per country (#114)

* industry-ppc: format

* industry-ppc: rewrite for performance

* industry-ppc: move reference year to config

* industry-ppct: tidy up and format (#115)

* remove stale industry demand rules (#116)

* industry-epc: rewrite for performance (#117)

* Revision: industrial distribution key (#118)

* industry-distribution: first tidying

* industry-distribution: first tidying

* industry-distribution: fix syntax

* Revision: industrial energy demand per node today (#119)

* industry-epn: minor code cleaning

* industry-epn: remove accidental artifact

* industry-epn: remove accidental artifact II

* industry-ppn: code cleaning (#120)

* minor code cleaning (#121)

* Revision: industry sector ratios (#122)

* sector-ratios: basic reformatting

* sector-ratios: add new read_excel function that filters year already

* sector-ratios: rename jrc to idees

* sector-ratios: rename conv_factor to toe_to_MWh

* sector-ratios: modularise into functions

* Move overriding of component attributes to function and into data (#123)

* move overriding of component attributes to central function and store in separate folder

* fix return of helper.override_component_attrs

* prepare: fix accidental syntax error

* override_component_attrs: bugfix that aligns with pypsa components

* Revision: build population layout (#108)

* population_layouts: move inside __main__ and blacken

* population_layouts: misc code cleaning and multiprocessing

* population_layouts: fix fill_values assignment of urban fractions

* population_layouts: bugfig for UK-GB naming ambiguity

* population_layouts: sort countries alphabetically for better overview

* config: change path to atlite cutout

* Revision: build clustered population layouts (#112)

* population_layouts: move inside __main__ and blacken

* population_layouts: misc code cleaning and multiprocessing

* population_layouts: fix fill_values assignment of urban fractions

* population_layouts: bugfig for UK-GB naming ambiguity

* population_layouts: sort countries alphabetically for better overview

* cl_pop_layout: blacken

* cl_pop_layout: turn GeoDataFrame into GeoSeries + code cleaning

* cl_pop_layout: add fraction column which is repeatedly calculated downstream

* Revision: build various heating-related time series (#113)

* population_layouts: move inside __main__ and blacken

* population_layouts: misc code cleaning and multiprocessing

* population_layouts: fix fill_values assignment of urban fractions

* population_layouts: bugfig for UK-GB naming ambiguity

* population_layouts: sort countries alphabetically for better overview

* cl_pop_layout: blacken

* cl_pop_layout: turn GeoDataFrame into GeoSeries + code cleaning

* gitignore: add .vscode

* heating_profiles: update to new atlite and move into __main__

* heating_profiles: remove extra cutout

* heating_profiles: load regions with .buffer(0) and remove clean_invalid_geometries

* heating_profiles: load regions with .buffer(0) before squeeze()

* heating_profiles: account for transpose of dataarray

* heating_profiles: account for transpose of dataarray in add_exiting_baseyear

* Reduce verbosity of Snakefile (2) (#128)

* tidy Snakefile light

* Snakefile: fix indents

* Snakefile: add missing RDIR

* tidy config by removing quotes and expanding lists (#109)

* bugfix: reorder squeeze() and buffer()

* plot/summary: cosmetic changes including: (#131)

- matplotlibrc for default style and backend
- remove unused config options
- option to configure geomap colors
- option to configure geomap bounds

* solve: align with pypsa-eur using ilopf (#129)

* tidy myopic code scripts (#132)

* use mock_snakemake from pypsa-eur (#133)

* Snakefile: add benchmark files to each rule

* Snakefile: only run build_retro_cost if endogenously optimised

* Snakefile: remove old {network} wildcard constraints

* WIP: Revision: prepare_sector_network (#124)

* population_layouts: move inside __main__ and blacken

* population_layouts: misc code cleaning and multiprocessing

* population_layouts: fix fill_values assignment of urban fractions

* population_layouts: bugfig for UK-GB naming ambiguity

* population_layouts: sort countries alphabetically for better overview

* cl_pop_layout: blacken

* cl_pop_layout: turn GeoDataFrame into GeoSeries + code cleaning

* move overriding of component attributes to central function and store in separate folder

* prepare: sort imports and remove six dependency

* prepare: remove add_emission_prices

* prepare: remove unused set_line_s_max_pu
This is a function from prepare_network

* prepare: remove unused set_line_volume_limit
This is a PyPSA-Eur function from prepare_network

* prepare: tidy add_co2limit

* remove six dependency

* prepare: tidy code first batch

* prepare: extend override_component_attrs to avoid hacky madd

* prepare: remove hacky madd() for individual components

* prepare: tidy shift function

* prepare: nodes and countries from n.buses not pop_layout

* prepare: tidy loading of pop_layout

* prepare: fix prepare_costs function

* prepare: optimise loading of traffic data

* prepare: move localizer into generate_periodic profiles

* prepare: some formatting of transport data

* prepare: eliminate some code duplication

* prepare: fix remove_h2_network
- only try to remove EU H2 store if it exists
- remove readding nodal Stores because they are never removed

* prepare: move cost adjustment to own function

* prepare: fix a syntax error

* prepare: add investment_year to get() assuming global variable

* prepare: move co2_totals out of prepare_data()

* Snakefile: remove unused prepare_sector_network inputs

* prepare: move limit p/s_nom of lines/links into function

* prepare: tidy add_co2limit file handling

* Snakefile: fix tabs

* override_component_attrs: add n/a defaults

* README: Add network picture to make scope clear

* README: Fix date of preprint (was too optimistic...)

* prepare: move some more config options to config.yaml

* prepare: runtime bugfixes

* fix benchmark path

* adjust plot ylims

* add unit attribute to bus, correct cement capture efficiency

* bugfix: land usage constrained missed inplace operation

Co-authored-by: Tom Brown <tom@nworbmot.org>

* add release notes

* remove old fix_branches() function

* deps: make geopy optional, remove unused imports

* increase default BarConvTol

* get ready for upcoming PyPSA release

* re-remove ** bug

* amend release notes

Co-authored-by: Tom Brown <tom@nworbmot.org>
2021-07-01 20:09:04 +02:00

132 lines
4.3 KiB
Python

"""Build industrial distribution keys from hotmaps database."""
import uuid
import pandas as pd
import geopandas as gpd
from itertools import product
def locate_missing_industrial_sites(df):
"""
Locate industrial sites without valid locations based on
city and countries. Should only be used if the model's
spatial resolution is coarser than individual cities.
"""
try:
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
except:
raise ModuleNotFoundError("Optional dependency 'geopy' not found."
"Install via 'conda install -c conda-forge geopy'"
"or set 'industry: hotmaps_locate_missing: false'.")
locator = Nominatim(user_agent=str(uuid.uuid4()))
geocode = RateLimiter(locator.geocode, min_delay_seconds=2)
def locate_missing(s):
if pd.isna(s.City) or s.City == "CONFIDENTIAL":
return None
loc = geocode([s.City, s.Country], geometry='wkt')
if loc is not None:
print(f"Found:\t{loc}\nFor:\t{s['City']}, {s['Country']}\n")
return f"POINT({loc.longitude} {loc.latitude})"
else:
return None
missing = df.index[df.geom.isna()]
df.loc[missing, 'coordinates'] = df.loc[missing].apply(locate_missing, axis=1)
# report stats
num_still_missing = df.coordinates.isna().sum()
num_found = len(missing) - num_still_missing
share_missing = len(missing) / len(df) * 100
share_still_missing = num_still_missing / len(df) * 100
print(f"Found {num_found} missing locations.",
f"Share of missing locations reduced from {share_missing:.2f}% to {share_still_missing:.2f}%.")
return df
def prepare_hotmaps_database(regions):
"""
Load hotmaps database of industrial sites and map onto bus regions.
"""
df = pd.read_csv(snakemake.input.hotmaps_industrial_database, sep=";", index_col=0)
df[["srid", "coordinates"]] = df.geom.str.split(';', expand=True)
if snakemake.config['industry'].get('hotmaps_locate_missing', False):
df = locate_missing_industrial_sites(df)
# remove those sites without valid locations
df.drop(df.index[df.coordinates.isna()], inplace=True)
df['coordinates'] = gpd.GeoSeries.from_wkt(df['coordinates'])
gdf = gpd.GeoDataFrame(df, geometry='coordinates', crs="EPSG:4326")
gdf = gpd.sjoin(gdf, regions, how="inner", op='within')
gdf.rename(columns={"index_right": "bus"}, inplace=True)
gdf["country"] = gdf.bus.str[:2]
return gdf
def build_nodal_distribution_key(hotmaps, regions):
"""Build nodal distribution keys for each sector."""
sectors = hotmaps.Subsector.unique()
countries = regions.index.str[:2].unique()
keys = pd.DataFrame(index=regions.index, columns=sectors, dtype=float)
pop = pd.read_csv(snakemake.input.clustered_pop_layout, index_col=0)
pop['country'] = pop.index.str[:2]
ct_total = pop.total.groupby(pop['country']).sum()
keys['population'] = pop.total / pop.country.map(ct_total)
for sector, country in product(sectors, countries):
regions_ct = regions.index[regions.index.str.contains(country)]
facilities = hotmaps.query("country == @country and Subsector == @sector")
if not facilities.empty:
emissions = facilities["Emissions_ETS_2014"]
if emissions.sum() == 0:
key = pd.Series(1 / len(facilities), facilities.index)
else:
#BEWARE: this is a strong assumption
emissions = emissions.fillna(emissions.mean())
key = emissions / emissions.sum()
key = key.groupby(facilities.bus).sum().reindex(regions_ct, fill_value=0.)
else:
key = keys.loc[regions_ct, 'population']
keys.loc[regions_ct, sector] = key
return keys
if __name__ == "__main__":
if 'snakemake' not in globals():
from helper import mock_snakemake
snakemake = mock_snakemake(
'build_industrial_distribution_key',
simpl='',
clusters=48,
)
regions = gpd.read_file(snakemake.input.regions_onshore).set_index('name')
hotmaps = prepare_hotmaps_database(regions)
keys = build_nodal_distribution_key(hotmaps, regions)
keys.to_csv(snakemake.output.industrial_distribution_key)