Transnational methodology

From MyWiki
Jump to: navigation, search

Introduction

Scope and targets

In the CitiEnGov project, partners have been asked to describe how they deal with energy-related data about buildings, mobility and public lighting.

The description involved their situation and willingness on energy data management, which tools they use for managing the data and their suggestions for improvement of these tools.

Managing energy data and the tools used for this purpose vary from region to region, that’s why there is a need for harmonized energy-related data (see the next chapter for explanation). There are available standards and technologies for sharing interoperable energy-related data on European level. Based on these standards and on evaluation of answers from partners, the transnational methodology is described in this document. The medium-term goal is to make energy-related data available, based on harmonized energy data model. That means that the energy data is collected accurately and reliably in every region, together with ICT services for sharing energy-related data.

Due to the technical nature of the subject, the text presented here is mainly addressed to ICT and geo-ICT experts, with sufficient skills on:

  • Data and database modelling, data extraction/transformation/load
  • Web services for presenting and sharing data
  • Standards for interoperability, in particular related to geographic information

This text is also available in pdf version, available here.

The need for harmonized energy-related data

Cities are places where energy is produced and consumed and it makes sense to focus on cities when dealing with energy data, as they yield great potential in reduction of energy consumption and thus increased efficiency. As a direct consequence of focusing on cities, it is of utmost importance to have a comprehensive knowledge about supply and demand of energy resources including their spatial distribution within urban areas. Precise and reliable integrated knowledge about energy infrastructure in urban space, characteristics of buildings and their mutual dependencies and interrrellations plays a relevant role in advanced simulations and energy-related analyses.

A major challenge in climate change mitigation, when it comes to energy-related data, is a constant access to the data, which should be straightforward, reliable and constructed to function for a long period of time. This kind of data could support sustainable energy policies and have an effect on public investments, as it has an effect on reduction of energy consumption of buildings and transport.

As reported by the Joint Research Centre of the European Commission in Location data for buildings related energy efficiency policies "to implement and monitor energy efficiency policies effectively, local authorities and Member States are required to report on baseline scenarios (e.g. the Baseline Emissions Inventories in the Covenant of Mayors initiative) and on progress made at regular intervals (Annual Reports for the Energy Efficiency Directive and the Energy Performance of Buildings Directive and Monitoring Emissions Inventories every two years for the CoM)”.

For purposes of reporting and monitoring energy efficiency policies, there are some tools available for local authorities and Member States, but they are very basic in their operation. They only allow users to input approximate and aggregated values and when local data is not available, local authorities may rely on national data, which many times does not reflect the actual situation on local level. The harmonized data with common framework would solve this, as it would be structured in a way that it encaptures data from single buildings, to districts, to national level at the end. Therefore, creating a framework for reporting and monitoring energy efficiency policies, with strong point on harmonized data, could improve interoperability between different directives / initiatives.

Scaling and relation between EU Directives and location (source: EC JRC, 2015, Location data for buildings related energy efficiency policies)

Within such framework, providing all relevant data on buildings with accuracy and consistency will significantly improve data quality and reliability and thus enable effective scenario modelling, which is used to support the overall policy process. Furthermore, looking from a scope of economic market, web-based tools that are providing access to energy performance data of buildings could improve knowledge of the territory and support activities of companies and industry operating in this area, such as energy service companies or companies dealing with renovation of buildings.

The CitEnGov harmonized data model

CitiEnGov project is dealing with three main sectors:

  • Buildings
  • Mobility
  • Public lighting

These sectors cover a wide range of areas that need to be considered. Especially when dealing with the topic of energy within Public Authorities, this represents a large spectrum of activities that are often listed in action plans such as SEAPs or SECAPs. In the Covenant of Mayors it is described that action plans (SEAPs or SECAPs) should include actions that cover the sectors of activity from both public and private actors, covering the whole geographical area of the local authority committed” (open reference).

While it is also noted that signatories are free to choose their main areas of action, it is anticipated that most action plans cover the sectors that fall under the risk & vulnerability assessment and emission inventory. The mitigation part of action plans have recommended sectors which are also the main part of CitiEnGov project:

  • Municipal buildings, equipment/facilities
  • Tertiary (non municipal) buildings, equipment/facilities
  • Residential buildings
  • Transport
  • Industry
  • Local electricity production
  • Local heat/cold production
  • Others (e.g. Agriculture, Forestry, Fisheries)

The adaptation part in SECAP documents focuses on the resilience of the cities. In this part, sectors to be chosen mostly depend on the context and individual situation of the city, however, we can mention some of the main sectors that can improve the resilience of the city:

  • Infrastructure
  • Public Services
  • Land Use Planning
  • Environment & Biodiversity
  • Agriculture & Forestry
  • Economy

Transnationality and the need for using INSPIRE

The idea presented here is to build up the “transnational template” starting from initiatives already defined at European level by the data specifications related to the INSPIRE Directive.

The conceptual model starts from the Data Specifications defined by the INSPIRE Directive as baseline, and considers all requirements and characteristics of energy data that partners provided.

Even though the implementation of INSPIRE data models is not the focus neither the goal of CitiEnGov they will be used as a starting point and as a common approach to get a common view and common semantics about energy-data.

Therefore, the objective of this activity will be twofold:

  • a common conceptual data model, to be considered as a possible target schema for exporting and sharing data outside the local context and outside the organization;
  • a reference implementation, as SQL-based relational database (possibly for Oracle and PostGIS platforms)

It is noteworthy that the final goal is not to force CitiEnGov partners to change the way they use energy-related data internally, but to help them to generate a neutral and standardized semantics.

The importance of sharing the same semantics about energy-related data can be simply clarified with the following example: on March 2017, during a CitiEnGov videoconference (SIPRO, GOLEA, DEDAGROUP PUBLIC SERVICES) it was discussed a practical requirement coming from Slovenian regions, where data about energy consumptions are usually shared from utilities (data providers/custodians) and Public Authorities. Data about consumption are:

  • temporally aggregated on annual basis
  • divided by fuel (e.g. gas, electricity, district heating, … )
  • divided by “building” categories

In the case of building “categories” GOLEA mentioned that they usually get these data divided in terms of “uses of buildings”:

  • residential
  • industrial
  • offices
  • commerce

Indeed, even though these categories are quite similar in different countries, often they do not have the same meaning. That’s why we need to look at INSPIRE in terms of semantics (and not merely in terms of Directive’s principles, data requirements or technical specifications); semantics practically means that we already have some basic concepts like buildings’ typologies, or (better) “uses of buildings” as already defined by INSPIRE: http://inspire.ec.europa.eu/codelist/CurrentUseValue

The codelist above contains what INSPIRE conceives when we think of “uses of buildings”. This codelist is:

  • not closed, but can be extended, as in this example
  • available in different EU languages … therefore users can switch from English to German or Slovenian or Polish and get the clear definition of each value, in national languages, as in this example

Of course, this is just a simple example of what we mean when talking about “semantics” related to energy data. In the deliverable DT1.2.1 project partners already shared a common definition of other “concepts” like:

  • energy type (primary, estimated, final, …)
  • energy source (biogas, natural gas, electricity, solid fuels, warm water o stream, …)
  • heating systems (central heating, district heating, electric radiators, solar heating, stove, …)
  • … etc

A first conceptual version of the data model has been provided to CitiEnGov partners in July 2017. To facilitate the understanding and the further agreement of the conceptual model (by September 15th, 2017), CitiEnGov partners have been provided 2 different documents:

  • PowerPoint slides, explaining the rationale of the proposed data model
  • Excel spreadsheet, containing the list of classes/tables and their attributes needed to cover all possible aspects of “energy database” related to buildings, transport and public lighting

The data model consists of 3 main classes (that will be tables in the physical database implementation) corresponding to the 3 sectors the project is focused on:

  • building
  • transport
  • installation (public light)

A physical implementation of the data model has been developed in CitiEnGov with a standard SQL structure provided to all CitiEnGov partners; the CitiEnGov SQL data model is available for the two spatial relational database platforms mostly used: Oracle and PostgreSQL/PostGIS.

Conceptual data model

As aforementioned, the data model relies on the INSPIRE Data Specifications: for instance for the "Buildings" sector the Technical Guidelines considered are available at: http://inspire.jrc.ec.europa.eu/documents/Data_Specifications/INSPIRE_DataSpecification_BU_v3.0.pdf

The following image summarizes the conceptual data model:

CitiEnGov ConceptualDataModel.png

The model is based on the following basic classes:

Buildings 
allows to store data about the building stock at different level of details: building units, buildings, energy plants and facilities, block of buildings or districts.
open details about Buildings
Installation 
allows to store data about building HVAC systems, energy meters and public lighting lamps and lines.
open details about Installation
Transport 
allows to store data about transport (number of vehicles, renewal rate, …) of different transportation groups (municipal fleet, public transport, private/commercial cars, etc). It can optionally associate a spatial element (e.g the administrative area where the fleet is contained, the spatial extent of the public transport line being described, etc).
open details about Transport
EnergyAmount 
allows to store all the information about energy: primary energy production, final energy consumption, renewable production, vehicle fuel consumption.
open details about Energy amount
Geometry 
classes Building, Installation and Transport all have a geometry attribute (mandatory for the first two classes) that can be valorized as a point, line or 2D polygon.

Physical implementation of data model

As aforementioned, the physical implementation of the data model will be a reference implementation based on two different platforms mostly used, Oracle and PostgreSQL/PostGIS. The SQL scripts for creating the database are available (in pdf files) at the following links:


The physical implementation of the CitiEnGov harmonized data model will be used to populate the database with data already available at partners’ premises or collected during the CitiEnGov project. These data will be transformed by CitiEnGov partners using ETL (Extract, Transform, Load) tools. Different options do exist to achieve this data transformation:

  • using SQL or PL/SQL (or PL/pgSQL) scripting language
  • Kettle software
  • FME software
  • HALE software

The physical data model will be provided to partners containing the following SQL statements:

  • CREATE statements for all tables of the “SCC solutions database” in SQL creates an object in a relational database management system (RDBMS). In the SQL 1992 specification , the types of objects that can be created are schemas, tables, views, domains, character sets, collations, translations, and assertions. Many implementations extend the syntax to allow creation of additional objects, such as indexes and user profiles.
  • ALTER statements to add constraints related to Primary Keys; in SQL changes the properties of an object inside of a relational database management system (RDBMS).
  • INSERT INTO statements, used to insert new records in a tables corresponding to codelists; the INSERT specifies both the column names and the values to be inserted.

ICT services to share energy data

The sharing of energy-related data will rely on the deployment of web geo-ICT services based on open standards. These web services will span from catalogue services for browsing and searching data in distributed metadata catalogues, to services for visualizing or accessing data. Client applications that will be implemented by CitiEnGov partners to present energy-related data (e.g. portals) need to use these web services directly by connecting them with standard interfaces/protocols. Data services are services related to data ingestion, management, view and access; from the data provided/publisher point of view (and also according to the ISO19119 taxonomy), the data services can be grouped in the following macro-categories:

  • discovery services
  • viewing services
  • access services configuration (download)
  • processing services (subsetting, ordering, filtering)

These web geo-ICT services based may be implemented using proprietary solutions like Esri ArcGIS Server (http://server.arcgis.com/en/) or open source ones like GeoServer (http://geoserver.org/). It is crucial that the solution chosen by the partner is implementing open standard protocols like the ones mentioned hereafter.


Discovery services

The discovery of energy datasets is usually performed through searching functionalities in metadata catalogues; metadata describe the general characteristics of each dataset, independently from the distribution formats or from the availability of services that operate on the dataset. One dataset, being a geographical one or tabular or other, may have different representations; in the case of geographical data, the “discovery metadata” may provide a general but structure description (responsible parties, dates, licenses, lineage, …) and refer to one or more “resources”. For instance, a metadata regarding a geographical dataset may refer to one or more of the following “resources” in different possible formats and standard protocols:

  • a CSV or XLS formatted file containing the tabular representation of data
  • a ZIP file containing vector representation of data (e.g. SHP with DBF for attributes), to allow Geographic Information Systems’ users to easily work on simple flat datasets
  • a KML encoded file, for being represented in Google Earth or other 3D / globe viewers
  • a GML encoded file, in case of complex spatial data to be provided in an interoperable and open standard format
  • a web service conformant to OGC WMS standard interface, to allow the visualisation of maps in web or desktop map viewers
  • a web service conformant to OGC WFS standard interface service, with dynamic outputs based on the same formats (SHP/ZIP, KML, GML, …) so to allow the downloading of subsets of data based on filters, or for the downloading of frequently updated data

View services

Since sometimes data visualization may be misunderstood as data access, it may be appropriate to highlight here the principle differences:

  • accessing data involves the possibility of querying, sub-setting and filtering (it’s a necessary condition, but not sufficient since certain view services have the capabilities of expressing a filter);
  • accessing data necessarily use a physical data format, but does not depend on it; the representation of data instead is an integral part of viewing services;
  • very often in viewing services, the representation of data completely hides underlying data making it impossible to recover them (approximations, portrayal, simplifications, generalization, aggregation etc.; usually these are part of the viewing service).
  • data coming from an access service can be subsequently elaborated without loss or without the need of particular pre-elaboration.

CitiEnGov partners may expose services for view energy-related data via web services based on well-known APIs for representing tabular data, or through WMS / WMTS protocols defined by the Open Geospatial Consortium (OGC) for maps. For spatial data, viewing means producing an image from the data applying a set of rendering rules, otherwise viewing a classic alphanumeric dataset can be achieved producing a tabular representation or a graphical one. The different infrastructural components are optimized to treat the different type of data and this results in various protocols and standards used in the data services. In the same way, accessing data can have several implementations: WFS for spatial data, CSV for tabular one, SPARQL endpoint for linked (see the following section). The CitiEnGov partners may also offer functionalities to let clients visualise:

  • tabular data, with filtering/searching capabilities to extract or sort subset of datasets
  • graphics (dashboards), based on open source Javascript libraries to render statistical data with high quality diagrams and presentation styles

Download services

In the context of CitiEnGov project, different representations of energy data are foreseen:

  • tabular data, with records and rows to present data in CSV, XLS or other formats
  • geographic vector, with spatial features representing buildings, transport networks or public lighting with vectors
  • geographic coverage, with raster images of spatial phenomena (e.g. energy production may be provided as spatial data in the form of a raster layer, with regular grid containing cells with different values of energy consumption)
  • geographic sensor, with near real-time data coming from sensors (e.g. energy consumption at single municipal buildings level)

As per INSPIRE definitions, a download service for vector geographic data is equivalent to a web service implementing the OGC WFS standard interface; the intention being that the user is given access to the raw data values instead of a cartographic representation as is the case with e.g. WMS requests that only return a map image. Access to the raw data enables two key benefits:

  1. the ability to perform calculation and analysis using the vector geometries or raster cell data
  2. the ability to draw non-pixelated map images at all scales using client-side rendering

CitiEnGov partners may implement an extended set of download services that goes beyond the INSPIRE requirements. Each extension provides a specific performance benefit and the total implementation includes the following protocols and formats.


Service protocol Data format, transport Benefits
Web Feature Service (WFS) GML/XML, GeoJSON, CSV Provides interoperable methods to access and work with remote spatial data sources.
SPARQL RDF/XML, RDF/JSON Provides a basis for easy extension of any dataset through RDF triple assertion. Provides Linked Data publishing.
Custom vector data service TileJSON The vector equivalent of tile map services for raster data. To remove the overhead of clipping custom extents for vector data, tiles are pre-generated. Client applications can buffer neighboring tiles into memory in order to provide smooth panning experiences.
Custom table data service WebCSV, JSON, XML This type of service can provide access to non-spatial tabular data in one of the three formats listed. WebCSV is the lightest format but has limited support in client libraries. JSON has relatively low overhead and is widely supported by browser based end-user clients. Finally, XML is very easy to parse using any software technology despite a significant markup overhead.

Processing services

In the context of CitiEnGov, processing services are “partner-driven” web services linked to the detailed requirements coming from each partner in terms of data processing and user engagement. Several use cases aim to perform e.g. calculations on data about buildings, transport network, public lighting. This relies, of course, on well-known data models that contain the information that is required to run the appropriate equations/algorithms. Data will be read from the partner data store and will be consumed by the processing service where the actual analysis code is implemented. Therefore, the processing services may be a set of independent end-user applications that will consume their business logic via the APIs and (optionally) the client-side JavaScript libraries implemented at partners’ level. The following table summarizes a list of possible operations that can be performed for different categories of processing services:

Categories of processing services

Type of data Visualization Querying Processing
Non-spatial graph X X N/A
Non-spatial table X X N/A
Spatial graph X X Proximity
Spatial raster X N/A N/A
Spatial table X X Proximity, overlay
Spatial table: building X X Energy performance
Spatial table: network X X Route calculation
Spatial table: point X X Interpolation

Technical references for services

This section contains the technical references about interfaces, versions, operations, etc. required at server or client levels. Indeed, the details of these technical references are based on previous EU projects (e.g. eENVplus, GeoSmartCity) available at the deliverables public access pages.

Technical references are divided in three main sections:

  • client: set of requirements related to client software (desktop or web) directly used by human beings to search/discover, view, access energy-related data
  • server: set of requirements related to server components, to be made available at partners’ level
  • interface: set of requirements related to standard interfaces and protocols to be considered at client and/or server side levels to guarantee interoperability

The detailed list of technical specifications is available in the separate page ICT technical guidelines.