Sunteți pe pagina 1din 8

ScienceDirect

Available online at www.sciencedirect.com

ProcediaScienceDirect
Computer Science 00 (2017) 000000
Available online at www.sciencedirect.com www.elsevier.com/locate/procedia
Procedia Computer Science 00 (2017) 000000
www.elsevier.com/locate/procedia
ScienceDirect
Procedia Computer Science 113 (2017) 916

The 8th International Conference on Emerging Ubiquitous Systems and Pervasive Networks
(EUSPN 2017)
The 8th International Conference on Emerging Ubiquitous Systems and Pervasive Networks
(EUSPN 2017)
Modeling and Processing Big Data of Power Transmission Grid
Substation
Modeling and Processing Using
Big Data of Neo4j
Power Transmission Grid
Arbr PerukuaSubstation UsingbNeo4j
*, Daniela Minkovska , Lyudmila Stoyanovac
a b str.Xh.Mustafa BB9/1,h1 No 7, Pristina
c
a

b
Arbr Peruku *, Daniela Minkovska , Lyudmila Stoyanova
Kosovar, PhD Candidate at Faculty of Computer Systems and Technologies - TUS, 10000, Kosovo
Bulgarian, Faculty of Computer Systems and Technologies - TUS , Blv. Kl. Ohridski , Sofia, Bulgaria
c
a Bulgarian,
Kosovar, PhD Candidate Facultyof
at Faculty ofComputer
ComputerSystems
Systemsand
andTechnologies
Technologies --TUS,
TUS ,str.Xh.Mustafa
Blv. Kl. Ohridski , Sofia,
BB9/1,h1 Bulgaria
No 7, Pristina 10000, Kosovo
b
Bulgarian, Faculty of Computer Systems and Technologies - TUS , Blv. Kl. Ohridski , Sofia, Bulgaria
c
Bulgarian, Faculty of Computer Systems and Technologies - TUS , Blv. Kl. Ohridski , Sofia, Bulgaria

Abstract

Abstract
Data sizes in power transmission grid have increased rapidly, which results in challenges. These data are large in volume; they
are generated fast and in different format, and come from various sources such as electrical substations. Traditional relational
databases
Data sizes are inadequate
in power in termsgrid
transmission of response time and
have increased have impact
rapidly, on performance
which results whenThese
in challenges. applied
datato are
verylarge
largein data sets, they
volume; and
are
alsogenerated
make this fastdatabase
and in different
difficultformat, and come
to evolve from various
according sources
to business such To
needs. as electrical substations.
address this Traditional
shortcoming, relational
the Big Data
implementations
databases are leveraging
are inadequate new
in terms oftechnologies
response time such
andashave
NoSQL dataonstores.
impact This research
performance when paper aims
applied and tries
to very largetodata
improve this
sets, and
process
also by modeling
make and processing
this database those
difficult to data using
evolve Neo4jtodatabase,
according businessand presents
needs. To modeling and processing
address this shortcoming, the the
dataBig
of power
Data
implementations
transmission gridare leveraging
substation new technologies
which has two power suchtransformers,
as NoSQL dataandstores. This research
then adding a new paper
poweraims and tries to
transformer to simulate
improve thisthe
evolving
process byfeature of Neo4j
modeling database according
and processing to the
those data business
using Neo4jneeds.
database, and presents modeling and processing the data of power
transmission grid substation which has two power transformers, and then adding a new power transformer to simulate the
evolving feature of Neo4j database according to the business needs.
2017 The Authors. Published by Elsevier B.V.
2017 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the Conference Program Chairs.
Peer-review under responsibility of the Conference Program Chairs.
2017 The Authors. Published by Elsevier B.V.
Peer-review
Keywords: Bigunder
Data; responsibility of the
Power transmission Conference
grid; Program
Neo4j; Processing Chairs.
and modeling

Keywords: Big Data; Power transmission grid; Neo4j; Processing and modeling

* Corresponding author. Tel.: +386 (0) 49 687 307


E-mail address: arber.percuku@gmail.com
* Corresponding author. Tel.: +386 (0) 49 687 307
E-mail address:
1877-0509 2017 The Authors. Published by Elsevier B.V.
arber.percuku@gmail.com
Peer-review under responsibility of the Conference Program Chairs.
1877-0509 2017 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the Conference Program Chairs.

1877-0509 2017 The Authors. Published by Elsevier B.V.


Peer-review under responsibility of the Conference Program Chairs.
10.1016/j.procs.2017.08.276
10 Arbr Peruku et al. / Procedia Computer Science 113 (2017) 916
2 Arbr Peruku, Daniela Minkovska, Lyudmila Stoyanova / Procedia Computer Science 00 (2017) 000000

1. Introduction

The data generated from different components and various sources of the power transmission grid are crucial for
operation of the power system. RTUs provide different data on energy demand, supply, operation, system
performance etc. Those data are massive; they are generated fast and come from various substations, and to manage
and utilize them need challenges. Traditional relational databases are inadequate in terms of response time and dont
show good performance when applied to very large data sets. To address this shortcoming, the Big Data
implementations are leveraging new technologies such as NoSQL data stores.
Relational databases are designed to display data in tabular structures, but to accommodate connected and
structured datasets they struggle to model them, and relational databases deal poorly with relationship. Relationships
exist in relational database but only at the modeling time as joining tables. When the structure of the dataset becomes
more complex and uniform, the relational model becomes more loaded with large join tables, populated rows and
lots of null values and those have impact on performance and also make this database difficult to evolve according to
business needs.
Graph databases provide the best means for modeling and make it easy to evolve according to business needs.
The flexibility of graph model allows adding new nodes and relationships, and the original data remain intact. Neo4j
allows adding more than one label to a node; labels are using to represent the roles a node plays in the graph.
Relationship forms paths. To query the graph which is known as traversing the graph, can involve those paths, and
because this path is oriented in nature of the data model makes operations more efficient.
Big Data on power grid have several applications such as: better utilization and optimization of system, improve
grid balancing, improve predictive equipment analytics, fault detection, protection, operations, planning, market,
asset management etc.

Nomenclature

ACID Atomicity, Consistency, Isolation, Durability


CQL Cypher Query Language
EMS Energy Management System
E-R Entity Relationship
NoSQL Not Only SQL
RDBMS Relational Database Management System
RTU Remote Terminal Unit
SCADA Supervisory Control and Data Acquisition
SS Substation

2. Big Data and power transmission grid

Data are created constantly. New digital technologies particularly Mobile, Social, Cloud, Automation device and
Big Data have transformed the computers industry. Big Data describes the continuous increase in data, and the
technologies which are needed to collect, store, manage, and analyze them. These have impact on people, processes
and technology. Big Data, from technology point of view, means hardware and software that integrate, organize,
manage, analyze and present data which are characterized mainly by three Vs: Volume-large amounts of data;
Variety-various of data sources and formats, and Velocity-speed of generation data.
Electrical power transmission grid is an automated system, it has substations (SS), where more Remote Terminal
Units (RTUs) are embedded in the power grid system to control and monitor it by using modern technologies. The
data collected from those RTUs are large in volume; they are generated fast, come from various sources-substations
on different formats, and need to process and analyze using Big Data technologies.
Arbr Peruku et al. / Procedia Computer Science 113 (2017) 916 11
Arbr Peruku, Daniela Minkovska, Lyudmila Stoyanova/ Procedia Computer Science 00 (2017) 000000 3

2.1. Big Data features

Big data refers to large amounts of data produced very quickly by a high number of diverse sources 5. Big Data
presents great opportunities as they help to develop new products and services, and make decision. These data can
be created by people or generated by machines, and it covers more sectors such as energy, transport and healthcare.
The three Vs (Volume, Velocity and Variety) are the main characteristics of Big Data and have begun to get out as
common subject. They allow business to analyze a large set of data about their aspects, to provide a new level of
insight and opportunity, often in real time or near to real time.
Recently the data in business and society are growing, and contributors to this growth, which is enabled by low-
cost storage, include: social data, mobile computing data, online business data, sensors data, automation process and
energetics parameters data, digitalization of voice and multimedia and so on. Increasing volumes and complexities
of data led to emergence of new technologies and processes. It is not enough to collect huge volumes of data, big
data initiatives have to identify the right data, organize it in a such form that can be explored with analytics and then
use them to get insights. Improving a business process depends on having the data to understand it in more details,
and these data can be power grid information, medical equipments data, telemetry from aircraft etc. With analytical
tools these types of data enable better prediction of future activity and performance, and allow organizations to
adjust processes to achieve better outcomes. As an example, monitoring a power grid or a plant and their
equipments help us to know in advance when they can fail, and by this to take preventive actions.
A range of new technologies such as NoSQL have emerged to deal with technical aspect of processing and
managing of large amounts of data. Big Data will bring a good performance improvement by leading to more
effective, decision making and optimizing business process.

2.2. Electrical power transmission grid data

The energy plays important roles to keep homes and businesses running without problems, and to supply
different appliances in all sectors. The main objective of energy policy is to provide secure, reliable, sustainability,
and low-cost supplies of electrical energy to customers. To make up power generation and loads in the electrical
grid and also to maintain the stability and reliability of the grid, and quality of supply need challenges. Electrical
energy is generated from power plants, and is transmitted on large distances through the system called power
transmission grid. Electrical substations are secondary stations of electricity generation, transmission and
distribution systems where voltage is converted from high to low or reverse using transformers, and electric power
may flow through several substations between power plants and the consumer. A transmission substation connects
two or more transmission lines, power transformers etc.
Power grid reliability is achieved through control and monitoring of system conditions, where automated actions
and decision making intend to ensure the system is stable and balanced at reasonable cost. Control of the power
transmission grid relies on the measurement and monitoring of electrical parameters such as system frequency,
voltage, current, active power, reactive power and so on.
The control, monitoring, and remote operation are necessary for any modern power system. Usually SCADA
system is used for this process. Power system automation is the process of automatically controlling the power
system via instrumentation and control devices, and substation automation refers to use data to control and automate
capabilities within substation.
The management and utilization of the data generated from different components and various sources of the
power transmission grid are crucial for deployment and operation of the system. RTUs and communications devices
provide data on energy demand, supply, operation and system performance. Those data are massive, generated fast
and comes from various substations. Traditional databases dont show the best performance for those large amounts
of data. Big Data allow those massive amounts of data to be analyzed, coordinated, and efficiently used. Big Data
can help for: improving the efficiency operation of transmission grid; predict equipment failures and power outages;
effectively integrate renewable energy sources; make better decision etc.
Power system automation, control, monitoring etc. require reliable real-time data processing, and to support this
requirement and to get future analysis and decision making, large amounts of data should be saved and stored in the
archiving database. There are various sources that huge amount of data can be generated through diverse
12 Arbr Peruku et al. / Procedia Computer Science 113 (2017) 916
4 Arbr Peruku, Daniela Minkovska, Lyudmila Stoyanova / Procedia Computer Science 00 (2017) 000000

measurements, and in our research paper we will take a substation at the 110 KV level, and will focus on processing
those data: system frequency, voltages, amps, apparent power, active power, reactive power, winding temperature of
power transformer, oil temperature of power transformer, tap-changer power transformer and data manually putted.

3. Graph database - Neo4j

The core structure for graph database is called node-relationship. Nodes and relationships support properties
which are a key-value pairs. Graph databases are navigated by following relationship which is not a case in
RDBMS, due to rigid schema table structure. A graph database stores data in a graph, and is capable to represent
any kind of data in a highly accessible way. The records in a graph database are called nodes. Nodes are connected
through typed, directed arcs called relationships. Each node and relationship can have their attributes referred as
properties. A label is a name which organizes nodes into groups.
The most widely used graph database is Neo4j. Neo4j is a one of the NoSQL database whose data model is a
graph, specifically a property graph. It is written in Java, and is open source10.

3.1. Neo4j features

Since the beginning of computer software, the data that applications must deal has grown rapidly and they are
very complex. These complexities of data include not only its size, but also their relations, changing its structure
often, and access to the data at the same time. Relational databases on those aspects of changing the data are not the
best solution. As the result, there are created different technologies to solve those problems, and they are grouped
under Not only SQL (NoSQL) databases.
A graph is a collection of vertices and edges, or to say simply, it is a set of nodes and the relationships that
connect them. Graphs represent entities as nodes, and the way where those entities relate as relationships. They
contain nodes and relationships; nodes have properties which are key-value pairs; nodes can labeled with one or
more labels; relationships are named and directed, and have a start node and end node; relationships can contain
properties. The main advantages of Neo4j are as follows:
It is very easy and faster to retrieve/traversal/navigate of more connected data
It is very easy to represent connected data.
It represents semi-structured data very easily
Neo4j CQL query language commands are in humane readable format and very easy to learn
It uses simple and powerful data model
It does NOT require complex Joins to retrieve connected/related data as it is very easy to retrieve its adjacent
node or relationship details without Joins or Indexes
If we use RDBMS Databases to store more connected data, then they do not provide proper performance for
traversing large amount of data. In these scenarios, Neo4j Graph Database improves the application performance
very well. Like this, these applications contain lots of structured, semi-structured and unstructured connected data. It
is not easy to represent this kind of unstructured connected data in RDBMS Databases. If we store this kind of more
connected data in RDBMS databases then retrieval or traversal is very tough and slow. It is very easy to store and
retrieve this kind of more connected data with Neo4j Graph Databases.
Neo4j offers ACID transaction database and offers high availability. Neo4j is scalable database and it is easy to
model because of the node-relationship properties structure. It does not require a schema, and it does not require
data typing, so it is very flexible. The limitation is that nodes cannot reference themselves directly. The replication
capability is very good and can replicate entire graphs. Neo4j supports around 34 billion of nodes and 34 billion
relationships10 actually.

3.2. Modeling data with Neo4j

Graph data modeling is the process in which a Neo4j user describes an arbitrary domain as a connected graph of
nodes and relationships6. Graph databases provide the best means for modeling the data. Graph differs from other
Arbr Peruku et al. / Procedia Computer Science 113 (2017) 916 13
Arbr Peruku, Daniela Minkovska, Lyudmila Stoyanova/ Procedia Computer Science 00 (2017) 000000 5

modeling techniques because links better logical and physical models, and graph databases are so called whiteboard
friendly, by meaning what we draw on it we can implement that easy inside the database.
Unlike a traditional RDBMS, Neo4j is a schemaless database3, which means we dont need to define tables and
relationships before to add data. A node can have any properties, and any node can be related to any other node. The
data model for a Neo4j database is implicit in the data it contains, which means it is a description of what we want to
put in database vs. explicitly as part of database itself, which means a set of prescriptions enforced by database that
constrains what it will accept. Neo4j data modeling is descriptive rather than prescriptive 3; as a result its easy to
make changes when the applications view need expands or to alter.
The most useful graph query language, specific to Neo4j is Cypher. Cypher is an expressive graph database query
language. As most query languages, Cypher is also composed of clauses. Neo4j query language, Cypher, works by
matching patterns in the data, so one way to see our data model is as an inventory of basic patterns, for example: a
person lives at an address; an underground station is connected to another underground station on an underground
line; an electrical substation is connected to another substation through overhead line and so on.
Once we have description of the data that will be stored on database, we may use this description about the
queries, and a common way to express this description. By drawing fragments of graphs that represent common
patterns in our data, we can visualize our model. Neo4j Graph Database has mainly the following building blocks:
Nodes, Properties, Relationships, Labels and Data Browser.
Modeling in relational area, the first stage is to understand the entities in the domain and how they are related. It
needs to form E-R diagram, by this to go from conceptual model to logical model, and then moves users directly to
table design and normalization of the database. Before we add rows on the tables, in case of complexity of model we
have big deals in the form of foreign keys constraints and join tables. One of the challenges of relational world is
that normalized models are not so fast for the needs of real world, it means to change users data model to suit the
database. This process is called denormalization, it is duplicating of the data. Very often, during design and
development, the model may have revision by meaning to accommodate the model according to the needs of the
application. Based on hard and rigid schema and complex modelling characteristics of relational databases, it is not
a good tool for fast changes, the performance and support the maintaining of integrity of data may decrease, when
we have fast growth and changes. Those attributes has the graph model. Instead of transforming of model into tables
in relational, in models graph we improve it, by meaning to produce an accurate representation of our model to get
the application needs, and this mean that for each entity in the domain, we capture its roles as labels, its attributes as
properties and its connections to next entities as relationships.
In this research case, it will be modeled a power transmission grid substation using Neo4j, and then after
modeling of this, it will be simulated adding e new power transformer based on increasing energy demand from
consumption side.

4. Transmission grid substations data in Neo4j

After modeling the power equipments in transmission grid substation on SCADA/EMS database, than the
measurements data and all other events data that come from substation need to be recorded for further analysis.
Those data usually are processed and recorded on traditional database inside SCADA/EMS system.
We will use Neo4j database to process and model those kinds of data. In our research case we will analyze a
substation SS_V2 which is connected on two other substations SS_V1 and SS_V3, through the overhead lines, and
has two power transformers T1 and T2 (T3 is new transformer that will be added), see figure 1.
Using traditional database for processing and recording the specified measurements data, we have to create three
tables: Parameters, Par_desc and Measurements_data; constraints; primary keys; foreign keys and so on, see figure
2. By increasing the volumes of data, the traditional database will not show good performance because of:
when we query the relationship, all JOINs are executed every time
executing a JOIN means to search for a key in another table
executing a JOIN means to lookup a key, so more entries more lookups slower JOINs
evolving the database is not easy, when the business requires that
By using Neo4j database, they are used pointers instead of look-ups, all JOINs are done on creation,
continuously can evolve the database, and by increasing the volumes of data the performance remain constant. In
14 Arbr Peruku et al. / Procedia Computer Science 113 (2017) 916
6 Arbr Peruku, Daniela Minkovska, Lyudmila Stoyanova / Procedia Computer Science 00 (2017) 000000

our research case, the measurements of parameters data that we are going to model and save on Neo4j database are
as follows: KV0,4 or 8 = Voltage expressed in kilo (K) phase 0,4 or 8; AMP0,4 or 8 = Current expressed in Amp
phase 0,4 or 8; MVA = Electric apparent power expressed in mega volt amps; MW = Active power expressed in
megawatt; MVAR = Reactive power expressed in mega volt amp reactive; TAP = Transformers Tap changer;
WTEM = Transformers winding temperature; OTEM = Transformers oil temperature; MAN_P =Value manually
putted; and Hz = System frequency.

Fig.1. Substation SS_V2 Fig.2. Tables in traditional database

We will use Neo4j Community Edition version 2.3.7, to create nodes, relationships and their properties for
modeling the substation SS_V2. Let us create three substations nodes: Subs_V2, Subs_V1, Subs_V3 and some of
properties:
Create (Subs_V2:SS_v2 {tso_id:1, tso_name:"K",ss_no:2,ss_name:"V2",Hz:50.00, KV:110, date_time:"06/04/17 0:00"})
Create (Subs_V1:SS_v1 {tso_id:1, tso_name:"K",ss_no:1,ss_name:"V1",Hz:50.00, KV:110, date_time:"06/04/17 0:00"})
Create (Subs_V3:SS_v3 {tso_id:1, tso_name:"K",ss_no:3,ss_name:"V3",Hz:50.00, KV:110, date_time:"06/04/17 0:00"})
Next, we will model power transformers T1 & T2 by creating two nodes V2T1 and V2T2, and will create some
properties, respectively:
Create (V2T1:Load_V2T1 {ss_no:2, ss_name:"V2", KV0:110.01, KV4:109.5, KV8:109.9, MVA:25, MW:18, MVAR:11, AMP0:120,
AMP4:125, AMP8:122, TAP:5, WTEM:32, OTEM:37, date_time:"06/04/17 0:00", man_p:1})
create (V2T2:Load_V2T2{ss_no:2,ss_name:"V2",KV0:110.01,KV4:109.5,KV8:109.9,MVA:25,MW:19, MVAR:10, AMP0:120,
AMP4:125, AMP8:122, TAP:5, WTEM:32, OTEM:37, date_time:"06/04/17 0:00", man_p:1})
Next, we will create the relationships r1 and r2 from node substation - Subs_V2 to the load T1- V2T1 and T2
V2T2 and some of their properties, respectively:
MATCH (Subs_2:SS_v2),(V2T1:Load_V2T1) WHERE Subs_2.ss_name = 'V2' AND V2T1.ss_name = 'V2'
CREATE (Subs_2)-[r1:Load1 {MW:"18", MVAR:"11"}] -> (V2T1) RETURN r1, Subs_2, V2T1
MATCH (Subs_2:SS_v2),(V2T2:Load_V2T2),(V2T1:Load_V2T1) WHERE Subs_2.ss_name = 'V2' AND V2T2.ss_name = 'V2'
CREATE (Subs_2)-[r2:Load2 {MW:"19", MVAR:"10"}]->(V2T2) RETURN r2, Subs_2, V2T1, V2T2
They are modeled the substation SS_V2, power transformers T1 and T2, and their relationships using Neo4j, the
query can be done to match all the nodes and their relationships, see figure 3.
MATCH (Subs_2:SS_v2), (V2T2:Load_V2T2), (V2T1:Load_V2T1) WHERE Subs_2.ss_name = 'V2' AND V2T2.ss_name = 'V2'
RETURN Subs_2, V2T1, V2T2
As it is explained above, the Neo4j database is schema less and can evolve easy if this is required by business, we
will model a new power transformer T3, by creating a new node V2T3 and its relationship r3, see figure 4:
Create (V2T3:Load_V2T3 {ss_no:2,ss_name:"V2",KV0:110.01,KV4:109.5,KV8:109.9, MVA:25, MW:18, MVAR:11,
AMP0:120,AMP4:125,AMP8:122,TAP:5,WTEM:32,OTEM:37, date_time:"06/04/17 0:00", man_p:1})
MATCH (Subs_2:SS_v2),(V2T3:Load_V2T3) WHERE Subs_2.ss_name = 'V2' AND V2T3.ss_name = 'V2'
CREATE (Subs_2)-[r3:Load3 {MW:"17", MVAR:"10"}]->(V2T3) RETURN r3, Subs_2, V2T3
From the above results and simulations, we can see how easy is to add more devices on electrical substations
using Neo4j, due to fast evolving on Neo4j database. In Appendix, we will simulate to add more properties on new
transformers nodes (T3), by using LOAD CSV statement.
Arbr Peruku et al. / Procedia Computer Science 113 (2017) 916 15
Arbr Peruku, Daniela Minkovska, Lyudmila Stoyanova/ Procedia Computer Science 00 (2017) 000000 7

Fig.3. The Nodes Subs_V2, V2T1 and V2T2 Fig.4. Adding a new node V2T3

5. Conclusion

Big Data of a transmission grid substation is analyzed. Those data come from RTUs, and they are large in
volume, come from various sources and are generated with high speed and on different formats. Traditional
relational databases do not show good performance when applied to very large data sets, and also make the database
difficult to evolve based on business needs. Neo4j is a schemaless database, which means we dont need to define
tables and relationship before to add data. This process is explored and examined. Neo4j graph databases are
specifically designed to handle graph based data more efficiently than the traditionally relational database. The
paper proposes the use of Neo4j database to store the data that come from power grid substations, and adding a new
element on it such as a power transformer. From the results and tests that we have done, we can see Neo4j enable
good performance operation, and make it easy to evolve the database, according to business needs.

References

1. Robinson I, Webber J, Eifrem E. Graph Databases New Opportunities for Connected Data; 2015
2. Hurwitz J, Nugent A, Halper Dr. F, Kaufman M. Big Data For Dummies; 2013
3. Vukotic A, Watt N. Neo4j in Action; 2015
4. http://www.gartner.com/it-glossary/big-data/. accessed: 2017.31.03.
5. https://ec.europa.eu/digital-single-market/en/big-data/. accessed: 2017.08.04.
6. https://neo4j.com/developer/guide-data-modeling/. accessed: 2017.16.04.
7. Wessler M. Big Data Analytics Dummies; 2013
8. B.M. Weedy, B.J. Cory, N. Jenkins, J.B. Ekanayake, G. Strbac. Electric Power Systems; 2012
9. https://www.big-data-europe.eu/energy//. accessed: 2017.01.04
10. https://neo4j.com/. accessed: 2017.06.03
11. https://github.com/neo4j/. accessed: 2017.06.03
12. McDonald J. Electric Power Substations Engineering. 2nd ed., New York; 2007, p.128-148
13. https://ec.europa.eu/jrc/en/big-data/. accessed: 2017.04.01.
14. Ishwarappa, Anuradha J. A Brief Introduction On Big Data 5Vs Characteristics and Hadoop Technology, Procedia Computer Science 48
(2015) 319-324
15. Rauf Baig A, Jabeen H. Big data analytics for behavior monitoring of students, Procedia Computer Science 82 (2016) 43-48
16. https://www.oracle.com/big-data/index.html/. accessed: 2017.31.03.
17. https://www.ibm.com/big-data/us/en//. accessed: 2017.31.03.
18. https://en.wikipedia.org/wiki/Big_data/. accessed: 2017.31.03.
19. http://www.gartner.com/id=2081316/. accessed: 2017.04.04.
20. Diamantoulakis P, Kapinas V, Karagiannidis G. Big Data Analytics for Dynamic Energy Management in Smart
Grids, Big Data Research 2 (2015) 94101
21. Naimur Rahman M, Esmailpour A, Zhao J. Machine Learning with Big Data An Efficient Electricity Generation forecasting
System, Bug Data Research
22. Hillman C, Petrie K, Cobley A, Whitehorn M. Real-Time processing of proteomics data, 2016 IEEE International
conference on Big Data (Big Data)
23. Botev V, Almgren M, Gulisano V, Landsiedel O, Papatriantafilou M, Van Rooij J. Detecting Non-Technical
Energy Losses through Structural Periodic Patterns in AMI data, 2016 IEEE International conference on Big Data (Big Data)
16 Arbr Peruku et al. / Procedia Computer Science 113 (2017) 916
8 Arbr Peruku, Daniela Minkovska, Lyudmila Stoyanova / Procedia Computer Science 00 (2017) 000000

24. Menenberg M, Pathak S, Udyapuram H, Gavirneni S, Roychowdhury S. Topic Modeling for Management Sciences,
A Network-based Approach, 2016 IEEE International conference on Big Data (Big Data)
25. Gundla N, Chen Zh. Creating NoSQL Biological Database with Ontologies for Query Relaxation, Procedia Computer
Science 91 (2016) 460-469
26. Rainey B, Gleich D. Massive Graph Processing on Nanocomputers, 2016 IEEE International Conference on Big Data (Big
Data)
27. Makris A, Tserpes K, Andronikou V, Anagnostopoulos D. A classification of NoSQL data stores based
on key design characteristics, Procedia Computer Science 97 ( 2016 ) 94 103
28. Portela F, Lima L, Santos M. Why Big Data? Towards a project assessment framework, Procedia Computer Science 98
(2016 ) 604 609
29. Elgendy N, Elragal A. Big Data Analytics in Support of the Decision Making Process, Procedia Computer Science 100 (2016 )
1071 1084
30. Deri J, Franchetti F, Moura J. Big Data Computation of Taxi Movement in New York City, 2016 IEEE International
Conference on Big Data (Big Data)
31. Bicevska Z, Oditis I. Towards NoSQL-based Data Warehouse Solutions, Procedia Computer Science 104 (2017) 104 111
32. Schmarzo B. Big Data Understanding How Data Powers Big Business; 2013, p. 25-53
33. Wu Ch, Lin F, Chang W, Tsai W, Lin H, Yang Ch. BIG DATA DEVELOPMENT
PLATFORM FOR ENGINEERING APPLICATIONS, 2016 IEEE International Conference on Big Data (Big Data)
34. Al-Jarrah O, D.Yoo P, Muhaidat S, Karagiannidis G, Taha K. Efficient Machine Learning for Big Data: A Review,
Big Data Research 2 (2015) 8793
35. Singh D, Reddy Ch. A survey on platforms for big data analytics, Journal of Big Data 2014
36. Zikopoulos P, Eaton Ch, de Roos D, Deutsch Th, Lapis G. Understanding Big Data; 2012, p.1-51
37. Gennaro M, Paffumi E, Martini G. Big Data for Supporting Low-Carbon Road Transport Policies in Europe: Applications,
Challenges and Opportunities, Big Data Research 6 (2016) 1125
38. Eifrem E. Neo4j the benefits of graph databases, Neo Technology, http://neotechnology.com
39. Hunger M, Boyd R, Lyon W. The Definitive Guide to Graph Databases for the RDBMS Developer, neo4j.com
40. Kunkel J. Graph Processing with Neo4j Lecture BigData Analytics; 2015
41. Van Bruggen R. Learning Neo4j; 2014, p. 73-91
42. Castelltort A, Fauvet C, Guidoni J, Laurent A, Sala M. Towards NoSQL graph based master data management systems:
building a generic and collaborative solution, Int. J. Emerg. Sci., 4(3), 87-102, September 2014
43. Cuzzocrea A. Big Data Provenance: State-Of-The-Art Analysis and Emerging Research Challenges, EDBT/ICDT Workshops 2016
44. Cuzzocrea A, Cosulschi M, De Virgilio R. An Effective and Efficient MapReduce Algorithm for Computing BFS-Based
Traversals of Large-Scale RDF Graphs. Algorithms 9(1): 7 (2016)
45. Pournaras E, Yao M, Ambrosio R, Warnier M. Organizational Control Reconfigurations for a Robust Smart Power
Grid. Internet of Things and Inter-cooperative Computational Technologies for Collective Intelligence 2013: 189-206

Appendix A. An example appendix

To import the data from the file on CSV format into Neo4j, it will be used LOAD CSV to get the data into the
query. It can be combined with USING PERIODIC COMMIT which will instruct Neo4j to perform a commit after a
number of rows:
USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///D:/Neo4j/els_nodes/els005.csv" AS line
CREATE (V2T3:Load_V2T3 {ss_no:line.ss_no,ss_name:line.ss_name,KV0:line.KV0, KV4:line.KV4, KV8:line.KV8,MVA:line.MVA,
MVAR:line.MVAR, MW:line.MW,AMP0:line.AMP0, AMP4:line.AMP4, AMP8:line.AMP8,
TAP:line.TAP,WTEM:line.WTEM,OTEM:line.OTEM,date_time:line.date_time,man_p:line.man_p})
This load of data is performed using Neo4j version 2.3.7 in computer with Intel(R) Core (TM) i5 CPU 2.53 GHz,
RAM 4 GB, 64-bit Operation System, see figure a1. Neo4j browser path: http://localhost:7474/browser/.

Fig.a1. Adding properties in V2T3 node

S-ar putea să vă placă și