FCIM IS 211M EN Dodon Ion Master

Universitatea Tehnică a Moldovei
APLICAREA TEHNOLOGIILOR BLOCKCHAIN ȘI IPFS ÎN

ASIGURAREA CONFIDENȚIALITĂȚII ȘI ACCESIBILITĂȚII
DATELOR SENSIBILE
THE APPLICATION OF BLOCKCHAIN AND IPFS
TECHNOLOGIES FOR ASSURING CONFIDENTIALITY AND
ACCESSIBILITY OF SENSITIVE DATA
Student: Dodon Ion

gr. IS-211M
Coordonator: Zaharia Gabriel,

asist. univ.
Chişinău, 2023
MINISTERUL EDUCAȚIEI ȘI CERCETĂRII AL REPUBLICII MOLDOVA
Universitatea Tehnică a Moldovei
Facultatea Calculatoare, Informatică și Microelectronică
Departamentul Inginerie Software și Automatică
Admis la susținere
Șef departament: Fiodorov I. dr., conf. univ.
_____________________________
„____” _________________ 2022
Aplicarea tehnologiilor blockchain și IPFS în

asigurarea confidențialității și accesibilității datelor
sensibile
Teză de master
Student: __________ __ Dodon Ion, gr. IS-211M
Coordonator: ________ ____ Zaharia Gabriel, asist. univ.
Consultant: _____ _______ Catruc Mariana, lect. univ.
Chişinău, 2023
REZUMAT
Keywords: Date, securitate, decentralizare, blockchain, IPFS.
Domeniul de tehnologii informat, ionale se dezvoltă foarte repede si apar tot mai multe tehnici s, i
tehnologii noi ce fac acest domeniu să progreseze. Unele dintre cele mai noi idei apărute in ultimii ani sunt
de a folosi principiile democratiei pentru a rezolva anumite probleme in era noastră informat, ională. Aceste
probleme fiind legate de sigurant, a datelor, despre recunoas, terea apartenentei lor, si altele.
Această lucrare sub numele “Aplicarea tehnologiilor blockchain s, i IPFS ı̂n asigurarea confident, ialităt, ii
s, i accesibilităt, ii datelor sensibile” elaborată ı̂n cadrul tezei de master, de către studentul Ion Dodon al Uni-
versităt, ii Tehnice a Moldovei, are ca scop de a rezolva unele probleme ce t, in de confident, ialitatea datelor
sensibile. Si pentru aceasta se vor folosi tehnologiile blockchain si IPFS. Acestea având caracteristicile
necesare pentru a crea un sistem sigur de protect, ie a datelor care nu depind de o companie tert, ă.
Lucrarea este ı̂mpărt, ită in următoarele părt, i: analiza domeniului, specificarea cerint, elor funct, ionale
s, i mai ales non-funct, ionale, design-ul sistemului, si implementarea lui. Iar la sfârs, it este concluzia unde se
ment, ionează despre observat, iile făcute pe parcursul lucrării s, i rezultatele obt, inute.
Internetul are o important, ă din ce in ce mai mare pe zi ce trece. In ficare zi, in fiecare arie din soci-
etate sunt folosite calculatoarele personale, servere, telefoane mobile inteligente si toate acestea comunică
ı̂ntre ele. O mare parte din datele despre noi, care ne expun personalitatea, sunt stocate in mediul on-line.
Aceste date pot fi folosite cu scopuri rele, de exemplu ele pot fi colectate de o companie sau chiar persoane
individuale fiind vândute altor companii sau persoane individuale pentru ca mai târziu societatea sa fie ma-
nipulata deja fiind cunoscută destul de bine ce preferint, e are sau alte alte date despre populat, ie. Multe din
datele stocate pe servere sunt de natură sensibilă, cum ar fi acte personale medicinale. Desigur la moment
aceste date sunt păstrate cât de sigur posibil, dar pană la urmă ele sunt pastrate pe servere care apart, in unor
companii ce ofera servicii IT (ı̂n cloud). Ceea ce ı̂nseamnă că noi incredint, ăm lor aceste date. Alt aspect
de risc ar fi cazul când aceste servere cad si datele nu mai pot si recuperate. Problemele mentionate sunt
bine cunoscute si există măsuri de prevenire, dar sigur că aceste măsuri nu sunt perfecte. Această lucrare
is, i propune sa vină cu noi idei de a proteja datele utilizatorilor de internet fără a depinde de o companie
externă. Aceste date care să nu le poată obtine nimeni altcineva decât det, inătorul lor, si care nu se pot pierde
ı̂n cazul cand un server cade.
Pentru a insus, i acest nivel de securitate va fi creat un sistem care stocheaza datele pe o ret, ea IPFS
ceea ce ı̂nseamnă că va fi un sistem decentralizat si ı̂n cazul cand un server cade, datele sunt repede recu-
perate pe alte noduri (servere). Va fi creat un Smart Contract care va functiona ca un manager de date s, i va
memora cine s, i ce date det, ine. Toate aceste operat, ii vor fi transparente s, i datele stocate for fi encriptate.
ABSTRACT
Keywords: Data, security, decentralization, blockchain, IPFS.
The field of information technologies develops very quickly and more and more new techniques
and technologies appear that make this field progress. Some of the newest ideas that have emerged in
recent years are to use the principles of democracy to solve certain problems in our information age. These
problems are related to the safety of the data, about the recognition of their belonging, and others.
This work under the name ”The application of blockchain and IPFS technologies for assuring con-
fidentiality and accessibility of sensitive data” elaborated as part of the master’s thesis, by the student Ion
Dodon of the Technical University of Moldova, aims to solve some problems related to confidentiality
sensitive data. And for this, blockchain and IPFS technologies will be used. These having the necessary
features to create a secure data protection system that does not depend on a third-party company.
The work is divided into the following parts: domain analysis, specification of functional and espe-
cially non-functional requirements, system design, and its implementation. And at the end is the conclusion
where it is mentioned about the observations made during the work and the results obtained.
The Internet is becoming more and more important every day. Today, personal computers, servers,
and smart mobile phones are used in every area of society and all of these communicate with each other.
A large part of the data about us, which expose our personality, is stored in the online environment. This
data can be used for bad purposes, for example, it can be collected by a company or even individuals being
sold to other companies or individuals so that later the company can be manipulated already knowing quite
well what its preferences are or other data about the population. Much of the data stored on the servers is
of a sensitive nature, such as personal medical records. Of course, at the moment these data are kept as
safe as possible, but until the end, they are kept on servers belonging to companies that offer IT services (in
the cloud). Which means we entrust them with this data. Another aspect of risk would be the case when
these servers fall and the data cannot be recovered. The mentioned problems are well known and there are
preventive measures, but surely these measures are not perfect. This paper aims to come up with new ideas
to protect internet users’ data without depending on an external company. These data cannot be obtained
by anyone other than their owner, and that cannot be lost if a server goes down.
In order to acquire this level of security, a system will be created that stores data on an IPFS network,
which means that it will be a decentralized system and in case a server falls, the data is quickly recovered
on other nodes (servers). A Smart Contract will be created that will act as a data manager and remember
who owns what data. All these operations will be transparent and the stored data will be encrypted.
Contents
List of figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
List of tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Listings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1 DOMAIN ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1 Problem analysis and definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2 Scope and solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.1 The inspiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Importance of the topic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Existing solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.1 IPFS pinning services versus Google Drive . . . . . . . . . . . . . . . . . . . . . 14
1.4.2 Advantages and disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.3 Cost analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 Proof of Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2 REQUIREMENTS SPECIFICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2 Non-Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 SYSTEM DESIGN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2 The environments involved . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 High level architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Data protection using AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Data storage flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.6 Integrated components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6.1 MetaMask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6.2 Etherscan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.6.3 IPFS Desktop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.6.4 Blockchain testnets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 SYSTEM IMPLEMENTATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1 Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5
4.1.1 Solidity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2 Hardhat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.3 IPFS-core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1.4 Data protection using AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1.5 Goerli testnet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2 The Smart Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.1 Build process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.2 Resulting artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.3 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 The client application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.4 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
A The Smart Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6
List of Figures
1.1 Comparing the movement of data in IPFS to centralized client-server models . . . . . . . 13

1.2 Google Drive Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 Smart Contract operations consuming Gas . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4 Gas refund process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.5 Solidity operations gas usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1 Types of requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1 Main use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2 High-level architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Symmetric key cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Asymmetric key cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Data storage flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Smart Contract development and the related components . . . . . . . . . . . . . . . . . . 35
3.7 MetaMask extension interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.8 Transaction history listed in Etherscan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1 Summarised comparison between DES and AES . . . . . . . . . . . . . . . . . . . . . . 42
7
List of Tables
1.1 Filebase-Google Drive advantages and disadvantages . . . . . . . . . . . . . . . . . . . . 17

1.2 Advantages and disadvantages of the system . . . . . . . . . . . . . . . . . . . . . . . . . 18
8
Acronyms
ABI Application Binary Interface.
AES Advanced Encryption Standard.
API Application Programming Interface.
CID Content Identifier.
DES Data Encryption Standard.
EVM Ethereum Virtual Machine.
GUI Graphical User Interface.
IPFS Inter Planetary File System.
JSON JavaScript Object Notation.
NFT Non-Fungible Token.
NPM Node Package Manager.
P2P Peer-to-peer protocol.
PoS Proof of Stake.
PoW Proof of Work.
RPC Remote Procedure Call.
RSA Rivest–Shamir–Adleman.
SPA Single Page Application.
UML Unified Modelling Language.
9
Listings
4.1.1 Smart Contract definition example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2 How to connect to IPFS using IPFS-core . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2.1 Generate Hardhat project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.2 Hardhat project dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2.3 Environment variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2.4 Hardhat testnet configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.5 Hardhat project structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2.6 The Ledger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2.7 Smart Contract function to publish CID . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.8 Function to get the owner address of CID . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.9 Compile the contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.10The Smart Contract ABI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.11Contract deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.12Deployer accounts per blockchain network . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.1 Client app dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.2 Client side function to upload a file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.0.1The Smart Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
10
INTRODUCTION
People use the internet every day at work, at school, at home for entertainment or remote work, and
almost everywhere. At first, computers were used to access the internet, the people started to share data
about themselves on the internet. This data includes what they like, what they do, etc. A bit later the internet
and computers started to conquer almost every field in society such as the medical field, governmental field
an others. At this point, people started to host on the internet even more sensitive data than what was
mentioned above, for example, medical certificates or personal IDs. These data should be protected for
malicious people and those who host the data should make sure they won’t lose it. There are of course
developed techniques to protect the data, but it is sometimes no enough reliable.
This paper describes a system that can be used to protect data with a higher level of security. First
of all, it is needed to have defined the functional and non-functional requirements for such a system. By
looking at the requirements, the most appropriate technologies and techniques were defined that can be
used to implement the system. The functional requirements are a few and very simple and those are to be
able to write some data and to be able to retrieve back what one stored earlier. While the non-functional
requirements are the most important for the project. They are what makes this project special.
By looking at the non-functional requirements and after researching on what technologies can be
used, it has been determined that to always assure availability a good technology to use is IPFS because
it is a protocol that allows storing data in a decentralized way, meaning that if a server falls down, the
other servers (also knows as nodes) will recover the data. To assure data encryption and transparency it
is possible to use Smart Contracts running on Ethereum blockchian. The Smart Contract will store the
information about who owns what.
The first chapter is the domain analysis. In this chapter the problem is analysed and defined. Also
in this chapter the scope and solution are being explained. Along all these, in the first chapter is explained
why the user would have to pay 0.36$ to store a document on this system. The second chapter tells the
functional and non-functional requirements of the system and also show the design of it using explained
diagrams. As the project focuses more on showing a new approach to protect the data online, there are
only a few simple functional requirements, while the non-functional requirements are the most important
aspects of the project being actually the backbone definition for what should be achieved. The diagrams
described how the selected technologies can be put together to implement that system that fulfils the defined
requirements. The third chapter is the actual implementation of the system with all of its code examples
and explanations.
10
1 DOMAIN ANALYSIS
It is clearly seen that informational technologies have an important role in our lives more and more as
time passes. These days informational technologies are used pretty much in every area of society including
in sensitive areas such as state security, personal health, the military domain, etc. This means that people
should pay special attention to the security aspect of today’s informational security. This chapter will be
explained the weaknesses of today’s informational data and how sensitive data can be protected on the
internet.
1.1 Problem analysis and definition

There are unfortunately many cases of fraud that lead to lots of material and non-material losses. A
well-known example of stealing personal data and selling it to other companies for suspicious purposes is
the Facebook-Cambridge Analytica scandal. [4] In this scandal, the problem is that Facebook might sell a
lot of people’s data stored on their platform, to other companies. As all know, Facebook stores data such
as photos, what people like, what they think, and other personal information like this. One might say that
this is not that sensitive information. But this is a debatable topic. What about the information that was
mentioned above, for example, personal health information? This is considered to be sensitive data and it
should be carefully protected from malicious people or from any possibility of loss.
In today’s world, the internet is mostly composed of lots of servers that process or store data. These
servers are mostly owned by big companies such as Google, Amazon, Facebook, and others. It is not as in
the early days of the internet when there were here and there a few servers owned by those who actually
need to use the internet and at that time the servers did not hold such important information as these days.
The idea for this project came from seeing that these days there are a lot of new powerful technologies
that could be used to protect sensitive data, but they aren’t used. The purpose of the project is to use new
technologies in order to come up with a new system, that offers a higher level of security for sensitive data.
The motivation is to help people have more trust in how their data is stored on the internet by offering them
a better solution for storage.
Most of the data storage systems are being held by a third-party company and people must obey
their rules. Most of these companies, which are also listed above have internal rules that are not visible to
the people and no one can know what they are actually doing with people’s data. Besides that, their prices
are dictated by them and these prices might not be equal to all in an honest way. All these are problems
because the Internet will be used more and more for a long time and more sensible data will be stored on it.
It is important to improve the way people hold data, at least their most sensitive data, on the internet.
11
Problem definition - Sensitive data is stored on servers belonging to third-party companies and
these companies could use the data for their personal purposes without the real owner of the data knowing.
1.2 Scope and solution

To have a good system on which anyone can trust to store their data on it, the system should have
the following characteristics: it should not be held by any third party company, the system should not be
corrupted and the rules put by the system should be democratic to anyone who uses it. It also should assure
the user that if something bad happens to the system, people’s data will not be lost. Other data storage
providers also ensure the last mentioned part but less the former characteristics. To have such a system,
first, it is needed to find the technologies that will allow us to build such a product.
1.2.1 The inspiration

There are actually such systems in use already and it is possible to use those technologies. A person
with the pseudonymous Satoshi Nakamoto [14] developed an idea named blockchain. [2] Blockchain allows
immutable and transparent data storage. In the beginning, it was only used for cryptocurrencies. A crypto-
currency, crypto-currency, or crypto is a digital currency designed to work as a medium of exchange through
a computer network that is not reliant on any central authority, such as a government or bank, to uphold or
maintain it.[3] To assure the user that his data is always available, it is possible to use IPFS (Interplanetary
filesystem) protocol.[6] IPFS is a protocol, hypermedia and file sharing peer-to-peer network for storing
and sharing data in a distributed file system. IPFS uses content-addressing to uniquely identify each file
in a global namespace connecting IPFS hosts. These two technologies (blockchain and IPFS) are widely
used in NFT (Non-Fungible Token) world. [9] An NFT is a digital asset, that is hashed, and the hash that
represents this asset is stored on a blockchain along with the address of the owner.
The system that is described in this paper is similar to how NFTs are stored, but NFTs are publicly
visible to the public. In contrast, since the system described here should provide confidential information,
therefore the data should be encrypted. In the blockchain world, each user has pairs of public and private
keys. These keys can be used to encrypt and decrypt the stored data.
It is a good idea to use a blockchain because by its nature it is immutable meaning that no one can
corrupt the data and there is no other third part company that the user has to deal with. The rules are equal
for every user like in a democratic world. Of course, these mentioned technologies are not perfect, they also
have their weaknesses. There are some attacks that can be used to corrupt blockchains. Some of the most
known attacks are the Sybil attacks and the 51% attack. [17] [1] The blockchains are based on consensus.
For a blockchain to work democratically it should be run on multiple nodes. Anyone can spin up a node.
Each node strives to mine blocks, also known as the process of PoW (Proof of Work) or a newer approach
is PoS (Proof of Stake).[12] [11] Proof of Stake appeared later and is considered more environmentally
12
friendly. As was mentioned, each node strives to mine more blocks since a blockchain is actually a chain of
blocks mined by the nodes that run the network. The nodes are rewarded for the mined blocks. Of course,
the more blocks one node mines, the more they get as a reward. In the consensus process, nodes agree upon
each other on the blockchain state. This means that if one node mines more blocks and has the longest
blockchains, the other node must validate that that node did a correct job, and its blockchain state is correct.
It is like voting, if more than 50% agree that one mined correctly the blocks, then those blocks are added
to the blockchain and synchronized with the other nodes. But what if someone holds more than 50% of the
nodes? This means that that person can decide for the rest of the miners. This is called the 51% attack.
Another attack is the Sybil attack which is similar to the 51% attack but a bit different.
In Figure 1.1 is presented the structural difference between a centralized data storage system and
IPFS. Mote details about this comparison can be found on the following medium article named IPFS: A
Complete Analysis of The Distributed Web. [8]
Figure 1.1 - Comparing the movement of data in IPFS to centralized client-server models
1.3 Importance of the topic

In today’s world as informational technologies have a more and more important role in people’s lives
it is important to give priority to the new technologies that are better in performance and are more secure,
to be used in the development of the new systems or to be used in the replacement process the old systems
that are still in use. In a world where everything relies on computers, the internet, smartphones, and others,
where almost every piece of information is stored on them, it is important to protect that information. There
are continuously being developed new ways to hack the current systems. The current systems are updated,
including in security means, to today’s standards. There are new rules and techniques that appear to better
secure the data on the internet, but there are also appearing new hacks for them. In this way, there is a kind
13
of war between hackers and scientists who want to find better ways to secure the data on the Internet.
As mentioned above, blockchain will be used to secure the data on the internet. The Requirements
specification and System Design 2 chapter and Implementation 4 chapter will be shown how blockchain and
IPFS can be used to encrypt the data, sign it, store it in a decentralized way, etc. Blockchain actually, when
it was invented, it was intended to be used as a platform for payment registry systems and ledger systems in
the digitalized world. Blockchain by its nature is immutable. All transactions of money are stored on it and
it is not easy to hack the blockchain in such a way that it will show that someone corrupted it and got illegal
money. Blockchain has a well-defined role of storing data transparently and democratically. Democratically
means that the same rules of its use are equal for everyone. But it has a cost that will be explained in the
Requirements Specification and System Design chapter 2.
1.4 Existing solutions

The traditional solution for data storage is to use one of the services from the big companies such as
Google, Amazon or from other smaller companies that also offer services for file storage such as Dropbox.
The way the systems of these companies work is by using lots hardware owned by them distributed in
almost any area of the word where there is predominantly population with the purpose that the data will be
more accessible. These servers are specialized in storage and are deployed in data-centres. All of these are
owned by a company. To use them storing services one has to obey to their rules, but this is not necessarily
a problem. But the thing is that they are not 100% transparent and not everything can be known about
what they are doing with peoples data. In terms of security they are quite secure but they are not using
blockchains because blockchains are against their profit.
There are also solutions that are using the mentioned technologies including IPFS. For example,
Filecoin is a small company that created their own crypto currency that can be used to pay for the storage
hosted on their blockchains. [5] Because of the fact that there are more companies that are using IPFS at
their core for file storage, in this paper will be compared IPFS with Google Drive because it is one of the
most popular traditional file storage providers.
1.4.1 IPFS pinning services versus Google Drive

There are a lot of IPFS-based pinning services such as Filebase, Kaleido, Infura, Fleek, Pinata,
Space, Eternum, Temporal, Peergos, DAppNode, IPFS Cluster, Textile Buckets, nft.storage, Fission.[7] Al-
most all of them are new, for sure newer than Google Drive. They have of course fewer reviews than Google
Drive. For this comparison will be taken only one of the IPFS pinning services, mainly Filebase. Filebase
and Google Drive will be compared based on the following criteria: pricing, inaugurations, ratings/reviews,
company information, product details, features. The comparison details are provided by Sourceforge.net.
[20]
14
1. Filebase
S3-compatible object storage based on Web3. Build Web3 with Web3, storing data on any of the networks
we support, including IPFS, Sia, Skynet, and Storj. The filebase functions as a suitable means for users to
upload data to favorably safe and geo-redundant blockchain webs without the necessity to handle contracts,
SLAs, or cryptocurrencies. In the filebase, all files linked to IPFS are even kept on the Sia network. This
produces a condition where the storage tier for IPFS filebase nodes is highly public and, most notably, geo-
redundant. Get started now with our unlimited 5GB free tier. We don’t charge for outgoing requests or API
requests, so you don’t have to worry about high overhead. There are no minimum object size requirements
or data retention policies in the filebase. Our API replaces existing applications and tools that use AWS S3,
making it easy to migrate to Filebase.
Pricing: Starting Price: $5.99 per month
Pricing Details: Our minimum subscription fee of $5.99 includes up-to your first 1 TB of storage and 1
TB of bandwidth. Additional storage and outgoing bandwidth transfer is billed at $0.0059 / GB.
Free Version: Free Version available.
Integration: Filebase integrates with the following services: Comet Backup, Amazon S3, NetDrive, Cy-
berduck, Couchdrop, Simplebackups, BackupSheep, Arq, Dropshare, and SnapShooter.
Ratings/Reviews: User Reviews as per Sourceforge.net
– overall 5.0 / 5;
– ease 5.0 / 5;
– features 4.5 / 5;
– design 4.5 / 5;
– support 5.0 / 5.
Features:
– blockchain;
– cloud storage;
a) encryption;
b) file sharing;
– IPFS pinning.
15
2. Google Drive
Store, share, and access your files from any device. Your first 15 GB of storage is free. With Drive
Enterprise, businesses only pay for the storage employees use. It comes with Google Docs, Sheets, and
Slides — and works seamlessly with Microsoft Office. Keep photos, stories, designs, drawings, recordings,
videos, and more. Your first 15 GB of storage is free with a Google Account. Your files in Drive can be
reached from any smartphone, tablet, or computer. So wherever you go, your files follow. You can quickly
invite others to view, download, and collaborate on all the files you want–no email attachment is needed.
Get started with Drive for free.
Starting Price: Free
Free Version: Free Version available.
Integration: Google drive has probably the richest integration set. The set of services with which Google
Drive can be integrated can be seen on the following resource Google Drive Integrations. Some of them are
Connecteam, monday.com, Wrike, and ClickUp.
Ratings/Reviews: User Reviews as per Sourceforge.net
– overall 4.9 / 5;
– ease 4.9 / 5;
– features 5.0 / 5;
– design 5.0 / 5;
– support 4.8 / 5.
1.4.2 Advantages and disadvantages

Another way to compare the selected platform is to compare them according to their advantages
and disadvantages from the point of view of the user. The advantages and disadvantages between Filebase,
Google Drive are presented in Table 1.1. The advantages and disadvantages will be first enumerated for
Filebase and then for Google Drive. Here are taken into account the following points: integration, how the
data is stored in regards to system architecture, cost, does the system has a free tier or not, whether is it
held by a company or not, security, available features, and others. After these two alternatives are compared
in Table 1.1, in a separate table are shown the advantages and disadvantages of the system that is being
described in this paper also names IPFSdataStore. IPFsdataStore is different from many points of view
from the systems that are described in the table above namely through its capacity of storage and speed
because when storing some data on a blockchain this takes time until the date is approved to be definitely
stored.
16
But the strong are for this system are its security and the fact that it is decentralized and there is no
company controlling it and all this is due to blockchain and IPFS.
Table 1.1 - Filebase-Google Drive advantages and disadvantages
Data storage provider Advantages Disadvantages

Filebase 1. It offers more integration options 1. It is being held by a company
for most Web3 storage providers. 2. The storage capability depends on the
2. It has a free tier. number of nodes that are
3. It has options for backup and part of the network
data recovery.
4. Automatic scaling without
any configuration
Google Drive 1. Data is decentralized 1. It is being held by a company
2. It offers lots of possibilities 2. The decentralization of data
for integration is limited by the company’s
3. It offers a fixed amount data centers
of storage for free 3. Despite its multiple security
4. It is possible to store big systems, the service isn’t
objects on the platform 100% hack-proof.
5. Does not have a file 4. Has a Limit on How Much
type limit You can Upload in a Day
6. Has the ability to preview
file content online
From Table 1.1 it is noticeable that Google Drive has more advantages but at the same time, it also
has more disadvantages than Filebase. The two platforms each of them has their own strong point in a
specific field, for example, Google Drive is good for storing lots of data while Filebase is good for storing
data is a decentralized way.
In Table 1.2 below presented the advantages and disadvantages of the system that is being described
in this paper. Like in Table 1.1 there are enumerated advantages and disadvantages, but this time for a
system that has a very narrow use. IPFSdataStore has its advantages due to the blockchain and IPFS
technologies. Blockchain is good because is able to share data fast and in a secure way amount entities.
In fact, blockchain can bring many advantages to businesses whether those businesses are using private or
public blockchains.
The strengths of blockchain are the following: trust, improved security, and privacy, decentralized
structure, reduced costs, speed, visibility and traceability, immutability, tokenization, and individual control
of data. It creates trust between entities where the trust does not exist or is unproven. There is not central
entity that coordinates everything. Everything is a peer-to-peer network and democratic rules are applied to
17
make this system work. The data stored on the blockchain is stored across a network of computers making
hacking even more difficult. Blockchains are immutable, which enables secure and a reliable audit of the
information. It is fast because that many intermediaries that would have the role to make the system secure
are actually absent. Tokenization is the process by which the value of an asset (physical or digital) is con-
verted into a digital token, which is then recorded and then transferred through the blockchain. According
to Joe Davey, chief technology officer at global consulting firm West Monroe, tokenization has taken root
in digital art and other virtual assets, but tokenization has broader applications that can simplify business
transactions. Utilities, for example, can use tokenization to trade carbon credits as part of carbon capping
programs.
Table 1.2 - Advantages and disadvantages of the system
Data storage provider Advantages Disadvantages

IPFSdataStore 1. The data cannot be corrupted. 1. It has fewer integration
The data is signed and options
encrypted at a low level 2. I can not store big objects
2. It is open source (or it is expensive to store them)
3. It is not being held by 3. It does not offer a free
any company amount of storage
4. Data is decentralized 4. Does not have a backup
5. It behaves equally for option or data recovery option
everyone. The system 5. The storage capability depends on the
promotes democratic rules number of nodes that are part of the network
6. Has files size and type limit
7. Does not offer the ability
to previw file content online
8. Does not offer ability to
edit the store data (the stored
data is immutable)
From Table 1.2 it is noticeable that there are a lot of disadvantages, but those disadvantages are from
the point of view of a traditional data storage system. The good thing is that the advantages are very strong
and difficult to accomplish with a storage system that is held by a company.
It is convenient to present a comparison in tabular form because it is easy to see the differences
alongside. There are 5 advantages of using the IPFSdataStore and there are 8 disadvantages of using the
IPFSdataStore. Even if it has less disadvantages it is a recommended tool for those people who want to
store their data online securely.
18
1.4.3 Cost analysis
Google Drive offers two kinds of plans: a monthly plan and an annual plan. Here will be compared
its monthly plan with the other service’s monthly plans. On Google’s official site, the plan looks like this in
the figure
Figure 1.2 - Google Drive Pricing
As it can be seen from the image, Google Drive has a free plan and this plan offers a generous
amount of storage space - 15 GB. Also, one can observe that people will pay per month 0.99$ if they want
to add 100 GB to their account. Alongside this 100 GB, the user gets Access to Google experts, the user can
share this 100 GB with 5 other users and some other extra member benefits. These advantages are called
Google One. To be easier to compare the costs between Google Drive, Filebase and IPFSdataStore will be
taken the amount one has to pay per month.
Filebase is free to utilize for all users who can keep up to 5GB of data on Filebase with no credit card
needed. Behind 5GB, users will have to boost to our their model subscription. Their minimum subscription
fee is $5.99 which includes up to 1 TB of storage and 1 TB of bandwidth. In the case of Filebase, they
also have bandwidth restrictions in contrast to Google Drive. For additional storage capacity and transfer,
the user will have to pay $0.0059 / GB. Also, they say that their subscription is renewed monthly until it is
canceled. Filebase claims that they will not charge for ingress nor for the number of API requests. They do
not have minimum file or sector size, there are no retention issues or retrieval delays, they do not require
the use of any special software and there is no special skill set required meaning that it is an easy-to-use
service.
It is clearly seen that as in the case of Google Drive, they charge a fixed price per month. Next will
19
be explained the IPFSdataStore pricing. Filebase and Google Drive, these two and most of the companies
that offer any kind of IT service, take the money in their companies. In the case of IPFSdataStore, as there is
no corporation, the money is distributed to those who take part in the mining process. It is like a community
where whoever works more and is more trusty makes more money.
As mentioned in the previous chapters, the IPFSdataStore will not be held by any corporation, but
as in life there is nothing for free, of course, the user will have to pay some money to store the data. That
money will go the miner. Miners are those who run the nodes on which the data is being stored. To run a
node it is required to consume energy. One can think of a node as a server running special software. That
software has the role to mine blockchain blocks. As mentioned above, there are two way of proving that a
blockchain node is valid or not. These two methods are PoW and PoS. PoW is more expensive, while PoS
is cheaper because it is based on a voting concept and is environment-friendly.
1.5 Proof of Concept

In this subsection will be taken a scenario with a number of users using the IPFSdataStore to see
how profitable is it for them to use the system in regards to security, functionality, and cost. Supposing that
there are 100 000 000 users using the system. While talking about IPFS, does not require money to read
data from it. To store data, if the users of the IPFS network run themselves a node on the network they
won’t have to pay money for storing data, but if they are using a Pinning service [7], they will have to pay
money as in the case of Filebase. This is explained in a discussion on ycombinator.com telling that IPFS is
like a torrent.
One has to spin a node and contribute to the network. So, in regards to IPFS, there are two alterna-
tives, either by spinning a node or by using a Pinning service. As the project aims to not use any third-party
company, the best and most secure way is to use your own pinning node. It is easy to start a node, this is
just by installing one software and keeping it running as explained on IPFS official site. [10]
The second part of the system is the Smart Contract.[15] The contract has the role to keep track of
who and what owns. Also the Smart Contract will give the user a public and private key. The private key
will be used to sign the files and the public key will be used to verify if the files are owned by a specific
person. The Smart Contract can be understood as a mapping where the key is the hash of the stored file on
the IPFS and the values can hold more data, some of this data being the public key of the owner. In this way,
everyone knows who owns what. The key will be the hash of the filed store on IPFS, but the files on IPFS
will be actually encrypted using the user’s private key and can be decrypted only with his/her same private
key. This process can be a bit different and will be explained in more detail in the following chapters.
As each transaction executed on a Smart Contract costs money, it is needed to find out how many
operations will be executed on the Smart Contract when the user stores a file. On a Smart Contract, each
20
atomic operation costs a fixed price. To be able to make the calculation, the Etherium blockchain will be
taken as an example, and we will suppose that this blockchain will be used to create the Smart Contract.
There are a lot of blockchains that support Smart Contracts, such as Cardano, Solana, Avalanche, Tron,
and a lot of others. The crypto-currency related to the Ethereum blockchain is Ether. This means that each
transaction executed on the Etherium blockchain costs a specific amount of Ether or Wei. One Wei is a
smaller unit of one Ether. This cost is named the GAS fee.
What is Gas? Gas dictates the unit that calculates the quantity f computational work needed to
perform precise functions on the Ethereum network. Since Ethereum transactions need computational aids
to complete, the individual transaction needs a price. Gas directs to the cost needed to complete a transaction
on Ethereum successfully. The operations that consume gas are illustrated in Figure 1.3
Figure 1.3 - Smart Contract operations consuming Gas
Gas prices are paid in Ethereum’s native money, ether (ETH). Gas expenses are indicated in Gwei,
which itself is a smaller unit of ETH - each Gwei costs 0.000000001 ETH (10-9 ETH). For instance, rather
than stating that gas costs 0.000000001 ether, one could tell that gas prices are 1 gwei. ’gwei’ tells ’giga-
wei’, and its value is 1,000,000,000 wei. Wei (named after Wei Dai, the creator of b-money) is the smallest
unit of ETH.
Why do Gas fees exist? In brief, gas prices assist keeping the Ethereum grid safe. By demanding
a price for every calculation performed on the network, to stop bad players from spamming the blockchain
web. In order to bypass unexpected or negative infinite circles or other computational wastage in the code,
the individual transaction is needed to limit how numerous computational actions of code execution it can
operate. The basic unit of calculation is ”gas”.
21
Even if a transaction has a limitation, a gas that is not used in a transaction is yielded to the user
(i.e.max f ee − (base f ee + tip). The gas refund process is presented in Figure 1.4.
Figure 1.4 - Gas refund process
As it has been mentioned, each computational operation costs Wei. In Solidity programming lan-
guage, which is one of the most popular programming languages to write smart contracts, each atomic
operation has a price, as shown in Figure 1.5. [16]
Figure 1.5 - Solidity operations gas usage
This is just a part of all the operations that are part of a contract that can be written in the Solidity
language. The entire table can be seen on the following link.
supposing that there are 100 000 000 users using IPFSdataStore, and each of them would sore 10
documents on the platform. From the table presented in Figure 1.5 can be seen that SSTORE is the operation
22
to store something on the blockchain. This operation costs 20,000 gas if one stores non-zero values in a
location in where there were zero values. On Oct 07 2022 the value of one gas was 13.49 Gwei, so in total
it will cost 20000*13.49=269800Gwei. In Ether, this is equal to 0.0002698 and in USD this is equal to
0.36$, since on 10th October 2022 when this was calculated 1 ETH was valued at 1328.72$. This would
mean that to save a record that shows that a user holds a specific document, this would cost 0.36$, therefore
for 10 documents this would cost 3.6$. This would be all the cost of the user’s spinning his IPFS node. 100
000 000 users in total would spend 100 000 000 * 3.6$ = 3.6 billion U.S. dollars if each one would store 10
documents.
Domain Analysis Conclusions

In today’s world the internet has an important role in our lives. The internet and informational
technologies overall are widely used in any area of society and very sensitive data is stored on servers
that are held by third-party companies which we have to trust. Fortunately in October 2008 was invented a
system for online money transfers that is decentralized and can hold money transfers or any other operations
securely and trusty. This is how Bitcoin appeared. The same technologies were later used to store NFTs
and to show their ownership democratically. These technologies are blockchain and IPFS. They can be used
anywhere it is needed to store sensitive information in a way that it can not be corrupted.
This chapter compared different services that can be used to store files online. There are three kinds
of storage services analyzed: traditional storage services that are held by big companies such as Google,
Pinning store services that are based on IPFS but which cost a pricing subscription per month, and the
service described in this paper that aims to be as independent as possible and democratic. The last is also
called IPFSdataStore, for convenience when making a reference to it, and this one has a very narrow domain
of appliances.
In the chapter was presented a cost analysis for the users who would use the system. It is free to
retrieve a file from the IPFS and to read information about the owner of the Smart Contract. But it costs
money to save any information on the contract, for example for a user storing 10 files on the IPFS and who
wants to store the information about the fact that he is the owner of the files, he will have to pay 3.6$. This
is only if the user spins his own IPFS node, otherwise, he/she will have to pay a Pinning service for using
the IPFS network.
23
2 REQUIREMENTS SPECIFICATION
In each project setting up the requirements is an important step to make a project alive. If the
requirements are set up correctly then the project has a higher rate of success. In case they are not defined
correctly or are not defined at all, this will bring problems regarding the communication between the entire
team working on the system and inevitably will cause delays.
As shown in Figure 2.1, there are different types of requirements and those are: business require-
ments, user requirements, product requirements, and finally product requirements lead to functional and
non-functional requirements. As shown in the diagram, everything starts with business requirements, and
the others are derived from the previous.
Figure 2.1 - Types of requirements
Functional requirements and non-functional requirements are enumerated in more detail in the Func-
tional Requirements section 2.1 and Non-functional Requirements 2.2 respectively. The other types of re-
quirements should not be confused with the functional requirements because they are more technical and
have a higher grade of details for the developer to be easier to understand what and how to implement. The
other types of requirements are broadly explained here
Business requirements are more focused on the business values. As shown in Figure 2.1 taking
from left to right the level of details in the requirements increases. This means that the business requirements
have the lowest level of detail. This is because these requirements are most often defined by the owner of
the project and he/she in a general way knows what he/she wants but knows less about the steps needed to
rich to the final goal which is to get the project done correctly.
24
User requirements are more describing what the user can do on the platform. These requirements
can be represented in different types of diagrams such as UML (Unified Modeling Language) Use Case
diagrams or can be described in user stories or scenarios.
Product requirements are describing what actions this system should perform in order to achieve
the business and user requirements. One product requirement can be composed of a number of smaller
functional requirements. The final system is a set of functional requirements implemented and the behavior
of the implemented system shows the non-functional requirements.
There is a small set of functional requirements for the system that is being described in this thesis
because the purpose of this thesis work isn’t to implement much functionality for the product but to actu-
ally demonstrate that such a system implemented on blockchain and IPFS can bring a lot of security and
performance benefits in special for systems that operate with sensitive data. This means that for this project
the non-functional requirements are the most important.
2.1 Functional Requirements

The functional requirements are the features that are needed to be implemented in order to offer the
users the possibility to achieve their goals when using the system. These requirements rule the system’s
behavior under certain conditions. The functional requirements as the other types of requirements should
be documented throughout the system development lifetime. IPFSdataStore has a small set of functional re-
quirements which are listed below. These functional requirements are enough to solve the detected problem
stated in Domain Analysis chapter 1.
R1 Users should be able to connect to the system using a crypto wallet provider such as MetaMask
or another. The user should not be forced to make an account including his email or any other personal
information. The wallet provider should be easily integrated with the webpage that represents the system.
Each wallet will have a public and private key. The public key will be as its name says and the private one
should be hidden from anyone.
R2 The user should be able to upload a file after he connects to one of his crypto wallets. The crypto
wallet should be Ethereum based. The file will be first encrypted using the AES encryption algorithm. The
web page will ask the user for an encryption password. After the file was uploaded encrypted on IPFS the
user should get a CID (Content Identifier) that can be used to retrieve the file back. The file is downloaded,
and the user should be again asking for the same password that was used for encryption to decrypt the
downloaded file.
R3 The data that will keep track of who and what data owns will be stored on a Smart Contract.
The Smart Contract will charge a fee for each record saved in it. The fee unit will be GAS and the GAS
price is the one that fluctuates based on various factors around the world. The Smart Contract will also
25
have a balance like any other Smart Contract that is deployed on a blockchain. This balance will be used to
keep the donations. The Contract should accept donations as a simple crypto-value transfer from another
contract or wallet held by a person. On the web page, there should be a button that can be used to show
what’s the amount of donations that this Smart Contract got. The donations should be withdrawable only
using the same wallet that was used to deploy the Smart Contract.
R4 On the SPA (Single Page App) there should be a button that can be used to get the list of all
CIDs for the stored encrypted files. This acrshortglo:cids can later be used to download the encrypted files.
Also, it should be possible to get the list of CIDs of the stored file of another user by specifying the wallet
address which is actually the private key.
R5 The user should be able to get the address of the owning wallet of a file that is stored on IPFS.
This can be used to demonstrate who is the owner of a file because each unique file will have a unique CID
encrypted with the same key.
R6 Once the user connects the SPA to a wallet provider, bellow should be listed all the CIDs of
the files that were already stored. If the user changes the connected wallet, the CIDs should also change
accordingly.
R6 The entire SPA should be deployed on IPFS. The app should be accessible through its CID. This
means that the app will be open source therefore the user will have more trust in the app. The app will not
have a backend part and the user will see all the code the app consists of.
2.2 Non-Functional Requirements

The nonfunctional requirements are the ones that describe how the system behaves under certain
loads, under attack, how many users should support, responsiveness, and so on. In the next section are
listed the non-functional requirements of IPFsdataSrtore. These requirements are the ones that make this
system special and they are mostly related to security – data protection. The requirements are not many but
they are enough to differentiate this system from the others.
R1 The system should support not only file storage but also raw string storage. The amount size of
the stored value should not exceed 500MB. This site will be enough to store sensitive information such as
identity documents.
R2 The stored data should be encrypted on IPFS so that no one will be able to read it. The encryption
will be based on AES encryption algorithm since this is the one recommended to be used to days as a
symmetric key encryption algorithm. The encryption process should not take more than 1 minute, but this
will depend on the client’s hardware because the app will run in the client’s browser.
R3 There should be a maximum of 10GB that a user can store on IPFS per wallet address. This
amount should be enough for a user if he/she wants to store only sensitive information.
26
R4 The system should be accessible anywhere on the planet where there is the internet and it should
be accessible easily like a simple website using a browser.
R5 Since all the data is stored on IPFS and on a Smart Contract (later named – the Ledger) there
shouldn’t be restrictions on how many users the system will hold. This relies on the blockchain capacity
and how many nodes are mining the respective blockchain, and of course, this also depends on how many
nodes contribute to the IPFS network.
Requirements Specification Conclusions

This chapter described what are the different types of requirements that a system should be described
and those are the business requirements, user requirements, product requirements, functional requirements,
and non-functional requirements. The last two were described in more detail. The functional requirements
and non-functional requirements are not many but they are enough to describe a system that aims to show
a concept of how people’s most sensitive data can be sore on the internet encrypted and without depending
on any third-party company.
The functional requirements are mostly focused on the ability to connect to the system using a crypto
wallet provider and to store encrypted files on the IPFS network. While the non-functional requirements
are focused on the abilities of the platform described in this paper. Some of the described restrictions are
related to what’s the limit of memory a user can use for storage. It should be noted that some requirements
are not dependent on the system itself but they depend on the platform that the system is built upon, in this
case, IPFS and Ethereum blockchain.
27
3 SYSTEM DESIGN
Each system should have a well-defined design before implementation. This held the developers
understand how to implement the system and will help later to extend it or to solve any issues about it.
While defining the design the architect while making the design diagrams can see issues that can appear
later in the system, before implementing it. This is because people in some situations can understand things
better if looking at a diagram.
This chapter presented the diagrams that explain how the systems work, what are the components,
and how those components interact between them. The platform consists of several components that come
from different environments. Those environments are different in the way they are constructed and each
of them has its own advantages and disadvantages in specific situations. As mentioned in the previous
chapters, this platform will consist of components that are not held by any third-party company and that are
open source. There will be a single component that is primarily developed withing this project, but this one
has the responsibility to integrate all the other ones in order to achieve the goal of allowing the user to store
their data online in a secure way.
The environments that interact are the following: an Ethereum blockchain, an IPFS network (for this
one will be needed a local node), and the local environment (user’s computer) where will be running a SPA.
This chapter will be described the use cases of the system, the high-level design showing the components,
how the data will be protected on IPFS, the flow of the data storage, and what are the external components
that complement the system.
3.1 Use cases

The IPFSdataStore platform, as mentioned previously, has a few use cases and those are enough
for achieving its goal of providing people with a secure way to store their data online. The Use Case
diagrams are a way of presenting what are actions a user can do on a given platform. This kind o UML
(Unified Modelling Language) diagram also shows the actions related to each other. For example, what are
the actions without which another broader action cannot be done. This is an easy way to present all the
system’s functionalities in a concise manner.
Use Case diagrams also show what are the actors that are involved and interact with the platform.
For IPFSdataStore there is one actor which is the user that primarily uses the platform. There is also
another actor that can be considered but this one can be considered more as a developer of the platform.
The IPFSdataStore has the possibility to receive donations because each Smart Contract like any other
blockchain address has value and has the ability to receive and withdraw money. Each Smart Contract has
only one deployer and this deployer and this deployer address can not be changed. For IPFSdataStore, only
the deployer of the Smart Contract will be able to withdraw the donated value.
28
In the following Figure 3.1 is presented the most important use cases of the system.
Figure 3.1 - Main use cases
As shown in this figure, the user will be able to connect to a crypto wallet, for instance, MetaMask.
MetaMask is an open-source chromium-based extension that is used to hold the key pair of crypto wallets,
and not only. After the SPA is connected to the wallet the user can select a file from his local storage, then
he/she will be asked for a key that will be used to encrypt the file because the file will be later store on IPFS
network and this network is open for everyone, everyone can see the what is stored on a given CID. The
same key that was used for encryption should be used for decryption. The Encryption process is based on a
symmetric key algorithm – AES.
Other use cases are the possibility to get encrypted files based on its CID. After the encrypted file
is downloaded it should be decrypted using the same key that was used for encryption. Another use case
is to get the owning address of the data stored on a given CID. This can be used to prove that a given CID
is owned by only one person. And lastly, one of the most important use cases is to get all the CIDs of all
the data that was stored using a given wallet. This is useful because it is difficult to remember all the CIDs
because they represent a long string of characters and so do the wallet addresses, but these are managed by
MetaMask, for instance.
29
3.2 The environments involved
The platform is built of components that are part of three totally different environments: blockchain,
Interplanetary filesystem, and the local machine of the user. One important component of the system is the
smart contract which will be deployed on the Ethereum blockchain. The data itself that is owned by the
user will be stored on IPFS network. And finally, the local computer of the user will be used to run the SPA
in the browser. The wallet management will also run in the user’s browser.
Blockchain is a technology used to store transparent data in a democratic way. The blockchain
is immutable and it has been proven that it cannot be hacked because of its characteristics. This is why
blockchain has been used and is still used for sensitive data storage such as keeping transactions of money,
it is used in voting systems, and so on.
IPFS is a new way of storing big amounts of data without involving other third-party companies such
as Google, Facebook, Amazon, or any other. It is a decentralized data storage mechanism in which anyone
can contribute with storage and can be rewarded for the storage offered. This can be achieved through
Filecoin. IPFS is actually a protocol very similar at P2P (Peer-to-peer) protocol. For IPFS it is enough to
start a node and through it, the user can store any file or even folder on the network.
3.3 High level architecture

The following Figure 3.2 is presented the high-level architecture of the system, what are the compo-
nent and how they interact. As explained earlier, there will be a wallet manager. This wallet management
holds pairs of keys. Each pair has a private and public key. The public key is also the address of a wallet.
The public address can also represent a contract.
The user interface will be a SPA. This will be running locally in the browser, will be open source
on GitHub and as well it will be deployed on IPFS. The single-page app will communicate with the wallet
manager then the wallet manager will send transactions to the blockchain. After the transaction was mined
on the blockchain, the wallet manager will be notified hence the SPA.
The app running locally will also have the responsibility to encrypt and decrypt the data before
storing it and after downloading. After some data is stored on IPFS, the user will get a CID the is like the
address where the data is stored. This address is a hash of the data, meaning that the same data will have
the same CID. But if the users use a different key to store the same data, the store bytes will look different
therefore the hash will be different.
And not lastly, the Smart Contract is deployed on the Ethereum blockchain. Ethereum blockchain, or
another blockchain that a EVM (Ethereum Virtual Machine) compatible, can be used for Smart Contracts.
These Smart Contracts are a kind of transparent and distributed computing and data storage. But to store
data and compute on a blockchain is costly. Each operation performed on the blockchain costs GAS, as
30
well as each bit stored. This is why blockchain can not be used to store the files the user owns. Instead, it
can be used to store data that does not take much memory and represents relations between entities.
Figure 3.2 - High-level architecture
The later sections will be described in more detail the external components: MetaMask, IPFS Desk-
top, Etherscan, testnets, and others.
3.4 Data protection using AES

There are more ways to encrypt and decrypt the data before storing it on the IPFS network. One set
of algorithms is based on a symmetric key and another set of algorithms is based on an asymmetric key.
Symmetric key encryption means that the same key is used for encryption as well as for decryption. Asym-
metric key algorithms are using two keys – one public key and another private. The two key algorithms a
usually slower and are merely used for exchanging symmetric keys, the symmetric key that should later be
used for the encryption of the actual data. [19] [13]
Some of the most popular symmetric key encryption algorithms are AES, DES, IDEA, Blowfish,
and others. The one that is considered to be more performant and secure is AES. Because it is a symmetric
key based, the algorithm can encrypt big chunks of data quickly. The algorithm is used by lots of today’s
applications, especially in governmental ones. It can be implemented and used at the hardware level or
31
software level. At the hardware level, the algorithm is more performant but more difficult to set up because
it could be the case that special hardware is used. For this project, the encryption will be performed at the
software level because this way to project is decoupled from the client’s hardware which means the system
is not hardware dependent.
On ssl2buy.com can be found a more in detail explanation of how symmetric and asymmetric key
cryptography work and what the differences between them are. [18] As shown in the following Figure 3.3
and Figure 3.4 that are fully explained on ssl2buy.com, the two parties which are involved in the secure
communication should use the same key for encryption and decryption
Figure 3.3 - Symmetric key cryptography
while for asymmetric key encryption, the two parties should not exchange any key. Asymmetric encryption
is a new technique. One of the most popular algorithms for public-private key cryptography is RSA. The
algorithm was invented by three people Rivest, Shamir, and Adleman, hence the name of the technique.
RSA is a technique that is based on mathematical principles just like most of the other cryptographical
techniques. This technique is basically based on some properties of the prime numbers.
The two keys are related to each other. First, the private key is chosen, and using the private key it
is possible to generate the public key. For encryption, the public key is used, and to be able to decrypt the
data only the private key will help. There is no other key that can decrypt this data.
A very similar technique is used to generate wallets for blockchains. The public key is considered
the address of a wallet and the private key is used to send transactions. Only the user who owns the private
key of an account can send crypto value from that account. It is very easy to obtain a crypto wallet. For this,
it is just needed to generate a key pair with any tool that generates addresses compatible with the ones used
32
by the blockchain. The user can firstly any private key he/she wants, it is recommended the key be longer,
the based on this private key the public key will be generated.
Figure 3.4 - Asymmetric key cryptography
In the next section will be explained the entire process of how a file is encrypted and saved to IPFS,
registered in the Smart Contract.
3.5 Data storage flow

In The section are explained all the steps needed to be performed in order to store a file on IPFS se-
curely and register it in a Smart Contract deployed on the Ethereum blockchain. The involved components,
what are their roles, and how they interact with each other are graphically presented in Figure 3.5.
First of all the user will access the on a ipfs:// link. After the page has loaded the user should
connect to one of his accounts. But this will not be a classical account represented by an email and a
password, or something similar. An account will be a crypt wallet, in this case, an Ethereum-based wallet.
The encryption and decryption of the date will take place on the user’s machine. The user in order to
connect to a crypto wallet will click on a button, this button will open MetaMask. MetaMask will ask for
a password because all the actions through MetaMask should be done only by the person who owns the
accounts that are managed by MetaMask. After entering the password MetaMask will ask the user which
accounts (wallets) he/she wants to connect to the SPA. After selecting the accounts, the user will press
connect and in this way, the user interface will be connected to a wallet provider which in this case will be
MetaMask.
Once the user connects the SPA to MetaMask, he/she can perform an action on the blockchain using
33
it. Now, if the user wants to store some data to IPFS, they will select a file from their local machine’s hard
drive, and after the file was selected the user will be asked for a key to be used for encryption. After the key
was inserted the file will be encrypted using the AES algorithm. To store the file on IPFS the user will have
to press Store File button. After a few seconds, the user will receive the response from IPFS containing the
CID of the stored data.
Figure 3.5 - Data storage flow
The obtained CID is then sent to the Smart Contract using the selected account from MetaMask. For
this will be created a transaction. This transaction will cost GAS. The GAS price depends on the state of
the blockchain. The selected account from MetaMask will be charged the consumedgas ∗ gasprice ETH.
The Smart Contract will register this CID on a map as a key and its value will be the address of the wallet
that was used to register the CID. In this way, the Smart Contract will keep track of who owns specific data
from IPFS because the CID is like an address of the date on IPFS.
To receive The file back decrypted the user should know the CID. If knowing the CID he can down-
34
load the data from IPFS. After downloading, they will be asked for the same key that was used for encryption
to decrypt the file. If the wrong key is specified then the ”decrypted” data will be corrupted.
3.6 Integrated components

This project consists of a number of components that work together to achieve the defined goal
which is described in the Domain Analysis chapter 1. In Figure 3.6 are presented the components related
to the Ethereum blockchain and what to process of Smart Contract development.
Figure 3.6 - Smart Contract development and the related components
The only components that are developed for this project are the SPA and the Smart Contract. The
single page application will be deployed on IPFS while the Smart Contract (also named – Ledger) will
be deployed on the Ethereum blockchain. The other components are used to fulfill the other needs and are
integrated with the whole system. The external components used are a MetaMask extension for the browser,
the Ethereum blockchain, the IPFS network, and Etherscan. The ones that were not explained till now will
be explained in the next sections of the next Implementation chapter 4.
3.6.1 MetaMask
MetaMask is an Ethereum-based crypto wallet manager. This wallet manager is open source. It can
be used to call Smart Contract or to send amounts of ether from one account to another. The manager can
35
be considered a gateway to the Ethereum blockchain. In Figure 3.7 can be seen the interface of MetaMask.
Figure 3.7 - MetaMask extension interface
MetaMask is simple and easy to use. The extension can be used to swap Ethereum-based assets, to see the
previous transaction made with a specific wallet, and is easy to connect to different blockchain networks
even testnets. It also can hold multiple wallets.
3.6.2 Etherscan
Etherscan is the block explorer for Ethereum-based blockchains. One can think of it as being the
Google for the Ethereum blockchain. Figure 3.8 is presented one section of a page from Etherscan where
is shown some general information about the current state of the mainnet Ethereum blockchain. Some
of this information includes the current Ether price, the total number of transactions, medium GAS price,
Ethereum transaction history in the last 14 days, the market cap, the number of the last finalized block, and
the number of the last safe block.
Etherscan can also be used to look into the code of a Smart Contract because everything in the
blockchain is open and transparent. It is also to see the ABI of a Smart Contract, the Transactions related
36
to one, the events and so on. In general, it is possible to see everything that is related to the blockchain.
Figure 3.8 - Transaction history listed in Etherscan
3.6.3 IPFS Desktop

IPFS Desktop is an application that spins up an IPFS node. This node can be used to interact with
the IPFS network, to see the content stored at a specific CID, to create shareable links and other useful
things. The application is open source. This allows using it for applications that require a high level of
security and allows people have more trust in it. The application can be installed on Windows, MacOS, and
Linux-based Operating Systems. Some of the features supported by the app are listed below.
– start IPFS node;
– store files on IPFS network;
– download file from IPFS network;
– visualize the contents on a given CID;
– deploy static websites.
3.6.4 Blockchain testnets

Blockchain testnets are used for development purposes. In the beginning, Ethereum-based blockchains
were based on Proof of Work. Proof of Work testnets were Ropsten, Rinkeby, and others which are similar.
But recently these became deprecated because the mainnet transitioned to a Proof of Stake technique of
37
proving the correctness of a block. Now to testnets in use are Goerli and Sepolia. These two are based on
Proof of Stake.
Testnets are using fake ETHs. This is useful for development purposes where it is risky to use real
money. Testnets have their own scanners very similar to Etherscan for mainnet. For example, the scanner
for Goerli testnet is goerli.etherscan.io. In order to get fake ETH, there are faucets. Faucets are a simple
website that can be used to transfer fake ETH into our wallets used for development purposes.
System Design Conclusions

This chapter presented the high-level architecture of the system, what are the components that make
the system achieve its goal, and how they interact with each other. At the beginning of the chapter were pre-
sented the functionalities supported by the system using Use Case Diagrams. Then the overall architecture
was presented and explained.
Environments that interact to make the system work are an Ethereum-based blockchain, the IPFS
network, and the user’s local machine which will be running a SPA in the browser. Another important
component that makes the connection between the user interface and the blockchain is the wallet managed
– MetaMask. Also, in this chapter was explained why AES encryption algorithms were chosen and how
they will be used in the system to protect the client’s data. To make a conclusion the architecture was
presented with a diagram showing the flow involved to save a file on the IPFS network securely. And in the
end, were described shortly the other components that are used for the system to work or only can be used
to monitor its activity.
38
4 SYSTEM IMPLEMENTATION
This chapter describes the technologies used to make the concept alive are shown the code snipped
for the actual implementation and other aspects related to the implementation of the system. Usually, there
are more possibilities to implement the same system, these possibilities relate to the chosen techniques and
technologies. Also, the implementation mostly depends on the defined architecture. Different technologies,
techniques, philosophies, and architectures will produce systems with different non-functional achieve-
ments, while the functional requirements will be satisfied and will work mostly the same way. So, to get
better performance, a more secure system, and other improved non-functional requirements it is important
to choose the appropriate technologies, and techniques and to have a well-defined system design.
In general, for this system were chosen technologies that are proven to be reliable. These technolo-
gies are also some of the most popular have great documentation and are supported by lots of developers.
Smart Contract development is already not something new for developers and even Smart Contracts are
already used by many clients where these technologies fit well for their requirements. These requirements
are usually regarding security and transparency. IPFS is already used by artists to share their artworks, also
known in the online world – NFTs.
4.1 Technologies
There are several languages available for developing Smart Contracts. Most of the time the used
language depends on the used blockchains. There are blockchains that support Smart Contract develop-
ment using general-purpose programming languages such as Python. Once Smart Contracts were invented,
specific languages for their development appeared. One of those languages is Solidity.
4.1.1 Solidity
Solidity is a Smart Contract development language. It has all the features needed to develop a
feature-rich Smart Contract. The language is high-level and object-oriented. This language allows the
creation of a Smart Contract that represents sub currencies. Each Smart Contract file has a .sol extension.
Each File should have a license identifier and a compiler version to be used to compile the Smart Contract.
In the Listing 4.1.1 is presented a simple Contract example and how the version and license can
be specified at the top of the file. The Contract is like a class in Java programming language. It has state
–attributes and behavior – methods. Each contract is resized at an address on the blockchain where it is
deployed. In the Listing 4.1.1 the uint storedData is the state while function set(uint x) public and function
get() public view returns (uint) are methods. Methods can be called from another contract or from the
exterior of the blockchain for example using MetaMask. Also, methods have different access modifiers
such as public, private, external. There are as well other methods specifiers such as payable. Payable
means that the call of this method will charge a specific amount of ETH.
39
/ / SPDX− L i c e n s e − I d e n t i f i e r : GPL− 3 . 0
pragma s o l i d i t y >=0.4.16 < 0 . 9 . 0 ;
contract SimpleStorage {
uint storedData ;
function set ( uint x) public {

storedData = x ;
}
f u n c t i o n g e t ( ) p u b l i c view r e t u r n s ( u i n t ) {
return storedData ;
}
}
Listing 4.1.1: Smart Contract definition example
Solidity also supports the creation of an interface. Interface help to define commons trait for the
contracts.
4.1.2 Hardhat
When developing a Smart Contract it is not recommended to use the magnet because the magnet
uses real money. For each Smart Contract deployment, the developer should pay a GAS amount. There are
testnets that can be used with fake money for testing purposes. Testnets are recommended for testing pur-
poses because they are very similar to the mainnet. But they are not convenient to be used for development
because testsnets like mainnets are slow. They are slow because each transaction should be confirmed with
a few mined clocks to make sure that the transaction was successfully registered in a block. Here come to
help the libraries such as Hrdhat, Truffle, Ganache, and other useful tools. For this project is used Hardhat
because its widely used by many developers and because it is based on Javascript which is a well know
programming language used in many applications of different types.
Hardhat has many features that are handy when developing a Smart Contract. One of the most
important is the ability to run a fake local blockchain node. This local blockchain is quick and the developer
will not have to wait a lot of time for the transaction to be mined and confirmed. Any valid blockchain even
the local one can be easily integrated into MetaMask. MetaMask is an extensible tool that works with any
Ethereum ETH-based blockchain.
Hardhat as well has lots of plugins, for example, it has plugins that make the deployment easier or
40
plugins that help test the Smart Contract from the exterior. A list of Hardhat plugins are if the following
– @nomicfoundation/hardhat-toolbox;
– @nomicfoundation/hardhat-chai-matchers;
– @nomiclabs/hardhat-ethers;
– @nomiclabs/hardhat-etherscan;
– @nomiclabs/hardhat-vyper;
But these are just a fraction of them. The framework has plugins for almost any thing that is needed for
Smart Contract development. The framework can be installed as a NPM package and has an extension for
VS Code.
4.1.3 IPFS-core
As presented in the third System Design chapter 3, the system is composed of several components
two of which are the ledger that is deployed on an Ethereum blockchain and another is a client app that in
this case will be a single page application. The client app will connect to the IPFS network. For this, it
needs a way to connect to the network.
IPFS-core is a JavaScript-based library. The library allows connecting to the IPFS network and
provides useful functions to upload or download data to and from the network. The library integrates all
that is needed to integrate the app into the IPFS network.
It is easy to integrate the library into the application. As shown in Listing 4.1.2 all that’s needed to
get a connection to the network is to import the library and to call the .create() function.
i m p o r t * a s IPFS from ’ i p f s − c o r e ’
c o n s t i p f s = a w a i t IPFS . c r e a t e ( )
c o n s t { c i d } = a w a i t i p f s . add ( ’ H e l l o world ’ )
Listing 4.1.2: How to connect to IPFS using IPFS-core
The cid is the content identifier explained in the System Design chapter 3. The library is free and open
source and dual licensed under MIT and Apache-2.0.
4.1.4 Data protection using AES

In order to protect the data stored on IPFS network the application will encrypt it using a symmetric
key encryption algorithm. The selected encryption algorithm is AES because it is considered to be more
performant than the other. Another popular symmetric key encryption algorithm is DES which appeared
41
before AES. There is also a derivative of DES called 3DES. But because this is a derivative the technique
doesn’t differ very much from DES. It has been chosen as a symmetric key encryption technique because
this technique is faster than the asymmetric key encryption techniques therefore big chunks of data can be
encrypted quickly using AES.
It has been proven that DES and 3DES are not enough secure for the current systems. AES has the
advantage that it allows using keys of different lengths. It also uses a more elegant mathematical approach.
DES is based on a key of 56bit length while AES allows keys of length 28-bit, 192-bit or 256-bit. This
makes the algorithm much stronger. Another disadvantage of DES is that it is efficient only on hardware.
AES was designed to be efficient in both hardware and software. Figure 4.1 is presented a show summarized
comparison of the two encryption algorithms.
Figure 4.1 - Summarised comparison between DES and AES
When the user wants to upload a file on the IPFS network, he/she will first be asked for the en-
cryption key. After the user provides the encryption key the file is encrypted and sent to the network in
the encrypted form. The encryption will take place at the software level, more precisely in this case the
encryption will take place in the browser.
When the user wants to download the encrypted file, he/she will have to provide the same key that
was used when the file was encrypted and save it in the IPFS network. The file will first be downloaded then
as in the encryption case the description will take place at the software level and also in the browser. There
won’t be any error if the user specifies the wrong decryption key. The client app will still try to decrypt the
file but the resulting data will not be the original one - in fact, there will be a file consisting of bytes that in
the end do not represent any valid data format.
4.1.5 Goerli testnet

Testnets are also blockchains but these blockchains are not used with real money. They are very
similar to the mainnet, the only difference is that the money from the testnnets are not recognized to be
valuable because testnets are used only for development purposes. In fact, they are some kinds of services
42
called faucets that allow a developer gat a certain amount of crypto for testing purposes to test his application
which is in the development phase. Since there are these faucets, the money from the testnets is not valuable
at all. The testnets were invented for development and testing purposes because it is not real to use the
mainnet for testing purposes. Using the mainnet for testing and development purposes will lead to the
loss of lots of money because the processing of developing any kind of software is based on the fail-trial
approach.
Sometimes the testnets that act almost identical to the mainnet are also slow. The more users use
the testnet the slower it is because there would be more blocks to mine if there are more users. There are
as well other kinds of blockchains that are used for testing and development purposes. But this blockchain
even if on the surface they act like a mainnet, under the hood they are actually very different implemented.
This difference is intentional because these types of blockchains are made to be fast in order to have a fast
development cycle. Like in the case of testnets mentioned earlier, the blockchain does not use real money.
This blockchain is usually integrated into a tool that is used for developing Smart Contracts and this tool
comes with many other useful features that are useful for Smart Contract development. An example of such
a tool was already presented in one of the previous sections of this chapter.
In regard to the testnets, there are many of the not just one. Each blockchain mainnets has several
testnets. Ethereum mainnet, for instance, had the following testnets: Ropsten, Rinkeby, Kovan, and other.
”Had” because these are already deprecated. They were deprecated because Ethereum 2022 switched to a
different block approval approach. Earlies the PoW (Proof of Work) approach was used, but now the PoS
(Proof of Stake) approach is being used. When the Ethereum mainnet switched to PoS two new testnets
were created to work with the proof of stake approach: Goerli and Sepolia. These two testnets are very
similar to the actual Ethereum mainnet. It doesn’t make a noticeable difference which one of them is
chosen to work with. For this project, the Goerly testnet was chosen. This means that the Ledger Smart
Contract will be deployed on the Goerli testnet and all the transactions will be registered on this testnet. But
this happens only for the testing and development phases. In production, the mainnet will be used.
Each mainnet has one or more block explorers. The block explorers are interfaces (web, mobile,
or other) that listen to all blockchain interactions and keep track of what happened on the blockchain. For
the Ethereum mainnet the official block explorer is Etherscan. The testnets also can have their own block
explorers. For the Goerli testnet for instance, the official block explorer is https://goerli.etherscan.
io/. On this website can be seen all the blocks that have been approved, what the approved transactions
and what are pending ones, all the information about the transactions, and lots of other useful information.
43
4.2 The Smart Contract
The Smart Contract, also called the Ledger, is responsible for maintaining the relationship between
the wallet address with which a file was deployed to the IPFS and the CID of the uploaded file. This Ledger
is like a key-value database. It basically consists of a hash map in which the key is the CID of the file and
the value is the address of the wallet. In this way, the Ledger keeps track of who is the owner of each data
stored on the IPFS network.
The Smart Contracts are agreements between the users of the platform. Everyone using the Smart
Contract agrees to the contract’s rules. Everyone can see the contract’s rules because the contracts are
deployed on a blockchain and everything that is on a blockchain is publicly available. This means that
everyone can see the code of the Smart Contract. They are computer programs written in programming
languages specific to writing them or some other general-purpose programming languages. The contract is
automatically executed when a transaction is sent from a wallet to the address of the contract or even from
another contract to the address of a contract. When a transaction is sent to a Smart Contract, the contract
has all the information of the transaction and even has the access to all the funds that were sent with the
transaction.
For this platform, besides registering in the Smart Contact who is the owner of each data store in
the IPFS network, the Smart Contract can do a little more. For example, it can receive donations, it offers
information about what are all the CIDs of the data that a user owns.
4.2.1 Build process

There are many tools that can be used to develop a Smart Contract such as Ganache, Truffle, and
Hardhat. These tools are used as frameworks that provide all the instruments needed to develop the Smart
Contract. For this project was chosen Hardhat because it is a JavaScript-based framework and the same
programming language will be used to develop the client app. The client app will be a single page app
running in the browser and which will be deployed on the IPFS network.
First of all the Hardhat project is generated using the following command
npm i n s t a l l −− s a v e − dev h a r d h a t
Listing 4.2.1: Generate Hardhat project
after running this command the developer will be asked if he/she wants to use JavaScript or TypeScirpt
to develop with. After the execution finishes the package.json file will be created containing the in-
formation about the project as shown in Listing 4.2.2. Two of the most important dependency are the
@nomiclabs/hardhat-ethers and ethers. These two libraries allow interaction with the blockchain. chain is
used for testing the Smart Contract. hardhat-deploy is a plugin/extension of Hardhat. It is used to deploy
the Smart Contract on the blockchain. hardhat-gas-reporter is a hardhat plugin that is useful to create gas
44
usage reports for the gas that was used to deploy the Smart Contract. solidity-coverage is a library that is
used to calculate the coverage percentage of the tests. These are mainly the most important libraries used
by Hardhat.
{
...
” devDependencies ” : {
” @chainlink / c o n t r a c t s ” : ” ˆ 0 . 3 . 1 ” ,
” @nomiclabs / h a r d h a t − e t h e r s ” : ”npm : h a r d h a t − d e p l o y − e t h e r s @ ˆ 0 . 3 . 0 − b e t a
.13” ,
” @nomiclabs / h a r d h a t − e t h e r s c a n ” : ” ˆ 3 . 0 . 0 ” ,
” @nomiclabs / h a r d h a t − w a f f l e ” : ” ˆ 2 . 0 . 2 ” ,
” chai ”: ”ˆ4.3.4” ,
” dotenv ”: ” ˆ 1 4 . 2 . 0 ” ,
” ethereum − w a f f l e ” : ” ˆ 3 . 4 . 0 ” ,
” ethers ”: ”ˆ5.5.3” ,
” hardhat ”: ”ˆ2.8.3” ,
” hardhat −deploy ”: ” ˆ 0 . 9 . 2 9 ” ,
” h a r d h a t − gas − r e p o r t e r ” : ” ˆ 1 . 0 . 7 ” ,
” p r e t t i e r −plugin − s o l i d i t y ”: ”ˆ1.0.0 − beta .19” ,
” s o l i d i t y −coverage ”: ”ˆ0.7.18”
},
...
}
Listing 4.2.2: Hardhat project dependencies
After the basic Hardhat project structure was created in the project’s files there will be a file called
hardhat.config.js. This file has all the configurations needed for Hardhat to work with the testnets, or even
the mainnet, although this is not recommended.
As shown in Listing 4.2.3 the hardhat.config.js imports some constants that should not be visible to
the public. These are private keys of wallets that are going to be used to deploy the contract or keys to some
APIs.
c o n s t COINMARKETCAP API KEY = p r o c e s s . env . COINMARKETCAP API KEY

c o n s t GOERLI RPC URL = p r o c e s s . env . GOERLI RPC URL
c o n s t GANACHE RPC URL = p r o c e s s . env . GANACHE RPC URL
45
c o n s t PRIVATE KEY = p r o c e s s . env . PRIVATE KEY
c o n s t ETHERSCAN API KEY = p r o c e s s . env . ETHERSCAN API KEY
Listing 4.2.3: Environment variables
These environment variables can be defined in a .env file or even in the environment variables at the op-
erating system level. The variable should not depend on the system and should be easily changed without
affecting the system’s functionality.
The imported variables are used to configure the components of Hardhat such as the testnets, the gas
reporter, the accounts, the used Etherscan, the solidity compiler version to be used, and others. In Listing
4.2.4 how the testnet networks are configured to be used by Hardhat.
module . e x p o r t s = {
defaultNetwork : ” hardhat ” ,
networks : {
hardhat : {
/ / from h a r d h a t
/ / p o r t 8545
c h a i n I d : 31337 ,
/ / g a s P r i c e : 130000000000 ,
},
ganache : {
u r l : GANACHE RPC URL ,
c h a i n I d : 1337 ,
},
goerli : {
u r l : GOERLI RPC URL ,
a c c o u n t s : [ PRIVATE KEY ] ,
chainId : 5 ,
blockConfirmations : 3 ,
},
},
...
namedAccounts : {
deployer : {
46
default : 0,
1: 0 , / / mainnet
5: 0 ,
},
},
...
}
Listing 4.2.4: Hardhat testnet configuration
As shown in the Listing 4.2.4 there are configured three blockchains to be used in the development and
testing. The first one hardhat comes by default from Hardhat. This is not a testnet, but it behaves like one.
This blockchain is very fast and this is convenable for teh developer. The second one is ganache which
also is not a real testnet but a simulation of it. Under the hood, it doesn’t look like a real blockchain. This
blockchain is also useful for development and it has a GUI (Graphical User Interface) that shows all the
blocks, accounts, transactions and other useful information about the blockchain. The last one that was
used is goerli which is a real testnet. Goerli is a real blockchain that is very similar to the main net just that
it doesn’t use real money. The Goerli testnet is used only for development and testing.
Each blockchain whether it’s the mainnet, a testnet, or a fake blockchain for development, has a
chainId. The chainId is used to differentiate between the many blockchains there are available. In the
previous configuration, the chainId is specified alongside an URL. This URL shows an access point to the
blockchain through RPC (Remote Procedure Call). It is also possible to specify in the configuration the
number of accounts that are going to be used with the blockchain and the number of blocks to be used for
transaction confirmation. Confirming a transaction means that X blocks were mined after a transaction was
created in the blockchain.
Once the Hardhat project has been configured it is possible to write the code for the Smart Contract.
The project has the following file structure as presented in Listing 4.2.5
− artifacts /
− contracts /
− deploy /
− deployments /
− node modules /
− scripts /
− test /
− utils /
47
− . env
− package . json
− hardhat . config . js
...
Listing 4.2.5: Hardhat project structure
In the contracts folder are located the Solidity files that are used to develop the Smart Contracts. In these
folders, there is also another folder called test. In the test folder under the contracts folder are located
all the tests for Solidity code. The Smart Contract can be tested as unit tests at the Solidity Code level
and these tests are located in the test directory under the contracts directory. It is also possible to test the
Smart Contracts in integration (integration tests). To write their integration tests it is possible to use any
programming language that has the needed libraries to test the Smart contract that is already developed
on a test net. The integration tests are located in the test directory under the project’s root directory or
a new scripts folder can be created under the root folder that and those scripts can be used to test the
Smart Contract from the exterior. The deploy folder contains JS files that are used to deploy the Smart
Contract on a testnet. These deployment scripts are similar to how database migration is written in SQL.
The deployments directory contains the resulting artifacts after deploying a Smart Contract on a testnet.
In the .env file are specified all the required properties that are injected in the JavaScript scripts and those
properties are used to configure the deployment of the Smart Contract of to configure any library that is used
to get some information about the contract after deployment, for example how much GAS has been spent
for the deployment. In the package.json file are specified all the needed libraries and their versions and in
as explained earlier, in the hardhat.config.js file are specified the configurations for the Hardhat project.
As mentioned, the Smart Contract is developed in a language called Solidity. There are out there
many programming languages to develop Smart Contract. Some of them are even well-known general-
purpose languages such as Python, Rust, etc. Solidity is a niched programming language that is used only
for Smart Contract development. In Listing 4.2.6 is presented the Solidity code for the Smart Contract that
is used for this project and next this code will be explained in more detail.
/ / SPDX− L i c e n s e − I d e n t i f i e r : MIT
pragma s o l i d i t y ˆ 0 . 8 . 7 ;
e r r o r NotOwnerError ( ) ;
c o n t r a c t Ledger {
a d d r e s s p u b l i c immutable i owner ;
48
address [] public users ;
mapping ( a d d r e s s => u i n t 2 5 6 ) p u b l i c s d o n a t o r T o D o n a t e d A m o u n t s ;
mapping ( s t r i n g => a d d r e s s ) p u b l i c s c i d T o A d d r e s s ;
mapping ( a d d r e s s => s t r i n g [ ] ) p u b l i c s a d d r e s s T o O w n e d C i d s ;
e v e n t NewCidRegistered ( a d d r e s s ownerAddress , s t r i n g c i d ) ;
event DonationsWithdrawal ( ) ;
e v e n t NewDonation ( ) ;
constructor () {
i o w n e r = msg . s e n d e r ;
}
/ / m e t h o d s t o p e r f o r m a c t i o n s on t h e s m a r t c o n t r a c t
...
}
Listing 4.2.6: The Ledger
The full code of this contract can be found in the Appendix of this paper.
Each Smart Contract should specify its license identifier. For this project, the MIT licenses have
been chosen. In a later section will be explained why this license was chosen and what others are other
licences available. In the case of Solidity, the license should be specified at the first line of the code as a
comment in this way SPDX-License-Identifier: MIT. After the licence was specified it is required to specify
the version of the compiler to be used to compile the Smart Contract. This is done using pragma solidity
followed by the compiler version. For this project, the 0.8.7 version was chosen.
Solidity also supports error types. to define an error type is needed to write the keyword error
followed by the error name starting with a capital letter. This contract uses the NotOwnerError error to
panic the execution of the contract in the case an action is done by someone else other than the owner of the
contract. Recall that the owner of the contract is represented by the wallet address that was used to deploy
the contract. As it will be shown later, there is a function that is used to withdraw all the funds from the
donations and this can be done only by the owner of the contract.
Each contract starts with the contract keyword and then followed the Contract’s name starting with
a capital letter very similar to how classes are named in Java programming language. A contract is similar
to how a class is in Java. It has state and behavior. To mutate the state of the contract requires GAS to be
49
burned, therefore it costs fees. Like in Java, how classes can implement interfaces, contracts in Solidity
can also implement interfaces. Implementing an interface allows the creation of contracts that share similar
behaviors. For example, there is an ERC20 specification that is used to create Ethereum-based tokens. It is
simple to create your own toke by implementing this ERC20 interface.
The state of the contract is one of the most important parts because the contract is used to hold
some information. This information is visible to everyone and cannot be corrupted. For the Ledger Smart
Contract there are a few mapping that is used to hold the information about who is the owner of each CID
of the files stored on the IPFS network. The i owner is an immutable attribute that holds the address of
the wallet that was used to deploy the contract. The users attribute is an array that holds all the wallet
addresses that stored a CID on this Smart Contract. The s cidToAddress is a storage attribute that holds the
information about who is the owner of a given CID. The other attributes are used for similar purposes or for
donations.
Solidity language also supports firing events. Events are used to log information about what hap-
pened in the Smart Contact. For example, it is possible to log all CID registrations, all donations and so on.
In this case, there are three types of events: NewCidRegistered, DonationsWithdrawal and NewDonation.
Events can also hold data. When an event is fired the data is specified and the data is logged in the block
explorer.
Similar to object-oriented programming languages, Solidity Smart Contract have constructors that
are automatically called when the contract is deployed. This constructor should be used to initialise the
state of the contract. In this case, is used to set the owner of the Smart Contract. Only the owner is the one
who can withdraw the donations of the Smart Contract.
In Listing 4.2.7 is presented the function that is used to register a new CID on the Smart Contract’s
state in the blockchain.
f u n c t i o n p u b l i s h C i d ( s t r i n g memory c i d ) p u b l i c {
s c i d T o A d d r e s s [ c i d ] = msg . s e n d e r ;
s a d d r e s s T o O w n e d C i d s [ msg . s e n d e r ] . p u s h ( c i d ) ;
u s e r s . p u s h ( msg . s e n d e r ) ;
e m i t N e w C i d R e g i s t e r e d ( msg . s e n d e r , c i d ) ;
}
Listing 4.2.7: Smart Contract function to publish CID
The function in Solidity can have access specifiers. This function has the public access specifier because it
should be accessible by anyone. Other access specifiers that can be used are external, internal and private.
The private access modifier is used to specify that a function should be visible only from the contract where
50
it is defined. The external visibility modifier is used to specify that a function can be called externally
only by another contract because the contract can also call other contracts. And finally, the internal access
specifier is used to specify that a function can be called within the contract or from another contract that
inherits from the contract in which the function is defined.
The function takes a string which is the content identifier of the data stored on the Inter Planetary
file system network. This content identifier is stored in the s cidToAddress map to keep track of who is its
owner. Each time a function is called there is a default variable called msg. This is like an object that holds
all the information about the current call to this function. This information consists of the address of the
wallet that was used to call the function, the value that has been sent when calling the function, if any, and
other useful information. After the address of the wallet that called the function was saved in the contract’s
state, the cid is stored in s addressToOwnerCids. This attribute was explained earlier, it is used to store the
CIDs that are owned by a specific wallet/user. Then the users array is registered that this user has used this
contract to store some data. Finally, the NewCidRegistered is fired.
In Listing 4.2.8 is presented as the function that can be used to verify who is the owner of a given
CID.
f u n c t i o n getOwnerOfCid ( s t r i n g memory c i d ) p u b l i c view r e t u r n s ( a d d r e s s )

{
return s cidToAddress [ cid ] ;
}
Listing 4.2.8: Function to get the owner address of CID
This function is as simple as it looks. It only returns the address of the wallet that was used to store the
given CID on the Smart Contract. There are some differences if to compare this function with the previous
one that was presented. This function returns a value which is an address. Solidity has a dedicated data type
called address to represent wallet addresses. And this function has a new function modifier called view.
This view is used to define a function that only reads the state of the Smart Contract, it does not change the
state. Therefore it does not cost anything to call this function. Anyone can call this function because it is
public and more than that, it won’t cost anything to check who is the owner of a given CID.
4.2.2 Resulting artifacts

Once the Solidity code for the Ledger Smart Contract was written. It can be compiled using Hardhat.
To compile the contact can be used the yarn command shown in Listing 4.2.9. After the contract has been
compiled a new folder will be created in the root directory of the Hardhat project.
$ yarn hardhat compile
Listing 4.2.9: Compile the contract
51
This folder is called artifacts and in it will be two folders: build-info and contracts. In the build-info folder
there will be a JSON file with lots of useful information about the build. In the other contracts folder will
be two files for each compiler contract. In the case of this project, there is only one contract called Ledger,
therefore there will be two files: Ledger.dbg.json and Ledger.json. The Ledger.dbg.json is just a JSON
file that point to the file from the build-info directory. But the most important file is the Ledger.json. A
shortened part of this file is represented in Listing 4.2.10.
{
” contractName ” : ” Ledger ” ,
” sourceName ” : ” c o n t r a c t s / L e d g e r . s o l ” ,
” abi ”: [
{
” inputs ”: [] ,
” s t a t e M u t a b i l i t y ”: ” nonpayable ” ,
” type ”: ” const ructor ”
},
{
” inputs ”: [] ,
” name ” : ” N o t O w n e r E r r o r ” ,
” type ”: ” e r r o r ”
},
...
],
” b y t e c o d e ” : ”0 x 6 0 a 0 6 0 4 0 5 2 f f f 6 0 4 0 c 5 7 8 3 0 0 0 8 0 7 0 0 3 3 ” ,
...
}
Listing 4.2.10: The Smart Contract ABI
The JSON format to show all the information about the result of the contract compilation. In this file
can be seen the contract name, the source Solidity file, the bytecode, and other useful data. But the most im-
portant part is the ABI (Application Binary Interface). The ABI is a JSON array that has information about
all the components of the Smart Contract such as the functions, the constructor, the event, the error, and so
on. The ABI can be used to call another Smart Contract from a Smart Contract. In general, the ABI is used
to call a Smart Contract. For example, in Listing 4.2.10 two functions can be noticed. Most of the functions
were removed from the ABI in order to have a simpler example to present. The first function described
52
is the constructor. As shown, the constructor has 0 input parameters and the stateMutability=nonpayable.
Payable functions are the ones that extract some funds from the called when a function is called. The
bytecode is what is saved on the blockchain and this code is run by the EVM.
4.2.3 Deployment
After the ABI is obtained it is possible to deploy it on a devenet, testnet, or even on the mainnet.
To deploy first it is needed to have a deploy script. There is a good Hardhat plugin called hardhat-deploy
that makes the deployment very easy. The deployment scripts are like database migrations. These scripts
should be placed in the deploy folder in the Hardhat project root directory. In the case of Hardhat, the
scripts are written using Javascript and the files should follow a certain pattern, for example, the first script
file to deploy the Ledger is named 01-deploy-ledger. The idea is to name the deployment script starting
with numbers and those numbers should represent the order in which the script should run. Hardhat will
detect which scripts were already run and will run only the ones that are new.
In Listing 4.2.11 is presented the deployment script to deploy the Ledger Smart Contract.
{
...
module . e x p o r t s = a s y n c ( { getNamedAccounts , d e p l o y m e n t s } ) => {
c o n s t { deploy , log } = deployments
c o n s t { d e p l o y e r } = a w a i t g et N a me d A c co u n ts ( )
const c h a i n I d = network . c onfi g . c h a i n I d
l o g ( ” D e p l o y i n g L e d g e r and w a i t i n g f o r c o n f i r m a t i o n s . . . ” )
c o n s t l e d g e r = await deploy (” Ledger ” , {
from : d e p l o y e r ,
args : [] ,
log : true ,
/ / we n e e d t o w a i t i f on a l i v e n e t w o r k s o we c a n v e r i f y
properly
w a i t C o n f i r m a t i o n s : network . c onf ig . blockConfirmations | | 1 ,
})
l o g ( ‘ L e d g e r d e p l o y e d a t ${ l e d g e r . a d d r e s s } ‘ )
i f ( i s T e s t n e t N e t w o r k ( n e t w o r k ) && p r o c e s s . env . ETHERSCAN API KEY ) {

await v e r i f y ( ledger . address , [ ] )
53
}
}
...
}
Listing 4.2.11: Contract deployment
This small script takes the contract Solidity file from contracts folder by the name. The deployers variable
is the that shows what wallet to be used to deploy the Smart Contract. This comes from the hardhat.config.js
where it’s specified what wallet to be used to deploy the contract on a specific network as shown in Listing
4.2.12.
...
namedAccounts : {
deployer : {
default : 0,
1: 0 , / / mainnet
5: 0 ,
},
},
...
Listing 4.2.12: Deployer accounts per blockchain network
As it can be seen from the listing above, it is possible to specify which wallet to be used when the deploy-
ment command is run. On the left side is the chainId and on the right side is the index on the wallet from
the list of wallets that were defined for each blockchain network separately. If no account was specified
then the default one will be the first wallet from the list of wallets.
4.3 The client application

To interact with the Smart Contract that is deployed on a blockchain it is needed to have a client
application with a user-friendly interface. The encryption process will also take place on the client side.
Before storing a file on the IPFS network, the client application will encrypt it, and when the file will be
downloaded the client application will perform the decryption.
The client application can be any type of application. It can be either a web app single page or not,
it can be a mobile application, a desktop application, and so on. For this project was implemented a SPA
was because it is possible to access it from the IPFS network using the browser and run it in the browser.
This helps the goal of minimizing third-party involvement.
54
To implement SPA SvelteKit was used because it is a JavaScript library that is easy to use and solves
many problems of the older libraries such as ReactJs and Angular. Some of those problems are related to
component state management. To implement the encryption using AES was used a JavaScript library called
crypto-js. The package.json file where all the libraries used are shown is presented in Listing 4.3.1
...
” devDependencies ” : {
” @neoconfetti / s v e l t e ”: ”ˆ1.0.0” ,
” @sveltejs / k i t ”: ” next ” ,
...
” svelte ”: ”ˆ3.46.0” ,
” vite ”: ”ˆ3.1.0”
},
” t y p e ” : ” module ” ,
” dependencies ”: {
” @sveltejs / adapter − s t a t i c ”: ”ˆ1.0.0 − next .48” ,
” bootstrap ”: ”ˆ5.2.2” ,
” buffer ”: ”ˆ6.0.3” ,
” crypto − j s ”: ” ˆ 4 . 1 . 1 ” ,
” encrypt − uint8array ”: ” ˆ 1 . 0 . 0 ” ,
” ethers ”: ”ˆ5.7.2” ,
” ipfs −core ”: ”ˆ0.17.0” ,
” ipfs −http − c l i e n t ”: ”ˆ59.0.0”
}
Listing 4.3.1: Client app dependencies
These are some of the most important libraries that are used to create the SPA. adatper-static is used to
build the static files from the Svelte source code. These static files can be uploaded on the IPFS network
and since all the SPA will be represented in pre-built static files it will be possible to access the application
directly from the IPFS network. The buffer library is used to convert any file in a buffer of bytes. These
bytes are then encrypted using AES and uploaded on the IPFS network. The ethers library is one of the
most important. It is used to interact with the Smart Contract, which means calling the contract functions.
And ipfs-core is used to interact with the IPFS network in order to upload or download files to and from it.
The client application is very simple and does all the needed things. If offers the possibility to
upload a file on the IPFS network, then get the CID of that file, then it registered the CID in the Ledger.
55
The cline app also offers the ability to download a file from the IPFS network by its CID. These two are the
main functionalities that are supported, but there are more. In Listing 4.3.2 is presented the function that
takes a file specified by the user, then encrypts it and stores in on the IPFS network.
c o n s t p u b l i s h F i l e = a s y n c ( ) => {
// validations ...
const encrypted = await encrypt ( f i l e s [0 ])

console . log ( ’ encrypted ’ , encrypted )
c o n s t r e s u l t = a w a i t i p f s . add ( e n c r y p t e d )
const cid = r e s u l t . path
const transactionResponse = await contract . publishCid ( cid )

await listenForTransactionMine ( transactionResponse , provider )
/ / show modal . . .
}
Listing 4.3.2: Client side function to upload a file
The encrypt function converts the file into a buffer of bytes and then encrypted those bytes using Cryp-
toJS.AES.encrypt(wordArray, key).toString(). In the end, the array of bytes is converted into a string and
then saved to the IPFS network.
4.4 License
There are many types of licenses that can be used to protect a project. The types of licenses can be
divided into two categories. The first category is the proprietary licenses that describe a proprietary type of
product and those licenses mostly protect a company’s right to its products. The other type of license is a
license that is used for open-source projects.
There are many types of licenses for open-source projects and the most known and most used are
the following:
– Apache License 2.0;
– BSD 3-Clause ”New” or ”Revised” license;
– BSD 2-Clause ”Simplified” or ”FreeBSD” license;
– GNU General Public License (GPL);
56
– GNU Library or ”Lesser” General Public License (LGPL);
– MIT license;
– Mozilla Public License 2.0;
– Common Development and Distribution License;
– Eclipse Public License version 2.0.
This project is a licensed user of the MIT license. This license is the least restricted one and gives
to the other lots of possibilities. The source code is open for everyone and it’s free to use. One can even use
this code to make it proprietary software.
Implementation Conclusions
The chapter explains how the system is implemented as a prototype that demonstrates that it is
possible to have an independent platform on which people can store sensitive data without the fear of losing
it and that someone could get the data. Most of the technologies that were used to implement the system
were listed and explained why those were chosen.
Solidity is the language of choice when it comes to writing Smart Contract. This language was
invented for contract development and it has all the needed features to write a contract. JavaScript is a
language that always has libraries for almost anything and of course, it is one of the first languages that got
tools to facilitate contract development. On the client side was used also JavaScript since it is the dominant
language to write single-page applications for the browser. Hardhat is the framework of choice because it
has a built-in fake blockchain that makes the testing of the Smart Contract easier.
57
CONCLUSIONS
The internet has a more and more important role in people’s lives and lots of sensitive data is stored
on platforms that are proprietary. That means that the user should entrust their data to those companies.
Unfortunately, there were already cases when people’s data was sold and this can cause big problems
related to violation of human rights.
This thesis project demonstrates with a functional prototype that it is possible to store sensitive data
on a platform that is independent and which assures that the data will not be lost. Having a system that is
decentralized and open-source increases people’s trust in it. A decentralized platform means that it is hosted
and run collaboratively by those who use it or who want to contribute to the long liveness of the project. In
the last, approximately 10 years have appeared technologies that are useful especially for such cases of data
protection, equality, and transparency. Some of those technologies are blockchain and IPFS (Interplanetary
file system). The blockchain is used for decentralized computation and keeping immutable data, especially
sensitive data while IPFS is used for storing big files on a decentralized network.
During the work on this thesis project was implemented a prototype that allows to upload and down-
load of encrypted files on the IPFS network. For this, the user has to have crypto wallets managed through
a wallet manager such as MetaMask. The prototype also offers the possibility to check who is the owner of
a given data and what data is owned by a given crypto wallet address. Even if it is possible to see what data
a user has uploaded on the IPFS network, it is not possible to read that data because it is encrypted before
uploading it using the client application.
The project has the goal to show an idea that works. It is licensed under the MIT license which
means that it is open source and free for everyone. Everyone is allowed to take this prototype and expand
on it by making it a real product that protects people’s data.
58
Bibliography
1. 51% attack. [online] [accessed 10.10.2022].
Available: https://en.bitcoinwiki.org/wiki/51%25_attack.
2. Blockchain [online] [accessed 20.10.2022].

Available: https://en.wikipedia.org/wiki/Blockchain.
3. Cryptocurrency [online] [accessed 27.10.2022].

Available: https://en.wikipedia.org/wiki/Cryptocurrency.
4. Facebook–Cambridge Analytica data scandal [online] [accessed 10.10.2022].

Available: https : / / en . wikipedia . org / wiki / Facebook % E2 % 80 % 93Cambridge _ Analytica _
data_scandal.
5. Filecoin [online] [accessed 15.10.2022].

Available: https://en.wikipedia.org/wiki/Filecoin.
6. InterPlanetary File System [online] [accessed 27.10.2022].

Available: https://en.wikipedia.org/wiki/InterPlanetary_File_System.
7. IPFS Pinning Services [online] [accessed 12.11.2022].

Available: https://sourceforge.net/software/ipfs-pinning/.
8. IPFS: A Complete Analysis of The Distributed Web [online] [accessed 7.10.2022].

Avaialble: https://medium.com/zkcapital/ipfs-the-distributed-web-e21a5496d32d.
9. Non-fungible token [online] [accessed 05.11.2022].

Available: https://en.wikipedia.org/wiki/Non-fungible_token.
10. Pin files using IPFS [online] [accessed 12.11.2022].

Available: https://docs.ipfs.tech/how-to/pin-files/#three-kinds-of-pins.
11. Proof of Stake [online] [accessed 15.11.2022].

Available: https://en.wikipedia.org/wiki/Proof_of_stake.
12. Proof of work [online] [accessed 15.10.2022].

Available: https://en.wikipedia.org/wiki/Proof_of_work.
13. Public-key cryptography [online] [accessed 18.11.2022].

Avaialble: https://en.wikipedia.org/wiki/Public-key_cryptography.
14. Satoshi Nakamoto [online] [accessed 10.10.2022].

Available: https://en.wikipedia.org/wiki/Satoshi_Nakamoto.
59
15. Smart Contracts [online] [accessed 11.10.2022].
Available: https://en.wikipedia.org/wiki/Smart_contract.
16. Solidity programming language [online] [accessed 07.11.2022].

Avaialble: https://docs.soliditylang.org/en/v0.8.17/.
17. Sybil Attack [online] [accessed 05.11.2022].

Available: https://www.imperva.com/learn/application-security/sybil-attack.
18. Symmetric vs. Asymmetric Encryption [online] [accessed 18.11.2022].

Available: https://www.ssl2buy.com/wiki/symmetric-vs-asymmetric-encryption-what-
are-differences.
19. Symmetric-key algorithm [online] [accessed 18.11.2022].

Avaialble: https://en.wikipedia.org/wiki/Symmetric-key_algorithm.
20. The Complete Open-Source and Business Software Platform [online] [accessed 12.11.2022].
Avaialble: https://sourceforge.net/.
60
Appendix A
The Smart Contract
/ / SPDX− L i c e n s e − I d e n t i f i e r : MIT
pragma s o l i d i t y ˆ 0 . 8 . 7 ;
e r r o r NotOwnerError ( ) ;
c o n t r a c t Ledger {
a d d r e s s p u b l i c immutable i owner ;
address [] public users ;
mapping ( a d d r e s s => u i n t 2 5 6 ) p u b l i c s d o n a t o r T o D o n a t e d A m o u n t s ;
mapping ( s t r i n g => a d d r e s s ) p u b l i c s c i d T o A d d r e s s ;
mapping ( a d d r e s s => s t r i n g [ ] ) p u b l i c s a d d r e s s T o O w n e d C i d s ;
e v e n t NewCidRegistered ( a d d r e s s ownerAddress , s t r i n g c i d ) ;
event DonationsWithdrawal ( ) ;
e v e n t NewDonation ( ) ;
constructor () {
i o w n e r = msg . s e n d e r ;
}
f u n c t i o n p u b l i s h C i d ( s t r i n g memory c i d ) p u b l i c {
s c i d T o A d d r e s s [ c i d ] = msg . s e n d e r ;
s a d d r e s s T o O w n e d C i d s [ msg . s e n d e r ] . p u s h ( c i d ) ;
u s e r s . p u s h ( msg . s e n d e r ) ;
e m i t N e w C i d R e g i s t e r e d ( msg . s e n d e r , c i d ) ;
}
f u n c t i o n g e t P u b l i s h e d C i d s ( ) p u b l i c view r e t u r n s ( s t r i n g [ ] memory ) {
r e t u r n s a d d r e s s T o O w n e d C i d s [ msg . s e n d e r ] ;
}
61
function getPublishedCidsByUser ( address userAddress )
public
view
r e t u r n s ( s t r i n g [ ] memory )
{
r e t u r n s addressToOwnedCids [ userAddress ] ;
}
f u n c t i o n getOwnerOfCid ( s t r i n g memory c i d ) p u b l i c view r e t u r n s (

address ) {
return s cidToAddress [ cid ] ;
}
f u n c t i o n w i t h d r a w ( ) p u b l i c onlyOwner {
emit DonationsWithdrawal ( ) ;
( b o o l s u c c e s s , ) = p a y a b l e ( msg . s e n d e r ) . c a l l {
value : address ( t h i s ) . balance
}(””) ;
require ( success , ” Call f a i l e d !” ) ;
}
/ / c a l l e d when no c a l l d a t a i s s p e c i f i e s s
receive ( ) external payable {
s d o n a t o r T o D o n a t e d A m o u n t s [ msg . s e n d e r ] += msg . v a l u e ;
e m i t NewDonation ( ) ;
}
/ / c a l l e d when t h e f u n c t i o n from c a l l d a t a i s n o t f o u n d
fallback ( ) external payable {
i f ( msg . v a l u e > 0 ) {
revert (” Fallback ”) ;
}
}
62
m o d i f i e r onlyOwner ( ) {
i f ( msg . s e n d e r ! = i o w n e r ) {
r e v e r t NotOwnerError ( ) ;
}
;
}
}
Listing A.0.1: The Smart Contract
63

FCIM IS 211M EN Dodon Ion Master

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

FCIM IS 211M EN Dodon Ion Master

Încărcat de

Drepturi de autor:

Formate disponibile

Universitatea Tehnică a Moldovei

APLICAREA TEHNOLOGIILOR BLOCKCHAIN ȘI IPFS ÎN

Student: Dodon Ion

Coordonator: Zaharia Gabriel,

Aplicarea tehnologiilor blockchain și IPFS în

Student: __________ __ Dodon Ion, gr. IS-211M

Coordonator: ________ ____ Zaharia Gabriel, asist. univ.

Consultant: _____ _______ Catruc Mariana, lect. univ.

1.1 Problem analysis and definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

A The Smart Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

1.1 Comparing the movement of data in IPFS to centralized client-server models . . . . . . . 13

2.1 Types of requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.1 Main use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1 Summarised comparison between DES and AES . . . . . . . . . . . . . . . . . . . . . . 42

1.1 Filebase-Google Drive advantages and disadvantages . . . . . . . . . . . . . . . . . . . . 17

AES Advanced Encryption Standard.

API Application Programming Interface.

CID Content Identifier.

DES Data Encryption Standard.

EVM Ethereum Virtual Machine.

GUI Graphical User Interface.

IPFS Inter Planetary File System.

JSON JavaScript Object Notation.

NFT Non-Fungible Token.

NPM Node Package Manager.

P2P Peer-to-peer protocol.

PoS Proof of Stake.

PoW Proof of Work.

RPC Remote Procedure Call.

SPA Single Page Application.

UML Unified Modelling Language.

1.1 Problem analysis and definition

1.2 Scope and solution

1.2.1 The inspiration

1.3 Importance of the topic

1.4 Existing solutions

1.4.1 IPFS pinning services versus Google Drive

1.4.2 Advantages and disadvantages

Table 1.1 - Filebase-Google Drive advantages and disadvantages

Data storage provider Advantages Disadvantages

Table 1.2 - Advantages and disadvantages of the system

Data storage provider Advantages Disadvantages

Figure 1.2 - Google Drive Pricing

1.5 Proof of Concept

Figure 1.3 - Smart Contract operations consuming Gas

Figure 1.4 - Gas refund process

Figure 1.5 - Solidity operations gas usage

Domain Analysis Conclusions

Figure 2.1 - Types of requirements

2.1 Functional Requirements

2.2 Non-Functional Requirements

Requirements Specification Conclusions

3.1 Use cases

Figure 3.1 - Main use cases

3.3 High level architecture

Figure 3.2 - High-level architecture

3.4 Data protection using AES

Figure 3.3 - Symmetric key cryptography

Figure 3.4 - Asymmetric key cryptography

3.5 Data storage flow

Figure 3.5 - Data storage flow

Student: ________ Dodon Ion, gr. IS-211M

Coordonator: ____ Zaharia Gabriel, asist. univ.

Consultant: _ ___ Catruc Mariana, lect. univ.