The Technical Foundation of IoT PDF

Contents
Foreword by Andy Stanford-Clark xv
Foreword by Alexandra Deschamps-Sonsino xvii
Introduction by Stefan Grasmann xix
Preface by Boris Adryan xxiii

Reasons for This Book . . . . . . . . . . . . . . . . . . . . . . . xxiii
How to Navigate This Book . . . . . . . . . . . . . . . . . . . . xxv
Acknowledgments xxxi
I Physical Principles and Information 1

Chapter 1 Electricity and Electromagnetism 3
1.1 Matter, Elements and Atoms . . . . . . . . . . . . . . . . 4
1.1.1 Electron Configuration and Atomic Orbitals . . . 5
1.1.2 Conductors and Semiconductors . . . . . . . . . . 8
1.1.3 Electric Charge, Current and Voltage . . . . . . . 10
1.2 Electric and Magnetic Fields . . . . . . . . . . . . . . . 19
1.2.1 Magnets and Magnetism . . . . . . . . . . . . . . 19
1.2.2 Interactions of Electric and Magnetic Fields . . . 20
1.2.3 Electromagnetic Spectrum . . . . . . . . . . . . . 23
Chapter 2 Electronics 45
2.1 Components . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.1 Passive Components . . . . . . . . . . . . . . . . 46
v
vi CONTENTS
2.1.2 Active Components . . . . . . . . . . . . . . . . 53

2.2 Analogue and Digital Circuits . . . . . . . . . . . . . . . 63
2.2.1 Logic gates . . . . . . . . . . . . . . . . . . . . . 63
2.2.2 Memory . . . . . . . . . . . . . . . . . . . . . . 63
2.2.3 Binary Calculations . . . . . . . . . . . . . . . . 66
2.2.4 Logic Chips . . . . . . . . . . . . . . . . . . . . 69
2.3 Programmable Computers . . . . . . . . . . . . . . . . . 69
2.3.1 Field-Programmable Gate Arrays . . . . . . . . . 71
2.3.2 Microcontrollers . . . . . . . . . . . . . . . . . . 73
2.3.3 Multipurpose Computers . . . . . . . . . . . . . 74
Chapter 3 Information Theory and Computing 75

3.1 Information Content . . . . . . . . . . . . . . . . . . . . 75
3.2 A/D and D/A Conversion . . . . . . . . . . . . . . . . . 76
3.3 Digital Signal Processing . . . . . . . . . . . . . . . . . 80
3.4 Computability . . . . . . . . . . . . . . . . . . . . . . . 81
II Historical Perspective of the Internet of Things 85

Chapter 4 50 Years of Networking 87
4.1 The Early Internet . . . . . . . . . . . . . . . . . . . . . 87
4.2 World Wide Web and Web 2.0 . . . . . . . . . . . . . . . 90
4.2.1 World Wide Web . . . . . . . . . . . . . . . . . . 91
4.2.2 Web 2.0 . . . . . . . . . . . . . . . . . . . . . . 92
4.3 Connecting Things . . . . . . . . . . . . . . . . . . . . . 92
4.3.1 Industrial Control Systems . . . . . . . . . . . . . 92
4.3.2 The Internet of Things . . . . . . . . . . . . . . . 93
III Applications of M2M and IoT 95

Chapter 5 The Difference Between M2M and IoT 97
Chapter 6 Common Themes Around IoT Ecosystems 101

6.1 Industry . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.1.1 Smart Energy . . . . . . . . . . . . . . . . . . . 105
6.1.2 Smart Manufacturing . . . . . . . . . . . . . . . 107
6.1.3 Smart Retail . . . . . . . . . . . . . . . . . . . . 110
CONTENTS vii
6.1.4 Agriculture . . . . . . . . . . . . . . . . . . . . . 110

6.2 Cities and Municipalities . . . . . . . . . . . . . . . . . 112
6.2.1 Energy, Gas and Water . . . . . . . . . . . . . . . 112
6.2.2 Environment . . . . . . . . . . . . . . . . . . . . 113
6.2.3 Traffic . . . . . . . . . . . . . . . . . . . . . . . 114
6.2.4 Security and Safety . . . . . . . . . . . . . . . . 115
6.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . 116
6.3 Connected Vehicle . . . . . . . . . . . . . . . . . . . . . 116
6.3.1 Smart Buildings and Assisted Living . . . . . . . 118
6.3.2 Smart Buildings . . . . . . . . . . . . . . . . . . 118
6.3.3 Assisted Living . . . . . . . . . . . . . . . . . . 120
Chapter 7 Drivers and Limitations 123

7.1 Drivers for Adoption . . . . . . . . . . . . . . . . . . . . 123
7.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 124
IV Architectures of M2M and IoT Solutions 127

Chapter 8 Components of M2M and IoT Solutions 129
8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.2 Sensors and Actuators . . . . . . . . . . . . . . . . . . . 130
8.3 Gateways and Hub Devices . . . . . . . . . . . . . . . . 132
8.4 Cloud and Data Platforms . . . . . . . . . . . . . . . . . 134
Chapter 9 Architectural Considerations 137

9.1 Network Topologies . . . . . . . . . . . . . . . . . . . . 137
9.2 Spatial Dimensions of Networking . . . . . . . . . . . . 139
Chapter 10 Common IoT Architectures 141

10.1 Mesh Networks . . . . . . . . . . . . . . . . . . . . . . 141
10.2 Local Gateway . . . . . . . . . . . . . . . . . . . . . . . 143
10.3 Direct Connection . . . . . . . . . . . . . . . . . . . . . 145
Chapter 11 Human Interfaces 147

11.1 User Experience and Interfaces . . . . . . . . . . . . . . 147
11.2 Mobile Phones and End Devices . . . . . . . . . . . . . 149
viii CONTENTS
V Hardware 153
Chapter 12 Hardware Development 155
Chapter 13 Power 159

13.1 Constraints of Field-Deployed Devices . . . . . . . . . . 160
13.2 Power Adapters . . . . . . . . . . . . . . . . . . . . . . 160
13.2.1 Conventional AC/DC Adapters . . . . . . . . . . 160
13.2.2 USB . . . . . . . . . . . . . . . . . . . . . . . . 162
13.2.3 PoE . . . . . . . . . . . . . . . . . . . . . . . . . 162
13.3 Batteries . . . . . . . . . . . . . . . . . . . . . . . . . . 163
13.3.1 Battery Chemistry . . . . . . . . . . . . . . . . . 164
13.3.2 Rechargeable Batteries . . . . . . . . . . . . . . . 167
13.3.3 Battery Types and Real-Life Properties . . . . . . 170
13.4 Renewable Energy Sources . . . . . . . . . . . . . . . . 173
13.4.1 Solar Panels . . . . . . . . . . . . . . . . . . . . 174
13.4.2 Energy Harvesting . . . . . . . . . . . . . . . . . 176
Chapter 14 Actuators 177

14.1 From Buzzers to Speakers (Sound) . . . . . . . . . . . . 177
14.2 From Indicator Lights to Displays (Light) . . . . . . . . . 178
14.3 From Vibration to Rotation to Switching (Motion) . . . . 180
14.3.1 Vibration and Piezoelectric Motors . . . . . . . . 180
14.3.2 Solenoids and Electromagnetic Motors . . . . . . 180
14.3.3 Relays . . . . . . . . . . . . . . . . . . . . . . . 183
14.4 Other Forms of Energy . . . . . . . . . . . . . . . . . . 185
Chapter 15 Sensors 187

15.1 Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
15.2 Location . . . . . . . . . . . . . . . . . . . . . . . . . . 188
15.2.1 Global Localization . . . . . . . . . . . . . . . . 189
15.2.2 Indoor Localization . . . . . . . . . . . . . . . . 190
15.3 Physical Triggers . . . . . . . . . . . . . . . . . . . . . . 191
15.3.1 Position, Motion and Acceleration . . . . . . . . 191
15.3.2 Force and Pressure . . . . . . . . . . . . . . . . . 193
15.3.3 Light and Sound . . . . . . . . . . . . . . . . . . 195
15.3.4 Temperature . . . . . . . . . . . . . . . . . . . . 197
15.3.5 Current . . . . . . . . . . . . . . . . . . . . . . . 200
15.4 Chemical Triggers . . . . . . . . . . . . . . . . . . . . . 201
CONTENTS ix
15.4.1 Solid Particles . . . . . . . . . . . . . . . . . . . 201

15.4.2 Humidity . . . . . . . . . . . . . . . . . . . . . . 203
15.4.3 pH and Other Ion-Specific Indicators . . . . . . . 205
15.4.4 Alkanes, Alcohols and Amines . . . . . . . . . . 206
Chapter 16 Embedded Systems 207

16.1 Microcontrollers . . . . . . . . . . . . . . . . . . . . . . 208
16.1.1 Architectures . . . . . . . . . . . . . . . . . . . . 210
16.1.2 Power Consumption . . . . . . . . . . . . . . . . 211
16.1.3 Input-Output Capability . . . . . . . . . . . . . . 211
16.1.4 Operating Systems and Programming . . . . . . . 212
VI Device Communication 213

Chapter 17 Communication Models 215
17.1 Open Systems Interconnection Reference Model . . . . . 216
17.1.1 Layer 1: Physical . . . . . . . . . . . . . . . . . . 216
17.1.2 Layer 2: Data Link . . . . . . . . . . . . . . . . . 216
17.1.3 Layer 3: Network . . . . . . . . . . . . . . . . . 218
17.1.4 Layer 4: Transport . . . . . . . . . . . . . . . . . 218
17.1.5 Layers 5 – 7: Session, Presentation, Application . 218
17.2 Transmission Control Protocol/Internet Protocol Model . 219
Chapter 18 Information Encoding and Standard Quantities 221

18.1 Coding Schemes . . . . . . . . . . . . . . . . . . . . . . 221
18.2 Information Quantities . . . . . . . . . . . . . . . . . . . 222
18.3 Information Encoding . . . . . . . . . . . . . . . . . . . 223
Chapter 19 Industry Standards 225

19.1 Hardware Interfaces . . . . . . . . . . . . . . . . . . . . 226
19.1.1 Communication Principles . . . . . . . . . . . . . 226
19.1.2 Serial/UART . . . . . . . . . . . . . . . . . . . . 228
19.1.3 Serial Buses . . . . . . . . . . . . . . . . . . . . 229
19.1.4 Joint Test Action Group . . . . . . . . . . . . . . 233
19.1.5 Universal Serial Bus . . . . . . . . . . . . . . . . 233
19.2 Longer-Range Wired Communications . . . . . . . . . . 235
19.2.1 Fieldbus Systems . . . . . . . . . . . . . . . . . 238
19.2.2 Ethernet . . . . . . . . . . . . . . . . . . . . . . 248
x CONTENTS
19.2.3 Powerline . . . . . . . . . . . . . . . . . . . . . 254

19.3 Wireless Standards . . . . . . . . . . . . . . . . . . . . . 255
19.3.1 Passive and Near-Field Radio . . . . . . . . . . . 261
19.3.2 Data Radio . . . . . . . . . . . . . . . . . . . . . 265
19.3.3 Cellular Data Services . . . . . . . . . . . . . . . 279
19.3.4 Satellite Communication . . . . . . . . . . . . . . 285
VII Software 289

Chapter 20 Introduction 291
20.1 Common Issues of Distributed Systems . . . . . . . . . . 292
20.1.1 The Fallacies of Distributed Computing . . . . . . 294
20.1.2 Identity and Openness of IoT Systems . . . . . . 294
Chapter 21 Embedded Software Development 297

21.1 Power Saving and Sleep Management . . . . . . . . . . . 298
21.2 Real-Time Requirements and Interrupts . . . . . . . . . . 299
Chapter 22 Network Protocols: Internet and IoT 301

22.1 Network Protocols . . . . . . . . . . . . . . . . . . . . . 302
22.2 Network Protocols in the Context of the OSI Model . . . 302
22.2.1 Advantages of a Layered Communication Proto-
col Model . . . . . . . . . . . . . . . . . . . . . 303
22.2.2 Vertical and Horizontal Communication within
the OSI Model . . . . . . . . . . . . . . . . . . . 304
22.2.3 Data Encapsulation . . . . . . . . . . . . . . . . 304
22.2.4 Indirect Connection and Message Routing . . . . 305
22.2.5 OSI Layers Revisited . . . . . . . . . . . . . . . 306
22.3 Internet Protocol Suite . . . . . . . . . . . . . . . . . . . 309
22.3.1 TCP/IP and the OSI Model . . . . . . . . . . . . 309
22.3.2 Layers of TCP/IP Messaging . . . . . . . . . . . 310
22.3.3 Internet Protocol . . . . . . . . . . . . . . . . . . 311
22.3.4 TCP . . . . . . . . . . . . . . . . . . . . . . . . 315
22.3.5 UDP . . . . . . . . . . . . . . . . . . . . . . . . 317
22.3.6 Ports . . . . . . . . . . . . . . . . . . . . . . . . 318
22.4 HTTP and HTTP/2 . . . . . . . . . . . . . . . . . . . . . 319
22.4.1 HTTP Methods . . . . . . . . . . . . . . . . . . 320
22.4.2 HTTP/2.0 . . . . . . . . . . . . . . . . . . . . . 320
CONTENTS xi
22.4.3 HTTP Authentication . . . . . . . . . . . . . . . 323

22.4.4 RESTful APIs . . . . . . . . . . . . . . . . . . . 324
22.4.5 HTTP for IoT Communication . . . . . . . . . . 324
22.5 CoAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
22.5.1 UDP as Transport Protocol . . . . . . . . . . . . 326
22.5.2 Protocol Features . . . . . . . . . . . . . . . . . 326
22.5.3 Use Cases . . . . . . . . . . . . . . . . . . . . . 327
22.5.4 CoAP Discovery . . . . . . . . . . . . . . . . . . 328
22.5.5 Comparison to HTTP . . . . . . . . . . . . . . . 328
22.6 XMPP . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
22.6.2 XMPP as an IoT Protocol . . . . . . . . . . . . . 331
22.6.3 Use Cases . . . . . . . . . . . . . . . . . . . . . 331
22.7 AMQP . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
22.7.1 Characteristics of AMQP . . . . . . . . . . . . . 332
22.7.2 Basic Concepts . . . . . . . . . . . . . . . . . . . 333
22.7.4 AMQP for the Internet of Things . . . . . . . . . 335
22.7.5 AMQP 0.9.1 vs 1.0 . . . . . . . . . . . . . . . . 336
22.7.6 Use Cases . . . . . . . . . . . . . . . . . . . . . 336
22.8 MQTT . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
22.8.1 Publish/Subscribe . . . . . . . . . . . . . . . . . 337
22.8.2 Protocol Characteristics . . . . . . . . . . . . . . 339
22.8.3 Features . . . . . . . . . . . . . . . . . . . . . . 340
22.8.4 Use Cases . . . . . . . . . . . . . . . . . . . . . 341
22.9 Other Protocols . . . . . . . . . . . . . . . . . . . . . . 341
22.10 Choosing an IoT protocol . . . . . . . . . . . . . . . . . 342
Chapter 23 Backend Software 345

23.1 IoT Platform Services . . . . . . . . . . . . . . . . . . . 345
23.2 Functions of an IoT Backend . . . . . . . . . . . . . . . 347
23.2.1 Message Handling . . . . . . . . . . . . . . . . . 347
23.2.2 Storage . . . . . . . . . . . . . . . . . . . . . . . 349
Chapter 24 Data Analytics 353

24.1 Why, When and Where of IoT Analytics . . . . . . . . . 354
24.2 Exemplary Methods for Data Analytics . . . . . . . . . . 355
24.2.1 Exemplary Methods for Edge Processing . . . . . 357
xii CONTENTS
24.2.2 Exemplary Methods for Stream Processing . . . . 359

24.2.3 Exemplary Methods for Batch Processing . . . . . 361
Chapter 25 Conceptual Interoperability 371

25.1 Device Catalogs and Information Models . . . . . . . . . 373
25.2 Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . 374
25.2.1 Structure and Reasoning . . . . . . . . . . . . . . 376
25.2.2 Building and Annotation . . . . . . . . . . . . . . 376
VIII Security 379

Chapter 26 Security and the Internet of Things 381
26.1 Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . 382
26.2 Other Attacks . . . . . . . . . . . . . . . . . . . . . . . 383
26.3 The Fundamentals of Security . . . . . . . . . . . . . . . 385
26.3.1 Confidentiality . . . . . . . . . . . . . . . . . . . 385
26.3.2 Integrity . . . . . . . . . . . . . . . . . . . . . . 386
26.3.3 Availability . . . . . . . . . . . . . . . . . . . . . 386
26.3.4 CIA+ . . . . . . . . . . . . . . . . . . . . . . . . 387
26.3.5 Authentication . . . . . . . . . . . . . . . . . . . 387
26.4 Access Control . . . . . . . . . . . . . . . . . . . . . . . 389
26.5 Non-Repudiation . . . . . . . . . . . . . . . . . . . . . . 391
Chapter 27 A Beginner’s Guide to Encryption 393

27.1 Shared Key Encryption . . . . . . . . . . . . . . . . . . 393
27.2 Public Key Cryptography . . . . . . . . . . . . . . . . . 395
27.2.1 Prime Numbers and Elliptic Curves . . . . . . . . 398
27.2.2 Man-in-the-Middle Attacks . . . . . . . . . . . . 398
27.2.3 Certificates and Certificate Authorities . . . . . . 400
27.2.4 Transport Layer Security . . . . . . . . . . . . . 401
27.2.5 An Example TLS Handshake . . . . . . . . . . . 402
27.2.6 Datagram Transport Layer Security . . . . . . . . 404
27.3 Cryptography on Small Devices . . . . . . . . . . . . . . 405
Chapter 28 Threats, Challenges, and Concerns for IoT Security and Privacy 407
28.1 A1: Device Confidentiality . . . . . . . . . . . . . . . . 407
28.2 B1: Network Confidentiality . . . . . . . . . . . . . . . . 409
28.3 C1: Cloud/Server Confidentiality . . . . . . . . . . . . . 411
CONTENTS xiii
28.4 A2: Hardware Integrity . . . . . . . . . . . . . . . . . . 412

28.5 B2: Network Integrity . . . . . . . . . . . . . . . . . . . 413
28.6 C2: Cloud/Server Integrity . . . . . . . . . . . . . . . . . 414
28.7 A3: Device Availability . . . . . . . . . . . . . . . . . . 414
28.8 B3: Network Availability . . . . . . . . . . . . . . . . . 414
28.9 C3: Cloud/Server Availability . . . . . . . . . . . . . . . 415
28.10 A4: Device Authentication . . . . . . . . . . . . . . . . 415
28.11 B4: Network Authentication . . . . . . . . . . . . . . . 415
28.12 C4: Cloud/Server Authentication . . . . . . . . . . . . . 416
28.13 A5: Device Access Control . . . . . . . . . . . . . . . . 417
28.14 B5: Network Access Control . . . . . . . . . . . . . . . 418
28.15 C5: Cloud/Server Access Control . . . . . . . . . . . . . 418
28.16 A6: Device Non-Repudiation . . . . . . . . . . . . . . . 419
28.17 B6: Network Non-Repudiation . . . . . . . . . . . . . . 419
28.18 C6: Cloud/Server Non-Repudiation . . . . . . . . . . . . 420
28.19 Summary of the Threat Matrix . . . . . . . . . . . . . . 420
Chapter 29 Building Secure IoT Systems 423

29.1 How to Do Better . . . . . . . . . . . . . . . . . . . . . 423
29.1.1 Device Registration . . . . . . . . . . . . . . . . 424
29.1.2 Device Identity System . . . . . . . . . . . . . . 426
29.1.3 Personal Cloud Middleware . . . . . . . . . . . . 426
29.1.4 Pseudonymous Data Sharing . . . . . . . . . . . 427
29.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 428
29.3 Principles of IoT Security . . . . . . . . . . . . . . . . . 428
About the Authors 431
Index 433
Foreword by Andy Stanford-Clark
When it comes to the Internet of Things, a lot of people just get on and do it. Other
people need to know more of the detail about the how and the why of the elements
that make up that broad spectrum of technologies we know as the Internet of Things.
If you are in the latter camp, this book is for you.
Spanning several gamuts in great detail, this book starts with the fundamental
particles that make up our world, covering the spins of electrons and leading
into electricity, electronics and digital logic. This introduces computers of various
shapes and sizes, and justifies the software that runs on them. Software brings data
analytics, applications, products and services, and all of these together enable the
creation of Internet of Things solutions.
This book primarily caters to those who are interested in researching, devel-
oping and building the Internet of Things, but who appreciate the scientific and
technical foundations as well. It provides a detailed grounding in electricity, elec-
tronics, information processing and the many building blocks of modern devices.
Here you will learn about fundamental properties of matter, through to an
exploration of security and privacy. To appreciate the options available to you in the
end-to-end architectures of device to application via cloud and networks, you will
go on a journey from flip-flops to tera-flops, from magnetic fields to fieldbus.
This book will satisfy the curiosity of the most curious, and provides a solid
foundation for undergraduate study. Ultimately, in the words of the authors, “the
book may be interesting for those who simply use the Internet of Things, and
wonder why it takes so long to develop good technical solutions and services”.
Andy Stanford-Clark
IBM Distinguished Engineer for IoT
xv
Foreword by Alexandra
Deschamps-Sonsino
The Internet of Things seems to many in 2017 like an unfulfilled promise, an oasis
on the horizon at the end of a long walk in the desert of ideas. Kevin Ashford’s
expression has, for many, yet to make good on its promise. The Internet is vast,
and so are things. The choice of technologies to use (connectivity, hardware) and
the things to connect (in our homes, at work, in our cities) are not infinite but the
possibilities are really endless. At the same time, we’re running out of a patience
as designers, product people who like things to be made real to be here now, at any
cost. We’re also confused as to what success looks like with the Internet of Things.
Is it good that everything is connected all the time regardless of our knowledge and
our understanding? Is it better if we’re involved in some decision making? What if
that connectivity does us harm?
These are not just ethical questions, they are the bread and butter of an IoT
developer, designer or entrepreneur.
The Technical Foundations of IoT goes beyond any publication before it to
educate those who will want to limit their understanding of the Internet of things
to hardware is hard. It is, but it doesn’t mean you can’t learn something and build
something with your eyes wide open to the possibilities, challenges and ability to
avoid the common pitfalls.
I envy someone coming to this book to understand the Internet of Things
for the first time. This is a comprehensive exploration of every technical aspect
of the field without talking down to its audience. Startups and small businesses
who will start to plan their technical journey will find a framework, approach and
solutions here. If that’s not your cup of tea and you’re on the management team,
well at least you’ll know how to talk to your engineers about their proposal, to
your technical team about their proposed solution and generally educate yourself.
That’s of vital importance now as we see political leaders incapable of talking about
security without getting the basics wrong. The general public will start to ask for
the impossible from technologists and connected product startups if we don’t build
xvii
xviii
up a good public argument for why we build things in particular ways. This book
will help you build that argument for your team and for others too.
Ubiquitous connectivity of devices, objects and services is not a dream but a
reality made possible if you’ve got the right tools. This is a great tool to start with.
Alex Deschamps-Sonsino
Designswarm,
Inventor of the
Good Night Lamp
Introduction by Stefan Grasmann
The business potential of the Internet of Things (IoT) is a big topic and has been
subject of many publications in the recent years. I’ve had the pleasure of working
with different customers in several verticals for the past ten years as an IoT solution
provider and found the following reasons the most convincing ones:
Product manufacturers name “customer intimacy” as one of their key motiva-

tors for diving into IoT. They are not satisfied anymore to just sell a product. It is
equally important to them that their customers will be happy using that product and
recommend its usage. That’s why manufacturers want to track how their product
is used and why they want to establish a channel to their end users and learn for
the future. IoT enables very attractive after-sales service business. Many product
companies even strive to transform into product and service companies. IoT enables
that transformation. As a side effect, the product manufacturer can skip resellers that
usually sit between them and their end customer, potentially blurring their view of
the real needs of the customer.
Along with the previous argument goes the point that IoT solutions usually
create benefits for many different stakeholders. Let’s take an example from con-
sumer IoT. If you buy a connected body scale, then you as the user will have an
immediate benefit: You don’t only see your daily weight, but you are served with an
app that shows you a diagram with your historical development and achievements.
As described above, the vendors of the scale also equally have many benefits, be-
cause they now have a direct contact to you as their user. They know how often and
for how long you use their product. They learn about your goals. They learn about
the limits of their product. They additionally create lots of data to develop a better
next product. They finally collect statistical data that might have great value for
completely different players; in this case, insurance companies or health authorities.
The arguments above already count if you just connect one product to its
manufacturer and invent new services. Consider the business potential that arises
xix
xx
through the network effect if we connect several different things and vertical net-
works. We already experienced this effect when the World Wide Web created new
connections for digital information, and when mobile phones and social media plat-
forms digitally connected everyone. It is easy to imagine the potential if we also
interconnect all the things around us. Add the fact that IoT spans across nearly all
market verticals. Analysts see IoT potential everywhere, from industrial predictive
maintenance to commercial IoT, from connected vehicles to continuous surveillance
of patients, from smart grids and smart cities to smart homes.
However, this enormous business potential creates the following challenges

for engineering teams that build IoT solutions:
• Every customer wants to be the first in the market. Time to market is
extremely ambitious. The pressure to deliver fast is very high. There is rarely
time for experiments.
• IoT solutions usually try to create a new or reshape an existing market. That
means that there is no fixed product vision and no certainty that the solution
will succeed. Your market will change during development and you need to
adapt quickly. Lean and agile development is crucial for success.
• Security and privacy are silent prerequisites. However, not too many stake-
holders will talk about them. But everybody takes them for granted. Engi-
neers need to come up with a solid approach to incorporate security and pri-
vacy into the foundations of their solutions. The problem: Securing a solution
usually needs time — a strong contradiction to the market requirements.
In consequence IoT engineers need a solid understanding of the involved technolo-
gies to cope with these challenges. These technologies are very broad and hard to
grasp as a whole because they cover many disciplines. That’s why I think that this
book will play a crucial role in the upcoming years. IoT will need lots of motivated,
well-trained engineers who create solutions that fulfill the business promises I de-
scribed above. The business drivers are enormous. So are the technologic dangers if
those solutions are created shortsightedly.
Boris Adryan, Dominik Obermaier and Paul Fremantle attempt to grab all
these crucial aspects at their root. They created this book to provide a common tech-
nological ground for newbies and experienced developers that want to understand
this field holistically.
Having worked in that domain for the last ten years I wholeheartedly say to
you: Read, understand and use this book as your technical IoT foundation.
Stefan Grasmann
Managing Director, Competence Center
Zühlke Engineering GmbH
xxi
Preface by Boris Adryan
The Internet of Things. Imagine a world in which everyday devices are connected
to one another and where they can exchange and integrate data for the benefit of the
world and the people around them. It’s a world in which the Internet acts as glue
between these devices, and the combination of their data allows actionable insight
that is greater than the sum of information that they yield individually.
This is probably one of the broadest and most generic definitions of the
Internet of Things (IoT) that you can find. Some of you may already perceive
our world as this place of ubiquitous computing and information exchange, while
for others the degree of connectivity and data integration isn’t going far enough.
You may not call it Internet of Things, you could say machine-to-machine (M2M)
communication, you could even call it a marketing hype; but whatever your term for
it is, is a technological trend that is going to require the rethinking of many products
and services. As such, if you think of yourself as a creator of the future, you should
be prepared and have an understanding of the building blocks that make the Internet
of Things.
REASONS FOR THIS BOOK
Analogies between the real world and the Internet of Things. My formal background
is in biology, a field ranging in scale from interactions between molecules to entire
ecosystems on this planet. While nobody can comprehend the entire wealth of
knowledge we have about the living world today, scholars and researchers in the
many areas of biology have long tried to write domain-specific textbooks to help
educate the following academic generations. University studies these days allow for
a considerable amount of specialization, and while I cannot even distinguish the
types of grass on a wild meadow, I have had successfully run a research group for
molecular genetics for many years. For my final examinations I was expected to
know whatever there was in Alberts (The Molecular Biology of The Cell), Stryer
(Biochemistry), Gilbert (Developmental Biology) and a few others, most of which
xxiii
xxiv The Technical Foundations of IoT
were easily near 1,000 pages or more. It is this degree of specialization that allows
graduates to go on to do PhDs in academia, join industrial R&D departments, or at
least demonstrate an aptitude to learning. However, to avoid the loss of the bigger
picture and to enable us to give our detailed knowledge the appropriate context,
during my studies we were always referred back to Neil A. Campbell’s Biology. At
a mere 1,400 pages, this book provided the framework for whichever new aspect
of biology we were to explore. Starting out with the basic chemistry required to
understand the composition and interactions of biomolecules, it guided us through
the different levels of biological complexity, from molecules to cells, from cells
to tissues, from tissues to organisms, and from organisms to ecosystems. Now
think of humans alone. What is our evolutionary origin? What are the principles
of evolution? How do we develop from a fertilized egg into an electrical engineer?
How do active decisions and physiological reactions control our very existence? It
soon becomes clear that even a great textbook like Campbell’s can only provide
a core vocabulary of a subject. Nevertheless, it is this vocabulary that is allowing
students today to explore different fields of biology, but also to go back to the very
basics and first principles that may give them an appreciation of the entire subject.
The Internet of Things is a result of human creation. Nevertheless, in many
ways it is a complex system and rich in facets just as is life. It builds on physical
principles, it has hardware components that we can touch, software that enables
that hardware to function, and connected to the Internet devices interact just as
organisms do in an ecosystem — and these ecosystems constantly evolve. The un-
doubtedly complicated mechanisms of the market, the politics of industry consortia
and monetisation models aside, the technology around the Internet of Things spans
scales and metrics from nanoampere to zetabytes, from lithium chloride batteries
to information theory, from machine-centric protocols to seemingly enchanted con-
sumer products. Successful teams are often multidisciplinary, bringing together ex-
perts from electrical engineering, computer science, product design and specialists
from particular verticals, such as manufacturing, transport, energy or health. While
each of these professions have their own training and as academic disciplines they
may exist as individual entities, their interaction in the Internet of Things is novel
and not yet well reflected in the literature.
This book aspires to provide an overview of the technical aspects of the In-
ternet of Things, offering more theory and background information than is usually
required for implementation. Current book recommendations with general applica-
bility comprise Adrian McEwen’s Designing the Internet of Things (2014), Claire
Rowland’s Designing Connected Products (2014) and, more recently, Dirk Slama’s
Enterprise IoT: Strategies and Best Practices for Connected Products and Services
Preface by Boris Adryan xxv
(2015). These are complemented by a vast array of hands-on manuals and begin-
ners’ guides with titles along the lines of IoT with (fill in any popular hardware
platform, programming language etc.) However, none of these books provide a sys-
tematic introduction into the scientific basis and technical aspects of the Internet of
Things in a way that Campbell’s Biology does it for the life sciences, with more de-
tailed information than these introductory works. While we will see real-life Internet
of Things solutions in various verticals and focus on particular technical challenges,
we won’t spend much time on commercialization strategies or business models.
This book primarily caters to those who are interested in researching, developing
and building the Internet of Things, but who also appreciate the scientific and tech-
nical foundations. As our technical and service design decisions have an immediate
impact on people, we will also discuss security, privacy and ethical implications.
It is my conviction that people who develop the Internet of Things should be
able to explain the why and how behind the technology. Not only may a thorough
background enable us to provide better solutions to problems, but I think graduates
of any current degree course relevant to the Internet of Things should be able to
question claims and design decisions that are offered. We are “standing on the
shoulders of giants” as engineers, programmers and designers. However, we should
not accept the status quo as a religious belief but have the knowledge to go back to
the basics. If the Internet of Things is going to become as ubiquitous and prevalent
as predicted by industry analysts, we are going to shape the future of this planet with
this technology. While it is convenient to ignore aspects that are not immediately
relevant for our day-to-day work, treating them as if by magic is a dangerous
stance. As future engineers, designers and creators of Internet-connected products
and services, we should have a basic understanding of electricity, electronics,
information processing and the many building blocks of modern devices. If you
disagree, this book is not for you.
HOW TO NAVIGATE THIS BOOK
In analogy to an undergraduate textbook for biology or chemistry, we will move

step-by-step from the physical principles of electricity and electromagnetic signals
to simple electronics (see Figure 1). We will see how digital circuits implement
boolean logic and how certain hardware designs allow the execution of prepro-
grammed commands. The meta-level of software allows us to explore basic ques-
tions of information processing and what computability means in theoretical and
practical terms. Following the historical development of computer networking, we
xxvi The Technical Foundations of IoT
will see how the first experimental connection between computers led to the creation
of the Internet, and how the invention of the World Wide Web triggered the enor-
mous growth of this ubiquitous infrastructure. From the first network-connected
devices and machine-to-machine communication, our journey continues to exam-
ples of the current Internet of Things. We will then see a few reference architectures
for the development of Internet-connected infrastructure. The hard- and software
chapters of this book will introduce the basic building blocks, both in terms of sen-
sor and actuator components that can be used to build Internet of Things solutions
as well as the protocols behind device communication and the Internet. We will
look at software on embedded and gateway devices, challenges in the design of
backend solutions, as well as data analytics and service design strategies. The book
concludes with security and privacy in the Internet of Things, one being primarily
a technical challenge, whereas privacy and user interaction are tightly linked to the
specific preferences of individual users.
This book was planned with a particular undergraduate course in mind. It was
thus primarily targeted at students who pursue studies that will ultimately allow
them to design hardware, software, products and/or services for the Internet of
Things, and I wanted them to have an appreciation of the technical and historical
dimensions of their subject. While it is written in a way that it could be read front-
to-back, I considered that many people who received a good STEM education at
school would want to skip over a few of the introductory chapters. The chapters
are therefore as much as possible self-contained, but references to other parts of the
book are presented where relevant.
While the emphasis in the initial phase of writing the book may have been
students in higher education, it soon occurred to me that professionals already
working with the Internet of Things or enthusiastic amateurs may want to use it
to acquire some knowledge in aspects that were not traditional components of their
respective professional backgrounds. Although I hold the somewhat extreme view
that everyone working in the field should know about the technical and historical
foundations, I appreciate that not everyone has the time or interest in my highly
subjective summary of physics, computing and history:
Thus if you’re coming from a product or interface design, or other less
technical background, I suggest that you start reading around the core M2M and
IoT chapters in Parts IV to VIII: Why is the choice of batteries so important for
standalone sensors? What are the advantages of the various IoT data exchange
protocols over a simple HTTP connection? What is an attack surface and why
should we care about it in consumer products? These are practical questions and
Preface
TECHNICAL AND HISTORICAL FOUNDATIONS

Part I
Physical Principles and Information
carry in
S G D SENDER information RECEIVER
zero-point message encode signal decode message
dB NOT
beam width A
AND channel
n p n B full adder carry out
OR
0 n
IN1 NAND
main lobe p NOR multiplexer
U IN2
-3 XOR
I R B IN3
IN4 multiplexer OUT error noise noise
NOT signal [V]
AND 5
I -G OR D flip-flop
NAND 4
t C
NOR 3
side lobes B XOR
NPN 2 bits - p log2 p
1
E
isotropic radiator back lobe clock 0
V+ GND
profile time [ms]
Atoms, solid-state Electric components, Information, codes,

physics, electricity, binary logic and understanding the
Figure 1 How to navigate this book.

electromagnetism foundations of analogue world
("radio") etc. electric computing
Spatial dimension: from atoms to devices Historical perspective: from the

1930s computing to ARPANET to
industrial control systems and other
"things" on the Internet
Preface by Boris Adryan
Part II
History of the Internet
Trojan
Room
Soviet Coffee Web Ambient
nuke Arpanet Internet WWW Pot 2.0 Orb
1950 1970 1990 2000

SAGE TCP/IP dotcom
boom
The Cold War,

SAGE, ARPANET, CORE M2M AND IOT
Internet and the Web
xxvii
CORE M2M AND IOT
xxviii
Part III Part IV Part VIII

Applications of M2M and IoT Architectures Security
sensor/ gateway - Why securing the IoT is
"smart" Internet
actuator device difficult and important
industry cities car - How to secure its
phone 'hub'
infrastructure home stuff BUILDING BLOCKS
sensor/ Internet
actuator
Industry 4.0, grids, Challenges with

city services, care wireless technologies BUILDING
homes, Tesla and co. and gateway devices. BLOCKS
Spatial dimension: from personal area networks and

wearables to industrial ecosystems and just-in-time
delivery
BUILDING BLOCKS
The Technical Foundations of IoT
Part V Part VI Part VII

Hardware Device Communication Software
- Power options - Communication models - Internet and IoT
- Sensor types - Information encoding protocols
- Actuators - Standard technologies - Embedded software
- Embedded systems - Backend software
- Data analytics
- M2M/IoT interoperability
Hardware and software components often seen, but not restricted to IoT devices
Preface by Boris Adryan xxix
in the process of addressing them the book will teach you vocabulary that is useful
in conversations with more technical professionals.
Hard- and software developers who want to grasp the complexity of the In-
ternet of Things are hopefully going to find the book useful as it can complement
their picture of the processes that are required to make an end-to-end solution work.
If you’re a programmer for embedded system, it can be useful to understand the
limitations of backend software (see Chapter 23), whereas data analytics profes-
sionals may not be aware of the specific challenges power management poses on
edge processing. If your entry point into programming was not through studies
of computer science, the foundation chapters on digital logic and the history of
computing may shed some light on concepts you’ve learned to accept as given (see
Parts I and II). At the same time, if your hardware knowledge comes from self-
study and your response to most engineering questions is based on empiricism, it is
worth going back to the first chapters of the book that introduce physical forces and
electromagnetic signals (Part I). Under the assumption that electrical engineers and
hardware specialists just know how stuff works, they may find useful information in
the discussion of Internet of Things ecosystems (see Part III), and why distributed
systems like sensor networks pose challenges to the overall design logic (Part IV).
Ultimately, the book may be interesting for those who simply use the Internet
of Things, and wonder why it takes so long to develop good technical solutions and
services.
Boris Adryan
June 2017
Acknowledgments
The authors wish to thank Aileen Storry from Artech House (UK) for believing in
this project and for her overall coordination.
We wish to thank Mark Setrem for critical proof reading of Chapters 1–11,
Toby Jaffey for useful suggestions and amendments to Chapters 4 –11, and three
anonymous reviewers for editorial advice in respect to the overall manuscript. We
apologize for many good suggestions that could not be taken into account for the
sake of time and brevity.
We further acknowledge Andy Stanford-Clark for the Foreword and Stefan
Grasmann for a very general introduction into the business opportunities provided
by the Internet of Things. Last but not least, a special thank you goes to Alexandra
Deschamps-Sonsino, both for writing a foreword as well as for organizing the
Internet of Things Meetup Group in London, which is a true melting pot of talent.
Many of the personal contacts (too many to be named) that fueled the writing of
this book were first made at one of these meetings.
BA: A big thank you to the Royal Society (UK), whose generous University
Research Fellowship enabled me to embark on the exciting journey from develop-
mental biology to the Internet of Things. I would also like to express my gratitude to
the IoT communities in London and Cambridge for many hours of fun and geekery.
In particular, IBM’s Node-RED team Nick O’Leary and Dave “CJ” Conway-Jones,
and Andrew Lindsay for blog posts that have drawn me into technical experiments.
I would also like to thank James Governor and Fintan Ryan for putting me on stage
at thingmonk, a catalyst for my transition into commercial IoT.
I owe the biggest apology to Ilka, Finnegan, Frederick and Florentine for too many
weekends I couldn’t spend with you.
DO: I’d like to thank Boris, the mastermind of this book, for his guidance,
his patience, and his enthusiasm. I’m honored to be part of this opus. A huge
thank you goes out to Christian Götz, Christoph Schäbel and Florian Raschbichler
for proofreading my first attempts and drafts. My apologies to Claudia for all the
xxxi
xxxii
evenings and weekends we couldn’t spend together. Finally I want to thank Josef
and Monika Obermaier. I owe you everything.
PF: Thanks very much to Boris for inviting me to join him on this journey,
and for his enthusiasm and his drive to create such a thorough, deep and engaging
book. I hope I’ve kept up the standard. Big thanks to Ruchith Fernando and Prabath
Siriwardena for getting me excited about Identity, and to Benjamin Aziz for his
help during my PhD work. Finally, thanks to Jane, Anna and Dan who inspire me
in everything I do.
Part I
Physical Principles and

Information
Chapter 1
Electricity and Electromagnetism
Every new development in information technology is ultimately an exploitation of

the laws of physics. The execution of even the most complex software boils down to
the state change of semiconducting material in a processor, and that is nothing more
than pushing electrons around. The exchange of signals among wireless devices
is based on the controlled emission and interpretation of electromagnetic waves.
Sound from speakers, rotating motors and the shining of light: all of this the result
of electricity and magnetism.
Information technology and computer science themselves have introduced
meta-levels: In essence just applied mathematics, we can assess if problems can
in principle be solved with a computer or whether they require impossible amounts
of calculation. The efficiency of an algorithm can be determined in absence of a
computer and completely without any knowledge about its implementation. The
laws of binary logic hold true no matter if we execute code on a set of transistors or
the latest multicore processor.
Most people treat everyday devices as black boxes. This attitude even extends
into the professional context, for a good reason: We cannot constantly go back
and think about the basics. Under normal circumstances, product designers do
not need to know software development and boolean logic, programmers don’t
need to engage with the intricacies of processor architectures and the electrical
engineers who are putting together different functionalities on a single chip may
not have to engage with Ohm’s law (although they hopefully still remember it).
Physics and information theory are omnipresent, yet usually we do not let ourselves
be distracted by the details not immediately relevant to our work. However, as
product designers, engineers and scientists, we should at least have a notion of the
underlying principles.
3
4 The Technical Foundations of IoT
In this chapter, we will be looking at electricity and electromagnetism. It can

by no means replace a solid foundation in mathematics, physics and chemistry.
Rather it should be understood as a guide to the appropriate vocabulary and concepts
that underlie any modern technology, giving an opportunity for further studies
where needed.
1.1 MATTER, ELEMENTS AND ATOMS
All matter is made up of chemical elements. There are more than 100 elements that
can be observed under certain experimental conditions, but only around 80 or so of
them occur as stable isotopes on Earth. The most common elements on this planet’s
crust are oxygen, silicon, aluminium, iron, calcium, sodium, magnesium and potas-
sium. On the level of soil and in the organismic world, however, these are com-
plemented by carbon, hydrogen, nitrogen, phosphate and sulphur in considerable
amounts. Other elements such as copper, zinc, lead or lithium are comparatively
scarce. Some elements of relevance to electronics such as iridium are extremely
rare, being about two orders of magnitude less common than gold.
The periodic table (not shown) provides a summary of chemical elements.
They are ordered by increasing atomic mass (i.e., the sum of protons in the core of
the atom). The rows of the periodic table are called periods, whereas the columns
are called groups. There are dozens of characteristics for each chemical element,
as they have a similar electron configuration of the outer shell. Often, elements
in the same group share similar properties. In this book we are only interested in
the electron configuration (i.e., in which layers electrons are coordinated around
the core), as this influences their electrical properties. Important data frequently
displayed for each element are the symbol and name. Electrical conductivity (the
opposite of resistance, measured in moh kJ
cm ) and ionising energy, measured in mol ,
give an indication of the element as electrical conductors or electron donors, a
property important in the discussion of electrochemical batteries. Conductivity
values indicate details, for example, that silver and copper are very good conductors.
Elements are distinguishable from each other as their smallest divisible unit,
the atoms, have a different number of protons: It is typically the most prominent
number in the periodic table of elements and is commonly referred to as atomic
number. The protons, together with neutrons, are the physical makeup of atomic
nuclei. For learning about electricity and magnetism these are the smallest compo-
nents of matter that we may have to care about, the entities and forces that hold
the positively charged protons and the chargeless neutrons together are relevant
Electricity and Electromagnetism 5
A B
h.f
n=1
n=2
n=3
Figure 1.1 Bohr-Rutherford model and image of a hydrogen atom. (A) The Bohr-Rutherford model
still proposed that electrons circulate the nuclear core like planets the sun, although already adding
constraints known from nuclear physics in 1913. The energetic difference when an electron falls back
from an outer shell to an inner shell (here, from to ) is compensated by emitting a photon
of the energy ∆ ℏ · . (B) In 2013 a research team at the FOM Institute for Atomic and Molecular
Physics (AMOLF) in the Netherlands made the first atomic photograph of a hydrogen atom. (Image
courtesy of Aneta Stodolna, FOM.)
only for nuclear physics. The smallest atom (hydrogen) has a core diameter of 1.75
femtometres ( · ), with the core contributing only about to
the total hydrogen atom with its electron shell. A single hydrogen atom is around
the smallest dimensions that can be resolved by X-ray or laser-based photography,
˚
a structure of approximately half an Angström ( A˚ = · ) in diam-
eter. In neutral atoms, there is an equal number of protons and negatively charged
electrons, otherwise we are talking about ions. Most properties of materials that we
care about in the context of electronics (electricity, magnetism, light) are directly
linked to the presence and behavior of electrons. Nucleus and electrons make up
only a tiny proportion of the overall volume of an atom. It is amazing to imagine
that the solid appearance of many pure elements (at room temperature) is mostly
down to electrostatic forces that act between nucleus and electrons!
1.1.1 Electron Configuration and Atomic Orbitals
The diameter of an electron is roughly 1.5x that of a proton ( ). The distribu-

tion of electrons around the nucleus is called electron configuration. Atom models of
the last century often depicted electrons on coplanar circular trajectories around the
nucleus, much like planets revolve around the much heavier sun. Refined ideas such
as the Bohr-Rutherford model (see Figure 1.1) departed from the idea of electrons
that purely follow the laws of classical mechanics:
• From all possible trajectories, only distinct orbits around the nucleus are
allowed.
• The change between orbits requires either the addition (absorption) or loss
(emission) of energy. For example, if an electron changes from an outer to
an inner shell, a photon of frequency is emitted that is proportional to the
energy loss of the electron ∆ ℏ · , with ℏ being the Planck constant.
On each energy level there is a maximum number of electrons that can carry
the potential energy to keep it in that shell. This electron capacity follows the rule
· . The shells have historically been named ( ), ( ), ( ),
( ), ( ) and ( ) from innermost to outermost. We now know
that atomic orbits are not circular and absolute. The Schrödinger equation defines a
wave function that describes the probability of observing an electron at a given
spatiotemporal coordinate. The imaginary space in which the electron is likely to
occur is called orbital (see Table 1.1). Every orbital is associated with a particular
shape or spatial probability and can take up to two electrons ( Hund’s rule). With the
increasing number of electrons per shell, at the same energy level, spatially more
complex wave functions are describing the orbitals (one spatial type, sphere),
(three spatial types, dumbbell along the three main axis), (five spatial types) and
(seven spatial types).
Orbitals are the spaces of highest likelihood of encountering an electron. They
have characteristic shapes that are predicted by the wave function . From the
innermost ( ) atomic shell, increases with the growing number of electrons,
as each shell can only carry · electrons. The quantum number denotes a rough
geometry of an orbital. While orbitals ( ) only know one symmetry, the
quantum number describes a particular spatial orientation. Each orbital can only
be occupied by two electrons.
We can characterize a single electron and communicate its address in the
theoretical model. Its shell, , is also referred to as principal quantum number.
The orbital angular moment quantum number is indicates a 0 = -, 1 = -, 2 -
or 3 = -type orbital. As electrons are not static their trajectory within the orbital
is expressed as l . Note that l describes symmetrical orbitals and linear
trajectories, while j l j refer to more complex spatial orientations in the z-
direction. States with j l j also possess higher potential energy, and in a strong
magnetic field l can be increased or decreased, depending on the orientation of the
( ) ( ) ( ) ( )
± ± ± ± ± ±
z x y z2 xz yz xy x2 y2 z3 xz 2 yz 2 xyz z(x2 y2) x(x2 3y2 ) y(3x2 y2 )
3
Orbitals
Table 1.1
5 ... ... ... ... ... ... ...

Electricity and Electromagnetism
6 ... ... ... ... ... ... ... ... ... ... ... ...
7 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
7
field. The electrons themselves induce a magnetic moment that is dependent on l ;

it is therefore also referred to as magnetic quantum number. Each orbit identified by
, and l can carry two electrons of opposite spin quantum numbers s (up)
or − (down), also contributing to the magnetic moment. It is important to keep in
mind that this is only a simple visual model of the complex mathematics underlying
quantum theory. In many applications, it is in fact sufficient to think about electrons
in the simple terms of the Bohr-Rutherford model, although in this book we may go
back and forth between these descriptions where it seems opportune.
Electrons of the outermost shell often participate in chemical reactions and
interactions between atoms. Electrons of some atoms can leave or be accepted,
while others can be permanently shared between atoms to form molecules. These
outer electrons are called valence electrons. For most main-group elements, these
are electrons of the and orbitals, while the so-called transition metals also con-
tribute orbital electrons to chemical bonds. Valence electrons are of relevance to
us as they directly influence conductivity. Chemically inert elements and molecules
without free electrons show no conductivity. Metallic atoms have low ionization
energies (i.e., these electrons can leave the atom after addition of low amounts of
energy, such as in an electric field). This is a property that is linked to the immedi-
ate environment of the donor atom. For example, while elementary copper (Cu) is
an excellent conductor and the freed electrons can move along, following an ionic
reaction with chlorine (Cl) they would participate in a lattice and become part of an
ionic bond in a CuCl2 crystal, rendering the lattice an isolator.
1.1.2 Conductors and Semiconductors
While sharing electrons is a feature of ionic interactions and covalent bonds between
atoms, the free flow of electrons between atoms is known as electric current. The
wave function states that electrons can only have spatially discrete localizations
around the nucleus, the orbitals. In an idealized model of atoms in conducting
material, the superposition of orbitals allows valence electrons to participate in a
more continuous exchange layer between the atoms. This is commonly referred to
as electric band model, or, more casually, as free electron gas (see Figure 1.2). The
interested reader may follow up on these phenomena in a textbook on molecular
orbital theory.
If there are no electrons in the electric band, or if the band is fully occupied by
electrons, no charge transport can happen. Only free electrons and the availability
of an electron hole elsewhere (a surplus of positive charge elsewhere, as shown for
the central beryllium (Be) atoms in Figure 1.2) enable conductivity and the flow of
orbital
valence and conduction band
partially overlapping 2p energetic
energy bands e - e- e- level
2s
separated
1s
energy bands
+ +
Be Be Be Be
electron configuration: 2s2
Figure 1.2 Model of electron transfer through the conduction band. Molecule orbitals can be inter-
preted as energetic levels. Beryllium atoms normally have a 2 electron configuration (two electrons in
, two electrons in ). For the purpose of the argument, let the two central atoms lack an electron, here
denoted with a sign. The energetic levels and physical distances of the orbitals are clearly separated
between the four atoms. However, the (valence) and (conduction) orbitals are energetically close
and partially overlapping. That is, 2s electrons can transiently and spontaneously participate in the
orbital. Energetically this a vertical movement, but the close proximity of the atoms allows electrons to
spill over to a neighboring atom when returning to the lower level.
electrons. This movement of electrons is not entirely unconstrained. Depending on

the element, electrons require less or more energy to move from one potential well
to another. In good conductors the energetic levels of different bands are similar
and partially overlapping, meaning that electrons require only a small amount
of energy to fill up holes at slightly higher levels. In non- or less conducting
elements, the energetic levels are so distinct that the fill-up cannot occur across
band boundaries. The energy difference is called bandgap in the language of solid-
state physics. The hopping between potential wells is often described as electron
diffusion. This is obviously a simplifying model, and it does not imply that electrons
freely diffuse from one end of a conductor to the other, but rather occupy and release
electron holes in a stochastic process. In semiconducting material, additional energy
contributed by heat or light enables electrons to jump between electric bands, that
is, to cross the bandgap (see Figure 1.3). These materials can conduct under certain
conditions: The reader may know them as part of thermistors or light-dependent
resistors.
The properties of semiconductors can be influenced by bringing in additional
electrons or by removing electrons in a chemical process called doping (see Figure
1.4). The doping is either n-type (e.g., by infusion with phosphate or lithium) and
leaves an additional electron, or p-type (e.g., with aluminum or gallium), which
non-conductor, semi-conductor conductor

isolator
conductive h .f
band e- energetic
level
valence band e- e- + e- e- e-
Figure 1.3 Valence and conductive band in isolators, semiconductors and conductors. The conductive
band is energetically unreachable in non-conducting material, the bandgap is too large. In semiconduct-
ing material, the bandgap can be overcome by additional external energy (e.g., in the form of heat or
light). Conductive materials are independent of external energy, electrons can overcome the bandgap
spontaneously.
removes an electron and creates an additional positive charge. This is a very

powerful method to tune the conductivity of materials, even in a ratio of
between a silicon lattice (four shared electron pairs) and the doping material (e.g.
arsenic, one surplus electron; indium, lacking one valence electron).
Many semiconductor crystal diodes work through the combination of p- and
n-type materials, as the boundary interface between the two show an interesting
electrical behavior. The basic unit for semiconductor electronics is the p-n junction,
and while a p-n block is used in simple rectifying diodes that act as unidirectional
valves in an electric circuit, the common bipolar junction transistors combine p-
n blocks into n-p-n or p-n-p types and can be used as switches (more about this
in Section 2.1.2 about transistors and other active electronic parts). It is important
to note that simply using p- and n-type materials in series in a circuit does not
yield the same affect; both p- and n-material need to be part of the same crystal.
The underlying idea for the p-n block is that electrons can more easily go from
the electron-rich n-type material to the electron-depleted p-type, whereas in the
direction of p-type to n-type the material serves as an insulator (see Figure 1.5).
1.1.3 Electric Charge, Current and Voltage
In Section 1.1.1 we established that electrons carry a negative charge and can move
between atoms. Atoms that have lost electrons are positively charged and attract
available free electrons.
+ -
+ n -
p p
+ -
+ n p -
n
+ -
+ n -
B
n-doping with arsenic p-doping with indium
[Ar] 4s2 3d10 4p3 [Kr] 5s2 4d10 5p1
+ - + -
+ - + -
+ - + -
+ - + -
+ - + -
+ - + -
5 e- in outer shell, 3 e- in outer shell,

1 surplus e- creating e - deficit
Figure 1.4 Electron flow and doping in a silicon grid. (A) A silicon grid (atoms in black) with covalent
bonds depicted as bulges, normally featuring two participating electrons (small circles). Electrons that
become free because of a net flow from right (electron source) to left (positive charge) participate
in n-type conduction, whereas electrons that move into available electron holes participate in p-type
conduction. (B) Contamination with doping atoms (in gray) can be n-type (e.g, arsenic, 3 : electron-
donating orbital) or p-type (e.g., indium, 1 : electron deficient orbital).
A B C
+ + + + + + +
- (+) (+) - p - (+) (+) - p - (+) (+) p
(+) (+) -
(+) - - - - - (+)
- - (+) - (+) -
(+) (+) (+) (+) (+)
-
-
-
- - (+)
(+) - (+) () - () - () - ()
+ (-) + + + + (-) +
(-) (-) (-) (-)
+ (-) + (-) (-) (-) + (-) (-) (-) (-) + (-)
+ +
(-) + (-) (-) (-) + (-) (-) (-) + (-) (-)
+ + + + +
+ n + n + + n
- - - - - - -
boundary layer boundary layer
+ arsenic ion
localized (-) free electrons (+) electron hole ( ) filled electron hole
- indium ion
Figure 1.5 Electron diffusion through a n/p boundary. (A) Initially, surplus n-material electrons close
to the n-p boundary diffuse into the p-zone. (B) This creates a layer between the n- and p-zones, which is
depleted of free electrons as the surplus n-zone electrons are repelled by the negatively charged p-zone
and drawn back into the positively charged n-zone, a force called space charge. (C) Only when an electric
field is applied, free electrons from the n-zone can diffuse to the p-zone, but not vice versa.
1.1.3.1 Static Electricity
Materials can become charged by the so-called tribolic effect, which is the transfer
of electrons from one material to the other due to different electrochemical potential.
That is, no continuous current is flowing, but the donor remains positively charged
and the receptor becomes negatively charged until they can neutralize their charge.
The effect is significantly accelerated by direct contact between the materials (e.g.,
rubbing the cat) but also works across small distances. If the built-up charge is
large enough and the material close enough to a neutral or differently charged
material, then a discharge through otherwise isolating medium such as air generates
an electric spark.
1.1.3.2 Coulomb’s Law
The basic unit of the charge is the coulomb (C). An individual electron has
the negative elementary charge − of ca. · , while, formally,
protons carry . With the exception of phenomena when protons are split into
their elementary particles, the quarks, is thus always a multiple of , while is
always a multiple − for negative charges.
One can calculate the force between two charges, and , at distance
using Coulomb’s law, which states that:
·
coef f ·
This resembles Newton’s law of gravitation and suggests that the force
between two charges scales with the squared distance between them. If the force
is negative, both charges attract each other, and if it is positive, they repel each
other. It becomes obvious why charges with the same sign seek to avoid each other.
In a neutral vacuum, the constant coef f is applicable if the two charges have
spherically symmetric distribution, with being the electric constant:
coef f
· ·
Maxwell’s equations are the fundamental rules that govern the interactions of
electric and magnetic fields. While the proper understanding of Maxwell’s theory
requires a thorough understanding of physics and considerable mathematical skill,
we accept here that the electric field constant
is linked to the the magnetic field constant

[ ]
·
· ·
and the and the speed of light ( ). As we know that the force is in Newton
(N), we can infer the unit of as:
[ ]
The force field around a charge can be obtained through a vectorized form
of Coulomb’s law. If we were to place an imaginary small charge, , in between
our two charges, and , both would apply a force on and attract or repel it
to different degrees, depending on their distance to it. The trajectory of would
be the sum of the attractive and repelling forces, describing a line if placed straight
electric field lines between electric field lines between

opposite charges repelling charges
Figure 1.6 Field lines between electric charges. Schematic display of field lines between opposite and
repelling charges. The arrow represents the direction of the force, and the density of lines the strength
for a particular surface area. Note that field lines naturally occur in all three spatial directions, but only
two are drawn in surface projections.
between and , or in a elliptic trajectories more laterally. In the case of equal

signs, the trajectories would point outward (see Figure 1.6).
The strength of the field is measured in Newton per coulomb [ N C ]. If we

assume an ideal homogenous field (e.g., above an infinite negatively charged plane)
to move a negative charge towards it from f ar to near , we have to invest
N
energy: If the local field strength is C , and we want to move a charge of
by from far to near, we require the work:
· ·
The unit of work is Joule (J). It is noteworthy that the same laws apply also in
inhomogeneous electrical fields, but the precise calculation involves the solution of
integrals and we omit their treatment for the sake of clarity.
The energy we have invested is stored as potential electric energy (it is not
lost as, e.g., heat, but can be released by allowing the charge back to its original
position). The potential electric energy is a value specific to a particular object in
an electric field. We can express the potential energy in near as the potential
energy in f ar plus some amount of work. The amount of work is proportional
to the charge in question. By normalizing to the charge , we can obtain the
difference in the electric potential in f ar and the potential in near . The
E Q-
q
d
E
E- +
Q+
Figure 1.7 Electrical fields between to infinite planes. For the definition of capacitance it is useful to
imagine two charged planes + and of infinite spatial expansion that are separated by distance .
Above every point on plane + acts a field of strength . That field radiates circularly with rapidly
decaying strength, such that the contributions of neighboring points are negligible.
difference
f ar near ·
defines the electric potential difference between f ar and near . It is the

voltage, measured in volts (V).
1.1.3.3 Capacitance and Current
Capacitance
The field strength above a charged infinite plane can be calculated as
coef f · ·
with and the surface area taking a radial distribution of charge around a point
above the plane into account (see Figure 1.7).
We now imagine two parallel planes and , which are positively or
negatively charged, respectively. They are separated by distance . For reasons of
simplicity, we assume that and have identical values, just with opposite
signs. The field strength between both planes can then be calculated as:
coef f · · coef f · · coef f · ·
To move a small charge against the field, we need to invest work just as in the
previous example. It is calculated as · · . We remember that normalizing
the work by the charge , we obtain the electric potential difference between the
two plates at distance . With
· coef f · · ·
follows:
·
coef f · ·
We can hence link the charge of our planes, , to their electric potential difference
via the denominator
coef f · ·
which is also referred to as capacitance . It follows that:
or
The unit of is thus Coulomb

V olt , or farad (F). The reader may have realized that
through algebraic transformation we can rewrite coef f · as π ccoeff
. This
allows us to write Q
C ·A
d , offering an alternative definition of the electrical field
constant. We can calculate the charge that occurs on our two planes of surface
area at distance by applying voltage across them. This storage of charge is
one of the underlying principles of capacitors in electronic circuits.
Current
So far we have focused on immobile charges that exhibit a force across distances.
We are now looking at situations in which the charges themselves are allowed to
move, as we would observe in an electric circuit. In the previous section we saw that
electrons with their negative charge − can relocate between atoms in conductive
material (see Section 1.1.2). And rather than using planes with opposite charge
that would discharge very fast, we let our circuit be powered with a battery, which
represents a significantly larger repertoire of electrons. The electrons move from the
negatively charged pole (-) of the battery to its positively charged pole (+), where
there is a surplus of positive charges that attract the electrons. The actual electron
flow is thus opposite to the conventional direction of electric current, which we
usually see depicted in schematic drawings. This convention dates back to a time
when the nature of electricity was not yet fully understood.
In a circuit we are not observing the transfer of a single electron, but a
continuous homogeneous stream of charges from (-) to (+). The stream is spatially
uniform as the moving charges are of the same sign and repel each other because
of the electrostatic force. That amount of charge per unit of time is the current, ,
measured in amperes (A).
Per definition = (i.e., the Coulomb is in fact, a derived unit). The
potential difference between two points is the voltage, which as we have seen in
the last section, can also be expressed as Nm C . Both metrics are linked through the
capacitance. A convenient overview of possible conversions between these units is
shown here: [ ]
· ·
· · · ·
1.1.3.4 Ohm’s Law and Power
Ohm’s Law
Every circuit without a load yields a shortcut, the nearly instantaneous complete
transfer of negative charges and death to the power supply. The most simple load
is a resistor, a component that limits the amount of charge that can pass through it.
Overall, the amount of charge, the current, is proportional to the electric potential
difference, the voltage, in the simple relationship ∼ . The resistance ,
measured in ohm (Ω), serves as a proportionality factor, such that
This relationship is known as Ohm’s law. Most electronics practitioners know

Ohm’s triangles by heart and won’t even require the simple arithmetics used to
derive it (see Figure 1.8).
An application: If we imagine a battery and a Ω resistor, the current
U : V
flowing through the circuit is limited to R Ω A. As 1 A is the
C
equivalent of s , we can infer that we are moving about · · ·
electrons per second at any given point in our circuit.
Extensions of Ohm’s law consider cases where resistors are placed in series
or in parallel within the circuit. If two or more resistors are present in series, their
joint resistance is
total;series n
U U U
I R I R I R
U=IxR I=U/R R=U/I
Figure 1.8 Ohm’s triangles. Ohm’s law, · , is used to express the dependence of voltage,
resistance and current in a circuit. By writing down this relationship in triangular shape, interpreting
the horizontal line as division for the calculation of current and resistance, we can easily infer the basic
algebra to calculate these values.
U
This allows us to calculate the overall current as total Rtotal;series . As the current
is the same at each position within the circuit, for each resistor we can determine the
Itotal Itotal
voltage drop as R1 R2 and so forth. In case of parallel resistors,
their total resistance adds up to
total;parallel
n
U
The overall current total Rtotal;parallel is split between the parallel subcircuits
containing , and so forth. The respective currents , , ..., n are calculated
U U U
as R1 , R2 , ... n Rn and add up to total .
Power
When current flows through a load, electric energy is transformed into other forms
of energy; for example, heat, motion or light (in physics, we also refer to these
components as transducers). In the case of resistors it is kinetic energy from
collisions of electrons with resisting material, thereby emitting heat. By definition
the power is indicated as the amount of charge across the electric potential
difference per unit of time , such that
·
·
The unit of power is watts (W), or 1 joule per second. Going back to our initial
example with U = and I = , across our Ω resistor electric energy
of is converted into heat.
1.2 ELECTRIC AND MAGNETIC FIELDS
Electromagnetism and electricity are two intimately linked physical phenomena. In

the discussion about Coulomb’s law, we have already briefly mentioned Maxwell’s
equations (see Section 1.1.3). We used the magnetic field constant to determine
the force between charges. The movement of electric charges yields a magnetic
field, and magnetic fields can induce the flow of electrons. The function of electric
motors and generators as well as radio communications can be explained with
these interactions. In the previous sections we have generally only looked at the
strength of electric fields and have neglected their direction. For the interaction
between electric and magnetic fields, their precise orientation is important and we
are introducing their vector notation in this section.
1.2.1 Magnets and Magnetism
Before we look more into magnetic fields, we need to understand the nature of
magnets. We have briefly mentioned the rules that govern the localization and
trajectory of electrons in the atomic shell. In simple terms, the magnetic quantum
number l describes the asymmetry of an electron trajectory, with j l j
deviating from linear trajectories to ones with distinct Z-dimension. Each electron is
a moving charge and thus itself yields a magnetic field. It acts as a magnetic dipole
(i.e., like a little bar magnet), with an orientation dependent on l and s .
On the atomic and molecular level, we can observe diamagnetic or para-
magnetic behavior. If all quantum numbers except s are equal, the magnetic fields
of the up- and down-spin electrons cancel each other out, the atom (or molecule)
does not react to an external magnetic field. We are talking about diamagnetism
in this case. If we have unpaired electrons per orbital, their dipole moment yields
paramagnetic behavior; that is, the atom (or molecule) does react to a magnetic field.
This explains why a stream of liquid helium (diamagnetic; configuration) is un-
affected while liquid oxygen (paramagnetic; , , x , y , z ) gets diverted
when flowing through a strong magnetic field.
On the macroscopic level we can distinguish ferromagnetism, ferrimagnetism
and antiferromagnetism, all of which describe conditions in which the magnetic
dipole moments can influence each other. When we colloquially refer to magnetized
materials, we usually mean ferromagnetic matter with magnetic domains (also
known as Weiss domains) of only a few cubic micrometers featuring dipoles with
parallel orientation. Through exposure to an external magnetic field these can be
aligned to change the material from a nonmagnetic to a magnetic state. Addition
Figure 1.9 The expansion of magnetic domains in ferromagnetic matter. Two-dimensional schematic
representation of Weiss domains in ferromagnetic material. While the material may occur nonmagnetic
as long as the magnetic dipoles are more or less randomly distributed, aligning the magnetic domains
through an external field may eventually lead to a complete reorientation of dipoles in one direction,
rendering the material magnetic.
Figure 1.10 Photograph of iron filings around a bar magnet. From Newton Henry Black and Harvey
N. Davis (1913): Practical Physics, The MacMillan Company, USA (p. 242, Fig. 200).
of thermal energy can reverse the previously random orientation of the magnetic
dipoles, thus returning the material into a nonmagnetic state (see Figure 1.9).
1.2.2 Interactions of Electric and Magnetic Fields
Magnetic Fields
It is possible to visualize magnetic flux lines with iron filings that orientate around a
bar magnet, which shows their resemblance of electric flux lines (see Figure 1.10).
For historical reasons and in reference to the Earth’s magnetic field, the poles of
magnetic fields are referred to as North and South, and the convention is that the
force extends from North to South.
The strength and orientation of the magnet is given through its dipole moment
−! and the strength of the magnetic field is described as −! − !
. is the magnetic flux
c
!
− !
−
density and scales with the magnetic permeability of the medium. · m· ,
where is the magnetic field constant and m a material-specific constant. For
simplicity and especially when the medium does not respond to the magnetic field,
m . The strength of the magnetic field is given in Tesla (T), and = CN m
s
.
Lorentz Force
The force an electric charge experiences when entering a magnetic field depends
!j and the strength of the magnetic field j−
on its speed j−
!
j:
!
− − ×−
!
· !
!
−
j j is a force measured in Newton, because:
N
C
· · s
m
The cross product × of − ! and −!

shows that
!
− ! and −
stands orthogonally on −
!
,
!
−
and that j j is maximal when the charge is entering the magnetic field orthogonally,
and that the charge experiences no force when it is traveling along the magnetic field
vector. In the simple case of −! and − !
being orthogonal to each other (and not at
any other arbitrary angle), we can use the so-called three-finger or right-hand rule
to determine the direction of the resulting vector (see Figure 1.11).
Individual electrons that enter a magnetic field are diverted, which is the key
principle of traditional cathode ray tube (CRT) displays where modulation of the
field determines where electrons fall on the phosphorescent screen. Electrons in a
wire are spatially constrained, their combined force can add up considerably and
make the wire move within the magnetic field. This is the functional principle
of electric motors (see Figure 1.12). The force that acts on the wire follows
!
− !×− !
· − ). The speed of electrons in the wire is determined by ∆tl
, with
−
!
being the length of the wire in the magnetic field and ∆ the time the electrons
Q
require to travel through . Because the current ∆t (remember the definition of
!
− !×− !
= , thus · ∆ ), we can rewrite · − as:
!
− !
− !
− !
− !
−
·∆ · · × · ×
∆
v
B
Figure 1.11 Visualizing the right-hand rule. Three fingers of the right hand define orthogonal axis. If
electrons move in direction of the thumb with speed −!, within a magnetic field − !
whose field lines are
!
−
extending toward the length of the index finger, a force acts in direction of the middle finger.
For example, a wire with a current of in a magnetic field will

!
−
experience a force j j · · . Because we’re assuming an
orthogonal angle between the wire and the magnetic field, the factor (the
!
− −
!
sine of the angle between and see definition of the cross product) is 1 and
was therefore omitted. The force of is the equivalent of the pressure of a
weight on Earth.
Moving electrons themselves produce a magnetic field. We have neglected
the magnetic field around the wire in the last example because its strength is
considerably smaller than the external field. The strength of the magnetic field of a
long conducting wire is:
!
− ·
j j
It is thus dependent on the current in the wire and proportional to the magnetic
field constant , and decays with the radius around the wire. At r =
distance from our wire with I = we therefore experience a magnetic field of
!
− −7 !
−
j j π · . We can depict as a concentric ring of radius
around the wire, with an orientation that follows the right-hand rule: If we hold the
wire in the right hand and flows in the direction of our extended thumb, the flux
lines follow our fingers. This is seen in Figure 1.11.
Induction
So far we have looked at moving charges through a wire in a magnetic field.

However, we can also induce a current by pushing the wire into that field. We
+ -
F1 = I (l1 x B), pushing into plane

magnetic
l1 l3 F2 = I (l2 x B), parallel with B => no effect
field
F3 = I (l3 x B), opposite direction of F 1
l2
rotational axis
Figure 1.12 Principle of an electric motor. A simple electric motor would use a simple, bent wire
as shown. On the left, the wire would experience a force that pushes it into the direction of the book,
whereas on the right side the wire would move toward us.
!
−
remember that the force j j a charge at speed j− !j experiences in the field of
!
− !j × j −
!
strength j j is · j − j. Work is defined as the displacement of an object
!
− !j · j −
!
with force , · [ · ]. We can rewrite j j · j− j as:
!
−
· j−
!j · j j· l
with l being the length of the wire that is pushed into the magnetic field with speed
j−
!j. We also remember that the voltage U was charge-normalized work, W , thus:
Q
!
−
j−
!j · j j· l
An example: Let a l = long wire be pushed orthogonally into a magnetic field

of !
−
at j j m
s , then · · . Motion from steam, wind or water
is used to move the wire in a magnetic field in generators for energy production.
1.2.3 Electromagnetic Spectrum
We briefly mentioned Maxwell’s equations when we first introduced the electric

and magnetic field constants (see Section 1.1.3). Along with the Lorentz force (see
Section 1.2.2) the equations fully explain the interactions of electric and magnetic
fields, and provide a framework to link their properties to quantum mechanical
principles. One of the key implications of Maxwell’s equations is the infinite range
of electromagnetic waves such as light or radio.
visible light
UV infrared
400 nm 450 nm 500 nm 550 nm 600 nm 650 nm 700 nm
cosmic gamma X rays UV- infrared radar Oven UHF VHF mediumwave alternate currents
rays radiation C/B/A shortwave longwave
microwave "radio"
1 fm 1 pm 1A 1 nm 1µ 1 mm 1 cm 1m 1 km 1 Mm
wavelength −15 −14 −13 −12 −11 −10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
[m]
frequency 10
23
10
22
10
21
10
20
10
19
10
18
10
17
10
16
10
15
10
14
10
13
10
12
10
11
10
10
10
9
10
8
10
7
10
6
10
5
10
4
10
3
10
2
[Hz] 1 Zettahertz 1 Exahertz 1 Petahertz 1 Terahertz 1 Gigahertz 1 Megahertz 1 Kilohertz
Figure 1.13 Electromagnetic spectrum. The electromagnetic spectrum from femtometer to megameter
wavelength. The visible spectrum is shown for reference. Data communication happens with long waves
with between peaks. The most relevant wavelengths and frequencies are highlighted as
“radio”.
In contrast to radio waves we can see light. To get an idea of where the energy
of these waves can come from in natural processes, let us first look at the mechanism
that makes some materials glow under ultraviolet (UV) light: fluorescence. The
energetic state of an electron can be elevated (excited) by the absorption of energy;
for example here UV radiation (remember that it’s the highenergy component of the
light that gives sunburn). This excitation can yield a change in quantum numbers
(i.e., the electrons in fluorescent materials can briefly occupy an orbital that is one
or two energetic steps higher than its original trajectory). After nanoseconds, under
vibrational energy relaxation, leading to the so-called Stokes shift, the electron
returns to its original energetic level. The remaining energetic difference is emitted
as an electromagnetic wave with frequency ∆ · . This is also the core
principle of light emitting diodes in which electric energy is converted into visible
light.
Within the electromagnetic spectrum, light covers only a small segment of
possible frequencies (see Figure 1.13). Frequency and wavelength are linked
through the speed of light:
·
That means higher frequencies yield shorter wavelengths. In the context of light, if
· (Hertz, 1/s), then
m
· s
·
· s
is the wavelength of orange light.

Some properties of electromagnetic waves can only be explained if we treat

them as a stream of particles called photons. Along the spatial and temporal axis,
the peak of the sine wave can be interpreted as areas of increased likelihood of
observing a photon. However, other features of electromagnetic waves rely entirely
on their wave characteristics. This phenomenon is referred to as wave-particle
duality.
As you can see in Figure 1.13, radio waves are in principle of the same
nature as light, but with waves having significantly longer distances between peaks
(i.e., meters to kilometers). Their interaction with matter is therefore different. A
rough analogy is that even a sheet of paper is sufficient to block light as one
can fit thousands of individual wave segments into the thickness of the paper
(where they are absorbed). At the same time, longer waves such as radio only
momentarily interact with even thicker material, such as walls, and pass through
mostly unhindered. Whether or not absorption happens also depends on the electron
shell of the matter, which explains why radio waves can penetrate some materials
(like brick walls) and are blocked by others (like iron). In summary, there are three
key factors that determine electromagnetic wave absorption:
1. The length/frequency and intensity (amplitude) of the wave.
2. The chemical properties and thickness of the barrier.
3. The structural makeup of the barrier, potentially allowing internal reflection
and diffraction.
We have previously calculated the strength of a magnetic field caused my
moving electrons (see Section 1.2.2). What is happening more precisely is the
induction of a magnetic field as result of a change in the electric field, and that fast
oscillations of electrons between two poles lead to the emission of (electro)magnetic
waves.
1.2.3.1 Electromagnetic Waves
Maxwell first postulated that electric fields induce magnetic fields, and magnetic
fields in turn induce electric fields, even in the absence of a charge and medium (i.e.,
in vacuum). As the electric field is then independent of a positive or negative pole (in
comparison to the electric fields between two charged planes), we are talking about
a rotating field in analogy to the magnetic field around a wire. Maxwell formulated
!
−
that the rotation of the electric field is the negative change of the field over time,
E B
Figure 1.14 Anatomy of an electromagnetic wave. The electromagnetic wave in this example is
linearly polarized. That is, when the wave is propagating toward a particular point on the X-axis along the
! vector, the oscillation of the electric field vector −
− !
would be observed as an up or down movement
along the Z-axis. In circular polarized waves the electric field vector rotates around the X-axis either
clockwise or counterclockwise, whereas elliptical polarization requires a change both in field vector
direction and amplitude along the X-axis.
!
−
or rot − ∆∆tB . The rotation of the magnetic field followed as change of the
∆E
change of the E field over time, or rot c2 · ∆t .
!
− !
−
Following Lenz’s law, both field vectors and stand perpendicular on
each other, propagating the continuation of the electromagnetic wave at the speed
of light as the result of their mutual oscillating reinforcement (see Figure 1.14).
The amplitudes of both components are in phase. The wave is polarized when
!
− !
−
the orientation of and do not change. Circular polarization describes the
phenomenon when the field vector rotates around the propagation axis (see Figure
1.15). In technical applications it can be beneficial to use unpolarized signals (i.e.,
using many waves with random field vector orientations) to increase penetration, as
the interaction with matter also depends on the polarization of light.
It is noteworthy that the electromagnetic waves emerge from concrete phys-
ical objects (i.e., atoms, molecules, or an antenna in technical applications), but
that the field changes continue infinitely in vacuum. In the atmosphere the reach
of electromagnetic waves is limited, as they can lose energy (amplitude) through
absorption, reflection, refraction, diffraction and interference on air molecules, geo-
logical landmarks and buildings. When electromagnetic waves pass through matter,
the loss can be calculated and is often communicated in decibels (dB), following:
signal strength after barrier

·
signal strength before barrier
A B C
E B
v
v
linear circular
E field with circular polarization
polarization polarization
Figure 1.15 Polarization of electromagnetic waves. (A) Anatomy of a polarized electromagnetic wave.
!
−
Only the vector is shown for the purpose of clarity. (B) Looking at the wave propagating out of the
book towards −!, linearly polarized waves have their maxima at positions separated by ◦ but the
!
− !
−
vector only fluctuates between the two. (C) In circularly polarized waves, rotates around −
! and
covers ◦ to ◦ from peak to peak.
For example, if the source signal carries before and after a barrier,
the loss is − . The same calculation can be used to study the gain in signal
strength; the decibel value a characteristic of antennas.
Poynting vector
The directional energy flux density of an electromagnetic wave is the Poynting

vector, the vector product of the electric and the magnetic field vector. Its unit is
W N
m2 , or m s . In a graphical representation, the Poynting vector describes a segment
in a sphere that emanates from a point antenna (see Figure 1.16).
Reflection, Refraction, Diffraction and Interference
The laws that govern the behavior of visible light also hold true for other electro-
magnetic waves.
• Reflection describes the mirroring of a wave on smooth surfaces, and it
can be accompanied with the absorption of the wave and the conversion
of the wave’s energy into other forms, such as heat. Reflection is normally
incomplete, and it is dependent on the interplay between the material, and
the wavelength and the angle at which the incidental wave hits the object.
A major reason for the nearly global reach of some radio communication
signals is the reflection of long waves ( to ) in the so-called F-layer
y
x
Figure 1.16 Graphical representation of the Poynting vector. It describes the directional energy flux in
W
m2
. The vector that emanates from the origin in this schematic follows −
! in Figures 1.14 and 1.15.
of the Earth’s atmosphere. Radio signals can also be reflected from building
structures, and the duplication of waves can lead to interference (see Figure
1.17).
• Refraction is another phenomenon we know from light. At the interface
between different media, such as air and water, the direction of propagation
is dependent on the ratio of their refractive indices, which are wavelength-
dependent material-specific constants. This is also the main principle behind
prismatic glasses that split white light into its different monochromatic
components.
• Diffraction refers to the change in propagation when a wave encounters a gap
or slit that is roughly the dimension of the wavelength. Originally discovered
for very small slits and visible light by Huygens in the seventeenth century, in
the widest sense this can include the passage between buildings or between a
hill and the atmosphere (in case of radio waves). Diffraction leads to a char-
acteristic speckle pattern with multiple intensity maxima around a central
absolute maximum. The pattern is the result of interference: Electromagnetic
waves add up (see Figure 1.17). In the case of waves that are in phase, the am-
plitudes of both waves add in intensity (constructive interference), whereas
◦
an offset of between both waves lead to their cancellation (destructive
interference).
Because of refraction and reflection, there can be multipath propagation of electro-
magnetic waves that yield destructive interference. In the so-called Fresnel zone,
any obstructions may lead to signal loss even when there is a clear line-of-sight
between the sender and the receiver of a radio signal.
w1 w1
+
w2
constructive interference
w1
+
w2
intensity
w2
destructive interference
Figure 1.17 Interference. An electromagnetic wave propagates from a source (here: lamp) toward a
wall with slits. The wave front is depicted from the top and from the side. Note how the peak maxima
from the side view align to the lines of the wave front. While emanating in a radial pattern from the
source, at larger distances the wave fronts appear as parallel lines. The wave front is then diffracted
on the two slits, giving rise to two waves 1 and 2 . The signals of both 1 and 2 add up. As the
diffraction also leads to a shift in phase between 1 and 2 depending on the relative position and angle
to both slots, areas with constructive or destructive interference emerge. If both 1 and 2 are in phase,
the signal becomes significantly brighter, and if 1 and 2 are completely phase-shifted, the signal
disappears.
L + C L C L - C L C
U I I
- +
t=0π t = 1/2 π t=1π t = 3/2 π
B
1
U
I
0
-1
1
Figure 1.18 LC Circuit. (A) Charge and current in the LC circuit at time points · , 2
· ,
3
· and 2
· . (B) and at the corresponding time points. Note that 1 and -1 indicate full
extent of and , but in the opposite direction.
1.2.3.2 LC Circuit and Hertzian Dipole
LC Circuit
Before we discuss the technical possibilities to send and receive electromagnetic

waves, we are going to take a step back and have a look at a classical electronics
experiment: The LC circuit, also called the resonant circuit. The two letters L and
C stand for a coil (with an inductance ) and a capacitor (with capacitance ) that
are connected in a simple circuit (see Figure 1.18).
The coil induces a magnetic field when current is flowing through it. The
buildup of the magnetic field leads to a delayed ramp up of current following the
inductor, which according to Ohm’s law requires a change of voltage, following:
∆
∆
with ∆I∆t being the current change over time and the inductance of the coil,
measured in henry (H). The henry is defined as the induction of by per
.
Under ideal conditions (without any dampening), we can start to observe an
oscillating behavior of the LC circuit: The capacitor stores energy in an electric
field, which is dependent on the electric potential difference across it (see Section
+
C C
L + C L L
U
-
LC circuit dipole antenna
Figure 1.19 From LC circuit to dipole antenna. The conceptual transformation of the LC circuit into a
dipole antenna: the distance between the capacitor plates become larger, ultimately corresponding to the
length of the antenna.
1.1.3). The coil, in contrast, stores energy in the magnetic field if current runs
through it and releases it after short retention. Let us imagine that at the
capacitor is fully charged. Current starts flowing across the potential difference at
the inductor, but following with shifted phase the decay in voltage as the capacitor
is discharging. If the values of and are appropriately chosen, we can observe a
situation in which is maximal when is already down to 0 ( · ). Due to the
electromotive force the decaying magnetic field drives the charge to the other side
of the capacitor. At · the current in the system has reached 0, but the potential
difference now acts in the opposite direction, and the next half-cycle begins. Real-
life LC systems often oscillate at several megahertz, and one of their applications
are contactless tag systems such as radio frequency identification ( RFID).
As the magnetic field of coils can also induce a current in a separate but nearby
circuit, it is possible to create chains of LC circuits with very complex behavior.
Hertzian Dipole
Conceptually a simple antenna is a resonant circuit that has been reduced to

a conductive rod (see Figure 1.19). The Hertzian dipole is only a theoretical
construction conceived by Heinrich Hertz in the 1880s, but it helps to understand
the mechanistic steps of sending and receiving electromagnetic waves. If the length
of the rod is half the wavelength of a radio signal it is designed to receive, then the
charges in the rod oscillate from end to end, following the sinusoidal signature of the
wave (see Figure 1.20). The ends of the rod act like the capacitor plates in the LC
circuit, and receiving the electromagnetic signal pushes the charges toward them.
The signal replaces the magnetic field of the LC circuit’s inductor here. Whether
a radio signal can exert this force on the charges in the conductor depends on
the so-called impedance of the antenna, a metric expressed in ohms in analogy to
resistance. The voltage changes across the antenna can be interpreted in dependence
of the incoming signal.
Sending an electromagnetic wave with the Hertzian dipole works in an
analogous way (see Figure 1.21). By applying a high frequency alternating current,
the charges are driven from end to end, and the oscillating magnetic field radiates
into the free space. However, mechanistically this implies a few more steps: We
can imagine that the electric and magnetic fields occur around the rod in alternating
order, first the electric field from pole to pole, then the magnetic field around the
rod, then the electric field again and so forth. While in a LC circuit the electric
field lines occur exclusively between the plates of the capacitor, they run around the
antenna and dissipate into free space when the magnetic field builds up.
The impedance of free space (i.e., vacuum) is · (roughly Ω)
(see Section 1.1.3 for and ). This value differs only insignificantly in air, but is
higher and relies on material-specific constants in denser materials relating to their
permeability. This hinders the electric and magnetic fields to be absorbed in the rod
and the wave is therefore emitted. The electric and magnetic field of ( · ,
· ) and ( · , · ) are thus the components of the positive and
negative peak of one wave period (see Figure 1.22 for the spatial radiation pattern
of a dipole antenna).
1.2.3.3 Antennas
The dipole rod is the simplest antenna implementation. There are many other
base geometries (monopole, dipole, loop, aperture antenna) and their subtypes (see
Figure 1.23), as well as combinations of antennas into arrays. Antenna theory is a
complex field and a detailed analysis of antenna designs is beyond the scope of this
book. In principle, purely from a physics perspective, the choice and configuration
of antennas depends on:
• Frequency (or frequency selectivity) and polarization (linear, circular, ellip-
tical) of the signal that is to be sent or received,
• Directionality and radiation pattern (i.e., omnidirectional or focussed),
• Gain and efficiency.
From a practical perspective, factors such as cost, space and desired distance
reliability are important. For some applications the best performant antenna is not
A B e- density
E field of t=0π
radio wave
antenna
t = 1/4 π
x y
radio wave
t = 1/2 π
z
antenna
x
t=1π
Figure 1.20 Dipole antenna. (A) The anatomy of a dipole antenna, top (x/y) and side (x/z) view. Only
!
−
the of a polarized electromagnetic wave that is in the plane of the antenna is shown. If the length of
the antenna (along the X-axis) is half the wavelength of a radio signal, charges in the antenna are pushed
from side to side as shown in the panel on the right. (B) At · the radio signal is 0 and the electron
density uniform along the antennal axis. As the signal deflects toward the left (i.e., having a positive
1 1
amplitude) at 4
· , electrons are pushed towards the right. At 2
· , the positive amplitude of
the signal is maximal and the majority of electrons at the right. After a falling flank and the radio signal
deflecting towards the right (i.e., having a negative amplitude), at · the electron density is briefly
uniform before peaking toward the left of the antenna (not shown).
A B
E field
t=0π
B field
t = 1/2 π
t=1π
t = 3/2 π
e- density v
intensity of E field
Figure 1.21 Sending electromagnetic waves. (A) Electron density distribution in a sending antenna
1 3
at time points · , 2
· , · and 2
· . The change in the distribution is
!
− !
−
mediated by alternating current that is applied to the antenna. (B) The development of and over
time. The electric field is strongest when the electron density is focused towards one side ( · ;
!
− !
−
· ) — note the inverse orientation of the fields between these two time points. The field
develops whenever the electron density returns to uniformity. At the beginning of the next cycle, after
1 !
− !
−
2
· , it is important to note that and fields of the first cycle are self-sustained and independent
of the activity of the antenna (i.e., they propagate infinitely). The frequency of maximum electron density
!
−
amplitude in the antenna are the field maxima of the radiated electromagnetic wave.
z z
E field E field
y y
B field
x
z
Figure 1.22 Omnidirectionality of a dipole antenna. A dipole antenna radiates the signal in a toroidal
pattern into space. Specialized antenna designs can deliver more directional radiation patterns than the
basic dipole design. In this example, the antenna would be centred at the origin, with the rod extending
!
− !
−
along the Z-axis. Left: Top view. Right: Side view. Note the separation of field wave fronts with
fields.
dipole antenna ~ ~
monopole antenna long-wire antenna
~
loop antenna wave guide
planar
antenna
horn antenna parabolic antenna
Figure 1.23 Different antenna designs. The canonical dipole antenna, the monopole and the long-
wire antenna are all closely related to the Hertzian dipole. They are cheap to build and see a wide
range of applications, such as radio and television reception (both as part of devices as well as roof-
top mounted), or citizen band and amateur radio communications. While the half-wave dipole has a
theoretical gain of , improved monopole or dipole design see gains of up to , in the case of
a corner reflector or a Yagi (see Figure 1.26) of almost . The dipole designs are typically used for
frequency ranges of to , although also used for gigahertz communication, for example
in WiFi routers. Loop antennas exist in circular shape as well as small wire coils, and are used for low-
and medium-frequency AM communication. Aperture antenna types such as the horn antenna or the
parabolic antenna see applications in microwave communication and frequencies up to . While
the horn has gains of about , parabolic antenna types often see gains of and more. Wave
guides are special types of aperture antennas. They come in various builds and, in some cases, resemble
optical fiber cables in both shape and function. Planar antenna types are often used as antenna boosters
or as passive (parasitic) elements in NFC or RFID applications.
source
reactive radiative
near-field region transition zone far-field region
1 wavelength ( )
2 wavelengths to infinity
!
−
Figure 1.24 Near- and far-field of an antenna. Schematic of a polarized wave extending into space.
!
−
and fields are offset by ◦ in the near-field region and decay at different rate. Waves can be easily
blocked in the near-field and transition zone. Only from wavelengths on, the electromagnetic
wave enters the so-called Fraunhofer zone and extends into infinity.
necessarily the most practical; just think about a parabolic antenna on a wearable
device... In practice, the best choice of antenna is often identified by empiricism.
For example, there is anecdotal evidence that wire whips are inferior to helical
antennas for the common band. In the mobile phone sector, (planar)
inverted F antennas (IFA or PIFA) have become a standard. Ceramic surface-mount
or wire trace antennas are options for including antenna functionality on a printed
circuit board (PCB). It is worth mentioning that some electric components act as
accidental antennas. Protecting other devices against electromagnetic fields (EMFs)
is an important requirement when going through certification.
One characteristic of antennas are their near- and far-field behavior (see Fig-
ure 1.24). In the very close range to the antenna, a few wavelengths from the source,
the signal does not yet fully exhibit the properties of an electromagnetic wave but
can be seen as consecutive wave fronts, it is referred to as directly reactive to the
antenna. Near-field and far-field radiation also respond differently to diffraction.
While in most cases the far-field performance of an antenna is prioritised, certain
communication standards such as near-field-communication (NFC) operate with ex-
actly this radiation.
As a rule of thumb, at a distance to the antenna, the near-field is often
defined as wavelengths and wavelengths are counted toward the
far-field (also called Fraunhofer zone). In between, there is a transition zone (the
Fresnel zone) with properties of both fields. In the near-field, electric and magnetic
fields appear separated by ◦ . The electrical field strength decays as r3 , while the
magnetic field decays with r2 . In the transition zone, both fields decay at r2 . Once
entering the far-field, both electrical and magnetic field fall into phase, and both
fields decay at r . A WiFi signal at has a wavelength of about . That
is already at most at , the signal has all properties of an electromagnetic wave
and can penetrate building structures, whereas chances are that you may experience
difficulties placing a router inside a cabinet.
The choice of polarization often depends on the frequency of the signal. Low-
frequency (long wave) signals can benefit from vertical polarisation (vertical to
the ground) and mid-frequency signals from horizontal polarisation. The reason is
that these frequencies experiences different types of reflection. Long wave signals
experience little dampening near the surface, are not reflected by the atmosphere,
and mostly follow the curvature of the Earth. Short wave signals, in contrast, are
reflected by air molecules but are more easily absorbed near the ground. Ultra-
high frequencies that are prevalent in most wireless communication devices usually
require line-of-sight propagation. As explained before, these signals are often sent
out with random or circular polarization to increase penetration. However, in home
applications (e.g., wireless routers), horizontal polarization is often desired to focus
the antenna on roughly the height where most devices are located.
The radiation pattern can be exactly calculated for simple antenna types like
the dipole, but must be measured experimentally for others. The NEC-2X antenna
simulation software is one freely available program for a variety of platforms. Al-
though the spatial pattern of an antenna is three-dimensional, it is usually displayed
in only two dimensions and normalized to a maximum gain (see Figure 1.25). In
the main direction, the shape is defined by azimuth and elevation. Additional visual
features include the region in which gain can be observed, as well as minor or
side lobes.
One of the most common antenna designs is the Yagi (see Figure 1.26), named
after one of its inventors, Hidetsugu Yagi. It is an extension of the dipole concept,
featuring additional rods that are spaced at wavelengths and which are mounted
in parallel to the dipole. These are referred to as directors. Behind the dipole, there
are additional elements called reflectors. Through the correct configuration of the
number of directors, their spacing and individual length, the Yagi can combine the
signals that are received or sent.
zero-point
dB
beam width
0
main lobe
-3
3dB beam width
(full width at
-G half maximum)
side lobes
back lobe
isotropic radiator
profile
Figure 1.25 Antenna diagram. It captures the directionality and strength of the antenna. An omni-
directional dipole would fill the isotropic radiator profile. More directional antennas have a main lobe
(principal direction), side lobes and a back lobe. Ideally, side and back lobes are minimal for a directional
antenna. The − mark defines the half-maximum signal strength and the width of the lobe explains
how focused the beam is. The zero-point beam width helps to understand whether the signal strength
falls uniformly over the width of the beam, or drops sharply at a point.
y reflector element dipole driver element

(optional) director elements
driver director
emission emission
backward emission forward emission
Figure 1.26 The Yagi antenna. It uses a dipole driver element along with various director elements and
an optional reflector. The dipole alone would emit uniformly in both directions of the X-axis, see “driver
emission” in the middle panel. The director elements mimic this signal, but slightly offset (phase-shifted)
to the driver. In forward direction, this leads to positive interference and an enhancement of the signal,
while in backward direction, due to negative interference, the signals cancel each other out.
Regulation
The radio spectrum that is used for technical applications reaches from several
hertz to . The use of these base frequencies is governed by the International
Telecommunication Union (ITU), who have also arbitrarily named different fre-
quency bands (see Table 1.2). It is noteworthy that the frequencies used for digital
data transmission (see Section 19.3 on wireless communication standards) are in a
very narrow band of the overall spectrum.
1.2.3.4 Frequency Modulation
Electromagnetic waves themselves carry only very little information: We can mea-
sure their amplitude, their frequency and polarization. However, in order to transmit
more information per unit of time, we can modulate the signal; that is, through
continuously controlling the variation of the waveform. The basic wave is referred
to as a carrier signal.
The oldest modulation methods are amplitude modulation (AM) and fre-
quency modulation (FM) (see Figure 1.27). The carrier signal either changes in
amplitude or frequency. In the case of AM, the amplitude of the carrier wave is
modulated in proportion to the signal wave. On the other side, in the case of FM, the
carrier wave can be altered in its frequency. In the case of zero points, the carrier’s
resting frequency remains unchanged, whereas positive and negative peaks in the
signal wave can be encoded as either a momentary increase ( ∆ ) or decrease
( − ∆ ) of the carrier wave’s frequency.
A slight shift in phase (less than half a wavelength) can also be used for
encoding signals; this is called phase modulation. However, phase modulation is
not widely utilized in analog electronics. In digital communication where the signal
◦
is encoded in a binary stream of 0 and 1, a phase shift of may indicate 0
when the shift occurs with the upper amplitude and 1 when it occurs for the lower
amplitude.
In contrast to amplitude modulation that can directly be used to transmit, for
example, acoustic signals, digital modulation requires the knowledge of a protocol
or convention (Figure 1.27). There are almost a dozen digital modulation methods,
such as:
• Phase-shift keying (PSK)
• Frequency-shift keying (FSK)
• Amplitude-shift keying (ASK)
A Analog Modulation
amplitude
AM
carrier wave modulate by
FM
frequency
signal
B Digital Modulation
unmodulated wave phase modulation (PSK)
frequency modulation (FSK)
amplitude modulation (ASK)
digital logic level
Figure 1.27 Signal modulation. (A) Analog modulation techniques by amplitude (AM) or frequency
(FM). There are two basic strategies to transmit a temporally variable signal over an otherwise stable
carrier wave. In the case of AM, the strength of the signal directly determines the strength of the carrier
wave. In the case of FM, the strength of the signal is translated into a momentary shift in frequency of the
carrier wave. (B) Digital modulation. Phase modulation: Amplitude and frequency of the carrier wave are
independent of the signal. A change in the logic level is indicated by breaking signal symmetry. When,
how often and how the symmetry is broken depends largely on the agreed encoding of the signal (see
Section 18.3). Frequency and amplitude modulation effectively use two different frequency or amplitude
settings to indicate a change in the logical level.
Derivatives of PSK are frequently used in wireless data communication like

WiFi and Bluetooth (here, it is not to be mistaken with the other PSK [preshared
key], a concept from data security). Phase-shifting keying is in practize achieved by
alternating between sine or cosine output waves. An advantage of digital modulation
methods is their intrinsic ability to be combined with error detection techniques.
Table 1.2
Use of Frequency Bands
Band Code ITU Frequency & Wavelength Application
Extremely low, ELF, 1-3 3 - 3000 Hz, 5 - Military communication,

super low and SLF, submarines
ultralow ULF
frequency
Very low VLF 4 3 - 30 kHz, 100 - Navigation, medical

frequency devices, metal detectors
Low frequency LF 5 30 - 300 kHz, 10 - Clock time signal, AM

radio, RFID, amateur radio
Medium MF 6 300 - 3000 kHz, 1000 - AM radio, amateur radio

frequency
Very high VHF 8 30 - 300 MHz, 10 - FM radio, television

frequency broadcasts, amateur radio,
weather radio, mobile
communication
Ultrahigh UHF 9 300 - 3000 MHz, 100 - Television broadcasts,

frequency microwave applications,
mobile phones, 2.4-GHz
WiFi, Bluetooth, ZigBee,
GPS, digital radio
Super high SHF 10 3 - 30 GHz, 10 - Satellite communications,

frequency 5-GHz WiFi, radar
Extremely high EHF, 11-12 30 - 3000 GHz, - µ Scientific applications

and THF
tremendously
high frequency
Chapter 2
Electronics
Chapter 1 on electricity and electromagnetism has introduced you to the physical

principles that are behind any modern technology, be it the condition-dependent
flow of current in a sensor or the exchange of information through electromagnetic
waves. This chapter will provide a very brief overview of common electric com-
ponents, the building blocks of every circuit. We will then discuss the difference
between analog and digital circuits, and see how modern integrated circuits and
multipurpose computers take information processing to a meta-level, which is in-
dependent of the considerations of electricity and basic circuit design. While power
grids supply alternating current (AC), most consumer electronics and probably most
devices in the Internet of Things are operating on direct current (DC). We will there-
fore concentrate on DC electronics and only allude to AC where it is essential.
This chapter cannot replace a degree in electrical engineering. Depending
on the reader’s degree of specialization and need for detail, standard works such
as Horowitz and Hill’s The Art of Electronics and/or manufacturer data sheets are
highly recommended for further studies.
2.1 COMPONENTS
Electric components are defined as devices that cannot be further disassembled

without losing their electrical functionality (see Figure 2.1). In contrast to electrical
elements that represent idealized electric components in calculations, real-life com-
ponents often suffer from inaccuracy, energy loss and efficiency issues. The catego-
rization of components is somewhat fuzzy: We speak of passive components if their
functioning is not dependent on additional electric energy and if their characteristics
45
cannot be actively controlled (through electric current). These are inductors, resis-
tors and capacitors. Diodes can be either passive or active, but are usually counted
toward the latter group. Other active devices are transistors or integrated circuits
that combine active and passive components. These are often used to control, rec-
tify or amplify currents, depending on the state of a control signal and additional
electric energy. The aforementioned passive and active components are available
as either as through-hole components or in various surface-mount packages (e.g.
small-outline transistor, small-outline integrated circuit). Other active components
derive their energy from physical or chemical processes, such as stretch deformation
(piezo), heat/cold (Peltier), generators or batteries. Together with other complex
components that provide higher-level functionality such as input/output, displays
and many others, these are not components in a strict sense and we shall discuss
them in more depth in the hardware section (see Part V) of this book.
This section aims to deliver a general overview of the core components
often encountered in simple circuits. The selection of appropriate components is an
important step in circuit design and the market offers a huge choice of specialized
technology. As such the devices that are presented here can only be exemplary, but
should provide a sufficient entry point for further studies with other introductory
works into electronics.
2.1.1 Passive Components
2.1.1.1 Resistors
Probably the most common component in any electric circuit is the resistor. Re-
sistors are two-terminal components that limit the amount of current that can go
through a circuit at any one time, thereby also lowering the electric potential differ-
ence before and after the resistor (see Section 1.1.3.4). Resistors consume power,
which is primarily dissipated as heat and to a much lesser degree as inductance.
This can be problematic as the resistance of many materials is influenced by their
temperature, and the heat-up can therefore change the accuracy of the component.
The basic material and design of the resistor is therefore dependent on the desired
resistance, accuracy and sensitivity to noise.
The most simple resistors are small axial components with layers of carbon
or less conductive metal oxide that block some of the current in between the
supply wires. In some cases this material is applied directly to printed circuit
boards (PCBs) to bridge between conductive traces. Some resistor types require the
use of semiconductive material, for example the light-dependent resistors (LDRs)
Electronics 47
Figure 2.1 An attempt at a taxonomy of electronic components. Passive components comprise in-
ductors, resistors and capacitors. Diodes and transistors are active components. Integrated circuits can
combine both active and passive basic components, but are themselves usually classified as active com-
ponents. In principle batteries and various electromechanical devices count as components also, although
they can get to a level of complexity that would warrant their classification as stand-alone devices.
in which the energy of the incoming light enables electrons to overcome the
bandgap. Other variable resistors are either manually preset (potentiometers) or
dependent on different physical, chemical or mechanical triggers. Potentiometers
often provide some mechanism to alter the distance that the current has to pass
through some less conductive material, which could be a knob or a slider. Resistors
that respond to physical, chemical or mechanical stimuli are, for example, the
aforementioned LDR, the humistor (consisting of a mineral core whose conductivity
changes with humidity) or force-dependent resistors (which follow the principle of
the potentiometer, but the distance between terminals is regulated by pressure or
stretch forces). Variable resistors can have linear or logarithmic characteristics; for
example, a linear potentiometer can respond to a change from 1, to 2, to 3 arbitrary
units in equal steps of Ω each, while a logarithmic one would increase, for
example, from Ω and Ω, to Ω.
Resistors are available from less than Ω to several mega-Ω. The colored
bands on axial resistors encode their resistive value as well as their tolerance
(typically ± or ± ). In the case of other packaging types such as surface-
mount components this information can be found printed on the device. The power
rating on a resistor determines how much they can heat up before failing; usual
values are between and . Higher power ratings are possible, but these
are specialist devices with special heat-dissipating ability.
2.1.1.2 Capacitors
Two-terminal capacitors are the second most common component in simple circuits.
They store electric charge between two surfaces and dissipate it over time (see
Section 1.1.3.3). We therefore find capacitors in applications where a current buffer
is beneficial (e.g., to compensate for high peak currents).
Capacitors exist with a huge range of capacities, ranging from picofarad
( ) to several farad. For many applications it is sufficient to use a component
in the submicrofarad ( ) range, as even millifarad ( ) capacitors count
as big and larger devices are often referred to as supercapacitors. As capacitance is
mainly a function of storage surface, capacitors are available in small surface-mount
packages and axial lead form, but can occasionally have a volume of several cm3 .
One application of capacitors is filtering particular frequencies from alternat-
ing current (see Figure 2.2). The time constant gives an indication of
the time a capacitor needs to charge to 63% of its maximum capacity and maximal
voltage max , or to discharge to e of max after is has been fully charged.
Electronics 49
V
R
C 63%
Vin ~ C Vout Vin ~ R Vout
37%
t [x RC]
low pass filter high pass filter 1 2 3 4 5 1 2 3 4 5
Figure 2.2 Application of capacitors in filters and charge/discharge curve. For a low-pass filter, if
in is temporally constant, out will follow with a short delay that’s dependent on the time constant
( ) of the capacitor. If the amount of time required to charge the capacitor is significantly longer
than a fast-changing signal in , then the outgoing out is largely suppressed (i.e., low frequencies
can pass, whereas high frequencies cannot). In a high-pass filter, slow changes in the signal yield the
continuous charging and discharging of the capacitor, with only small amounts of current going through
the load.
At each time point , the voltage across the capacitor is:

( t
)
− τ
max
Capacitors are sensitive to excess voltage; for example, when applying volt-
ages that are too high for their specification, capacitors can explode. In general,
capacitors suffer a range of shortcomings, making them a necessary evil in many
circuits. While precise capacitors with tolerances of are available for a price,
more frequently one can find components with tolerances of up to ± . Further-
more, capacitors are subject to leak currents, the continuous loss of charge. While
a good capacitor can hold current for hours and days in the absence of any load,
excessive leak current can shorten this time significantly. As capacitors also have
an internal resistance (the so-called equivalent series resistance [ESR]), they may
heat up if any loads require significant currents. This loss of functionality makes
capacitors a common point of failure.
The cheapest and most commonly used capacitors are small ceramic ones that
are available up to the lower picofarad range. Together with (metal) film capacitors,
their specifications are usually better than electrolytic devices. One can often
recognize ceramic capacitors by their bloblike epoxy coating, and film capacitors
as plastic blocks. In contrast to ceramic and film capacitors, electrolytic capacitors
usually feature cylindric shapes. While electrolytic devices can have significant
capacitance, their tolerance and ESR is much worse. Also, electrolytic capacitors
are polarized devices and will explode if installed incorrectly. Tantalum capacitors
share many features of electrolytic capacitors, but are less prone to heat and a
better choice for high-frequency applications. However, they are more expensive
than electrolytic ones as tantalum is a rare metal.
The specifications of capacitors are usually printed on them, albeit in a
device-specific shorthand. Depending on the package, these range from 3- or 4-
digit numbers encoding the capacity to longer codes including letters symbols that
encode the tolerance.
2.1.1.3 Inductors
Inductors are two-terminal components that build up a magnetic field around a coil.
This field is maintained as long as current is flowing and decays if the current
stops. The magnetic field then induces a voltage in the inductor in a time-dependent
manner, leading to a decoupling and phase-shift between voltage and current (see
Section 1.2.3.2 for further explanation).
Inductors are most relevant in AC applications where the current frequently
changes in direction, for example in radio frequency generators. They can also be
used to block high-frequency AC while allowing DC (or low-frequency AC) to pass.
Coils can also find applications where the electromagnetic properties are used to
interact with other magnetic matter; for example, in relays to lift a magnetic switch.
The simplest inductors are simple coils of wire. Together with inductors
where the wire is wound around a nonmagnetic core (e.g., made from ceramic);
these fall into the category of air core inductors. While the air core is less inductive
than that of the ferromagnetic core inductors, their ideal-versus-real life properties
are more aligned. Ferromagnetic cores suffer from current losses due to the hetero-
geneity of eddy currents within the magnetic material. In many applications there
is a shift away from simple inductors toward technically more complex gyrator
devices with better performance.
Typical inductors have inductances from µ (small axial ceramic core)
to (large coils with ferromagnetic core). As with resistors, the value of the
axial through-hole components is encoded in colored bands, whereas surface-mount
inductors feature printed data.
2.1.1.4 Diodes
Diodes are two-terminal, polarized components that allow current to pass in only
one direction. They are often referred to as rectifiers, devices that can filter out
one direction of alternating current, such that only directed current remains in the
circuit. Modern crystal diodes are based on semiconductor physics (see Section
Electronics 51
+
~ -
Figure 2.3 Rectifier bridge. Alternating current has positive and negative voltage maxima, with the
sign indicating the direction of current. Two pairs of diodes, here drawn with the same orientation and
diagonally opposite to each other, act together to facilitate that only rectified (DC) current is reaching
the (+) terminal. The (-) terminal remains grounded throughout.
1.1.2). Simple diodes consist of a single p-n junction and act as a valve. The most
common ones are axial components with a mark that indicates the anode (in the
sense of a DC circuit, the current from anode to cathode is let through), or small
surface-mount packages. Diodes have two ratings: current rating specifying the
maximum current that can be conducted through them, and a peak inverse voltage
(PIV) that a diode can act against before it breaks down and allows current to pass
in the inverse direction. Common silicon diodes have a so-called threshold or cut-in
voltage of about . Bridge rectifiers are ensembles of four diodes that utilize
both the positive and negative phase of AC to provide a (semi-) continuous DC
signal (see Figure 2.3). In real-life applications, rectifier circuits are buffered with
capacitors to a continuous DC output.
The breakdown of a diode can be a desired event (see Figure 2.4). Picture
a subcircuit that that under normal operating voltages does not see any current as
it is blocked by an inverse diode. If the voltage exceeds the PIV of the diode, it
breaks down and allows current to enter the subcircuit. Depending on the design
of the circuit and the subcircuit, this may save the device from high-voltage
damage. Zener diodes have p-n junctions that allow for much lower PIV than
normal diodes. They are designed for this purpose and can also be used for voltage
regulation in combination with a resistor. Schottky diodes feature a low forward
voltage drop and extremely fast switching times (i.e., very short delays to go from
a conductive to nonconductive state once the voltage has dropped lower than the
breakdown specification). This can be used for filtering purposes such as in radio
communication applications. The breakdown voltage that indicates the directional
switch is not to be confused with the power rating of a diode: if the current and
voltage (their product!) exceed certain limits, the diode will physically break. Zener
and Schottky diodes come in axial and surface-mount form factors.
forward
If [mA]
current
Zener rectifier
6
diode diode
reverse forward
voltage voltage
-50 -40 -30 -20 -10 0.4 0.8 1.2
Vr [V] Vf [V]
-2
-4
-6
reverse Ir [mA]
current
Figure 2.4 Diode breakdown. Breakdown can be a desired feature of a diode. While normal rectifier
diodes have a large reverse voltage (i.e., they are very resistant to current in one direction), Zener diodes
have much lower resistance to reverse voltage and are often used for switching applications. Note the
different scales of the X-axis for reverse voltage R and forward voltage f .
Electronics 53
Photo Diodes
Photo diodes are conceptually similar to light-dependent resistors, although they

act in binary switching applications. While the LDR allows current to flow in light
and resistance decays with brightness, photo diodes mediate an on or off, and
allow current to flow in the dark and not in light. The underlying principle of a
photo diode is that the semiconducting material is exposed to the outside through a
small window, and incoming photons create electron holes that propagate through
to the anode. The spectral sensitivity of photo diodes depends on the choice of
semiconducting material. Photo diodes can occasionally be recognized by small
spherical protection caps that cover the detector.
Light-Emitting Diodes
Light-emitting diodes (LEDs) use electroluminescent p-n junction semiconductors.

All diodes emit some form of energy when forward-biased, usually in the form
of heat. In LEDs the composition of the material allows the fine-tuned emission
of visible light, ranging from (green) to (red). The peak current
specifies the current that breaks the diode; therefore it is usually recommended to
use a LED together with a limiting resistor. LEDs are ubiquitous and do not require
any further description of their shape and form. It is worth mentioning that the
voltage drop that occurs across LEDs is dependent on their color and hence doping
agent. While for a forward current of the drop is just over for red LEDs,
it is between to for green and blue variants.
2.1.2 Active Components
2.1.2.1 Transistors
Transistors, short for transfer resistors, are without doubt the most important in-
vention in electronics in the last century. From single transistors that can be used
in breadboard circuits to switch or regulate current to modern computer processors
with billions of microscopic transistors, their first technological ancestors date back
to the 1920s and 1930s. Since then, transistors have been a hotbed of innovation.
The two most important groups of transistors are bipolar junction transis-
tors (BJT) and field effect transistors (FET). In principle both groups serve the
same purpose, switching and regulation, although the way they are controlled is
fundamentally different. While BJTs are controlled through current and have low
Table 2.1
Junction Bias in NPN and PNP Transistors
Voltage at B/C B/E Outcome

NPN junction
E B C Reverse Forward Active
E B C Forward Forward Saturation
E B C Reverse Reverse Cutoff
E B C Forward Reverse Reverse-active
PNP junction
E B C Reverse Forward Reverse-active
E B C Reverse Reverse Cutoff
E B C Forward Forward Saturation
E B C Forward Reverse Active
input impedance, FETs use an electron potential difference to establish current

flow through the device and have higher impedance. As the name suggests, BJT
are bipolar, meaning that the conducting channel uses both N- and P-type semicon-
ductor material (see Section 1.1.2). In contrast, FETs are unipolar and therefore also
simpler to implement on smaller footprints. Integrated circuits and microprocessors
usually operate with FETs.
A bipolar junction transistor has collector (C), emitter (E) and base (B)
terminals (see Figure 2.5). The collector-emitter axis indicates the flow of current
if the transistor is active, with the arrow of the emitter showing the conventional
direction of current. In NPN transistors, the base current induces an injection of
negative charges from the emitter that diffuse towards the collector, whereas in
PNP transistor it is the collection of electron holes that enable a current to flow.
Depending on the relative voltages at the E, B and C terminals, the n-p or p-n
interfaces within the transistor create a forward or reverse bias, yielding an overall
outcome following Table 2.1. It is worth mentioning that in practice NPN transistors
are more commonly used than PNP for their cheaper production cost and faster
response time. The activation of a NPN transistor is shown in Figure 2.6.
Historically, BJTs are about 20 years younger than FETs, which were con-
ceived in the 1920s but first became commercially available in 1960. Analogously
to the BJTs, we are referring to N- or P-types for both the junction (JFET) and metal-
oxide-semiconductor (MOSFET) FET families. For reasons of simplicity, we will
Electronics 55
B
C PNP
B C
NPN
V+ GND V+ GND
B
recombination recombination
N P N (-) free electrons P N P
"collect" "inject" (+) electron hole
(-) (+)
C (+) E E C
(+)
B B
B/C junction B/E junction B/E junction B/C junction
Figure 2.5 Transistor circuits and rough anatomy of a BJT. (A) Exemplary circuits for NPN (left) and
PNP (right) transistors. If no current⊗is applied to the base (B) of the NPN transistor, the circuit from V+
to GND is not closed and the lamp ( ) is off. The base current (from the control subcircuit) triggers the
flow of current into the load subcircuit in the transistor. Phenomenologically, this is as if a direct physical
connection between the collector (C) and emitter (E) of the transistor has been established, putting the
lamp into an active circuit. As it takes very little current to activate the transistor, it is advisable to secure
the base with a Ω resistor. The same operational principle is true for the PNP transistor circuit: the
base current establishes an electrical connection between emitter and collector (although with inverse
orientation), switching the light on. However, while in the case of the NPN transistor we are looking
at the flow of current from the base towards the emitter, in the case of the PNP transistor it is current
coming from the emitter through the base to GND, which activates the transistor. (B) The underlying
principle of NPN (left) and PNP (right) transistors. Note that now we are looking at the movement of
electrons, which travel in the opposite direction of conventional/technical current. The NPN transistor
sees electrons (coming from GND) injected at the emitter and collected at the collector. This is facilitated
by some positive charge (electron holes) coming in from the base — one could also say that electrons
are allowed out of the base. While a small amount of charge gets lost in a process called recombination,
the overall flow of electrons from E to B ( n-p junction), and from B to C ( p-n junction) is higher.
The relative voltage over the B/E and B/C junction determines whether the transistor is switched on
or off (see Table 2.1). The PNP transistor works conceptually very similarly, although because of the
different doping used in PNP transistors, it is primarily positive charge carriers that are responsible for
the flow of current. As positive charge carriers translocate slower than electrons, PNP transistors exhibit
considerably slower switching behavior than NPN ones.
NC P NE
+ (-) (+)
+ - + + (-)
(-)
(-) - (-) +
t=0 + + () +
- (+) (-)
no current (-) + () - + (-)
boundary layer
t=1 + (+) --
C+E powered + + + (-) - + + (-) --
+ (-) --
(-) - GND
+ + + () + (-) + --
V+ + - (+) (-) --
+ (-) + () - + (-) --
+
t=2 + () (-) --
boundary shift + + + - + + (-) --
+ (-) --
- () GND
+ + + + (-) + --
V+ + - (+) (-) --
+ (-) + () - + (-) --
+
t=3 + () (-) --
base triggered + + + - + + (-) --
+ (-) --
- () GND
+ + + + (-) + --
V+ + - (+) (-) --
+ (-) + () - + (-) --
+
B +++++++
t=4 (-) --
+ (+)
boundary break + + + - + + (-) --
+ (-) --
+ - (+) (-) + GND
+ + (+) + --
V+ + - (-) --
+ (-) + () - + (-) --
+
B +++++++
t=5 + (+) (-) --
+ + + - + + (-) --
+ (-) --
- (+) GND
+ + + (+) + (-) + --
V+ + - (-) --
+ (-) + () - + (-) --
+
B +++++++
t=6 --
load current + (+) (-)
+ + + - + + (-) --
+ (-) --
breaks through GND
+ + - (+) + (-) + --
V+ + + - (+) (-) --
+ (-) + (+) - + (-) --
+
B +++++++
+ arsenic ion
localized (-) free electrons (+) electron hole ( ) filled electron hole
- indium ion
Figure 2.6 The switching mechanism of a NPN transistor explained. Flow of current in a BJT over
time. In an unconnected circuit ( ) the boundary layers between B/E and B/C are as described for
simple n-p junctions (see Figure 1.5). Note the shift of the boundary layer between and
when current is applied at collector and emitter. At voltage is applied at the base, leading to the
breakdown of the P/NE boundary. While some free electrons are lost through recombination, electron
holes are filling up ( ). During the boundary layer between P/N C is finally disappearing,
providing complete continuity between NE and NC at .
Electronics 57
n-MOSFET n-JFET
S G D
metal S G D
oxide
n n p n
semiconductive n n n
material p
p
B B
optional n-gap
with: depletion mode
w/o: enhancement mode
Figure 2.7 MOSFET vs. JFET. Cross section through n-MOSFET and n-JFET devices. Both types are
often made directly on a silicon base, and they are grown using different types of doped silicon. Source
(S) and drain (D) are directly in contact with the n-doped material. In both cases, the gate (G) is separated
from the conductive material and exerts its effect by building up an electric field.
concentrate on the differences between the N-type devices for both groups (see Fig-
ure 2.7). While BJTs are current-controlled devices (charges of the base directly par-
ticipate in the load circuit), in FETs the voltage at the gate (G) determines whether
current can flow between the source (S) or the drain (D). It is the electric field that
gives FETs their name and that determines if current can flow in the underlying
layers. In contrast to BJTs, the current remains continuously in one substrate (here:
n-type semiconductor), although variants of the MOSFET can utilize the so-called
n-gap. If present and the n-channel is continuous, current between S and D can
flow by default, but the gate voltage can limit it (depletion mode). If absent, the
gate voltage enables the flow of current (enhancement mode). MOSFET devices are
easier to fabricate and smaller than JFETs and are the principle component of most
modern electronics; however, they are prone to electrostatic damage. MOSFETs
also dissipate very little energy.
Phototransistors are one of the many more common special transistor types.
They omit base or gate and exhibit their regulatory function when activated through
light. In contrast to BJTs, they usually feature a separation semiconductor layer
between a single p-n junction.
Due to the versatile and ubiquitous nature of transistors it is difficult to
summarize all the form factors in which they are available, ranging from large,
metal-shielded cylinders with three leads to surface-mount packages to integrated
circuit versions. The key specification for the choice of a transistor is usually its
absolute maximum ratings: BJTs are sensitive to large collector-to-base or emitter-
to-base voltages, and FETs are sensitive to excessive gate voltage. Just as for diodes,
it is important to consider the reverse voltage in cases where polarity might change
in a circuit. The total maximum power dissipation is relevant in amplification
IC
circuits, not for switching applications. It is often directly linked to the IB
IC
and IE factors that express the amount of current amplification a transistor
can provide. Power MOSFETs are usually the choice for high-current applications,
working together with capacitors to ensure the continuous supply of current for
electric motors etc.
2.1.2.2 Integrated Circuits
Integrated circuits (ICs) are devices that embed more complicated circuits in a
simple package with defined interfaces. In simple terms, ICs are thus the hardware
equivalent to functions in a programming language. Notable and widely known
examples are the NE555 timer IC, the 7400 transistor–transistor logic (TTL) series,
the 7200 complementary metal-oxide semiconductor (CMOS) series, and many
other devices such as voltage regulators or operational amplifiers.
The broad definition of what constitutes an IC means they can incorporate
both passive and active components, ranging from a few diodes and capacitors to
billions of transistors. In terms of their output, we can broadly distinguish analog
ICs (where the inner workings of the IC determine an output from a continuous
range of voltage and/or current) and digital ICs (where the outcome is either HIGH
or LOW, ON or OFF, see 2.2.1). Voltage regulators and operational amplifiers fall
into that first category, while logic gates, computer memory and microprocessors
belong to the latter.
In the following sections we will briefly introduce a few common analog
ICs before discussing digital logic, logic gates and more highly integrated ICs.
Integrated circuits come in a wide range of usually standardized form factors.
Through-hole devices in dual inline package (DIP) are common for prototyping, and
small outline transistor (SOT) or small outline integrated circuit (SOIC) are surface-
mount devices. They have standardized dimensions and pin count. More complex
ICs such as microprocessors often have proprietary footprints and pin layouts.
Operational amplifier
Operational amplifiers (op amps) integrate a number of transistors, resistors and

capacitors to amplify a voltage (see Figure 2.8); for example, a input signal
may yield a output signal. While such amplification is possible with a
single transistor, typical op amps provide a more versatile interface that allows
Electronics 59
A C
IC 741 2
(op amp) 7
PNP PNP
PNP PNP
offset null 1 8 not connected NPN
NPN
inverting input 2 7 V+ 3 NPN NPN
NPN
standard input 3 6 output PNP PNP 6

GND 4 5 offset null NPN
NPN PNP
NPN NPN NPN

NPN NPN
1 NPN
B V+
2 7 4
-
741 6 5
+
3 4
GND
Figure 2.8 Operational amplifier. The IC 741 is a good example of how complex functionality can be
provided with a defined interface. (A) Physical pin layout for the 741 from its data sheet. Note that pin 8
is not required but present to be pin-compatible with other 8-pin layouts. (B) The electric symbol for an
op amp, with pin numbers from the physical layout. (C) The inner workings of a 741 with 20 transistors,
10 resistors and a capacitor. The numbered terminals refer to the pins’ numbers.
amplification of an input signal or both the inversion of a temporally variable

signal and its amplification over a wide range of input frequencies. Op amps are
therefore often used in radio frequency applications. Without going into the details
of their working and specifications, op amps are an excellent example how a defined
functionality is available in a simple package, hiding away the complexity of the
actual circuit.
Optocoupler
An optocoupler allows for the complete electronic separation of two circuits that
nevertheless have to communicate. They are useful in cases where the control signal
is weak and requires amplification, and also to relay the control signal from an
electronically sensitive device to an actuator with much larger operating current.
The separation is achieved through light, which is emitted from a LED and induces
a current in a photo diode. In Figure 2.9, the optocoupler IC integrates LED and
photo diode, as well as an entire operational amplifier circuit, to boost the signal.
This also exemplifies how functionalities can be combined in a modular way to
achieve more and more complex functions in simple packages.
signal
V+
op amp NPN
Voutput
GND
GND
Figure 2.9 Optocoupler. The optocoupler circuit is not drawn to scale. The IC integrates LED, photo
diode and a transistor along with an entire op amp as depicted in Figure 2.8. Signal and GND of the
LED are completely separated from the V+ and GND of the amplifier circuit. Once triggered, Voutput is
proportional to the signal.
Comparator
An analog comparator is a device with two inputs, nominally called and ,

and an output. If is higher than , the output is sourced from the supply input
(V+); otherwise the output is connected to GND. Comparators have widespread
utility in hysteresis controllers, level shifters and other conversion applications. The
section on analog-to-digital conversion (3.2) shows an exemplary circuit involving
comparators.
Voltage regulator
Most circuit designs require a defined voltage for stable operation and a regulated
supply is therefore essential. There are various strategies to step the voltage in a
circuit with directional current up or down. Hence, voltage regulators (also called
Vreg) are very common ICs that accelerate the design and build of circuits that face
fluctuating voltages. They are complex devices that warrant entire books on them,
making the choice of an optimal regulator a challenging task.
NE555 Timer IC
In combination with a capacitor and a resistor, the NE555 can be set up to provide
an output signal at very reproducible timings ranging from microseconds to hours.
It is at the core of many simple circuits for visual and acoustic alarms. Timing is an
essential task in many modern circuits, and thus the NE555 was a welcome device
that integrates 25 transistors, 15 resistors and 2 diodes. The NE555 was invented in
1971 and has since sold in very large quantities of nearly a billion per year.
Electronics 61
2.1.2.3 Generators, Piezo and Peltier Elements
The boundary between complex electronic components, and other power, sensor
and actuator devices is somewhat ill-defined. For the purposes of this book, we
will only briefly cover the physical principles behind piezo and Peltier elements,
and concentrate on the devices that ultimately utilize those in Part V on Hardware.
By our earlier definition these active components require additional energy to
operate and/or they can be controlled electronically beyond their original design.
The chapter on electricity and electromagnetism (Chapter 1) already covered the
key principles behind generators and electric motors, which are good examples of
electromechanical devices.
Piezo Element
The so-called direct piezo effect uses mechanical force to generate an electric
current by moving positive and negative charges in a crystal toward each other.
The indirect piezo effect refers to the deformation of the same material by applying
a voltage. There are broadly speaking three applications of the piezoelectric effect
in electronics:
• For sensor applications and for energy harvesting
• To control movement (e.g., of motors or miniature speakers)
• For timekeeping and triggering
The most widely known piezoelectric material is quartz, silicon dioxide (SiO2 ). A
single silicon dioxide molecule is linear. In a crystal, however, the silicon atoms can
coordinate up to four oxygen atoms (see Figure 2.10). This allows for an atomic
packaging in which the bonds between the atoms enable their dynamic and flexible
position within the grid. A single silicon atom can be thought of as being tethered
to the crystal through springs. The speed at which the Si atom can vibrate within
such a system is called resonance frequency, a material-specific constant.
In timing applications the crystal is occasionally actively deformed by apply-
ing a small voltage for a short amount of time. The resulting resonance vibrations
can then be sensed through small fluctuations in current as the crystal relaxes, pro-
viding a trigger for counting and so forth. Timer circuits take care of the coordina-
tion between both modes.
A C
Si O
Figure 2.10 Piezocrystal and its deformations. (A) Silicon (black) and oxygen (gray) atoms in a linear
silicon dioxide (SiO2 ) molecule. (B) Silicon and oxygen atoms in a quartz crystal. Note that the oxygen
atoms are depicted as smaller in this view as they are on a lower level in the stack. (C) Potential
deformations of a rectangular slice of quartz.
Peltier Element
Peltier elements are thermoelectric devices that use current to control the tempera-
ture of a metallic substrate. The heat/cold is often transferred onto a ceramic layer
that can keep the desired temperature longer. The elements utilize the physical ef-
fect that the junction between two conductors either require additional energy for
a charge to cross into the the energetically less favorable substrate, or that excess
energy is dissipated when charge enters the conductor of a lower energetic state.
This thermoelectric effect occurs at any n-p junction of semiconductors and is usu-
ally considered a negative side effect. However, by utilizing specialized crystals and
doping, the effect can become much more pronounced and can be used in technical
applications. The thermoelectric effect works in both directions (i.e., it is possible
to generate a current by heating the interface between the n-p junctions). Thermo-
electric generators often use many such interfaces in series. While solar panels use
the photovoltaic effect to generate power from light, some systems use excess heat
to gain additional energy through the thermoelectric effect.
Electronics 63
2.2 ANALOGUE AND DIGITAL CIRCUITS
The previous sections have featured a selection of analogue circuits: depending on

the application, they rely on a temporally constant voltage (e.g., to keep a light-
emitting diode shining) or temporally changing voltage with otherwise uninter-
rupted supply (e.g., to drive a motor), or the voltage itself is interpreted to yield
a particular outcome (e.g., a scaled version of the input signal using an operational
amplifier). Especially the continuous nature of the signal in the latter case empha-
sises the meaning of the term analog.
2.2.1 Logic gates
Digital circuits and digital logic are different compared to analog logic. For the sake
of simplicity, let us assume that digital design only knows to states: ON or OFF. Real
implementations are slightly more complex, as the voltage threshold for either state
can in principle be arbitrarily chosen. On and off are, depending on the context,
also referred to as TRUE and FALSE, HIGH and LOW, or 1 and 0. Especially in
the context of so-called Boolean algebra, the latter is prevalent.
Boolean logic defines basic operations like NOT, OR, AND, NOR (not OR),
NAND (not AND) and derived types such as XOR (exclusive OR). For example,
if A = 0, then NOT A ( ) = 1. A OR B can only be 0 if both A and B are 0;
otherwise it is 1. A truth table provides an overview of the logic functions (see
Figure 2.11). Through application of Boolean logic, it can be shown that all logic
operators can be expressed through combinations of either only NAND or only
NOR gates, respectively. Logic functions can be implemented in hardware through
combinations of transistors (see Figure 2.12). While the NOT and NOR gates are
indeed based on just a single transistor, logic gates and their derivatives are usually
packaged as ICs.
2.2.2 Memory
Logic gates allow the implementation of memory; that is the storage of a state over
a period of time (see Figure 2.13). This is one of the key prerequisites of modern-
day computing. Information storage on this lowest level is achieved by so-called
latches and flip-flops. The SR ( set-reset) flip-flop serves to picture the idea behind
flip-flops (Figure 2.13A): the combination of two NOR gates produces the outputs
Q and that take opposite states. If Q is high, must be low, and vice versa. This
is achieved by routing the output of both NOR gates to an input of the other gate. If
input A B input A B input A B

0 1 NOT 0 0 0 OR 0 0 0 AND
1 0 output 0 1 1 output 0 1 0 output
1 0 1 1 0 0
1 1 1 1 1 1
input A B input A B
0 0 1 NOR 0 0 1 NAND
0 1 0 output 0 1 1 output
1 0 0 1 0 1
1 1 0 1 1 0
input A B
0 0 0 XOR
0 1 1 output
1 0 1
1 1 0
Figure 2.11 Logic functions and truth table. Input and output states for the logic functions NOT,
OR, AND, NOR, NAND and XOR. Note while NOT knows only one input or output, the other logic
functions combine the input states to derive a single output. The symbols for the logic functions is shown
underneath the respective table. Note that negation (in NOT, NOR and NAND) is always indicated by a
small circle, and that OR-related functions have a rounded back and are pointed whereas AND-related
functions have a flat back and are half circular. At this level of understanding, the outcome of the logic
functions is purely arbitrary. Only when looking at the implementation on the level of transistors, see
Figure 2.12, can we see how the input states are combined to derive an output.
Electronics 65
V+ (high) V+ (high) V+ (high)
A A
A NPN output NPN output NPN NPN output
B B
GND (low) GND (low) GND (low)
V+ (high) V+ (high)
A NPN A NPN
output NPN output

B NPN B NPN
GND (low) GND (low)
A
B output
Figure 2.12 Implementation of logic gates. All logic gates are connected to a voltage-carrying HIGH
rail and a grounded LOW rail. The general principle behind the gate operations can be understood by
progressively working through the NOT, NOR, OR, NAND and OR examples: NOT: If A = 0 (off),
the transistor acts as barrier, and voltage (on, 1) flows to the output. If A = 1 (on), the transistor is
open, grounding the HIGH rail. As there is a resistor between the HIGH rail and the output in actual
implementations of the gate, the output is off (0). NOR: The NOR gate is an extension of the NOT gate,
but the transistor is controlled by both A and B lines (i.e., if either or both are on, the output is off). OR:
The OR function is the inverse of NOR, and can be achieved by having a NOT gate following the NOR
output. NAND: The NAND gate works similar to the NOR gate. If both A and B are on (1), the HIGH
rail gets grounded and the output is off (0). If none or only one of the transistors is on (1), the output
remains high (1). AND: As with NOR and OR, AND is just a NAND gate followed by a NOT gate.
the S (set) input is triggered, Q receives a current and is set high, while is low. Q is
maintained high until R (reset) is triggered. The signal sets high and the NOR gate
at R leads to Q becoming low. On a macroscopic level this happens instantaneously;
however, in reality there is a propagation delay of a few nanoseconds until the new
states are established.
The RS flip-flop can be clocked (i.e., made to rely on an external trigger that
allows the state to be changed (Figure 2.13B)). This can be achieved by combining
the S and R signals both with a signal through a NOR gate. In the absence of the
clock signal, changes of S and R are ignored. A clock signal can be provided by a
crystal, an active component introduced in Section 2.1.2.3.
The JK flip-flop is an extension of the clocked SR flip-flop (Figure 2.13C).
It stabilizes the circuit as it combines the S and R signals with a clock, as well as
feedback from the output of the flip-flop via a 3-input AND gate. It also extends
the SR flip-flop by allowing for both S and R being 1, an input disallowed in the
simpler version.
Other flip-flop types include the T flip-flop, which is essentially a version in
which both J and K are linked and which represents a clocked toggle switch, and
the D flip-flop with an additional data input. Most modern flip-flop implementations
offer additional control lines, often allowing for preset and clear functionality.
2.2.3 Binary Calculations
Any whole number can be represented as binary code; that is, there is a direct and
unique mapping between the decimal numeral system and the binary code with
the digits 0 and 1 (see Figure 2.14). It is therefore used as the principle numeral
system in digital applications. Each position in the code is referred to as a bit,
and 8 bits together as byte, and a byte allows the representation of different
numbers (without the use of a so-called sign bit, 0 to 255). The conventions for the
encoding of larger numbers, decimal numbers and signed numbers are very much
that: conventions (e.g., first or last bit encoding + or -, first byte of a 16-bit word low
byte or high byte, and so forth). For the purpose of this section, we will continue
with a simple 8-bit binary code without signs and decimals.
On the example of binary addition, we are going to see how two binary
numbers can be added with a digital circuit: The 1-bit full adder is a digital circuit
that utilizes logic gates to add two 1-bit values and allows for the consideration of
a carry-in bit (Figure 2.14B). A half adder (not shown) can produce a correct result
for the cases 0+0 (0), 1+0 (1) and 0+1 (1). However, 1+1 does require a sum of 0,
but would need to carry over a 1 to the next position. This is implemented in the full
Electronics 67
A
set-reset flip-flop
S R QQ R Q
0 0 Q Q
1 0 1 0
0 1 0 1 Q
S
1 1 X X
R Q R Q R Q R Q R Q R Q R Q
S Q S Q S Q S Q S Q S Q S Q
H
S L
R H
L
Q H
L
Q H
L
propagation delay
B
R S H
L
Q R H
L
Q H
L
S Q Q H
L
ideal, asynchronous RS flip-flop
(no delay)
R S H
Q L
clock R H
L
Q H
L
S Q Q H
L
C H
L
ideal, clocked RS flip-flop

(no delay)
C
JK flip-flop
C J K QQ
0 0 Q Q J 3-input
AND Q
1 0 1 0
clock
0 1 0 1
1 1 Q Q K 3-input
AND
Q
X 0 0 Q Q
X 1 0 Q Q
X 0 1 Q Q
X 1 1 Q Q
Figure 2.13 Flip-Flop. (A) A set-reset flip-flop implementation with two NOR gates. The logic state
for S, R, Q and in the timing diagramme shows a SR cycle: S is set high, and as A and B input of
the lower NOR gate are connected, the state is reflected in Q, where the state is held even after S returns
to low. Upon activating R, Q gets reset and set high. (B) Comparison of an ideal (i.e., no delay) RS
flip-flop to a clocked RS flip-flop. Note how the brief R high state does not have consequences in the
absence of a clock signal. (C) Truth table and build of flank-triggered JK flip-flop.
A B
1x 27 = 128
A
0x 26 = 0 sum
B
0x 25 = 0 carryin
8 bit 1 0 0 1 0 0 1 1
1x 24 = 16
0x 23 = 0 carryover
1 byte
0x 22 = 0
1x 21 = 2
1-bit full adder
1x 20 = 1
147
binary 1 0 0 1 0 0 1 1 = decimal 147
C
input A: 0 1 0
(decimal 2)
input B: 0 1 1
(decimal 3)
A2 B2 A1 B 1 A0 B0
1-bit 1-bit 1-bit

overflow carry carry carry
full full full 0
adder adder adder
sum3 sum2 sum1
output: 1 0 1
(decimal 5)
Figure 2.14 Binary addition. (A) The byte representation of the number 147. The decimal number 147
is the sum of , meaning that (from left to right) bits 1, 4, 7 and
8 are 1 (true, on, high), whereas bits 2, 3, 5 and 6 are 0 (false, off, low). (B) Circuit of a 1-bit full adder.
If the carryin bit is low, the sum bit is 1 if A or B are high. If both A and B are high, the sum bit is 0 and
the carryover bit is high. The circuit can take a high carryin bit into account. (C) Addition of decimal 2
and decimal 3 with a series of three 1-bit full adders. Note how the input numbers are separated into the
bit representation; that is, such that decimal 2 contributes binary 1 to A1 but binary 0 to A0 and A3 , and
how the addition of binary 1 at both A1 and B 1 yields sum2 = 0 but a carry over of 1.
Electronics 69
adder. The full adder also takes into account a carry-in bit. By combining a set of n
1-bit full adders one can implement summations of any two n-bit numbers (Figure
2.14C).
The word size of a computer describes how many bits constitute the basic
unit for a compute. For example, while the decimal numbers 147 and 108 can be
added in one step on a 8-bit processor (both summands and the result, decimal 255,
can be represented in 8 bit), the same operation would take more than sixteen times
as many operations on a 4-bit machine: The result is the summation of 16 4-bit
numbers, and the intermediate result require multiple time-consuming transfers to
and from memory.
2.2.4 Logic Chips
The previous examples have shown digital logic implemented with bipolar junc-
tion transistors. These transistor-transistor logic (TTL) chips are amongst the oldest
commercially available integrated circuits. The electronics company Texas Instru-
ments introduced the 7400 TTL series in 1966. Integrated circuits of the 7400 family
are implementing hundreds of logic, memory and calculation functions. Today the
7400 still exists but has become less popular for industrial applications, as Texas
Instrument had already introduced a direct competitor in 1968, the so-called 4000
series of complementary metal-oxide semiconductor (CMOS) chips. Both have be-
come industry standards, giving a vast choice of ICs that are widely pin-compatible
and available from a multitude of manufacturers.
CMOS chips are implemented with MOSFETs, as the name suggests. These
can run on a lot less power while being less sensitive to being operated at a precise
voltage level (5V for most TTL chips, 3-18V for CMOS chips). They also feature
less heat dissipation and can be manufactured on a smaller footprint. TTLs are
more sensitive to noisy input (e.g., sudden voltage spikes), whereas CMOS chips
are slower. It is usually possible to find a pin-compatible correspondence between
both families. For example, identifiers along the lines of 74Cxx are actual members
of the 40xx family with full pin-compatibility to the 74xx series.
2.3 PROGRAMMABLE COMPUTERS
In the widest sense modern computers are large systems of integrated circuits for the
processing, storage and transmission of digital data. The first electronic computers
were analog devices, employing relays (electromagnetic switches) or vacuum valves
instead of transistors to implement logic gates. The most prominent example of

an electromechanical computer is probably Konrad Zuse’s Z2 design from 1940,
with 600 relays operating at a speed of approximately 5 Hz. The British Colossus
from 1943 was based on 1500 valves. It is often debated whether to attribute the
first computer label to Z2, Colossus or the first purely electronic computer, the
Electronic Numerical Integrator and Computer (ENIAC), which appeared in 1946.
Remarkably, while the Z2 still required around 0.8 seconds for a binary addition,
the ENIAC already managed to do so in 0.2 milliseconds.
These early computer systems were still very much purpose-built and mostly
not programmable, with the exception of setting input variables. Modern computers
mostly follow a reference model introduced by John von Neumann in 1945. The
von Neumann architecture (VNA, see Figure 2.15) requires:
• A hardware system that is independent from the problem that the computer
is to solve.
• Programs, input and output data that share the same memory.
• Programs are allowed to jump and are not only linear in their execution,
allowing for loops and conditional branching.
• Programs can be self-modifying; also, programs can produce new programs.
• The system is Turing-complete. This is an abstract concept introduced by
Alan Turing in the 1930s and one of its practical implications is that any
problem that is computable can be solved with the system.
The VNM is the lead principle of most modern computers and general-
purpose central processing units (CPUs). The brain of the device is the control unit
that can orchestrate all aspects of input and output (jointly referred to as I/O), along
with copying data from memory into the so-called registers of the both the control
unit and the arithmetic unit. The latter has the compute functionality as discussed
in the section on binary calculations (see Section 2.2.3), but these days with many
thousand to millions of parallel units. As the control unit itself can be populated
with commands from memory, and processes can write back to memory, it becomes
clear how this architecture enables jumps, conditions and loops in software.
Today there is a vast range of CPU types for every purpose: for supercomput-
ers, for desktop machines, for laptops, for mobile phones, for embedded devices,
and so forth. Key chip design companies in 2017 are Intel, AMD, ARM, Qualcomm,
Atmel, Freescale and many more. While some of these businesses (e.g., Intel) are
Electronics 71
code
input memory output
data
control unit arithmetic unit

central processing unit
Figure 2.15 The von Neumann architecture. A key requirement of the VNA is that there is no
separation between code and data, and therefore it will be only referred to as data. The control unit
uses commands (dashed lines) to facilitate the transfer of data (solid lines) from the input into memory.
Arithmetic operations on the data are coordinated through the control unit, which triggers the exchange
of data between memory and arithmetic unit, and also memory and output devices. The control unit can
also receive new commands from the arithmetic unit; for example, the update of variables in a branching
condition. It is noteworthy that the very first update of the control unit from input is managed by what is
referred to as bootloader in modern computer systems.
also manufacturing processors, others (e.g., ARM) license their designs to third-
parties and are not involved in large-scale production themselves. Without focussing
on a particular CPU design, in this section we are going to look at three levels of
CPU complexity.
2.3.1 Field-Programmable Gate Arrays
Field-programmable gate arrays (FPGAs) are groups of universal logic gates. They
are called field-programmable because the programming can be performed when
already deployed in the field, although the CPU’s functionality is implemented by
changing gates. The basic building block of the FPGA is a logic cell, often referred
to as a slice (see Figure 2.16).
Each cell behaves like a complex integrated circuit with input bits A, B, an
optional in bit and a clock signal. Through the inputs IN1, IN2, and IN3, the
logic function that the two main gates execute is selected. An additional input IN4
determines if the outcome of the full adder or the gates are fed forward to the data
flip-flop. Each slice can be programmed independently using a hardware description
language such as Verilog or VHDL. Dating back to the early 1980s, the number of
slices has grown from under ten thousand to millions of slices per FPGA chip.
The simple structure of FPGAs enables very fast execution times, especially
for parallel calculations. However, the relatively low level at which FPGAs are
carryin
A NOT
B AND full adder carryout
OR
IN1 NAND
NOR multiplexer
IN2
IN3 XOR
IN4 multiplexer OUT
NOT
AND
OR D flip-flop
NAND
NOR
XOR
clock
Figure 2.16 Field-progammable gate array logic cell. A FPGA logic cell with A, B, IN1-4, carryin and
clock inputs, and OUT and carryout outputs. A, B and carryin are the canonical input bits of a full adder
(see Figure 2.14). However, instead of a simple addition, the input bits determine how the three input
bits are combined with binary operations to yield and output (and carryout ). A clock signal is optional to
incorporate data from previous steps via a D flip-flop.
Electronics 73
programmed prohibits the implementation of complex business logic. Most FPGAs

are utilized in hardware accelerated signal or image processing.
2.3.2 Microcontrollers
Microcontrollers are integrated circuits that encapsulate a microprocessor (like a

CPU) along with memory for code and data, and ports for input and output. They
are sometimes also referred to as system-on-chip (SoC), although that term is
also used for other hardware including FPGAs. The first commercially available
microcontrollers came from Intel and Texas Instruments in the 1970s and targeted
the market for embedded computing, for example in industrial control systems and
household appliances, but also pocket calculators. These first systems were 4-bit
machines, whose program memory was erasable programmable read-only memory
(EP-ROM, which could only be erased by exposure to ultraviolet light), or cheaper
ROM whose programming was performed at the manufacturing site.
Although already produced and deployed in billions (from remote controls
and calculators to power plant control), the programming of microcontrollers re-
mained only accessible to engineers until the early 1990s. By then, new memory
technologies such as electrically erasable programmable ROM (EEP-ROM) and
flash (a Toshiba trademark) enabled the storage and physical update of microcon-
troller programs without the use of specialist hardware, accelerating the develop-
ment process and uptake by more industries.
Currently microcontrollers with up to 64-bit data word size are available,
rivalling the compute power of microprocessors used in multi-purpose computers.
However, the most prevalent microcontroller types are still 8-bit and 16-bit, with 8-
bit models costing around 0.25 USD in large quantities and an annual sale of more
than four billion units in 2006. Microcontrollers are a very active area of research
and development, as well as marketing, especially with the Internet of Things in
hindsight, as they are going to be the processing platforms for most end devices that
generate or consume data. In 2014, sales were already estimated in the range of 18
billion microcontrollers that year.
Microcontrollers are relatively low-speed and usually have significantly less
on-board memory than most multipurpose computers. This, however, makes them
cheaper both in terms of price as well as power consumption. Most controller
manufacturers provide development tools that allow code development with C, C++
or other high-level programming languages. These tools translate business logic
into binary code that can be interpreted by the controller. A special program called
bootloader can retrieve that code from memory when the system is powered up and
initiates the execution.
A great advantage of microcontrollers over using microprocessors in other
systems is that they can directly interface with electronics. The integrated circuits
therefore have general purpose input output (GPIO) pins. This allows the controller
to send or receive digital signals. Some microcontrollers also incorporate digital-
to-analog (DAC) or analog-to-digital converters (ADC), or pulse width modulators
(PWM) that enable them to work with analogous signals (see A/D and D/A con-
version, Section 3.2). Modern microcontrollers often provide hardware supported
ports, such as Universal Serial Bus (USB), Universal Asynchronous Receiver Trans-
mitter (UART), or other hardware interfaces (see Part VI, Device Communication).
The control units of microcontrollers are especially designed for real-time
processing, e.g. with interrupts that trigger code for immediate execution irrespec-
tive of where in the main business logic the processor is currently stuck.
2.3.3 Multipurpose Computers
Multipurpose computers are systems that integrate a microprocessor, and expand-

able memory, as well as interfaces for input and output devices. Desktop, laptop
and tablet computers as well as most modern smartphones fulfill the criteria for a
multipurpose computer: in other words, what most of us call a computer these days.
The definition of multipurpose is rather fuzzy and best appreciated in comparison
to embedded systems that are usually used for a particular application.
Chapter 3
Information Theory and Computing
Claude Shannon published his seminal 1948 paper A Mathematical Theory of

Communication and laid the foundation of modern information theory. His work did
not only establish quantitatively the nature of information and key metrics relevant
for its transmission, but also inspired academic fields as diverse as media studies
and biology.
3.1 INFORMATION CONTENT
The sender/receiver model provides an universal description of communication (see

Figure 3.1). The sender has a message that needs to be encoded for information
transmission. When the receiver gets hold of this information, it needs to be
SENDER information RECEIVER
message encode signal decode message

channel
error noise noise
Figure 3.1 Shannon’s model of information transfer. The model knows a sender and a receiver. A
message from the sender needs to be encoded, sent over a communication channel, and decoded by the
receiver to understand the message. A lot of Shannon’s interest revolved around communication error
and strategies to recover messages from noisy transmissions.
75
decoded and the actual message extracted. Our signal can have different degrees
of redundancy, and therefore would require different degrees of bandwidth for its
transmission: A brief verbal statement is probably less redundant than a song. One
reason to chose a poem or a song over a simple exclamation is its robustness to error
or noise. The core message “I love you!” is more easily misunderstood said once in
a noisy environment than a repetitive pop song.
Shannon’s work at Bell Labs addressed mathematically strategies for informa-
tion compression and recovery from error, as well as the capacity of communication
channels and many more. A key equation from his paper allows us to calculate the
minimal amount of information required to transmit a message: He borrowed the
concept of entropy from physics, which is a metric of stochasticity in statistical
◦
thermodynamics. A brief example: At absolute zero (0 kelvin, − C) and in
vacuum, carbon atoms in graphite are nearly stationary. That is, the entropy of the
system is minimal, and the position of individual atoms can be predicated with little
degree of uncertainty. At extremely high temperatures, however, in carbon vapor,
the atoms distribute and diffuse randomly at high speed. The energy of that system
is high, so we need a lot more information to know the position of individual atoms.
Shannon used entropy as metric for the amount of information required to
communicate a message. He used the smallest unit of information, the bit, as a base
line for his calculations:
Number of bits needed = − ·
How many bits are required to express the outcome of a coin flip? The
likelihoods are . With − · · , we
expectedly require 1 bit (e.g., 1 = head, 0 = tail) to communicate the outcome of
the experiment. Analogously, the throw of a dice with for each number
would require 2.58496 bit. In practical terms, for use in a digital computer, one
would chose to encode it in 3 bit. As control, we can look at an experiment with
256 different outcomes and see how many bits are required for its encoding:
− ·
3.2 A/D AND D/A CONVERSION
A continuous physical quantity (e.g., a voltage from a sensor device) needs to be

periodically sampled to form a sequence of values that represent the amplitude
Information Theory and Computing 77
of that signal in the computer: it needs to be discretized. This is the task of

an analog-to-digital converter (ADC) and the inverse function is performed by a
digital-to-analog converter (DAC), which interprets a digital code and generates a
corresponding (semi-)continuous voltage.
Both ADCs and DACs are characterized by the same metrics: granularity and
quantitative resolution (bit rate), sampling rate and temporal resolution (1/s), as well
as dynamic range.
Analog to digital conversion
For the sake of simplicity, we will focus on the example of a 2-bit ADC that encodes
a to input signal with binary 0000 for and binary 1111 for . The
intermediate voltages would be represented as 2-bit integer numbers (i.e., 0 to 3),
encoding the values , , and . If the ADC can in principle work
in the range of to but the actual signal is more restricted (e.g., to ),
a device with a good dynamic range would automatically scale and represent the
voltages , , and .
The example in Figure 3.2 shows that higher bit values yield better quanti-
tative resolution. Many embedded devices such as Arduino and mbed microcon-
trollers with integrated ADCs have a resolution of 10 to 12 bit. Some scientific
specialist devices provide resolutions up to 64 bit, although this gain of digital res-
olution is not always sensible, especially if the analog input signal is noisy and the
increased digital resolution only resolves the noise.
So far we have only looked at the outcome of a single conversion operation
on a captured signal (the circuit that takes this snapshot is called a sample and
hold device). In case of temporally variable signals the sampling rate (conversions
per unit of time) and susceptibility to aliasing are relevant metrics. Note how the
final ascend of the signal does not get captured until the signal is digitized more
frequently. ADC sampling rate and signal aliasing were theoretically studied by
Shannon and Nyqvist, giving rise to the Nyqvist-Shannon sampling theorem. One
practical consequence of the theorem is that ADCs need to have a sampling rate that
is at least twofold higher than the highest frequency they should faithfully digitize,
such that the frequency and change of signal can be analyzed computationally.
There are many strategies for ADC and DAC conversion. Voltage drop over
resistors is the underlying principle for the most simple ADC devices (see Figure
3.3, left). A flash ADC breaks down a reference voltage (Vref , ideally the maximum
sample voltage Vsample ) into equally spaced reference voltages. The sample voltage
is then fed into a comparator (see integrated circuits, Section 2.1.2.2) and compared
increasing granularity increasing temporal

resolution
2-bit 3-bit 4-bit
(sampling rate)
11
10
01
00
time [ms] time [ms] time [ms]
signal [V] 4
5 3
2
4 1
0
3
time [ms]
2
5
1 4
3
0 2
1
time [ms] 0
time [ms]
Figure 3.2 Analog to digital conversion. Shaded box: A continuous dynamic signal increases over the
duration of from to . In the top three panels (2-bit, 3-bit and 4-bit), the granularity of
the analog-to-digital conversion increases from 4 to 8 to 16 distinct steps. That is, over the time
frame, the 2-bit ADC recognizes two different voltages, the 3-bit ADC recognizes three voltages and
the 4-bit ADC recognizes four voltages (indicated by the grey lines). To accurately detect the temporal
development of the signal, the sampling rate increases from to to in the panels on the
right. While at steps the signal could conceivably be called linear, at higher sampling rates the logit
function becomes evident.
Vref
7-bit 7-bit
7/8 Vref V1
1 1 x Vmax
V2 Vout
2x R R
6/8 Vref V1
V2
1 1 x Vmax
2x R
5/8 Vref V1
R
V2 1 1 x Vmax
2x R R
4/8 Vref V1
V2
0 0 x Vmax
2x R R
3/8 Vref V1
V2
0 0 x Vmax
2/8 Vref 2x R R
V1
V2
0 0 x Vmax
2x R R
1/8 Vref V1
V2
0 0 x Vmax GND
2x R 2x R
example for
GND
Vsample ~ 5/8 Vref
Vsample
flash analog-to-digital conversion digital-to-analog conversion
Figure 3.3 Anatomy of ADC/DAC devices. ADC (left) and DAC (right) implemented with resistor
ladders. The ADC breaks down the reference voltage V ref into equal chunks representing the steps from
0 to Vref V. At each step a comparator emits 1 if the sample Vsample is greater than the reference, and 0
if it is not. In the example the voltage of 58 Vref is translated into binary 1110000, or 108 of 128 steps.
The DAC follows a similar principle, although through addition of equally stepped chunks of Vmax .
to each of the reference voltages. This naturally provides an on-or-off response at

each of the comparators, and the output can jointly be interpreted as a bit word.
These simple devices are prone to quantization errors and nonlinearity. Successive
approximation conversion designs combine an ADC step with the generation of
an equivalent DAC output, compare the generated signal to the input signal, and
improve the ADC estimate by averaging over a few iterations.
Digital-to-analog conversion
Digital-to-analog conversion is at the heart of modern audio devices (i.e., digital

music to analog sound waves) or motor control (simple analog motors). A simple
DAC adds up small voltages that are proportions of the maximum voltage the
system can provide. For example, setting the uppermost bit 1 results in the V max
component of our exemplary device (see Figure 3.3, right).
Pulse Width Modulation (PWM)
An alternative to employing a DAC is the use of PWM for the controlled output of
energy. In simple terms, PWM mimics the output of lower than Vmax voltages by
reducing the number of duty cycles. A duty cycle is the smallest amount of time a
digital output has to remain active when being switched on and immediately off. If a
LED shines bright at Vmax , outputting short spikes of Vmax volts at 50% of the time
(omitting half of the duty cycles) would reduce its brightness by half. This method
does not only work for time-constant voltages. Other wave forms can be emulated
using PWM strategies as well. For example, if wanting to emulate a half-sine wave
(over time), the wave form is divided into time slices. For each time slice, the
average value of the points on the wave is averaged, and that value is emulated
using the strategy described for the single value. The number of duty cycles that
fit into the wave segment determines the analog resolution of the method; that is,
if 8 duty cycles fit into a wave segment, the values can be represented with a 3-bit
resolution.
3.3 DIGITAL SIGNAL PROCESSING
Electronic signals can be processed like mathematical functions. Analog and digital
circuits allow arithmetic operations such as subtraction or addition of two ore more
signals. Other methods apply filters to remove unwanted components like noise
from a signal, or edge and peak detection circuits allow the extraction of key
information from a signal stream.
Figure 2.2 provided an application of capacitors in cleaning analog signals
using high- and low-pass filters. In digital signal processing (DSP) the analog signal
is first digitized using an ADC, computationally processed in a microprocessor and
returned as analogous signal:
analogous input! ADC! digital signal processing! DAC! analogous output
DSP has the advantage over analogous processing that virtually every functional-
ity can be implemented at no additional hardware cost and can be reprogrammed.
Adaptive filters or compression algorithms that require a temporal look forward
and/or backward in the signal are enabled through digital memory. Reasons against
DSP can be the time-consuming ADC/DAC conversions as well as the fact that each
conversion implies some degree of information loss in comparison to the analogous
input signal. DSPs are ubiquitous and these days usually available as small IC pack-
ages. For example, DSPs are used in sound systems to boost particular frequencies
or add distortion to make digital recordings of music sound more natural. They are
also important in automotive; for example, where they execute calculations for the
antilock breaking system (ABS). The advantage here is that the DSPs are used as
cheap, dedicated computers. The speed of the digital compute compensates for the
time of ADC/DAC conversion and provides overall a better response.
Most DSPs are optimized for multiplication and addition. This enables the
fast execution of many filtering algorithms where averaging of signals is involved,
including the Fourier transformation that aims to replicate the signal through a
summation of trigonometric functions.
3.4 COMPUTABILITY
Computability is a vast field of mathematics and theoretical computer science.

Here, we are only going to briefly outline a few of the concepts to finish off
our introduction about the physical and information theoretical principles behind
modern communication technology and the Internet of Things.
A problem counts as computable if it has a mathematical solution and can be

computed using an algorithm. One roughly distinguishes three cases:
• Problems that are computable. The algorithms can be:
– deterministic (i.e., there is one exact solution)
– non-deterministic (there is a large solution space, and every execution
of the algorithm may yield a different -correct- answer)
• Problems that are computable in principle, but deriving a solution is usually
not practical because of compute time. Decryption in the absence of a
passphrase often falls into this class of problem.
• Problems that are not computable (i.e., for which no algorithm exists).
What constitutes computability and a solution depends on the definition. For ex-
ample, the question about the meaning of life is of philosophical nature and cannot
be tackled with an algorithm. However, any other problem that can be decomposed
into a set of mathematical questions can. In order to constrain the definition of
computability and have a test bed for further analysis, Turing described an abstract
machine model with a read-write head that follows a programmable tape with dis-
crete fields. Each field can contain commands that move the head (an arbitrary
number of steps left or right) or that mediate the writing of characters to other
fields. The alphabet and meaning of characters (as a command) differ between
different types of Turing machines. Turing addressed the question which type of
algorithmic problems can be solved with his model machine (today we say com-
puter architectures are Turing complete if they can mimic a Turing machine). It is
possible to show formally that programs with just loop, while or goto structures are
equivalent to Turing machines. A corollary of that is that any code on the basis of
goto-commands (spaghetti code, like in many old BASIC dialects) can be rewritten
with more human-readable loop and while structures.
Computability also often refers to the complexity of an algorithm to solve a
problem (i.e., the amount of time that is required to solve it computationally). Some
problems can be solved in polynomial time; for example, the sorting of numbers
in a list (where compute time increases linear with the number of elements that
need to be sorted). These are also referred to as P-type problems. Others require
exponential effort to solve. Those are called NP-problems (for nonpolynomial) and
have in principle a solution that can be determined programmatically, but only with
computational effort that makes their solution unfeasible or at least very hard. Many
cryptographic methods have solutions that are NP hard unless provided with the
appropriate keys. Also, P-type problems usually know an efficient algorithm, while
NP-type problems frequently employ heuristic approaches to find a solution instead
of systematically exploring the parameter space. The traveling salesman problem is
one of the classic NP examples: What is the shortest route to visit different places
while visiting each place only exactly once?
Part II
Historical Perspective of the

Internet of Things
Chapter 4
50 Years of Networking
The ability to transfer data between and from a dispersed set of devices is the
key enabling technology of machine-to-machine communication and the Internet
of Things. The development of a system that today allows the Internet to be larger
than the sum of its parts dates back to the 1960s. The Internet is based on a variety of
conventions, hardware and software that enable devices to connect to other devices
virtually anywhere in the world. Those concepts, protocols and technical details
behind the Internet will covered in the respective chapters of this book (e.g., Part VI
on Device Communication and Part VII on Software). This chapter aims to provide
a historical perspective how the Internet has grown from the interest of a handful of
academic researchers to an ubiquitous network of billions of connected devices.
4.1 THE EARLY INTERNET
In the aftermath of the Second World War, tensions between the United States and
their Western European allies on one side, and the Soviet Union and their partners
on the other side developed into the Cold War. By 1950, both superpowers were
capable of building nuclear weapons, and in the decade to come developed long-
distance deployment systems, each threatening the opposing block with thermonu-
clear annihilation.
The Semi-Automatic Ground Environment (SAGE) early-warning system un-
der the control of the North American Air Defense Command (NORAD) employed
an array of autonomous radar stations that surveyed the airspace over North Amer-
ica for Soviet bombers. To enable a rapid response even though many of the radar
stations were based in remote areas of the northern polar circle, they fed back
87
their signals via telephone lines to central compute centres, where IBM AN/FSQ-7
vacuum tube computers calculated interception points for the airforce on the basis
of radar trajectories. This network could be counted as the world’s first computer
network.
With the successful launch of Sputnik as the Earth’s first artificial satellite in
1957, the United States feared the delivery of Soviet nuclear weapons by intercon-
tinental missiles. In response to their surprise and in order to accelerate their own
research efforts, the government organization Defense Advanced Research Projects
Agency (DARPA) was tasked to strategically invest into emerging technologies.
One of the desired capabilities was the rapid and robust exchange of information
between computer systems, and to distribute complex computations between the
available hardware systems.
DARPA commissioned the Advanced Research Projects Agency Network
project (ARPANET). It was the first computer network to use packet switching and
the first to employ the Network Control Program (NCP), a predecessor of TCP/IP,
a method and protocol still used today by which a message is split into a set of
smaller packages, which simultaneously seek independent routes to the destination,
where they are rejoined to make up the message. While one intrinsic feature of
packet switching is the robustness of the network to loss of service from individual
nodes in a decentralized network, it is an urban legend that its use was a deliberate
architectural decision to withstand nuclear attacks. The prototypic ARPANET
started in 1969 and connected four computers: three at University of California
sites in Los Angeles, Santa Barbara and Stanford, and one at the University of
Utah. The four sites were connected using Interface Message Processors (IMPs),
heavily customized versions of the $10,000 Honeywell DDP-515 minicomputer, to
execute the functions of what these days is known as a router. By 1973 over 40
academic institutions in the United States and the first few sites in Europe, such as
the University of London, were connected. The IMP software was ported to other
hardware systems and in 1977 there were more than 110 computers participating in
ARPANET (see Figure 4.1). In 1983 the Department of Defense split off MILNET
from ARPANET, but remained loosely connected for the exchange of electronic
mail.
Several subnetworks that were using the same standards as ARPANET arose
in the early 1980s and connected to ARPANET. Many were of academic nature
such as the German Dfn or the British Janet, which was one of the first networks
that interfaced through a satellite link and not through the otherwise prevalent
telephone lines. Historically, networks of European institutions had used a different
package switching protocol, X.25, which necessitated gateways that could switch
50 Years of Networking 89
Figure 4.1 A logical map of ARPANET in March 1977. (From: Heart, F., McKenzie, A., McQuillian,
J., and Walden, D., ARPANET Completion Report, Bolt, Beranek and Newman, Burlington, MA,
January 4, 1978.)
between protocols. The National Science Foundation in the United States operated
NSFNET, which was the first subnetwork to open up to nonacademic users and
allowed commercial providers to sell access to the ever-growing network. In 1990
ARPANET was officially decommissioned, having assisted in the creation of the
worldwide network that the Internet is today.
The technology behind ARPANET was not static. Already in 1969 the Re-
quest for Comments (RFC) bulletin was introduced, a platform for the proposal
and discussion of protocols and stxandards in computer networking. This enabled
the participants of the network to help shape architectural decisions. Notably, three
early RFCs were 791-793 (IP, ICMP and TCP), which lead to the switch from NCP
to TCP/IP in ARPANET in 1981. TCP/IP is the package protocol still used today
(see Chapter 22).
Use cases and the etiquette of ARPANET were also subject to RFCs. Even
the term “Internet” for the ARPANET first occurred in a RFC. Since 1971, when
Ray Tomlinson first sent an electronic message, there was an increasing proportion
of network capacity used for email, which was formally specified in RFCs 524 and
561 in 1973. Further, the file transfer protocol (FTP; RFC 354, July 1972) was
established for the exchange of data files between computers. To date there are
nearly 8,000 RFCs, as well as about a dozen consortia that govern standards in the
Internet.
4.2 WORLD WIDE WEB AND WEB 2.0
The early Internet was used primarily for the exchange of textual information:
emails and newsgroups were the primary applications for network connections,
along with purely technical uses like remote login and file transfer. Newsgroups
on Usenet, first established in 1980, were simply text files that could be appended
following RFCs 850 and 1036, and that were synchronized across servers via the
standard Unix-to-Unix Copy (UUCP) suite of Unix commands. However, once
information in newsgroups was older than what a server operator supported (from
days to months), the relevant parts of the file were deleted; thus, Usenet was not
suitable for the long-term provision of information.
4.2.1 World Wide Web
It was only in the late 1980s and early 1990s that attempts were made to use the
Internet for the systematic retrieval of distributed, archived information. The Gopher
protocol on the basis of TCP/IP utilized a directory structure to organize content,
but in contrast to the File Transfer Protocol (FTP), it could be browsed without
knowledge of the Unix command line, with a text-based user interface called Very
Easy Rodent-Oriented Netwide Index to Computerized Archives (VERONICA).
Since 1989, Tim Berners-Lee at CERN had researched “a large hypertext database
with typed links,” the basic concepts behind static but interlinked websites and
a document-description language like the HyperText Markup Language (HTML),
which allowed the integration of textual information and images. He also invented
the HyperText Transfer Protocol (HTTP), which uses TCP/IP for transmission to a
viewer software, which he termed a web browser. The name of Berner-Lee’s first
browser was WorldWideWeb (WWW), which he released in December 1990.
The HTTP protocol and HTML standard saw rapid adoption by the academic
community, and after only two years, by January 1993, there were about 50
institutions that shared content on dedicated web servers. Already in the mid-
1980s a few commercial entities participated in the Internet, but the web was a
breakthrough in terms of user experience — allowing even lay people to seek
information online. In 1993, news organizations like Bloomberg, The Economist and
Wired were among the first businesses to register domain names, clear-text names
that are translated into server IP addresses.
The beginning commercial success of the WWW and the rise of novel browser
software like Netscape that could interpret different flavors of HTML gave rise to
the WWW Consortium (W3C) standards body to control and coordinate further
developments of the WWW. Currently there are about a dozen consortia that govern
different aspects of the Internet and WWW, including the Internet Architecture
Board (IAB) and the Internet Corporation for Assigned Names and Numbers
(ICANN).
Partly owing to the easy-to-use WWW, in contrast to command line and text-
only interfaces, the mid-1990s also saw increasing numbers of Internet service
providers (ISPs) that allowed consumers dial-up connections to the Internet. This
in turn inspired the formation of countless Internet companies that aimed to market
products and services through the WWW. Many business models in this early era of
e-commerce were not viable and ultimately lead to the burst of the dot-com bubble,
as the crisis of the stock markets in the year 2000 was called.
4.2.2 Web 2.0
While the worldwide participation in the WWW grew despite the commercial
backlash, creating and putting content into the WWW remained the domain of
specialists and keen hobbyists. From 2002 to 2006, a range of social media websites
like Myspace, Facebook, Twitter and YouTube launched and allowed their users
to create and publish their own content on the web without the need for deeper
technical understanding. This revolution in terms of who provides and controls the
content of websites is commonly referred to as the Web 2.0. At the same time, smart
phones have become key devices to access the WWW. The decentralization and
democratization of both content provision and access point availability contributed
largely to the web having historical and societal impact; for example, the Arab
Spring protests of the 2010s were largely orchestrated through social media.
4.3 CONNECTING THINGS
The ability of the WWW to make text and other content such as images available
within a convenient browser session sparked exciting innovations: What if the web
server would not only provide a static picture, but could programmatically update
the image file regularly? Already in 1991 the Computer Laboratory at the University
of Cambridge had experimented with digital video and broadcasted an image of
their coffee pot using a local network protocol. With the WWW gaining popularity,
in November 1993 they switched to displaying the coffee pot image on a public
website: The world’s first webcam was born! The Trojan Room Coffee Pot, as it
was called for its localization in the laboratory’s building, was the first prominent
thing on the Internet.
4.3.1 Industrial Control Systems
Ever since industrial processes used electricity, relays and electromechanical

switches were employed to monitor and control essential infrastructure (electricity
grid, water), process plants (e.g., for oil and gas) and production (manufacturing).
Before complex microcontrollers and embedded computers became more widely
available in the 1980s, programmable logic controllers (PLCs) replaced ladders or
arrays of relays for binary decision making, as relays were slow and required elec-
tricians for programming and maintenance. Even now PLCs are used in industry
whenever the business logic does not require more expensive computing resources.
Originally PLCs were used to control a particular piece of equipment. They were not
intended to be used remotely and often required direct human intervention (open-
loop control: the PLC acts as implementation device, like a remote control). Later
models also allowed local feedback loops in which the system adapts to current
sensor measurements (closed-loop control) and monitoring from a remote terminal.
A logical extension of PLC-based control were distributed control systems (DCSs),
which integrated data from many different sensors in large plants to calculate and
provide feedback in real-time. These were traditionally wired peer-to-peer connec-
tions between PLCs and required dedicated transmission lines.
Industrial processes that are geographically distributed and that are not neces-
sarily on one campus but truly remote gave rise to supervisory control and data ac-
quisition (SCADA) systems. The boundaries between DCS and SCADA are fuzzy,
but conventionally SCADA is used for systems that can robustly handle unreliable
connections with intermittent connection loss, high latency and varying bandwidth.
Packet switching and TCP/IP lend themselves naturally to these difficult circum-
stances. Hence, SCADA systems often utilize the Internet for monitoring and con-
trol purposes. SCADA implementations provide additional functionalities that are
largely enabled by the network connection and integration of data, for example:
• Remote control of specialized equipment via PLC
• Data acquisition through remote sensors
• System orchestration through open and/or closed control
• Graphical user interface, often animated and schematically representing the
process
• Database for system events (historian)
• Communication with higher control instances (e.g. data analytics)
The modern term for SCADA is machine-to-machine (M2M) communication,
or Industrial Internet of Things (IIoT). Common M2M protocols and interfaces are
presented in Part VI (Device Communication) and Part VII (Software).
4.3.2 The Internet of Things
The expression Internet of Things was originally framed by Kevin Ashton in 1999.
He referred to the increasing connection of the physical world to the Internet, which
he saw exemplified by the ever-increasing number of RFID tags that provided a
digital identity and linked products to databases. As such, Ashton had an industrial
process in mind when he talked about the Internet of Things, though the term is
increasingly used to refer to consumer products as well.
With the rise of cheap, ubiquitous Internet, the first consumer devices that
were not conventional end devices like desktop computers or, since the early 2000s,
smart phones, began to utilize network connections. The first wave saw Internet
radios and webcams with built-in web server capability.
It is hard to establish which products would qualify as the first real Internet of
Things devices. In 2006, the WiFi-connected Nabaztag digital pet (in the shape of
a stylized rabbit head) provided status information based on user interests (email,
stock market, weather) by subtle means such as light tone and direction of ears. It
could also play and record voice information for emails, podcasts, and so forth. The
Nabaztag also had an application programming interface (API) to allow third-parties
access to the device’s functionality, which gave rise to an ecosystem of tools so that
users could interact with their Nabaztag and exchange content. Before that, since
2002, a spin-off from the MIT Media Lab called Ambient Devices experimented
with Internet-connected devices that would inform their users in nontraditional
ways. Their one-pixel display, the Ambient Orb, was a web-connected lamp that
acted as discrete indicator of status. Later they extended their model of glance-able
information to devices like the Ambient Umbrella that would inform its owner of
an impending rain shower.
A significant step change for the invention of Internet-connected products, or
at least prototypes, was the Arduino. This open-source and open-hardware project
released a microcontroller board, the Arduino UNO, in 2005. In contrast to other
microcontroller boards that required specialist knowledge and a complex toolchain
to program, the Arduino UNO was designed to enable anyone to experiment with
low-voltage circuits and computer control. The system consisted of the actual
controller board, as well as an integrated development environment that could
be programmed in Processing, a very simplified version of C. Another boost
for experimentation with the Internet of Things came from the Raspberry Pi, a
small credit-card sized Linux computer available from 2012. Notable fun projects
that received wide media coverage included the Bubblino, a device that identifies
a hashtag on Twitter and responds with soap bubbles, or Botanicalls, a set of
applications that report the humidity in a plant pot via Twitter.
Since the 2010s there have been countless products for the emerging smart
home market, along with a plethora of wearable devices that are often counted
as part of the Internet of Things. We will see more examples in Chapter 6 on
Applications of M2M and IoT.
Part III
Applications of M2M and IoT

Chapter 5
The Difference Between M2M and IoT
This is the first chapter of this book that exclusively addresses machine-to-machine
communication and the Internet of Things. As briefly mentioned in the preface as
well as in Chapter 4, there are a few semantic but also conceptional differences
between the two. Very briefly, in machine-to-machine communication, the Internet
replaces a direct physical or entirely private data connection, the business logic
is well-defined and hard- and software solutions are built around those needs. The
Internet of Things is an extension of this concept, but in contrast to the purpose-built
connections and applications, everything (note the exaggeration) is connected to the
Internet and software systems need to understand the context of data and provide
actionable insight and appropriate control. For the sake of simplicity, from here on
we will use Internet of Things in many cases that are classical machine-to-machine
communication, and use the abbreviation IoT as a generic term for both variants.
To better understand the differences between IoT and M2M, it is useful to
think about IoT as result of a technical evolution (see Figure 5.1). Section 4.3
presented how industrial control systems allowed the computational control of
machinery in manufacturing and other industrial processes. Similar developments
happened in vertical markets, such as building control and public infrastructure
where municipalities could tap into existing telephone lines. In the M2M stage,
where most verticals are stuck at the time of writing, these direct connections are
complemented or replaced by other means. For example, mobile phone and satellite
technology allows the tracking of mobile assets such as vehicles, and together with
digital subscriber lines for localized infrastructure the Internet becomes the primary
route for data exchange. While these M2M connections all participate on the same
Internet, they are logically isolated and there is no communication between, for
example, connected industry and traffic information systems. The architecture of
97
98
these solutions is rigid and the context for any analysis is defined by the business
logic.
We are at the brink of a world where more and more infrastructure is
connected to the Internet, while at the same time a new breed of Internet-connected
consumer products emerges. Depending on the school of belief or industry analyst
company, there might be some 50 billion connected devices in 2020. In its ultimate
realization, the IoT could be connecting everything. However, technical difficulties
like the computational interpretation of arbitrary data from ever-changing device
combinations aside, questions of data ownership, privacy and provenance are going
to arise.
Currently many products and services are marketed under the IoT umbrella
that are neither M2M nor actual IoT. Technological pioneer Alexandra Deschamps-
Sonsino proposed a litmus test that helps to determine if a thing participates in
IoT (see Figure 5.2). Core elements to discern whether something is IoT are the
decentralized management of the data in the cloud and the availability of a developer
interface to allow privileged third-parties to access it. To really leverage the potential
of IoT, software must be able to understand the context of information, which can
be enabled by adhering to IoT interoperability standards. Along with strategies for
defining the relationship of devices to each through ontologies, these issues are
subject of the Software section (Part VII) of this book.
This chapter provides a few insights into the current state of IoT in various
verticals and many examples of how individuals and communities could benefit
from the exchange of information between organisations, industries, markets and
infrastructure. The focus here is less on the precise technical details of the IoT
solution (the book covers those in the subsequent four parts), but to highlight
the potential of the technology if IoT were to become a network of interacting
ecosystems — an appetizer for the Internet of Things. Most of the discussed
scenarios assume that interoperability problems and issues of privacy and data
ownership are overcome. Energy markets, industry and connected vehicles are
exhaustively covered in the book Enterprise IoT by Slama et al., a recommended
read for those interested in more detailed information about the business incentives
behind the next steps in IoT.
The Difference Between M2M and IoT 99
Automation
Stage
Industry Infrastructure Municipalities Buildings

(electricity,
water, gas)
- machine automation - traffic lights
- industry robots - status info at parking lots
- process documentation - air quality displays
- machine automation - heat, ventilation, A/C
- process documentation - lifts
- blinds
M2M
Stage - demand and supply comm's
Maintenance
- predictive maintenance
Vehicles
- real-time asset tracking
- distributed manufacturing
Traffic
- public transport updates - real-time management
IoT
Stage
£ € 12:48
$
Retail / Wearables
Finance
Health
Consumer
Services
Figure 5.1 The evolution from automation to the Internet of Things. It is difficult to assess to which
stage various verticals have evolved. While it can be assumed that almost all application fields are
employing technology of the automation stage, most companies are now trying to leverage the M2M
stage with defined data analytics. Ultimately, when IoT refers to an ecosystem, devices are autonomously
going to tap into relevant data sources.
100
sensors/actuators/indicators
local data collection ( ) ( ) ( ) ( ) ( ) ( ) ( )
gateway device ( ) ( ) ( ) ( ) ( )
cloud service
user interface (app) ( ) ( ) ( ) ( ) ( )
developer interface (API) ( ) ( )
interoperability
hard- soft- auto- M2M indus- gadget proprie- IoT
ware ware mation trial IoT tary IoT
( ) optional
never IoT
process-bound industry consumer products
Figure 5.2 “A Litmus Test for IoT”, adapted from Alexandra Deschamps-Sonsino. It is important
to remember that this classification schema is based on the opinion of Alexandra Deschamps-Sonsino
and Boris Adryan; this is not a formal classification. Devices that are not connected and/or that collect
data only locally are not IoT. Cloud services, apps and application programming interfaces alone are
not IoT. In an industry context, most outward-facing features (user interface, apps, developer interface,
interoperability layer) are going to process-bound (i.e., with a defined business logic in mind). Machinery
that connects to some sort of user interface falls into the automation category. If it makes use of the
Internet, but data access is exclusive to its stakeholders, it is M2M. If some of the data is offered through
an interoperability layer (e.g., providing access for suppliers and customers), it is industrial IoT. In the
context of consumer products, most devices for which an app constitutes a remote control but which don’t
offer developer access and interoperability are gadgets, or toys. To qualify for the IoT label, products
need to at least offer a developer interface, and ideally, guarantee a level of interoperability.
Chapter 6
Common Themes Around IoT Ecosystems
It is difficult to predict the many applications of IoT once it has reached the stage of
ubiquitous expansion. We have not even started to comprehend the possibilities, just
as the concept of social media was unknown and its consequences unforeseeable at
the beginning of the Internet. Although the meaning of applications/functions of
IoT is different for each industry, municipalities and consumers, the rationale can
be summarized as five core functionalities of the IoT, see also Figure 6.1:
• Demand and supply management
• Maintenance
• Information systems
• Remote access
• Services
When we use could in examples for these five functions, in some cases this
already means is or is about to become reality. One application of IoT is around the
communication of demand and supply. In an industrial setting, this could mean that
supply chain management could see a further degree of automatization. Rather than
relying on manual stock keeping and reordering, connected sensors could count
and track assets during the manufacturing process, and depending on demand, in-
telligent systems could take care of orders and asset tracking while new supply is
en route. However, the same principle applies for automatic restocking in a smart
101
102
roads bridges
machinery power appliances supply asset waste automatic
grid chain tracking management restocking
Industry Cities Home Industry Cities Home
Maintenance Demand & Supply

tele- pay-per-use
Industry
presence business models
Applications Industry value added
lights Cities Remote Access Services services
Cities community
appliances Home engagement
Information Systems public
heating transport
tele- Industry Cities Home
presence
parking air occupancy infotainment

quality
Information
Demand & Systems
Supply Information
Information Systems
Systems Maintenance Service
Autonomous Optimised
Remote Driving Manufacturing Demand &
Access Smart
Remote Supply
Maintenance Retail Demand &
Diagnostics Access Supply
Demand & Emerging
Supply Usage Business Service
Vehicles Industry
Maintenance Monitoring Models Maintenance
Service
Public
Service Diagnostics Transport
Verticals
Service Compliance Health Cities Traffic
Management Management
Information
Systems Assisted Waste
Living Home Management
Service
Information Maintenance
Systems Infotainment Assistend Demand &
Energy Living Supply
Information Management
Service
Systems
Remote Maintenance Information
Access Information Systems
Systems
Figure 6.1 Taxonomy of IoT ecosystems. Two strategies to visualize the impact of IoT on our lives
(top: based around applications; bottom: based around verticals). Note that these are just different sides
of the same coin: For example, the predictive maintenance of machinery is found in the Applications
taxonomy (Maintenance ! Industry ! machinery), as well as the Verticals taxonomy (Industry !
Optimised Manufacturing ! Maintenance).
Common Themes Around IoT Ecosystems 103
home. The Amazon Dash button is a first step into this direction. This is a small
connected device that automatically reorders household consumables with the press
of a button.
The automated maintenance of equipment is another important application

with a large potential to save time and money. Predictive maintenance in industry
may take sensor data around common failure points in machinery and determine
the chance of failure, allowing for preventive repair. A related concept is that of
condition-based monitoring. For example, in the operation of a jet engine, the pri-
mary purpose of the condition-based monitoring is to ensure thrust is within toler-
ance and to prevent its use before it fails. In a city context, roads and bridges could
be monitored for structural integrity (e.g., the widening of cracks), eliminating the
need for site visits by engineers. Consumers may benefit from connected household
appliances that alert them and manufacturers of impending breakdown.
Remote access in the sense of an industrial control system is not new, but in
the future, city lights, home heating systems and even remote-controlled appliances
could enable novel, demand-based applications. The same is true for information
systems. Again, this is a function thus far mainly implemented in industrial control
systems, with the potential to penetrate many more areas of modern life. In a city
context, municipalities can use traffic and air quality data to provide better informa-
tion to citizens; in fact many communities already publish information like real-time
bus timetables or parking lot occupancy. While at this stage this data may be of low
resolution (e.g. simply indicating whether or not there are spaces in a parking lot),
we may see detailed information about the number of available on-street parking
spaces. In the context of home or building automation, information systems can
help to improve energy usage or security (e.g., when room occupancy and/or door
and windows can communicate whether they are open or closed).
An important aspect are novel services that can emerge in the IoT. For industry
and retail, this may be pay-per-use business models (often referred to as from-capex-
to-opex models) or value-added services such as automated maintenance, or the sale
of usage data to third-parties: data from a connected tire may carry information use-
ful to insurance companies (information about the driving behavior of individuals),
car manufacturers (e.g., to inform engineering decisions, e.g., about stress points),
producers of navigation systems (e.g., warning of a corner that frequently sees panic
braking) and many others. The service aspect for cities could bring new forms of
community engagement; for example the incorporation of data from citizen-owned
environmental sensors into information systems.
In an attempt to grasp the impact of IoT, the above taxonomy can also be centered
around verticals or domains, such as:
• Industry
• Agriculture
• Municipalities (Smart cities)
• Buildings
• Health
• Vehicles
For each of those verticals, one can draw up applications that impinge on the
five core functionalities mentioned before. It is important to keep in mind that these
are applications that seem obvious from a current perspective and that the suggested
conceptual connections are entirely independent from the communication architec-
ture of the Internet. Also, some of these ecosystems are connected and embedded
within each other. For example, a smart city can have many smart buildings, and
some of those buildings may support assisted living and integrate closely with a
connected health solution. Again, it is up to us as users to comprehend the different
scales of geographic distribution and time how these various IoT ecosystems relate
to each other, because from an information technology perspective these are just
different data sources that connect via the Internet.
6.1 INDUSTRY
For the purpose of this book, electricity, water, and gas infrastructures, the man-
ufacturing industry and retail fall under the industry umbrella. The manufacturing
industry in particular is often referred to as Industrie 4.0 (with IoT being the fourth
industrial revolution after steam power, mass production and digitalization, and the
German spelling in respect to the origin of the phrase), industrial IoT (IIoT) or
simply smart manufacturing.
6.1.1 Smart Energy
Energy production and distribution are some of the key challenges in the future.
The increasingly industrialized world population requires more energy than ever.
At the same time the burning of fossil fuels is recognized as a contributor to global
warming, and along with nuclear energy, there is a global trend toward alternative,
renewable energy sources. In contrast to the demand-driven operation of coal and
nuclear power plants that were often established close to the sites of high demand,
renewable energy sources such as wind and sunlight face the problem of storage and
distribution. While off-peak storage heating systems have been around for more than
40 years, these weather- and time-of-day-dependent power plants need to utilize
intelligent storage and improved spatial distribution as they are typically placed in
areas with optimal weather patterns (see Figure 6.2).
A few smaller power plants using different types of renewable energy (wind,
solar, biogas) can be managed together as a virtual power plant, where the fluc-
tuating supply can be balanced locally before feeding into the grid. This requires
communication between the different plants, as it is important to avoid overproduc-
tion of energy. Facilities such as pump-storage hydroelectricity or power-to-heat or
-gas conversion allow the storage of electricity off-grid. Traditional resupply paths
for energy storage simply monitored the grid for demand, with IoT primary energy
providers can communicate with storage facilities to provide the optimal supply for
the network. These strategies touch upon all five core IoT functionalities, including
servitisation, as the communication network enables real-time energy trading.
Large-scale industry customers and municipalities connected to the smart
grid could internally aim to balance demand and supply. It is conceivable that
(a) some industries convert back heat and pressure into electricity to supply their
own and the needs of neighboring productions sites, and (b) that production sites
automatically negotiate a schedule for energy usage with the grid and between each
other. Even before smart energy technology became available, it was not uncommon
for commercial customers to have interruptible contracts that benefit from a lower
price but take short interruptions in supply into account. Technology can only lead
to a further refinement of these contracts. Similar strategies may also be useful for
municipalities that use electricity for services such as district heating.
On the small scale, the future generation of buildings is going to be more
energy efficient and may include the local production of wind, solar or geothermal
power. Again, the communication among energy-demanding devices, local storage
batteries and the local microgrid may help to balance power consumption during
Large-scale Energy
wind farm storage
Large-scale
solar power
Fossil and
nuclear
Grid Maintenance
Industry
Distribution
Micro-grid infrastructure
Infrastructure
(high demand)
Municipalities
Buildings
Traffic Buildings
Traffic
Local wind Local solar Buildings
70%
Traffic turbine power
Vehicles Battery
70% Local wind Local solar
turbine power
Vehicles
Micro-grid
Battery
70%
Local wind Local solar
Vehicles Battery turbine power
Traffic Buildings 70%
70%
Vehicles Battery Local wind Local solar

turbine power
Figure 6.2 The smart energy ecosystem. Renewable energy, energy storage, intelligent grid architec-
tures and the interplay with industry and municipalities are key to overcoming environmental issues.
peak times. While today heating and appliances are probably the most power-
demanding equipment in terms of peak currents, the charging of electric vehicles
is likely to become more prominent. With the capacity of car batteries massively
exceeding the need of the daily commute, through integration of traffic information,
travel schedule and battery status, the vehicles themselves can engage in micro-
transactions. This means that they can sell electricity to feed into the microgrid or
buildings to taper off peak demand, or communicate with recharging stations when
and where to replenish. This, however, also requires further investment into smart
grid technology as some grid designs are intrinsically one-directional.
An important aspect of the smart energy concept is the consumer. Remote
access to appliances or heating, ventilation and air conditioning, real-time infor-
mation about power consumption and cost (smart metering), as well as their active
participation in the energy market (in the form of microtransactions) may change
their behavior.
Remote condition monitoring for maintenance is an important application
on all technical and organizational levels, from generators in power plants to grid
components and distribution infrastructure to batteries. It has been said that before
IoT the primary sensor for energy providers to learn about the failure of their
infrastructure were phone calls from angry customers.
6.1.2 Smart Manufacturing
The roots of smart manufacturing lie in operational software and industrial control
systems. On the top floor the tasks of enterprise resource planning (ERP) and
material requirements planning (MRP) have long been standardized and are well
reflected in software. On the shop floor, manufacturing execution systems (MES)
have increasingly become a middleware to interface to DCS/SCADA systems (see
Connecting Things, Section 4.3).
Monitoring status information and remote control are traditional applications
in industry. In complex environments, such as machine floors, novel methods for
indoor tracking of personnel, tools and materials in an integrated manner can
improve both safety and productivity (see Figure 6.3). The precise tracking of hand
tools can even inform better routines for the production process. The data of material
tracking systems (e.g., via RFID tags) can help collate all information related to one
particular piece of a product (product history), data that is both of relevance for due
diligence as well as improved service to the consumer.
Most industries rely heavily on an uninterrupted supply chain. It is conceiv-
able that warehouse stock can be monitored and managed through asset tracking
Top Floor Material Shop Floor

Requirements Warehouse
Enterprise Planning
Resource Logistics
Planning Advanced
Planning &
Design Scheduling
Manufacturing
Execution
System
? Maintenance
Connected
Product
Last-Minute Usage Data Product Data Maintenance Updates

Requests Management Agreements
Orders
Customer and Aftermarket Services
Figure 6.3 Smart manufacturing. ERP, MRP and APS are classical components of manufacturing
software systems. They coordinate the acquisition of materials, schedule work force and plan the
production process. The requirements for a production cycle are fed into the MES, which controls
processes and machinery, often under consideration of immediately available materials. In traditional
manufacturing, the only interaction with the customer (typically wholesalers) was the order coming
into the ERP. In a connected world, every aspect from product design to the actual manufacturing step
to customer care and aftermarket services are part of the process. Warehouses and the supply chain
are highly integrated, allowing more efficient just-in-time delivery and production. Machine failure
on the shop floor can be prevented through predictive maintenance strategies. Continuous tracking of
components and location/usage data from handheld tools can provide immediate feedback to the MES,
allowing for optimization of production and quality control while in production. One example is the
connected ratchet in automotive production lines that monitors and documents the speed, number of
revolutions, force, torque, and so forth for each screw that is being mounted. This data can go into
product data management systems, and may help service engineers to detect problems even at a later
stage when the product is deployed in the field. IoT offers additional opportunities to engage with the
customer. As products can already be allocated to a particular order, it is possible for customers to
submit change requests and have the MES react to those in real-time. Once deployed, the product can be
continuously improved through software updates. Usage data can be retrieved to understand customer
profiles, which along with maintenance information can help inform future product designs.
systems, eliminating the need for manual counting and record-keeping. In combi-
nation with logistics information systems, supply chain management can take into
account not only the goods that are currently in stock, but also the quantity and
quality of materials in transit to the factory. The latter may be of special importance
where perishable goods that require stable environmental conditions are transported,
or where careful handling needs to be assured — sensors for GPS localization,
temperature and humidity, and orientation through gyroscopes come to mind.
The predictive maintenance of production machinery itself is a very important
application of IoT. While it is clear that any system needs to undergo regular
maintenance, the precise monitoring of equipment allows to schedule outages
before critical failures occur. In some cases it may be possible to automatically
order replacement parts so that just-in-time repairs are possible, entirely without
human intervention until an breakdown occurs. Condition-based monitoring also
increases availability as one can switch from schedule- or time-based preventive
maintenance to a predictive, demand-based system. Intelligent systems could feed
this information into planning and scheduling applications (APS) and/or even
determine an ideal service schedule on the basis of machine failure likelihood and
product demand.
The connected shop floor and the connected product enable novel customer
and aftermarket services. While in traditional for-stock and order-based production
the ordering process was concluded with the submission of the order by either a
stockist or consumer, it is conceivable that in the future their requests can propagate
so quickly through the system that last-minute amendments for customization are
possible, or conversely, that the production team on the shop floor can check for
details directly with the customer. Once delivered, the product can still be improved
by uploading software updates. Depending on the type of product, even an app store
model is possible that allows the customization of user interfaces, and so forth.
Product history and usage data allow predictive maintenance for the product, and
if appropriate service agreements are in place, the product itself may indicate to
either the producer or the customer (or both) that a repair is in order. This data is
also important to inform the design of future product generations. Every deployed
product is therefore also going to participate in a field test.
Especially in markets where long-term leasing agreements are common,
subscription-based or pay-per-use contracts are enabled by IoT. This may allow
manufacturers of expensive equipment to sell their product at a cheaper price and
expand their reach, and recover cost with each use. One of the most prominent
cases is the leasing of jet engines by airlines, which has given rise to an entire
industry. While the airplanes belong to the airline, the engines remain assets of the
vendor, who employs real-time monitoring and performs maintenance, thus taking
organizational and technical pressure off the airline.
6.1.3 Smart Retail
The challenges in retail in some sense resemble those of manufacturing in terms of

demand and supply. Intelligent supply chain management and the automated stock-
keeping and shipment between stores require the same asset tracking infrastructure
and afford the same opportunities.
Retail stores information systems of the future can combine footfall data
and the mobile phone traces (through beacon or WiFi triangulation functionality)
to optimize indoor traffic and the placement of products along popular routes
from the entrance to cash registers. It is conceivable that this information can
be used in real-time to inform both the supply chain and analyze the impact of
marketing campaigns. Much like recommender systems provide special offers to
those researching a product online, with the help of in-store tracking their repeated
presence in a store or an area within a store can trigger engagement via apps
or websites. This seems especially relevant for premium products that require an
additional impulse to buy.
Appliances that keep track of consumables (including the proverbial con-
nected fridge) enable novel types of services. Sharing this information with retail
stores may enable them to provide customized offers for individuals and house-
holds. If vending machines are continuously monitored and restocked, decentralized
outlets enable the exploration of novel and unusual marketplaces. One example of
this development is duty-free electronic products that can be purchased from vend-
ing machines in airports.
6.1.4 Agriculture
Food production is another key challenge in a world facing a growing population

and climate change. In industrialized countries, agricultural production is already a
highly engineered process. However, for crop production outside of industrialized
greenhouses, even in the developed world, the precise control of environmental
conditions is often not possible or complicated by the need for human intervention.
Critical challenges for plant growth are water, temperature, soil quality, diseases
and pests. Unfortunately, water is a scarce resource and the irrigation of fields
must be controlled, especially in times of drought. The runoff from fertilizer and
dung when it enters the water system can lead to excessive growth of algae, which
in turn diminishes the amount of oxygen in water. Chemical interventions against

plant diseases and pests are generally poisonous, and their distribution needs to be
carefully managed. The temperature of soil and offshoots can be held higher than
the environment by overlaying fields with black plastic foil, but again, this requires
manual labor and farmers try to avoid it when possible.
It is difficult to frame the different potential IoT applications into exactly
the same categories as shown in the taxonomy of Figure 6.1. Here are just a few
examples from the agricultural context that exemplify information systems, remote
access, and demand and supply management:
• The microclimate in fields can be monitored with probes that indicate which
parts of a field may require irrigation or protection. Through the automated
integration with weather forecasts, this enables information systems to advise
where and when to invest labor and resources, rather than using them based
on a schedule and on an entire field when a spatially more restricted measure
would be sufficient. Automated irrigation (with remote access) should take
into account soil humidity, temperature, groundwater levels and weather
forecast. Also, when fertilizer or chemical interventions have been brought
out, it is advisable to minimize irrigation to avoid their dilution or transport
into the water system. This requires communication between sensors and
actuators.
• Not all dung is created equal, especially when smart energy (see Section
6.1.1) and modern agricultural production meet. Waste management systems
can determine the chemical composition of animal waste and prioritize its
use for energy production in biogas installations or fertilization. Cross-
energy solutions manage the demand and supply for heat, electricity or
biogas. This is where agricultural IoT can interface with the electricity grid
or local consumers.
• Asset tracking is an important IoT application also for the agricultural
user. This does not stop at vehicles and tools. Livestock can be tracked in
the field, and along with sensors for body temperature and motion, their
health status may be inferred, allowing for what effectively is predictive
maintenance. If within a given geofenced area there is a systematic increase
of body temperatures, preventive measures like quarantine may be taken
before a disease can spread. Ultimately, the tracking identity of the animal
can be communicated to downstream processing units, be it to monitor milk
production or prove of origin for the slaughterhouse. This is already done for
some large farm animals using RFID tags.
Farmers are known to be technological early adopters (e.g., trickle irrigation,

large machinery and pesticide-resistant crops). It can therefore be anticipated that
important changes incorporating IoT strategies into agriculture will occur.
6.2 CITIES AND MUNICIPALITIES
At an increasing rate worldwide people are moving to urban areas. While currently
about half of the world population lives in cities, that number is expected to rise
to more than 70% by 2050, or even higher in threshold and developed countries
that are already largely urbanized. The world population is increasing at the same
time, with projections suggesting a minimum of 10 billion people if not more. This
puts immense pressure on cities and municipalities, whose budget is often not even
sufficient for maintaining existing infrastructure, let alone the development of new
structures and facilities that can cope with urban growth and increasing headcount.
Even with funds available, cities and municipalities often face limited capacity for
expansion, as roads, railways or pipework can typically not cross existing building
structures. While the challenges for city planners are manifold, the focus here is
going to be on the key issues of energy, gas and water, environment, traffic and
security. A general problem especially in the industrialized world is the age structure
of our society. This represents challenges around the services that cities can and
have to offer, and ranges from public transport options suitable for the elderly to
support with mobile nurses. Many cities are already claiming to go smart in the
near future, however, offering only a few digital services does not represent the
realm of possibilities. Here, the focus is on novel opportunities.
6.2.1 Energy, Gas and Water
Energy distribution is an important aspect in urban areas. The existing cables have
limited capacity and supplying a larger number of people with electricity requires
an integration with smart grid solutions (see Section 6.1.1) to balance supply
and demand. While it is difficult to impose restrictions on citizens, the energy
consumption of urban infrastructure itself will have to be reduced. The incentives
for municipalities are twofold: The demand on the power grid can be reduced and
saved electricity can made available for other usages, while at the same time saving
money.
Smart street lamps that reduce the light level during off-peak hours or switch
off certain areas entirely during the night (e.g., along highways) are one possible
solution. Many smart lighting systems that are connected to the Internet are already
commercially available. Modern streetlights are also capable of determining their
function (the maintenance aspect) and provide metrics about their energy efficiency
(information system). As the scheduled or demand-triggered control of lighting has
implications for security (crime) and safety (accidents), these solutions would bene-
fit from the integration with sensor networks that determine footfall and traffic flow.
Similar entry points for energy saving by establishing demand-based infrastructure
could be the optimal control of water pumps in the sewage system, which in turn
would require the integration with a water management system (e.g., a pump could
go slower and the system could rely more on a naturally occurring slope if there is
no flood imminent), or by provisioning a carefully priced energy mix of electricity
and district heating to incentivize optimal usage behavior.
The pipework for water and gas in many North American and European
cities is more than a century old. There are estimates that up to 20% of the water
transported in pipes underground in England and Wales is lost every day due to leaks
in the decaying network. The identification and repair of leaks requires pigging
and occasionally the deployment of underground engineering teams and excavation.
With a suitable sensor network, the structural health of pipework could be surveyed
continuously, avoiding the time and cost of emergency engineering works. A major
role of the sewage system, especially when sewage and storm drains cannot be
separated, is the draining of floodwater. These systems would benefit from an
integration with water level measurements in wells and rivers, to determine if all
sewage should pass through wastewater treatment or, when flooding is likely, if,
for example, excessive rainwater should be diverted directly away from the city.
While such measures are already applied empirically, a solution participating in the
IoT would make relevant information (e.g., flood warnings) available to citizens in
real-time as well.
6.2.2 Environment
Air quality in cities is a major problem. While it can be anticipated that the shift
to electric vehicles and improved public transport may help address the issue, for
the time being traditional combustion motors are going to be a main contributor
to urban pollution. The amount of diesel particles accumulating in the air or the
amount of nitric oxide rising above critical levels is often weather-dependent. Traffic
patterns, surrounding industries, prevalent heating systems and microclimate all
contribute to the uneven distribution of toxins in urban areas. Hence, the denser the
sensor network in an area, the better municipalities can make informed decisions
about countermeasures, such as traffic rerouting or incentivizing other sources of

heating. Air quality information systems are also a potential entry point for citizen
engagement: Data from home-based stationary or personal mobile sensor devices
can improve the spatial resolution of the analysis that is possible with the usually
more specialized sampling stations and diffusion tubes utilised by municipalities.
Water management and the monitoring of water quality are of special im-
portance in arid and semiarid areas of the world. Efficient rain water harvesting
and grey water recycling (reutilization of wastewater without fecal contamination)
require the same information infrastructure as discussed under energy, gas and water
(see Section 6.2.1). While the quality of water is continuously monitored in most
industrialized countries and usually regarded as drinking water, the necessities of
dry climate may require a course-grained classification into wastewater, usable wa-
ter (e.g., for cleaning and irrigation) and drinking water. Reliable information is
key for the success of such a model, not only to control water flow to avoid cross-
contamination but also to build trust in the population.
The remote monitoring of waste levels in collection containers gives munic-
ipalities the ability to plan recycling strategies in the longer term (e.g., if there is
a trend toward more compostable or otherwise recyclable waste), but also to opti-
mize collection. Real-time information could enable the disposal services to stop
only where bins indeed need emptying, or if integrated with traffic information, to
reschedule or reroute service tours in areas of heavy traffic.
6.2.3 Traffic
Urban traffic includes pedestrian, cyclists and motorists as well as various forms of
public transportation. Current sensor technologies allow the quantification of pedes-
trian traffic through pressure-sensitive panels in walkways, simple light barriers or,
indirectly, through counting wireless devices such as mobile phones in an area. This
information can be used in the long-term planning of city layouts as well as for more
immediate measures including barriers in cases of emergencies.
Traffic flow is already being monitored in many cities through induction
loops in road surfaces, in some countries using numberplate recognition and is
increasingly augmented with Bluetooth tracking. As part of an IoT solution, this
information could not only be used for the automatic control of traffic lights
and variable message signs along roads as already done, but also to relay that
information directly to vehicles and individuals to dynamically plan and adjust their
journeys. The lack of parking spaces is a key issue in urban traffic management.
While data from parking meters and pay-and-display units are occasionally already
available online in agglomerated form, future guidance systems could be taking

real-time sensor data from on-street parking and relaying this information to citizens
directly on the approach to their designated destination. Add smart vehicles, and the
routing to the next available parking space could be automated. As on-street parking
and parking garages have limited capacity, or in response to pollution levels, park-
and-ride schemes could adjust their pricing dynamically to incentivize their use. Bus
timetables with reliable estimated arrival times, based on real traffic information
along the route, could further drive the adoption of public transportation.
The structural health of roads and bridges lends itself to predictive mainte-
nance. The integrity of bridges requires regular visits from engineers and compli-
cated scaffolding. With the appropriate sensors in place, vibration and material ero-
sion could be measured to provide a better schedule for maintenance. This demand-
based system could reduce costs and alleviate potential traffic problems, especially
if an optimal schedule for repair can be determined on the basis the different data
sources.
6.2.4 Security and Safety
With municipalities having better insight where people in urban areas are and what
activities they might be involved in, public services can provide better security and
safety.
While traditionally using acoustic warnings from sirens, today alerts to mo-
bile devices may provide better coverage. In geographic areas where earthquakes
and tsunamis are within the possible risk, relaying this information to citizens is
vital. Toxic fumes from the uptake of volcanic activity accumulate more slowly
and warnings can be issued. Impeding landslides can be predicted with vibration
and stretch sensors. This data can help authorities to issue road closures or direct
pedestrians to more secure areas.
Security through tracking of individuals by their wireless footprint or via
camera surveillance is extremely controversial. However, noise-activated dispatch
of police patrols may provide a reasonable alternative. One already commercially
available system can recognize and pinpoint, for example, gunshots and car crashes
through a mesh of connected microphones. While still being a niche solution, one
particular use case for such noise maps is the sharing of data with citizens suffering
from conditions like anxiety, as the information may help them to navigate the city
without exposing them to their triggers, just as pollution data would help asthmatics
to avoid areas of risk.
6.2.5 Summary
In conclusion, the vast ecosystem of an urban area offers a huge range of possible
entry points for IoT. While some of the scenarios suggested in this section may
never see the light of day, others are readily available and wait for adoption. A
commercially very active area is traffic guidance and smart parking, so it can be
expected that at least some cities are going to experiment with traffic avoidance to
alleviate pollution.
6.3 CONNECTED VEHICLE
Modern vehicles have well over 50 separate electronic systems that control every-
thing from motor performance and assisted steering/braking to lights and entertain-
ment systems. With cars themselves becoming electric, this number is likely to rise
in the future. The connection of these systems to the Internet is only the next logical
step in the evolution of the car. The previous sections have shown a wide range of
examples how (predictive) maintenance can help to ensure functionality and reduce
the chance of failure and cost, and the connected car is no exception. Whether this
is marketed just as a premium service or if car metrics are transmitted by default to
enable engineers improve future designs remains to be shown. Several car manufac-
turers already offer additional services that include data contracts, vehicle tracking
and automated emergency calls; for example, OnStar developed by General Mo-
tors. The European initiatives e-Call and b-Call aim to alarm the policy and rescue
services if the car detects a collision (e-Call) or if a breakdown occurs (b-Call).
The idea is that unless a driver actively cancels the operation, information about
the vehicle’s location as well as critical parameters are automatically transferred
to those services. In the long run and if services like e-/b-Call become mandatory,
this would enable municipalities to avoid the cost of maintaining emergency phones
along motorways, pushing the cost for these services towards owners of vehicles.
We are a long way from connected cars to fully autonomous driving aug-
mented by the IoT. However, there are already thriving ecosystems around vehicle
data services. Fleet management information systems allow to track vehicles and
provide an overview of a fleet’s health: localization, fuel level and a digital logbook
usually represent the core services. This allows remote operators to plan journeys,
confirm scheduled breaks and assist financial controlling with critical data. These
systems have been used by logistics companies since the late 1990s. The section
about smart manufacturing (see Section 6.1.2) discussed how this information can
be used with other asset and load tracking solutions to facilitate an integrated supply
chain.
With electricity becoming the default fuel in the future, the whereabouts of
active vehicles in demand is critical information for energy providers. It would be
only reasonable to couple this data and allow two-way communication to balance
demand and supply, and inform drivers of the most optimal refuelling points along
their route.
There is a massive range of novel services that are enabled by the connected
vehicle. Car sharing schemes like ZipCar in North America and parts of Europe use
IoT for their pay-by-use service. Customers can locate available cars in real-time,
and after booking they can unlock the cars with their mobile phones (following a
remote trigger). Other subscription-based services include automatic routing and
traffic management, combining satellite navigation and real-time traffic updates,
and the automatic payment of fees and street charges when vehicles enter toll-
restricted areas. Private security companies also offer the retrieval and tracking of
stolen vehicles as soon as they leave a certain geofenced area, even before their
owners become aware of their loss.
Already existing services also include pay-by-use (one-off) insurance. The
insurance market per se has recognized interesting use cases of IoT, at this stage
primarily around the car: On the basis of usage data, driver behavior can be
assessed (e.g., on the basis of typical acceleration and deceleration patterns, average
speed, fuel consumption, even preferred parking locations) and help companies to
determine into which risk category a driver may fall. The benefit of such solutions
is twofold, although again, controversially discussed: Drivers who know that they
are being held accountable for their behavior (by means of their insurance rate) tend
to drive more carefully and economical, whereas insurance companies are able to
better understand their customers and calculate risk closer to their margins in order
to offer better deals (e.g., also for young and inexperienced drivers).
Autonomous driving describes concepts ranging from assisted parking to fully
automated operation of the vehicle. The automated parking of vehicles only requires
a few sensors that determine the size of a parking space and computation of how to
optimally steer the car into it. These systems require the presence of the driver, either
in the seat or at least standing nearby to oversee the process. This assistance does not
rely on IoT. Real autonomous driving, however, requires communication between
vehicles, and in order to optimize traffic flow, communication with infrastructures
like digital road signage and traffic lights is required as well. At this stage of
automatization and data exchange, vehicles will fully integrate into the various
ecosystems of the IoT. However, lag and bandwidth are still inhibiting factors
and thus further investment also into the data infrastructure along the road is also
necessary to reach this level of autonomy.
6.3.1 Smart Buildings and Assisted Living
Smart buildings, smart homes and assisted living go hand in hand. While the focus
of connected buildings is on building control and how to manage them efficiently
(e.g., in terms of cost or energy), smart home and assisted living solutions touch on
subjective targets such as feelings of convenience, security and individual health. As
such, there are significant potential consumer IoT products and services that deserve
consideration. The first section on smart buildings will focus on IoT for professional
building management. Many of the use cases discussed there are also applicable to
the smart home, a section where convenience and assisted living are in the fore.
6.3.2 Smart Buildings
Broadly speaking, building management is the same for residential homes, office
buildings and other, non-industrial buildings. The core amenities include controlled
temperature, hot/cold water and electricity (see Figure 6.4). Heating and lights can
be considered default infrastructure in any house, whereas depending on geographic
location, ventilation and air conditioning as well as automatically controlled doors
and windows may be more prevalent in commercially used buildings.
The latter settings often provide central control for heating, ventilation and
air conditioning (HVAC), and lights. More complex building management systems
also integrate data from room temperature sensors, occupancy/intrusion detection
sensors and fire alarms. Traditional systems are independent from the Internet;
in fact there are a range of bus systems (physical interfaces and data exchange
protocols) specifically developed for device management in buildings.
The premise of a smart building is that these infrastructural entities commu-
nicate with each other and allow an automated control system to react like a human
operator might. For example, if intense sunshine is detected and the temperature
rises inside an office, window blinds can automatically close, the windows shut
and the HVAC counteracts with providing cool air. As this is an energy-intense
process, it would only be triggered if passive-infrared sensors detect actual room
occupancy. In particular, green buildings that are built for optimal energy (a low
carbon footprint) and water usage to minimize their environmental impact require
such elaborate control systems. This also exemplifies why ’smart’ does not auto-
matically necessitate an Internet connection.
% lx
sensors CO2
22OC
weather temperature humidity air quality light level occupancy hazard
detection
actuators
window heating ventilation air lights door/window

blinds conditioning control
room level
Public
boiler services
CO2
Maintenance hot water !
ventilation air quality intrusion
system detection
chilled water 3498 kWh Remote

Grid
greywater smart meters Control
recycling
data
lift or escalator collection - guided -
motors
Water or other control alarm
infrastructure batteries units systems
- automatic - central plant coordination centre
Figure 6.4 Smart buildings and IoT. Commonly available sensors and actuators on a room level, and
control systems on a building level in a smart building. Environmental sensors inside and outside inform
the coordination center of temperature, humidity, air quality, light level and occupancy. Together with
weather data, this determines how window blinds, automated doors and windows, heating, ventilation
and air conditioning can work together to provide ideal living/working conditions. This system does
not require an Internet connection to the outside world. However, enabling infrastructure in the central
plant to participate in the IoT allows usage-based monitoring and predictive maintenance of HVAC
components, lifts and escalators. Integration with the energy grid as well as the water system further
helps to diminish the building’s footprint without human intervention in the coordination center. Building
managers may be able to monitor or control critical infrastructure remotely, allowing the better utilization
of workforce and decentralized management services. Through integration of occupancy data, hazard
detection (e.g., of fire or gas) and intrusion alarms, relevant public services can be alerted and a measured
response coordinated. For example, emergency services may choose to respond differently to a fire alarm
in a fully occupied office building, in contrast to one that has only partial allocation — based on actual
data and not on an assumption based on the time of day.
As participants in the IoT, smart buildings could interface with third parties
that provide predictive maintenance for machinery in the central plant (e.g., boilers,
ventilation systems and water pumps). Infrastructure such as the electricity grid
and water supply could indicate the current supply and demand level for these
resources, so that batteries can be charged either from the grid or local renewable
energy sources. While greywater needs to be disposed of occasionally, it may be
more ecological to do so when there is no particular strain on the sewage system,
as discussed in Section 6.2.1. Information exchange with public services (police
and fire departments, waste collection) can help streamline operations on both the
consumer and provider side. Interesting use cases arise from the possibility to
measure occupancy: In shared offices with hot desks this may enable occasional
telecommuters to decide whether they are going to come in. In a residential setting,
this information can be used to determine whether family members are already
home (or in conjunction with other sensors, if somebody has broken in). Last but
not least, remote access enables building managers and third parties to manually
control infrastructure if needed.
6.3.3 Assisted Living
Assisted living, also called smart or augmented living, is a summary term for
technologies that will improve quality of life. This includes the monitoring of
health, care for the elderly, but also the provision of convenience and feelings of
connectedness and safety. Potentially, this may be a large market for consumer
products that are not essential, but that may still have a high factor of desirability.
Current wearable devices that are centered around e-health mostly serve the
quantification of physical activity. In the future, it can be anticipated that heart rate,
pulse, body temperature and blood pressure can be monitored in unobtrusive ways.
Google filed a patent for the measurement of blood sugar levels from liquid on
the cornea, a potential entry point for the management of diabetes. This data could
allow medical professionals the long-term monitoring of vital signs, a prerequisite
for preventive measures especially for demographic groups at risk.
Connected pill boxes, injection devices and medicine dispensers allow medi-
cal professionals but also pharmaceutical companies to measure the compliance rate
(i.e., if patients use medication at the prescribed times and amounts). This does not
imply that patients consciously avoid medication, but forgetfulness at old age is a
common reason for a lack of compliance. Other assistance for the elderly includes
wearable alarm systems that alert carers in cases of medical issues or accidents like
falls. Such systems already exist to some extent, but with their integration in an
IoT ecosystem the coupling to public services, health professionals and caregivers
would become more efficient. For example, in connection with IoT devices in the
smart home, it would become clear in which room of the house a fall has occurred
and/or if preventive measures such as the emergency shutdown of electricity, gas or
water should occur.
Smart watches and mobile phones connect to the IoT as personal information
portals. Potential offerings and services around connected appliances or cars have
been discussed. The connected fridge that orders milk or the crockpot that heats
food once users enter a geofenced area around their home are exaggerated examples
of the possibilities. What people find reasonable use cases for consumer IoT is going
to be highly individual. Quite interestingly, products that indicate the well-being of
loved ones are popular at this stage: there is the Good Night Lamp, a group of
connected lamps that can be located anywhere in the world and that light up if one
of them is being switched on, or the Pillow Talk, a system consisting of a wrist
band that records the pulse of a person and replays that rhythm through the speaker
in a pillow, which may be located elsewhere. Both of these products still fall into
the gadget category of the classification shown in Figure 5.2. However, allowing
developers to utilize the functionality of these devices for integration into whichever
application is probably the next logical step. This extends the IoT to people, which
is why it is sometimes referred to as the Internet of People.
Chapter 7
Drivers and Limitations
The previous section has presented many actual or potential use cases of IoT. What
drives the adoption of IoT? What hinders it?
7.1 DRIVERS FOR ADOPTION
The driving force for industry users to connect facilities and digitalize processes is
an increase in efficiency, and IoT is either a measure to save cost and/or boost profit.
Also, the IoT presents an opportunity for companies to create unique selling points
and innovate in a novel service landscape around connected products, thus driving
further innovation in that field. Industries have always adopted an open stance
toward novel technologies that improve productivity, and industrial control systems
have lent themselves to further automatization, giving rise to many M2M solutions.
The industrial IoT is therefore quite advanced, at least for solutions that connect
various branches and services within an organization: The coupling of supply chain
and asset tracking is often already reality, and the adoption of technologies like
the fully connected shop floor are probably just a question of time. However, so far
industry has been mostly unresponsive to calls for integration with energy providers
or other infrastructure. While there are reasons of energy economy, the financial
incentives for industry are not sufficient to implement such changes.
Cities increasingly see the advantages of going smart and recognise opportu-
nities for money saving. Often in collaboration with third-parties (e.g. private bus
services or parking lot operators), increasingly traffic information systems such as
real-time bus timetables and parking information become available. However, issues
123
with energy networks, gas and water pipes are usually addressed by immediate mea-
sures as municipalities are not able to make the upfront investment that a preventive
infrastructure would require. This is also true for many environmental purposes of
IoT, which are desirable but not essential. Fortunately, policy makers have recog-
nised the need for improvements around energy usage. Legislation in the European
Union (the Energy Efficiency Directive, EED) now aims that by 2020 at least 80%
of electricity meters should be smart and allow two-way communication between
energy providers and consumers. This is undoubtedly only a first step toward the
wider adoption of IoT solutions.
Ultimately, consumers might become a major driving force in the adoption
of IoT. While connected devices are not yet common, Bosch for example has
announced that by 2020 all of their household appliances are going to have some
sort of IoT functionality. Analogously to the shift from basic mobile phones to
smartphones, this move is likely going to be copied by many manufacturers of
similar devices. At some stage, consumers are going to expect connectivity even
of basic appliances to benefit from integration with other devices and services.
7.2 LIMITATIONS
Further to the financial constraints and the sometimes unclear cost/benefit ratio in
the uptake of IoT solutions, the three most important challenges are:
• Lack of interoperability
• Lack of trust in IoT security
• Privacy and data provenance issues
These issues are discussed in more detail in the respective sections of this
book: Part VII (Software) and Part VIII (Security).
In brief, industrial communication solutions are often highly customized and
have primarily one organization in mind. The integration of novel hardware is hin-
dered by the lack of backwards compatibility to existing systems, and software is
built around the needs of the organization at the time of purchase. That means that
interfaces to other organizations (data formats, processes) are often not available and
require further effort. In the example of industry and energy grid, while in principle
the industry partner could see the benefit of negotiating ideal consumption patterns
with the energy provider, their software may lack the standards that would enable
them to implement such change easily. Especially in large organizations there is
Drivers and Limitations 125
considerable administrative overhead, hindering otherwise relatively straightfor-

ward technical solutions. Unfortunately, while the industry case can be solved and
managed as an organizational process, including agreements or binding contracts
between two or more partners, in the consumer space there are often market strategic
consideration around standards that push consumers into either/or decisions. This
is comparable to forcing the consumer into the ecosystem of a particular mobile
phone or games console manufacturer. Car manufacturer X may want to define an
interface to communicate to electric recharging stations operated by company Y.
Pushing such a standard would diminish market access for manufacturer Z. How-
ever, manufacturer Z proposes a different standard and seeks support from energy
provider E, with E threatening to impose fees on Y if they should not subscribe to
this standard. Extend this hopeless scenario to household appliances, street lamps
and vehicles. The IoT is a very fragmented space of consortia supporting different,
generally proprietary standards so far.
Opening processes and infrastructure that previously had been isolated re-
quires trust in the security of such solutions. Putting the control system of a nu-
clear power plant on the Internet without any protection against malicious access
is grossly negligent. Unfortunately, as Part VIII on security will show, the attack
surface for IoT solutions is significantly larger than hosting a few services in the
cloud: Devices can be physically hacked, radio connections intercepted, messages
spoofed and data centers sabotaged. Hence, IoT engineers will have to secure their
solutions on every possible level, and have the ability to build trust in their user
base. These issues exist both for industry applications as well as for the consumer
IoT.
Last, especially in the consumer space, there are growing concerns about
privacy and data provenance. Who owns the data generated by IoT processes? Who
decides what can be done with it? How should consumers be able to control who can
use their data and for what purpose? These are questions that need to be addressed
on the legislative and technical level. Another challenge for the IoT!
Part IV
Architectures of M2M and IoT

Solutions
Chapter 8
Components of M2M and IoT Solutions
8.1 OVERVIEW
This chapter is going to introduce the basic components of every M2M or IoT
project: What is required to build a complete end-to-end solution? How do these
components components fit together? The answer as so often is entirely dependent
on the particular application. It is therefore important to gain an understanding
of network architecture and the different types of networks used for device com-
munication and the Internet. Examples of common IoT architectures are going to
highlight their applicability for different types of problems, their advantages and
shortcomings.
The data flow in many M2M applications is from a device to a coordinating
entity that uses computational decision making and back to a device. However, in-
formation systems and processes that require human supervision require interfaces
that are both insightful and intuitive. In the current consumer IoT, websites, mobile
phones and other devices with small screens are still the norm to interface with
things. In the future we may see connected devices that utilise entirely different
types of communication with the user; for example, interfaces that are (1) visual,
(2) auditory, (3) ambient or (4) tactile.
The classical model of information flow in computing is I-P-O: input, pro-
cessing and output. This model still holds true for the IoT. Input may be provided
by sensors, processing happens primarily on local computers, in the cloud and on
data platforms, and this may trigger output on remote actuators, including displays.
Depending on the technology used for data transmission, sensors and actuators may
require the use of a gateway device, frequently referred to as hub. Much like a
129
wireless router for home use, they exchange short-range radio signals with end de-
vices and provide Internet access through a wired or long-range radio connection.
However, it is not an absolute requirement that a hub must be wireless, as can also
be seen in the discussion of network architectures in the subsequent chapters of this
part. While we will use Internet connection throughout most of this chapter, this
often implies a layered connection and intermediaries like a private network, or a
local area network, which then connects to the Internet itself.
To make conceptual links to real-world applications, this chapter is often

referring to hardware and software standards, along with a range of technical
acronyms and brand names. These are further detailed in the respective chapters
of this book and, for the sake of readability, may not always be cross-referenced.
8.2 SENSORS AND ACTUATORS
The canonical input device in IoT applications is the sensor (e.g., for the detection
and measurement of light, sound or current). In complex systems, the boundaries
between a simple sensor or alternative sources of data can be blurry. For example,
processed information from entities outside an IoT application may be interpreted as
a form of input that does not necessitate a sensor. At the same time, an application
may not necessarily require an immediate output device or actuator, as the entire
data flow serves as input for another process in the IoT ecosystem. However,
a simple end-to-end solution typically includes one or many sensor devices that
provide qualitative (present/absent) or quantitative (measured) data as well as
appropriate actuators (physical output) or visual interfaces (display in app/website).
The in-principal decentralized nature of the IoT enables sensor and actuator
devices to be installed in remote or otherwise hard-to-reach locations. A key
requirement for many devices is thus their independence from mains electricity
and/or a wired network connection, a container tracking device being an example for
such a self-sufficient system. This imposes a range of constraints on hardware and
software developers, and who often need to find a balance between those different
requirements:
• If battery life is paramount for a device, the developers may choose to keep
the device in a quiescent state most of the time. Unlike mobile phones
that have to be recharged in relatively short intervals to remain functional,
IoT devices are expected to run for months, if not years, without a battery
replacement. Hence, displays are frequently omitted in such devices and
Components of M2M and IoT Solutions 131
if some form of visual output is required, that may be in the form of

energy-saving status LEDs. If relevant data comes in the form of complex
signals, some degree of decoupling the sensing and processing unit from the
communication module may be required.
• In order to maintain a connection, some radio and data standards need contact
with a base station more frequently than the time between measurements or
status updates. In order to extend battery life and keep the sensor/actuator
in low-power mode, one might choose to let the connection expire and
renegotiate with the base station when needed. However, depending on the
protocol used to establish the connection, this renegotiation may require the
device to be awake for longer than maintaining an existing connection using
short keep-alive signals.
• Frequent transmissions of raw data can strain the battery. While methods for
data compression or edge-processing exist, these are typically computation-
ally expensive, shifting the energy demand from the communication interface
to the processor.
In order to keep hardware prices low, widely deployed systems typically
utilize the lowest-spec compute components possible. This means 8- or 16-bit single
core processors and memory in the lower kilobyte range are therefore not unusual
for embedded systems, although some manufacturers argue that these controllers
are not sufficient for security and the computational requirements for encryption in
a safe IoT. This plays into the decision of what to do with data and where to do
it. It is important to note that field deployment usually necessitates small, robust
packaging. This imposes limitations on the type of embedded system, number and
type of hardware interfaces, ventilation, room for antennas and place for batteries.
That is, while it is tempting to chose a large battery with plenty of capacity, both
size and safety requirements may forbid this option.
In this section sensors and actuators were treated like stand-alone devices.
However, IoT functionality can also be seen as an add-on to existing systems.
Conceptually, also in terms of the necessary hardware, a smart thermostat is simply
an electric thermostat with an additional connection (e.g., WiFi interface) that
enables the user to exchange information and commands with their phone. This
also means that while IoT sensors and actuators can pose a challenge to engineering
and design, some of the aforementioned issues have already been addressed as
part of the product design process before the item was Internet-enabled (e.g., a
large household appliance can supply the necessary electricity to keep the device
constantly connected to the Internet).
8.3 GATEWAYS AND HUB DEVICES
Preceding any architectural considerations around designing end-to-end IoT so-

lutions, this section briefly introduces what is known as gateway or hub devices.
They serve as intermediaries between end devices and the Internet (see Figure 8.1).
The form and function of gateways differs widely, ranging from small embedded
Linux systems with radio modules and some degree of data preprocessing to clas-
sical wireless routers as used in the consumer space to small battery-operated radio
transceivers with an uplink to communication satellites. Depending on the technol-
ogy, some end devices may use Internet protocols directly (e.g., IPv6-to-the-edge);
in the example of the satellite connection they may rely on proprietary standards
prescribed by the telecommunications operator and the data may only go on the
Internet once the signal has been received back on Earth.
Edge devices that do not have an independent network connection via Eth-
ernet, a digital subscriber line or a mobile data connection often communicate via
proprietary protocols, conventions of how the data is encoded in a physical signal.
On the physical layer there are radio standards of varying range that find utility in
connecting devices to a hub, but in buildings, industrial environments and cars this
could also be a wired fieldbus system. Short-range radio, such as ZigBee, Bluetooth
or WiFi requires hubs to be locally deployed along with edge devices, whereas long-
range radio, such as LoRaWAN or Sigfox, allows the integration of devices across
a wider area, much like cellular networks. These solutions all necessitate that the
addressing of individual devices is organized by the gateway, as the Internet (IP)
address is issued to the hub, not down to the device level.
The function of the hub is to receive these proprietary messages and to format
the data such that it can be sent through the Internet, or conversely, to receive data
from the Internet and to send it to the device in a format it can interpret. The
hub therefore has at least two hardware interfaces, one to the radio or a fieldbus,
and one to connect to the Internet. While one Internet connection is sufficient,
on the edge device side the hub can interact with many different devices. This
can have certain advantages where expensive connections (e.g., via satellite) are
used: the hub can integrate data from the pool of devices and send them in a single
transmission. Ideally, all edge devices in a particular perimeter would connect to just
one gateway. This would save resources on one hand, and allow for integrative data
processing already before sending data across the network. Unfortunately, while
some hub devices integrate ZigBee, Bluetooth and WiFi, there are considerable
interoperability issues that either prevent edge devices to connect to hubs from a
gateway cloud
/ hub direct connection:
Internet
- Ethernet
radio standards - digital subscriber line
or fieldbus or
- cellular network
- satellite link
edge device
- sensor
- actuator
Figure 8.1 Direct or gateway-mediated connections to the Internet. Edge devices can connect to the
Internet directly or through a gateway/hub. Here, an Ethernet connection implies the connection to some
network infrastructure that is readily using TCP/IP or other Internet protocols to communicate the data.
Alternative routes include digital subscriber lines, the cellular network or satellite links, and the data may
only at the remote site transit into the public Internet. When using indirect connections, the data needs
to be transferred either via radio (e.g., WiFi) or wired fieldbus systems (e.g., CAN) to the gateway/hub.
It is the hub that then mediates the Internet connection. The cloud is the site of data collection and
integration. The chapter on architectural considerations (10) provides a more fine-grained view of the
various connectivity options. Note that this schematic omits security measures like a hardware firewall
or virtual private networks that use encryption to protect the data transfer, topics covered in Part VIII.
different vendor, or the data format is not sufficiently documented for third parties
to make use of the information.
While for the purpose of this section we discussed WiFi as a device-facing
interface and Ethernet as an Internet-facing interface, it is not uncommon in the
consumer IoT to have an edge device connect to a hub via a proprietary radio
protocol and the hub connecting via WiFi to a wireless modem router to the
Internet. Gateways can often not be avoided in the design of an IoT architecture.
However, due to their central nature, they can represent a single-point-of-failure
and also require additional configuration, which makes direct connections between
the device and the Internet desirable where cost and power options allow. As many
hardware manufacturers also define their own standard for the communication
between the end device and the hub, gateways are a key barrier to interoperability
in the IoT.
8.4 CLOUD AND DATA PLATFORMS
The cloud is widely used synonymously with any computer resource that can be
accessed through the Internet. Typically, these are large data centers with hundreds
or thousands of computers, in which companies or individuals can purchase pro-
cessor time and storage capacity. These data centers are located all over the world.
The physical location as well as the precise hardware is often unbeknownst to the
users, who write, install and run software on these machines over the Internet. While
there are many types of offerings for different purposes from particular providers,
their services are generally, colloquially referred to as the Salesforce cloud, the
Amazon cloud (Amazon Web Services), the Microsoft cloud (Microsoft Azure) or
the Google cloud, to name a few of the currently biggest vendors.
The cloud providers offer computational tools in the form of interconnected
services that can be accessed through code libraries or programming interfaces to
build cloud solutions, such as the backend for an IoT application (see Figure 8.2).
The IoT itself has been recognized by the cloud vendors as a worthwhile area for
investment, such that there are even entire toolsets specifically aimed at developing
IoT solutions. On top of this ground layer of functionality, many middleware
providers have built what is commonly referred to as an IoT platform. These are
systems that allow for device management, data ingestion, storage and analytics.
Common components of an IoT platform are message brokers that take care of the
distribution of incoming data to relevant business processes, time-series databases
with optimized queries for time-based or sensor-based aggregation or real-time
analytics. While end-users can view graphs, perform simple calculations and export
some of the data through so-called dashboards, the true power comes through APIs
that allow programmers to utilize the functionality of the IoT platform and access
the data in a programmatic manner. This provides opportunities for integration
with other services, such as custom-written analytics tools or interfacing with other
applications. Unfortunately, by default most IoT platforms are closed and resemble
data silos, such that data exchange between platforms and the integration of data
across platforms is largely prohibited by a lack of interoperability. This is where
considerable changes need to happen in the future in order to make IoT ecosystems
happen as they are detailed in Chapter 6.
It is important to note that the concept of an IoT platform is not tied to
cloud platform providers and the cloud. There are plenty of platforms that are not
based on cloud vendor tools, but that are developed independently and installed
on commodity hardware of the middleware provider. In fact, ambitious home users
can maintain their own installations on small systems such as the Raspberry Pi.
traditional stack cloud stack and service types
clients (devices + users) clients (devices + users)
customer-specific mash-ups
overhead expenses
business applications application development Software as a Service (SaaS)
server software configuration and security
operating system system administration Platform as a Service (PaaS)
commodity hardware space
- rack servers Infrastructure as a Service (IaaS)
electricity
- disk arrays cooling and ventilation virtual machines,
- network switches ISP connection dedicated server, etc.
responsibilities as
IoT solution provider
Figure 8.2 Traditional and cloud stack comparison. A provider of business applications in the tradi-
tional stack needs to take care of server software and operating systems on the software side, as well
as rack servers, disk space and network infrastructure on the hardware side. The overhead expenses for
systems administration and the operational costs to run a data center are significant. Cloud providers
offer infrastructure, software platforms and even software that IoT solution providers can build up on
to design IoT offerings for their customers. The costs for the solution provider in the traditional stack
include the running of a data center, whereas in the cloud stack there is a periodic fee, depending on the
amount of usage the solution sees. There is some anecdotal evidence that switching from maintaining
infrastructure to using cloud can be up to a hundredfold cheaper.
However, for professional users with higher demands in terms of data transfer and
storage, the total cost of ownership is considerably lower for cloud services, which
also provide faster scalability and better elasticity for their tenants, depending on the
level of service agreement. Such architectural decisions are sometimes not entirely
technically driven: Large cooperations sometimes prefer the installation of server
hardware on their premises (on-prem) for a wide range of legal reasons. One reason
of concern are access rights of governmental organizations to user data in some
countries. Many cloud service providers therefore allow their customers to choose
a location where their data is processed and stored, in an attempt to alleviate such
concerns. It is therefore not uncommon that cloud companies maintain data centers
in the United States as well as a range of European countries.
Importantly, as stated in reference to home users, it is conceivable to run an
IoT platform on a local computer, and control the exchange of data with third parties
prior to transmitting the data over the public Internet. This restricts the utility of the
platform, but might be sufficient for home applications or in environments where
privacy and security are absolutely paramount.
Some of the technical challenges in the development of IoT like backend
software are covered in Part VII (Software) of this book: Design considerations
for message handling (brokerage), database design and storage, as well as the
principles of data analytics both on the incoming data stream and in batch. The
section further covers some theoretical approaches around interoperability and data
exchange between platforms.
Chapter 9
Architectural Considerations
Before discussing design strategies and common examples of IoT architectures, it

is worth having a look at the foundations of network computing. This discussion
aims to develop an appreciation for the currently prevalent architectures. Which
topologies are used and what are their characteristics? What spatial dimensions do
computer networks encompass and how does this impact our design decisions?
9.1 NETWORK TOPOLOGIES
The connection between just two computers requires only one choice to be made: Is
the connection meant to be unidirectional, designating one computer as sender and
the other one as receiver of the information, or do both participate in a bidirectional
exchange and can take either role? Even though when thinking about a simple
sender-receiver connection the flow of information seems unidirectional, modern
software protocols typically utilize what is commonly referred to as handshake,
the negotiation of modalities how data is encoded, and confirmation of receipt (thus
reversing the direction of the communication). If in such a bidirectional scenario the
physical connection can simultaneously transmit data in both directions (e.g., when
the receipt for data package happens while a new package for package is
already on the way), we talk about a duplex connection; if the line can only be used
for one of them at the time, it is called half-duplex.
As soon as a third (and fourth, etc.) computer is added, there are a number of
network topologies or network layouts that have been used in the past (see Figure
9.1):
137
C1 C2 C3 C1 C1 C2 C1
termi- router
nator C3 C3 C2
C2 C3 C4
bus
ring mesh star
Figure 9.1 Network topologies. The oldest topology is the bus. The ring topology is conceptually a bus
that rejoins itself at the terminator. The mesh topology can be seen in wireless sensor networks. Most
prevalent in local area networks as well as the IoT when including a hub device is the star topology with
a central router.
• The most simple topology is the half-duplex bus, a linear arrangement that
connects computer C1 with computer C2 , and computer C2 with computer
C3 . If a message is sent from C1 to C3 , software on the C2 checks whether
it is the receiver of the message, and if not, forwards it to C3 , where it
is acknowledged. If C2 sends a message for C1 , the information goes to
C3 , which forwards the message. A device referred to as terminator then
indicates that there is no other computer in the bus. This reverses the direction
of transmission for the message, now going from C3 to C2 and, ultimately,
C1 . In the absence of a terminator (that is required on both ends of the
bus), messages can be lost or may be reflected indefinitely. The first office
networks were bus systems. While the bus is easy to implement and does
not require much cabling, data exchange is not very efficiently handled and
data collision from reflected messages leads to overall poor performance,
especially if messages are addressed to devices that are not actually present
in the network.
• The ring topology connects C1 to C2 , C2 to C3 , and C3 to C1 . There is no need
for a terminator. However, breakage of a single connection within the ring
leads to failure of the entire network, much like in the linear bus topology.
One advantage of the ring over the bus topology is that the directionality of
the ring can change in order to minimize the path to the recipient.
• In a mesh topology every computer is conceptually connected to every other
computer. This, of course, leads to an exponential growth of necessary
connections for any , following n n . In a mesh network, a message
from a sender goes out to any other computer (node in the graph). If a direct
route is not possible due to failure, every computer can act as a repeater
to provide alternative, indirect routes for delivery. The mathematical Party
Architectural Considerations 139
Problem determines how many and which connections can fail until the
delivery of the message can no longer be guaranteed. As a network design for
wired connections, the mesh topology is unfeasible because of the amount of
material and labour that is required to establish the links. However, radio
communication is intrinsically undirected; that is, every device in the reach
of the sender (or repeater) can participate in the network. This requires some
degree of overhead to avoid message duplication and unnecessary relay.
Mesh networks are often used in distributed sensing applications, where
intelligent embedded sensor devices build a network, not computers.
• The star network is probably the most common network topology used
in local area networks and thus widely known even to nonspecialists. A
classical example is the switch/router for WiFi/Ethernet connections, where
a central computer (in that case a relatively low-spec embedded device) is
connected to all other computers in network. Messages are sent to the central
router, which forwards them to the appropriate receiver. This provides the
shortest possible route between these devices. One can distinguish between
active routers that manage receipts and retrials autonomously, or passive
devices that act as simple repeaters and leave receipt management to the end
devices themselves. The advantage of the star network is its robustness to
connection failures with computers that are not in a sender or receiver role.
However, prone as a single-point-of-failure, if the switch/router itself fails,
the entire network is rendered broken.
9.2 SPATIAL DIMENSIONS OF NETWORKING
Traditional (Internet) networking has broadly defined two spatial dimensions: A

local area network (LAN) typically refers to a spatially restricted domain, such as
an entire building or a floor, where the devices are connected in a simple architecture
such as a bus or star topology, and where the infrastructure is owned and controlled
by the network operator (i.e., the user can manage access privileges or quality of
service). Conceptually, this is contrasted by the wide area network (WAN), which
without any further implication about the network architecture refers to everything
that is outside the local network and implies that some or most of the infrastructure
is leased (e.g., with access through an ISP). The terms LAN/WAN are often utilized
to convey to a user whether the devices are inside or outside the local network (e.g.,
separate LAN/WAN settings are common for home routers).
Currently there are a variety of intermediates that describe the spatial dimen-
sion of the network. For example, the campus area network (CAN) would be the
network of a larger institution with more than one LAN that are nevertheless be-
hind one common gateway to the Internet (although in practice there may be more
than one gateway to provide redundancy). The metropolitan area network (MAN)
is often used to refer to networks operated by municipalities as a backbone for data
services as discussed in the M2M and IoT ecosystems chapter (see Section 6.2).
The usefulness of such acronyms is doubtful as they do not convey any technical
detail about the architecture or complexity of the underlying network structures.
The personal area network (PAN) deserves a closer look in the context of
the IoT: In the wearable device market, Bluetooth Low Energy (BLE) emerges as
a de facto standard to interface wearable but also implantable devices (e.g., pace
makers). While meshing is in principle possible, BLE usually pairs these types
of equipment to mobile phones. The phone then acts as hub to collate the data
and interface to the Internet. Hence, the PAN is frequently used to describe the
spatial domain around the mobile user. Variations of the PAN include the body area
network (BAN) primarily for medical devices, or the static near-field networks that
are created ad hoc once a user devices connects to passive RFID or NFC tags.
For most IoT applications the semantics around the network is irrelevant.
Most deployments resemble a PAN or LAN, as the transfer of data between the
devices and/or gateways over the Internet is largely beyond the control of the user.
Chapter 10
Common IoT Architectures
There is no one-size-fits-all solution for IoT architectures. In many cases, the choice
between a mesh network, a star topology with a local network, or a direct data
connection over the cellular network (or combinations thereof) depends on the par-
ticular constraints of an application. Each architecture has advantages and disad-
vantages. The following sections detail a few characteristics and show exemplary
applications for each. Information about most of the technical terms mentioned in
this section are featured in Part VI (Device Communication) and Part VII (Software)
of this book.
10.1 MESH NETWORKS
Mesh networks are useful for deployments in large but spatially restricted areas
where devices are placed at relatively high density. Typical applications are meshed
sensor networks, in which each device combines the ability to take measurements
and send data, but also to receive and forward incoming radio messages. The Insti-
tute of Electrical and Electronics Engineers (IEEE), a standards body, defined the
802.15.4 standard that specifies the low-data rate exchange between such devices.
Radio frequency modules such as ZigBee or those supporting IETF (Internet Engi-
neering Task Force) 6LoWPAN (IPv6 over Low-Power Personal Area Network, an
IPv6-to-the-edge standard) are enabled for meshing. The ultimate receiver of data
sent over the meshed network is a classical gateway to the Internet. Gateways can
do protocol conversion at different layers. While it is conceivable that a dedicated
6LoWPAN router takes encrypted IP frames out of 802.15.4 communication and
reframes them as 802.11 (standard WiFi) or 802.3 (standard Ethernet) packages,
141
Internet
sensor node
optimal route gateway node
failed or suboptimal route idealized range
Figure 10.1 Mesh network. Sensor nodes (black squares) propagate measurements along various
routes with their radio range (indicated by dashed circles). On the way to a gateway node (gray square)
that mediates the connection to the Internet, there are many failing or suboptimal routes for a data
package. In order to prevent lag times, the optimal route is often remembered and reutilized.
other gateways may unpack this data and send it unencrypted through the Internet.
These approaches are conceptually very different and have different implications
for security and access management.
A great advantage of the mesh is that there is no single-point-of-failure
(i.e., if a device fails the remainder of the network should be unaffected, with the
exception of the gateway device), and that devices even under harsh radio frequency
conditions may be able to participate in the network by connecting to transiently
accessible partners (see Figure 10.1). Common application areas are therefore
building automation in the absence of wired fieldbus systems, as brick walls and
steel beams can hinder radio communication, spatially long or wide deployment
patterns (e.g., along railroad tracks or pipelines) in which the devices themselves
act like a linear bus, but also mobile assets such as cargo containers that can build
ad hoc networks to connect to a more remote base station. This implies that meshed
architectures are primarily used in industrial and business applications.
The biggest disadvantage of mesh networks is the relatively low data through-
put. Effective meshing requires frequent hops between radio channels to avoid col-
lisions between neighboring nodes and the utilization of half-duplex time synchro-
nised send-and-receive patterns. This necessitates the data package size to be finely
Common IoT Architectures 143
balanced when data is transmitted at a maximum rate of 250 kbit/second, which

drops significantly with the distance of the devices from each other, for the data
package to fit into one time frame. On top of transmitting the payload, the protocol
needs to ensure that near-optimal routes between sensor nodes and the gateway are
established. This shortest-path routing can be both time and energy consuming in
the first instance, but once a network structure is established, remains stable until
failures occur. This is commonly referred to as self-healing property of mesh net-
works.
There is also both a financial as well as an energy cost associated with the
components and the software that facilitate routing. Therefore, mixed architectures
combining reduced-function devices (i.e., exclusive senders) can be interspersed
with controller devices with full mesh functionality.
10.2 LOCAL GATEWAY
The concept of a gateway architecture is detailed in Gateways and Hub Devices

(Section 8.3). The two most common options are tethering IoT end devices to (a)
mobile phones via BLE or (b) to local gateways such as WiFi routers or hubs
supporting other types of radio standards (see Figure 10.2).
Most phone-based solutions build on standard BLE chip sets, for which
mature hardware and stable software stacks exist. The phone then passes data from
BLE to the Internet via WiFi or available cellular data connection, and vice versa.
Modern phones also have a degree of compute power, allowing for processing and
caching before that forwarding takes place. This also makes phone-based hubs an
attractive choice for development, because software updates of the hub service can
easily be provided. However, data exchange over BLE requires the user to manually
pair their device with the phone. This step may be a nonissue for professional
applications, where the IoT application is deployed by trained personal, but in
the case of consumer products there may be situations in which the user either
cannot or does not want to provide the phone connection. This may exclude phone-
based gateways from applications such as pay-per-use, as it cannot be guaranteed
that the device can report the correct usage data unless appropriate device-locking
mechanisms (e.g., in the absence of an Internet connection) are in place.
Locally installed dedicated hubs connected to a static Internet connection
can provide more stable connectivity than a mobile phone. Options range from
classic WiFi routers to complex gateway devices combining radio modules, basic
processing and network access. The advantage of novel long-range radio standards
cellular, several km
Bluetooth LE, 15m
Bluetooth, < 100m
app
telco
infrastructure
cloud
WiFi, < 100m
Internet
ZigBee, ISP
< 50m infrastructure
LoRaWAN,
< 10km
Figure 10.2 Gateway examples. A set of gateway options that can be used to connect devices to
the Internet. The example shows radio standards of varying typical range (ZigBee, ; WiFi,
; LoRaWAN — actual distances often are use-case specific and depend on the local
environment). The gateways connect to the Internet via standard broadband (e.g., DSL, VDSL) or other
existing networks (via Ethernet). In the case of using a mobile phone as gateway, the radio standards are
less wide-ranging, especially BLE, which is widely used in wearable devices. The phone connects to a
cellular tower via GSM, 3G or LTE. An alternative option, not shown, is the connection of end devices
via satellite communication. Companies like Iridium or Inmarsat offer communication devices and data
contracts, and send data packages over the Internet once the data has been received back at a ground
station. For all cases, not shown is how ISP and telecommunication provider infrastructure integrates
with the Internet (e.g., through the connection to backbones). Devices and data are typically protected
through firewalls and the use of virtual private networks. These are discussed in detail in Part VIII on
IoT security.
Common IoT Architectures 145
(e.g., LoRaWAN, Sigfox) is that they can represent a cost-effective alternative to

cellular data exchange, especially when many devices in a larger perimeter (up
to to ) need to be deployed. On the smaller scale, frequently seen in
older industrial deployments but also in the smart home arena, are hubs that act
as intermediates between short-range ZigBee-style radio devices (with a range of
to ) and a local WiFi network, which the connects to the Internet. In the
smart home market, hubs are often marketed as defensive measure as they introduce
a segregation between the Internet and the consumer device. However, while this
may be true for online hacking (and an effective countermeasure against that is the
use of a firewall; see Part III (Security), attackers may still be able to eavesdrop the
physical radio connection between the device and the hub. Encryption of the radio
traffic is thus just as important as securing the rest of the network. Unfortunately,
data encryption is a compute-intense operation and is therefore limited by the cost
of appropriate chips and energy consumption. A crucial current problem with radio-
based communication is the lack of interoperability between different standards. As
discussed in Part VI (Device Communication), there are at least a dozen proprietary
radio standards to chose from, and at this stage it is difficult to foresee which ones
may be able to become for the IoT what WiFi is for computer communication.
In practical terms, in both phone- as well as gateway-based systems, the hub
can represent a potential single-point-of-failure. The additional component requires
installation and configuration. The range of radio hubs is limited. As such, an
optimal balance between ratio of end devices to hubs and their location needs to
be found. In all cases, the required data rate may be the most constraining factor in
the choice of a radio standard.
10.3 DIRECT CONNECTION
A direct, cellular data connection to the Internet often represents the simplest
architecture for connectivity, as long as the IoT device is in the range of a cell
tower. In contrast to the gateway approach using a mobile phone and BLE, here
the end device itself holds a communication module and subscriber identity module
(SIM) or equivalent to connect to the cellular network. Modern cellular standards
such as 4G/LTE can support data rates that rival GBit/s Ethernet connections,
although such peak rates are still difficult to achieve, and Global System for Mobile
Communications (GSM) 2G and International Mobile Telecommunications (IMT)
3G are prevalent as they are older and also more affordable (see also Section 19.3.3).
Worldwide deployments can become difficult to plan as for the diverse range
of telecommunication standards. In the mobile phone industry multiband devices
that can adjust to the four key frequency bands (used in the Americas, Europe and
Asia/Australia) are common, but with the lowest possible cost in mind, some chip
sets for use in connected devices do only support one or two frequency bands.
Also, while GSM is still widely used in M2M applications, the overall shift to
faster standards for mobile entertainment leads to telecommunication providers
contemplating the switch off of GSM networks. For example, 2G networks have
already been switched off in Canada, and Telstra in Australia (2016) and AT&T in
the United States (2017) have already announced to do so as well. This requires
IoT solution providers to upgrade to 3G or 4G, hardware which currently is still
significantly more expensive than GSM modems. Interestingly, Telenor in Norway
plans to switch off 3G in 2020 and let GSM run until 2025, primarily for the number
of installed M2M solutions and to allow companies a longer time for transition.
The cost of cellular communication is another factor worth considering when
planning a deployment. Some hardware manufacturers bundle their devices with
a service contract, so that a certain amount of data transfer is covered by the
purchase price. However, many cellular modems are provider-independent and can
be equipped with any SIM card. This has implications for device management and
cost, because for each IoT end device there must be a data contract in place.
In areas where there is no cellular network, satellite communication may
represent a feasible option for data connections. While expensive in operation
and dependent on special hardware, these modems link to a satellite via a surface
antenna. The routing of data through satellite connections leads to relatively high
latency, but especially in case of one-way data communication this can sometimes
be neglected.
Integration of the communications hardware and configuration before deploy-
ment to a customer makes cellular data a valuable option where the data service is to
be absolutely transparent (as in: invisible) to the user. So-called roaming SIMs can
connect to the networks of a range of telecommunication providers in any country,
which in practize means better coverage and reliability when the device is deployed
into the field. This lends itself especially to novel service models that require con-
nectivity all the time, such as pay-per-use and remote unlocking of features. In order
to prevent third parties including potentially untrustworthy network operators to
intercept or eavesdrop on these connections, a standard solution is to establish a
virtual private network (VPN) on top of the physical connection — a strategy also
advisable for other types of Internet-based traffic.
Chapter 11
Human Interfaces
How do people interact with the IoT? In a world with billions of connected devices,
the IoT is going to penetrate every aspect of our lives. It is hard to imagine that
we are willing to connect to a website or to open an app every time we need
to retrieve information or control a device. User experience in the IoT and user
interfaces of devices and services are an essential aspect of the overall M2M/IoT
solution. Here, the focus is on some of the technical issues and expectations around
user experience (UX) and interfaces (UI). A highly recommended specialized guide
for practitioners is Designing Connected Products by Rowland et al., which has
emerged as a benchmark in the field, as well as Enchanted Objects by David Rose,
a conceptual paradigm for IoT.
11.1 USER EXPERIENCE AND INTERFACES
The physical world is direct and immediate. Unless there is a break in the electrical
circuitry around the house (unlikely), a power cut (possible, but still unlikely)
or a burnt-out lightbulb (possible, but infrequent), we expect a room light to be
responsive to the dial and switch of a dimmer in the wall. However, if the light can
also be controlled over the Internet, the following things might happen:
• We use the wall switch, but somebody else counteracts our action by using
the Internet connection.
• We use a smartphone app to control the light, but somebody else counteracts
our action by using the wall switch.
147
Now, how do we interpret that the light is not responsive after trying to control
it from an app? Our experience with the two aforementioned scenarios suggests
that somebody is working against us. However, from an engineering perspective,
we know that the following might be true as well:
• The phone might not be connected to the Internet.
• The lights might not be connected to the Internet.
• A functional data connection might exist, but the latency is adding up
considerably between the different components and we may have to wait
for the signal to reach the light (see Figure 11.1).
• Delayed signals from a previous attempt might come through and counteract
our last commands (the commands arrive out-of-sync).
While during the design and development phase the conceptual model (the
schematic description how action and reaction are coupled) is clear to the engineer,
in the future our homes may be full of devices for which even technologists lack
a clear understanding of how information is being relayed. This requires products
and services to communicate very clearly what is happening at each stage of the
interaction between the user and the device.
There are a few measures that can be taken to improve the user experience
through appropriate design and feedback. For example, if the wall switch automat-
ically assumes the setting representing the light’s status, the first two problematic
situations might be avoided. This would require the light switch to flick and the dim-
mer dial to assume the appropriate position. It is clear that this solution means ad-
ditional effort and cost at the hardware development step, but the overall experience
may become more logical and satisfying than the constant inconsistency between
wall switch and light status. Conversely, the light’s representation in the app should
always reflect the current situation, and not display an old status from when the app
was last used. On the software side, continuous feedback may help to shed light
on the current status and implicitly explain the conceptual model: Upon changing
the light in the app, information about whether an Internet connection is present, if
both the phone and the light are online, and finally, if the transfer of the command
has been successful, can help to enforce a consistent experience. When additional
end devices, physical switches and maybe a gateway device are being added to this
architecture, a good conceptual model, thoughtful design and feedback are even
more important: When were they last seen online? Who has seen the last message?
It is essential for the user to understand that they have used the app correctly and
that there is an issue outside their immediate control. Otherwise, the uncertainty
Human Interfaces 149
on 3G 50 ms
10 ms 10 ms 50 ms
off app Internet cloud
50 ms exchange service
1 ms broadband
proprietary radio WiFi 20 ms
20 ms 50 ms
Figure 11.1 Conceptual model of a light switch. Switching a light switch is immediate and direct.
In this exemplary IoT system, light control becomes a multistep process in which latencies can add up
considerably (timings only show order of magnitude). There is some delay between the touch interface
of the phone and an app interpreting this as an actual switch command. If not in the home network, the
phone uses a data connection to a cellular tower, and from there information is directed to an Internet
exchange. This information has to go to a cloud service and the response needs to find its way via a
broadband connection to the user’s WiFi router. If not ideally set up, there is further delay establishing
a WiFi connection to the proprietary radio base station for the lamp, and so on. What happens if a
cloud service is discontinued? If only one of the components in this chain-of-command is missing, the
system ceases to work. Designers have therefore coined the phrase graceful degradation, which means
that devices should still be useful even if Internet connections or cloud services go away. The connected
lamp would then act as conventional lamp.
about their own abilities coupled with a technology they don’t understand is going
to hinder its acceptance.
11.2 MOBILE PHONES AND END DEVICES
Current IoT solutions use four main interfacing strategies. There is the display of
information in dedicated software systems (primarily in industrial applications, e.g.,
SCADA user interfaces), access to data through websites and data portals (common
for fitness-related gadgets with an incentive to share success), control of devices
using mobile phone apps (home automation and wearables), and control through or
signaling by the end devices themselves (e.g., the light is an actuator but could also
communicate a status using a blink pattern).
Lack of Interoperability
With the exception of custom-developed IoT solutions for industry, the user inter-
faces on mobile phones are typically not integrated. That is, in many cases of smart
home applications, wearables and health applications, users in the consumer IoT are
faced with a range of different apps that need to be used to control various aspects
of their lives. This leaves the integration of information and the control in the hands
of the human user, and IoT becomes synonymous with a remote control. Ideally,
software solutions would add interoperability and interface with many different de-
vices, thus eliminating the need for micromanagement. This requires computational
decision making. Relevant information (when it is appropriate) is desirable, other-
wise the interaction with the phone can be either inconvenient or highly annoying.
Unfortunately, this is hindered by the various hardware standards, protocols and
diverse APIs (if at all present) that would allow software to interface with all the
things. There is a more technical discussion on the issue of interoperability in Part
VII of this book (Software).
Voice Control and Constant Awareness
Recently there has been a move towards conversational interactions with the IoT
(chat-like systems such as Apple Siri, OK Google or Amazon Alexa), similar to
using these voice-activated assistants on mobile phones. Stand-alone devices like
Alexa are meant to become centerpieces of home IoT solutions, waiting for trigger
words to analyze voice commands from the user to control devices and retrieve
information. While conceptually attractive, the interoperability issue remains the
same as with separate apps and users are potentially left with just a subset of devices
from particular vendors that can be controlled this way. Even if the interoperability
issue can be overcome, it has already become clear that the amassed data from
many different devices needs to be filtered, and distilled information needs to
be communicated to the user in order to avoid an overload with unnecessary or
irrelevant data. The consumer IoT is intended as a convenient efficiency tool and
not meant to become a time sink like email has become.
Modifying Data Flow
The issue of interoperability and the difficulty providing an integrated experience

has been widely recognized and identified as potential business model. On the level
of mobile phone apps, tools like Thington or Muzzley offer a single user interface
for a variety of home automation devices, internally using the APIs provided for
these products. While development tools for data pipelines that accept and send
data from and to IoT devices following user-specific criteria have come a long
way, online services like If This Then That (IFTTT) or Zapier still require a certain
Human Interfaces 151
degree of technical savviness and an idea about the conceptual model to make their
use intuitive. The stand-alone tool Node-RED follows the general idea of a data
pipeline, but allows a much larger degree of customization than the aforementioned
online tools, combining the use of a graphical data flow editor and the use of a
programming language.
Enchanted Objects
However the technical solution works in the background, an important school of

thought proposes that screen devices are not the most effective way of receiving
information from the IoT. In his book Enchanted Objects, communication designer
David Rose proposes that the devices themselves need to indicate status. This con-
cept is called ambient information. Examples of enchantment, as he refers to it, are
the Ambient Orb, a table light that changes color in response to what is important
to the user, the Glow Cap, a connected pill box that attracts the user’s attention if
medication needs to be taken, or the Ambient Umbrella, an umbrella that indicates
with a discrete light if the weather forecast predicts rain. This paradigm also sug-
gests that IoT is going to enter our homes with small, almost unnoticeable changes
that are going to increase the overall acceptance of the technology.
A particular challenge for designers can be the constrains of small, battery-

driven IoT devices. How do you facilitate input and output of devices that have no
screen or only small, low-resolution displays? How many different messages can
reasonably be encoded with a single-color LED? And while a clearly detectable
click can be reassuring when submitting information on a touch screen, the sound
file or even the software code to do so may be too large for an embedded device
with little memory. These are challenges discussed in the books mentioned in the
introduction to this chapter.
Part V
Hardware
Chapter 12
Hardware Development
This chapter introduces the hardware components that often play a role in the
design of sensor or actuator devices for the IoT. While Chapter 2 focused on the
smallest building blocks of practical electronics (components such as resistors,
diodes and transistors), the center of this set of chapters is on the next higher level
of modularity.
This book cannot provide the engineering training that the professional de-
velopment of electronic hardware products requires, nor can it provide blueprints
or discuss the interaction between hardware developers and product designers for
devices at any detail. The key processes of
• Planning
• Prototyping
• Testing
• Mass production
• Certification
all warrant their own specialist literature. The development of the communication
infrastructure itself, be it gateway devices for local deployment or carrier-grade
Internet backbone hardware, are highly specialized areas that are beyond the scope
of this book. Equally, the development of devices that integrate with the ecosystems
of particular solution providers (e.g., mobile phone manufacturers) usually require
a level of engineering that cannot be captured in the space of a few dozen pages.
Hence, after a brief discussion of aforementioned key steps in developing hardware
products, the next chapters aim to give an overview of the diversity of the modules
155
156
that are used to build them, along with some technical background that explains
their function.
Planning, Prototyping and Testing
In the planning phase engineers take an inventory of specifications and requirements

that the electric design needs to fulfill. While it is often desirable to involve product
design specialists at this step, it is still common to develop both electronics and
casing of a product separately. As discussed in Chapter 8 on components and
architectures, there might be the need to compromise in the electric design: While
in some applications low power consumption and battery life are paramount, others
may necessitate power-hungry processing or broadband communication, such that
these functions take precedent over the power supply. The planning phase allows
the prioritization of these demands.
At the prototyping stage, engineers often produce a proof-of-concept device
on the basis of off-the-shelf modules and temporary wiring on a breadboard or
perfboard. With PCBs becoming a commodity that can be custom-ordered over the
Internet and that are easily available, experiments with surface-mount components
that are typically being used in mass production have become part of the design
process. Once a prototype exists, it can be tested against the requirements, and often
testing and further design are repeated in iterative steps. The further the electronics
is maturing, the higher is the need to collaborate with a product design team to
integrate both electronics and casing into a field-deployable device.
Mass Production and Certification
A key difficulty is the conversion of an often manually built prototype into a robust
device that can be machine-manufactured in large quantities. For example, while
for classic breadboard development through-hole components with long leads and
of normal size are being used, modern production lines require very small surface-
mount components (SMCs) that run off a reel and are fixed to the PCBs on so-
called pick-and-place machines. This also requires strategies for automated testing,
as there are inevitably broken components especially when buying electronics for
production in bulk. While PCBs for prototyping may be as small as the electrical
design requires, inside actual products they often need to provide additional space
for mounting screws and so forth.
Especially when dealing with consumer products, there are certifications that
need to be in place so that the device can be sold. These include electrical safety,
Hardware Development 157
and evidence for adhering to environmental standards, but also electromagnetic

interference (EMF) with other electronic products. As EMF is often dependent on
the particular electronic design, the components used and even the shape or material
of the case, it is the mass production-ready design that requires certification. Table
12.1 shows a selection of certificates often seen on electronic products. While
EC/FCC marks are legally required for sales in the EU and US markets, respectively,
others are often used to add credibility and further incentivize their purchase by
consumers.
For simple devices the initial prototyping step may only be a small fraction of
the overall time, effort and development cost compared to making the design ready
for mass production.
158
Table 12.1
Selection of Certifications Frequently Used for Electronic Products
Certification / Mark / Standard Body Description

CE (formally Conformité Eu- European Union The mark certifies adherence
ropéenne) EU legislation in respect to
safety, and also to relevant di-
rectives, such as the Electro-
magnetic Compatibility (EMC)
Directive, the Low Voltage
(LV) Directive or Toy Safety
Directive, where appropriate.
US, Environmental A benchmark for the energy ef-
Energy Star Protection Agency and ficiency of devices.
Department of Energy
FCC (formally Certificate of US, Federal The mark certifies that the de-
Conformity) Communications vice produces electromagnetic
Commission emissions under the limits ap-
proved by the FCC.
IP (Ingress Protection) International The IP code provides detailed
Electrotechnical information on the protective
Commission properties of the casing with re-
spect to water and dust.
RoHS (Restriction of Haz- European Union The RoHS mark certifies that a
ardous Substances Directive) product is free of problematic
substances (e.g., lead), which
are both detrimental to workers
in production as well as diffi-
cult to dispose of.
UL US, Underwriters This is an electrical and fire
Laboratory safety, and sustainability mark
by an independent provider,
Underwriters Laboratory.
VDE Germany, Verband der This mark certifies electrical
Elektrotechnik, safety, most of which are rules
Elektronik und according to DIN (Deutsche In-
Informationstechnik dustrienorm).
Chapter 13
Power
Every electronic device requires power. An electric potential difference and a

sufficient amount of charge (see Electricity and Electromagnetism, Chapter 1).
Alternating current (AC) is the primary means of transporting electricity over large
distances. Following Ohm’s law under consideration of the phase shift , the loss
of power is:
loss · ·
This means that the higher U the lower the relative loss of power that is caused by the
resistance R of the wire. Therefore, long-distance transmission of electricity over
the grid typically sees voltages of and higher. As AC can be transformed
to high voltages more easily than direct current (DC); this has become the standard
for the power grid. For domestic and most industrial use cases, these voltages need
to be transformed to more practical levels. The voltage and the frequency of the
change in direction are the two key parameters that characterize household AC.
Typical combinations are / (in North America) and /
(in much of the rest of the world). Different socket types (number of pins, their
shape and arrangement) further diversify how household AC is used around the
world, allowing a supply wire (phase, active), a neutral wire (to close the circuit)
and an earth wire for protection. However, with the exception of AC motors, most
end devices and all gateways and computers operate internally with DC. We are
therefore concentrating on supplying DC in this section.
159
13.1 CONSTRAINTS OF FIELD-DEPLOYED DEVICES
New and improved wireless technologies for data communication have been a sig-
nificant enabler for IoT. Without the need for an Ethernet connection, indoor deploy-
ments away from wall sockets and outdoor deployments remote from urban areas
and buildings are possible. However, with the exception of inductive harvesting used
for recharging batteries, electricity to run a device cannot be supplied wirelessly. In
many applications it is therefore essential to run devices on battery power. Unfor-
tunately, batteries have limited capacity, which constrains their lifetime, and their
peak and continuous currents are often not compatible with all modes of data com-
munication. Therefore their use poses technical challenges to hard- and software
design. The lifetime issue can occasionally be overcome by recharging batteries
using solar panels or other renewable forms of energy. Where this is not possible,
adapters that convert household AC to DC remain a power supply of choice.
13.2 POWER ADAPTERS
In environments with an existing infrastructure of electrical lines and when addi-

tional wiring is not critical, there are two main strategies of powering devices by
DC: power adapters and power-over-the-data-cable; for example, with Power-over-
Ethernet (PoE) or USB. Power adapters transform AC to DC, while data cables
normally tap into the internal DC supply of a host device. Both deliver much lower
voltages than what is available over the household AC. Moving the transformation
of AC to DC away from the device into a separate unit has an advantage in terms of
standardization: while wall sockets and voltages may differ from country to country,
a USB receptacle, power adapter socket or Ethernet jack represent a common inter-
face, so that the end devices do not require any further regional adjustments, which
is beneficial for mass production. Also, removing transformers away from end de-
vices is desirable for any radio communication, as electromagnetic interference is a
common problem especially with cheaper power adapters.
13.2.1 Conventional AC/DC Adapters
AC-based DC supplies come in different form factors. Simple wall plugs that deliver
DC via tip/jack connectors are a common solution for consumer devices, and many
computers and peripherals use larger, stand-alone power supplies with proprietary
Power 161
connections to the device. In industrial settings one can see transformer units with
ceiling- or wall-mounted rails that connect to devices with screw terminals.
The rectifier circuit in the section on diodes (see Section 2.1.1.4) gives a good
demonstration of the underlying technical principle of AC/DC adapters. Depending
on the quality of the coils used in the circuit and whether or not voltage dividers and
so forth are being used, these can supply DC from up to almost double the input
AC voltage (utilizing an additional voltage doubler circuit). However, the majority
of consumers relevant in computers, gateway devices, actuators and sensors operate
at ≤ . Most microcontroller boards operate at standard voltages of , ,
and more recent chipsets at . Typical wall plugs supply between and
, with some specialist adapters being able to support up to .
The main choice for a stand-alone adapter is typically between regulated
or unregulated DC, the latter being more simple and cheaper. This refers to the
presence of a voltage regulator that adjusts the output voltage dependent on the
load, the current required by the consumer. In unregulated adapters, devices with
lower current demand may see higher voltages following Ohm’s law. On the
other side, unregulated power adapters often show higher output voltage when
tested without load, as they are designed to deliver the nominal voltage under full
load. Another source of undesirable voltage deviations is the AC supply itself.
Unregulated adapters are prone to AC ripple, a periodic fluctuation of voltage
originating from the remaining waveform after it has passed through the rectifier
circuit. Adding stabilizing capacitors can provide some improvements. However,
only voltage regulators in regulated power supplies are mostly free of ripple noise.
While the early generation of regulated AC/DC adapters used analog, linear
dissipative regulators that could generate significant excess heat, the prevalent
adapter types these days are switched-mode power supplies (SMPS), which are
smaller and a lot more energy-efficient; with transfer efficiencies in the range of
90–95%. There are several different types of SMPS:
• Forward converters
• Flyback converters
• Self-oscillating converters
They all have in common that the output DC is generated by the fast on-and-
off switching of inductors and other storage elements, which utilizes a digital
control circuit. The result is similar to that of pulse width modulation, with the DC
dependent on the number and duration of duty cycles. An undesired side effect of
high-frequency switching is electromagnetic interference; therefore SMPS require
line filters and shielding for radio frequencies. The design of power supplies is a
specialist subject in electrical engineering, hence the interested reader may find
more detailed discussions in data sheets and/or the literature.
Unregulated power supplies are cheap, and in cases where they are carefully
selected for a particular downstream application, their price beats any of the techni-
cal shortcomings. Regulated power supplies offer versatility, stability and efficiency,
although at a higher price point. Within electronic appliances, especially comput-
ers, there are commonly rails for various DC voltages that components within the
appliance can utilize. This can then be used to supply power-over-the-data-cable, as
detailed below.
13.2.2 USB
A very common but, in terms of voltage and current, least flexible option is
power over USB, at least until the release of the USB 3.1 specification. In older
USB standards, the two outermost pins (left and right) of the 5-pin USB cable
are connected to and ground, providing a maximum of (or
with USB 3.0) from a USB host controller that is build into a computer. In
fact, the specification goes as far as the maximum current has to be negotiated
before a device can draw this much. However, while this allows some peripherals
to be powered from a host computer, stand-alone USB power with currents up
to or higher has become a standard in itself: it is currently the prevalent
interface for charging mobile phones, with the exception of one large smartphone
vendor. Especially for devices that just need occasional recharging, relying on the
ubiquitous availability of a USB charger cable has become a frequent occurrence in
the consumer market.
The new specifications of USB 3.1 allow up to at . This is an attempt
to make additional power cables obsolete, as bidirectional power supply can be
handy in a variety of scenarios: while a monitor may be supplied with power from
a desktop computer, a mains-connected monitor may just as well be used to charge
a transiently connected notebook.
13.2.3 PoE
PoE is an industry standard (802.3at Type 1/2) and utilizes varying pairs of the
8 pins of the RJ-45 Ethernet jack as power and ground, depending on the type.
A limiting factor is the resistance of the Ethernet cable that is being used: older
category 3 (Cat-3) cable can only supply to at (for PoE 802.at
Power 163
Type 1), whereas Cat-5 cable is suitable to carry up to (Type 2). PoE is often
used in industry settings and in building automation where it offers a convenient
solution to provide data communication and electricity over just one cable. The
higher voltage enables PoE to work over distances of up to , the maximum
length allowed for the Cat-5 specification for data transfer. It should be noted
that PoE requires special routers/switches that follow the 802.3at specification; the
conventional Ethernet specification does not include power. The source device is
referred to as power-sourcing equipment (PSE), whereas the powered device (PD)
is the consumer. It is noteworthy that PoE implements a setup phase during which
PSE and PD communicate how much power is required and/or if reduced power is
okay. This also allows for solutions in which data packages for a PD activate the
power supply between PSE and PD, which otherwise remains turned off.
13.3 BATTERIES
A considerable proportion of IoT devices are going to be mobile assets and products
that are installed away from mains electricity. For many applications, the electricity
generated by renewable energy sources is not yet sufficient to run devices on it,
although it may be sufficient to recharge batteries. This means that a battery would
act as primary power source that can supply large currents required in operation,
but it is continuously recharged during times when the device is off or in a sleep
mode. It is important to note that this strategy implies the use, for example, of small,
consumer product-sized solar panels, in contrast to rooftop-mounted panels, large
arrays of solar panels as well as industry-grade wind turbines, which contribute
to the electricity grid and essentially deliver mains electricity. For small devices
away from the grid, batteries are going to be of critical importance for most IoT
applications where users either do not want a power cord or where mains electricity
is simply not available.
Battery design and battery chemistry are very active areas of academic and
industrial research. There is no generally applicable set of rules as to which battery
type is best and the selection of the right battery for an application can often be
difficult and is based on a lot of empiricism. However, this section aims to provide
a general appreciation of which factors influence the performance of a battery: their
chemistry and physical build. The concluding section provides a brief overview of
commonly available packages.
13.3.1 Battery Chemistry
Batteries provide a potential difference on the basis of electrochemical properties

and a redox reaction between battery components. Reduction-oxidation (redox)
reactions involve the transfer of electrons from one species of atoms to another.
For example, the reaction of zinc with copper sulphate:
Zn (solid) + CuSO4(in solution) −−! ZnSO4(in solution) + Cu(solid)
can be rewritten in the two distinct steps:
CuSO4 −−! Cu2 + + SO4 + 2 e – and

Zn + 2 e – + SO4 −−! ZnSO4
This means there is a transfer of electrons between the copper and zinc atoms, which
is not to be mistaken with the sharing of electrons in ionic or covalent bonds or in
an electric band, as present in conductors (see “Conductors and Semiconductors” in
Section 1.1.2).
The key components of redox reactions are oxidizing and reducing agents.
Oxidizers receive (i.e., pull) electrons, whereas reducers donate (i.e., push) electrons
for the reaction. Why exactly some elements or molecules are good oxidizers or
reducers is beyond the scope of this book. It suffices to say that permanganate
(MnO 4 – ) or electronegative elements such as chlorine (Cl2 ) are common oxidizers,
and iron (Fe), zinc (Zn) or lithium (Li) are reducing. There are many other agents,
but these are the ones most relevant for commonly available batteries.
A prototypic galvanic cell (see Figure 13.1) has two containers, one with zinc
sulfate and one with copper sulfate. They are connected in two distinct ways: A first
route leads from a zinc probe in the ZnSO4 solution to a copper probe in the CuSO 4
solution. A second route is a salt bridge, in the easiest case a piece of paper wetted
with a sodium chloride solution. The zinc probe loses electrons, which induce a
flow of charge toward the copper probe. The electric potential difference and the
charge is what is being used in battery applications. The reduced zinc (Zn2 + ) from
the probe goes into solution. At the same time, SO4 – and Cu2 + separate, as copper
seeks oxidation at the probe and the negatively charged sulfate travels over the salt
bridge towards the aqueous zinc. While at the beginning of this cycle there is an
excess of zinc in the left and an excess of sulfate in the right container, the hole
reaction stalls once the system reaches equilibrium.
Power 165
A
e- e-
salt copper
zinc
bridge probe
probe
SO4-
Zn +2
Cu+2
ZnSO4 CuSO4
start concentrations:
[Zn+2]
[SO4-]
Figure 13.1 Galvanic cell. A zinc probe in a zinc sulfate solution and a copper probe in copper sulfate
solution are connected over a wire. When the two reservoirs are further connected over a salt bridge (e.g.,
a tissue soaked in sodium chloride), a current will flow from the zinc probe to the copper probe (here
symbolised by an ampere meter to count electrons). In that process, zinc ions from the probe will go into
solution and travel toward sulfate from the neighbouring reservoir, while copper ions from the copper
sulfate solution will form a layer on the copper probe.
The term battery originates from many galvanic cells being connected in a
battery to deliver larger currents. Alternatively, standard batteries are also referred
to as primary cells. In the example of the galvanic cell, both probes reside in aqueous
solution. In modern batteries, this solution is typically replaced by a polymeric
electrolyte, and both compartments are shielded from each other using a separator
(e.g., made from cellulose).
The voltage that such as system can deliver depends on the standard electrode
potential, E0 , of the reactants. This potential provides a quantitative description of
how large the tendency of atoms and molecules is to push or pull electrons in a
redox reaction. In the case of Cu2 + + 2 e – − ↽ −
−⇀
− Cu : E0 = + and Zn2 + +
2e– − ↽−−⇀
− Zn : E0 = - . This means that zinc is a sizeable donor and copper
a good acceptor of electrons, and their potential difference of is the voltage
◦
difference between the two (under standard conditions, i.e., at ,
each, etc).
The E0 values can give a convenient approximation for the voltage a galvanic
cell can provide. For example, a standard alkaline battery on the basis of zinc and
manganese dioxide produces the reaction:
Table 13.1
Standard Electrode Potential E 0 of a Few Selected Species
Reaction E0 [V]
Li+ + e – −
↽−
−⇀
− Li -3.04
Mn2 + + 2 e – −
↽−
−⇀
− Mn -1.19
Zn2 + + 2 e – −
↽−
−⇀
− Zn -0.76
Ni2 + + 2 e – −
↽−
−⇀
− Ni -0.25
Pb2 + + 2 e – −
↽−
−⇀
− Pb -0.13
2 H+ + 2 e – −
↽−
−⇀
− H2 0.00 (per definitionem)
2 MnO2 + H2 O + 2 e – −
↽−
−− Mn2 O3 + 2 OH –
⇀ +0.15
SO4 2 – + 4 H+ + 2 e – −
↽−
−⇀
− SO2 + 2 H2 O +0.17
Cu2 + + 2 e – −
↽−
−⇀
− Cu +0.34
SO2 + 4 H+ + 4 e – −
↽−
−⇀
− S + 2 H2 O +0.50
MnO4 – + 2 H2 O + 3 e – −
↽−
−− MnO2 + 4 OH –
⇀ +0.59
Hg2 + + 2 e – −
↽−
−⇀
− Hg +0.78
MnO4 – + H+ + e – −
↽−
−− HMnO 4 –
⇀ +0.90
MnO2 + 4 H+ + e – −
↽−
−− Mn3 + + 2 H2 O
⇀ +0.95
MnO2 + 4 H+ + 2 e – −
↽−
−− Mn2 + + 2 H2 O
⇀ +1.23
O2 + 4 H + + 4 e – −
↽−
−⇀
− 2 H2 O +1.23
Cl2 + + 2 e – −
↽−
−− 2 Cl –
⇀ +1.36
MnO4 – + 8 H+ + 5 e – −
↽−
−− Mn2 + + 4 H2 O
⇀ +1.51
MnO4 – + 4 H+ + 3 e – −
↽−
−⇀
− MnO2 + 2 H2 O +1.70
HMnO 4 – + 3 H+ + 2 e – −
↽−
−⇀
− MnO2 + 2 H2 O +2.09
Power 167
Anode: Zn −−! Zn 2 + + 2 e – : E0 = -
Cathode: 2 MnO 2 + 2 H+ + 2 e – −−! 2 MnO(OH) : E 0 = + (because Mn3 +
reacts to MnO(OH) in aqueous solution)
The potential difference of ( + (- ) ) is close to desired value of

for standard-size alkaline batteries. This target value is carefully engineered
and takes into account a variety factors. First, there are a range of alternative reac-
tions on the anode and cathode that happen at the same time as the primary reaction,
although at a much smaller scale (the large number of reactions involving MnOx
species in Table 13.1 serves to emphasize the breadth of potential side reactions,
as there is an abundance of H+ and e – in these aqueous solutions). There are also
other reactions taking place at the interface of anode and cathode, and the electrolyte
between them, which further complicate matters.
From the E0 values in Table 13.1 it becomes evident why lithium chemistries
involving chlorine are often used in powerful batteries. The theoretical E0 difference
of a lithium-thionylchloride battery is from the reactions:
Anode: Li −−! Li+ + e –

Cathode: 2 SOCl 2 + 4 e – −−! S + SO2 + 4 Cl –
Note that the spectrum of E0 values is larger, but they are associated with chemical
species (e.g., strontium) that cannot be safely handled outside a laboratory setting.
13.3.2 Rechargeable Batteries
Rechargeable batteries are also referred to as secondary cells. With the advent of
mobile devices (laptop computers, mobile phones), there has been a general shift
towards integrated batteries that can be recharged. The reason is the limited capacity
of all battery types, and that recharging is more economical and environmentally
friendly than replacing entire batteries with cases and residual toxic chemicals.
Why are all batteries not designed for recharging? Primary cells typically
have a higher energy density, that is, more current can be stored per volume (e.g.
for alkali-manganese; for lithium-thionylchloride).
This value is lower for secondary cells (e.g., for lead acid), with the
exception of some lithium polymere secondary cells that achieve similar density.
The chemistry of primary cells allows a typical load-independent discharge rate
of between 3 to 5% per year, with lithium-thionylchloride taking the lead at 0.5%
per year. This is in stark contrast to secondary cells with typical, inferior discharge
rates of up to 25% per month. However, rechargeable batteries are chemically more
robust and dynamic. Fast-running redox reactions allow higher peak currents, and
the internal resistance of secondary cells remains stable over the entire lifetime of
one discharge cycle, whereas the resistance increases in primary cells, which can
often be seen in a decline in their performance over time.
In principle, standard batteries can also be recharged, simply be applying an
external current on the anode and cathode, in opposite direction than the electron
flow when the battery is in normal use. However, this is discouraged by all manu-
facturers, as non-optimized chemistries may yield the development of gas, leading
to dangerous overpressure. If a battery is designed for recharging, its chemistry and
mechanic build have to counteract such processes.
Well-known rechargeable chemistries include lead, nickel or cadmium in con-
junction with sulfuric acid, potassium hydroxide or cadmium hydroxide. Lead and
cadmium are very toxic, and acids and bases are corrosive. Therefore, the produc-
tion of these accumulators has largely been discontinued. They were traditionally
found in car batteries or cheap consumer devices. Modern rechargeable batteries for
telecommunication or multimedia devices feature nickel-metal hydride (NiMH) or
lithium-ion chemistries. While they are not necessarily nonhazardous, they are free
of the problematic lead and cadmium compounds.
Lithium polymer (LiPo) secondary batteries have become a standard in most
mobile devices (see Figure 13.2). They are based on lithium-ion chemistries, but
their name references the typically polymeric pouch into which these batteries are
pressed. Lithium-ion accumulators work differently compared to lithium batteries
as they provide energy by the directed intercalation of lithium-ions into a matrix, in
contrast to stochastic reactions seen for metallic lithium in solution. Charged ions
travel from a positive (e.g., lithium-metal oxide) towards a negative (e.g., lithium-
graphite) substrate through an electrolyte (e.g., lithium-hexafluorophosphate) and a
separator that is only penetrable for lithium. The recharging process reverses this
intercalation.
At the beginning of the 2000s, the quality and build of rechargeable NiMH
batteries and LiPo packs was still poor. The performance of batteries was variable
even across models of the same manufacturers, including significantly differing
degradation rates (i.e., declining energy density over time). Modern LiPos do not
require a conditioning phase in which the battery have to be fully charged and
completely discharged over several cycles before they are fully functional. They
are also not prone to the so-called memory effect, in which the batteries cannot
discharge further than the point when it is usually recharged. However, as lithium
chemistries may suffer from uncontrollable feedforward reactions, referred to as
Power 169
anode separator cathode

electrolyte electrolyte
Li+
wafers of Li-
wafers of Li-
metal oxide
graphite
aluminium copper
direction while
rod rod
discharging
Figure 13.2 Recharging a LiPo battery. In a fully charged LiPo, the lithium-ions sit in between layers
of Li-metal oxide material. During the discharge process, the ions migrate into layers of Li-graphite.
The graphite material stores the lithium-ions until the process is reversed. This allows for a higher ion
concentration per surface area and faster mobilization than in traditional builds using solid anode and
cathode rods.
thermal runaway, rechargeable batteries of this type need to be protected against

overcharging and excessive current, and need to have mechanical protection against
overpressure. LiPo and other lithium batteries are therefore often forbidden for
reasons of fire safety. There are special charger circuits for LiPo packs that prevent
overcharging and thus prolong battery life.
Power Management Circuits
The recharging of a battery depends on the temperature, the available voltage

and current, the capacity of the battery and the time the recharging process is
supposed to take. There are in principle two strategies: relatively fast, periodic and
demand-dependent recharging, or trickle charging that is slow and primarily aims
to counteract the load-independent discharge of batteries. Especially in applications
where only occasional microcurrents are required, trickle charging from a solar
panel can be a reasonable option to maintain the readiness of the device over
extended periods of time. This is in contrast to the fast recharging of batteries, which
usually requires the constant supply of mains electricity.
There are four conditions that are detrimental to all rechargeable batteries:
overdischarging, overcharging, shorting it, overusing it. Overdischarging simply
pushes all charge carriers to one side of the cell to a degree that the chemical reaction
is no longer reversible. Overcharging, shorting and drawing too much current for too
long all lead to excessive temperature, which promotes chemical decay and bears
incoming DC, rechargable

e.g. 9-12 V battery, e.g. 6V
1N4007 1N4007
LM317
+ 10 +
Vreg
_ _
6.8V
180
1k
pot
BC548
Figure 13.3 Charger circuit. This exemplary circuit shows how a battery can be recharged with
to DC. The 1N4007 diodes ensure that only current of the appropriate polarity can enter the
circuit. The LM317 is a standard IC that converts the incoming DC to the target voltage, and the precise
value can be set using the potentiometer. Once the battery is charged, the Zehner diode breaks
down, switching the BC548 transistor, thus disabling further supply of the battery. The Ω resistor
protects the transistor and prevents shortening of the battery in the absence of a load.
a risk of overpressure and explosion. Hence, a power management circuit balances

the demand on the battery and recharges it at an optimal rate.
The most simple recharging circuit utilizes a diode between the power supply
and the battery to prevent current flowing back into the charger if the battery
is completely loaded and the power supply switched off. This strategy is also
employed in very cheap solar panel charging circuits. The next, more complex
variant of this charger uses a transistor and a Zener diode to ensure the charging
circuit is turned off once the battery is full (see Figure 13.3).
Recharging ICs like the MAX639 deliver a continuous charging current and
implement other security measures (i.e., they integrate complex charging circuits in
single packages).
13.3.3 Battery Types and Real-Life Properties
Most hardware manufacturers rely on standardized battery types (see Table 13.4).
The most common cylindrical batteries follow American National Standards Insti-
tute (ANSI) norms AA, AAA, C and D. Other common form factors are the
block battery and (coin cell) watch batteries (e.g., of the CR2032 or CR2025 type).
In many cases these codes refer to both a form factor and a nominal voltage; how-
ever, in particular the AA type is available with the two standard voltages and
. It is noteworthy that this is really only a minimal reflection of the diversity
of battery types, as a look into a specialist catalog will confirm. Including legacy
Power 171
Table 13.2
Common Battery Types
ANSI Code Alternative Size Nominal Volt-

Name age
Tubular
AA Mignon H: D: /
AAA Micro H: D:
C Baby H: D:
D Mono H: D:
Block type
9V x x
Coin cell
CR2025 H: D:
CR2032 H: D:
form factors, there are several dozen variants of block-type batteries with voltages
from to , or more than twenty types of coin cells alone. Specialist bat-
teries with form factors specific to particularly devices exist as well. For example,
mobile phone manufacturers often develop devices and batteries at the same time
as they need to optimize size, shape, weight and capacity in an integrative manner.
However, for reasons of development cost, this device-specific design of batteries is
more the exception than the norm.
Suppose a desired voltage has been selected; following the choice of size,
there is often a further decision to be made for the internal build and chemistry of
the battery. This is reflected in the International Electrotechnical Commission (IEC)
battery norm. For example, this may indicate that an AA battery is either of the R6
(zinc-carbon) or LR6 (alkaline-manganese) type. Build and chemistry both have an
influence on overall capacity, which means that the IEC code can also express a
certain expected capacity.
Currently, alkaline-manganese batteries have largely superseded zinc-carbon
batteries, which are prone to leakage as the zinc electrode decomposes in the
process of discharging. That being said, alkaline-manganese batteries can also suffer
from leakage if stored for a prolonged period of time, although their build usually
provides some degree of protection from leakage. In addition, alkaline-manganese
plastic separator
metal case (with + pole)
Zn powder matrix
metal case (with - pole)
Zn probe
collector leading to + pole
separator separator
(cellulose with KOH) (cellulose with AlCl3)
cathode material cathode material
(e.g. MnO2) (e.g. SOCl2)
plastic separator anode material
metal plate (with - pole) (e.g. Li)
1.5V alkaline battery

(AA form factor)
cathode material
(e.g. MnO2)
Li matrix
separator
1.5V button cell (cellulose with KOH)
(coin form factor) 9V block battery
Figure 13.4 Cross-sections through batteries. Top left: Alkaline battery. Note how the zinc probe is
separated from the rest of the case to reduce the risk of leakage. Top right: Lithium-thionylchloride
battery. Bottom left: Watch battery. Bottom right: 9V block battery. These batteries often have very
small capacities, as they are made of six smaller battery cells.
Power 173
batteries have a higher capacity and lower discharge rate than zinc-carbon ones.
Cross-sections through common battery types are shown in Figure 13.4.
In summary, critical parameters for the choice of the right battery are:
• Nominal voltage
• Peak and continuous current
• Energy density (i.e., capacity per volume)
• Total capacity
• Impedance (the amount of energy lost as heat at higher currents)
• Operating temperature
• Orientation (i.e., the battery is always in a particular position)
• Self-discharge rate (see Section 13.3.2)
• Various other discharge characteristics (e.g., in dependence of load)
For many applications, lithium-thionylchloride primary cells have become
the best-in-class solution. Their nominal voltage of is ideal for logic
devices. They maintain that voltage almost until the end of their capacity,
in contrast to alkaline batteries that drop below their voltage level after
using only about 10% of their capacity. Many lithium-thionylchloride batteries
are developed to deliver this voltage for an extended temperature range from
− ◦ to ◦ , with anecdotal reports of being operational at − ◦ . Lithium-
thionylchloride batteries are especially suited for prolonged operation times (e.g.,
for devices with long sleep times that require only short pulses of small currents),
and can provide a stable supply for up to 10 years.
The selection of the optimal battery for a particular applications is often
difficult. Fortunately, in contrast to off-the-shelf products available in supermarkets,
batteries that are sold in the business-to-business sector are usually accompanied
with data sheets and detailed specifications. The interested reader may find a wealth
of information on manufacturer websites.
13.4 RENEWABLE ENERGY SOURCES
Large-scale solar farms, wind turbines and other forms of electricity that fall un-
der the renewable energy umbrella are simply components of the energy grid, and
this book is not concerned with how mains electricity is generated and delivered.
However, small solar panels and energy harvesting technology can provide a valu-
able alternative for recharging batteries of a connected device independent of mains
electricity. In particular solar panels have achieved a level of efficiency that is suf-
ficient to drive a small microcontroller when the panel is fully exposed to light.
Therefore, this section focuses on solar power as the most widespread source of
renewable energy in small devices and consumer products, as well as two energy
harvesting strategies that are commercially available and have been developed past
an experimental stage.
13.4.1 Solar Panels
The solar panel is the the smallest unit of a photovoltaic system. The panel repre-
sents a physical module for the solar cell (system ! panel ! cell) and serves as
protection of the cell from mechanical damage and moisture, and as the electronic
connection to other panels. There is a vast list of characteristics that describe the
electric performance of a panel, for example the maximum power point (MPP),
voltage and current at the MPP, and its efficiency. These features depend on the
type and quality of different solar cells, as well as the electronic design of how
these cells are connected.
Depending on the use case, solar panels can be stiff or flexible. The variant
often mounted on buildings consists of a layer of reinforced plastic, which serves
as bed for the solar cell. This layer integrates channels for electronic connections
as well as mounting holes. The solar cell itself is sandwiched between the plastic
and toughened glass, and the entire build is fixed into an aluminum frame. Solar
panels for the use in electronic devices often see the plastic and glass layers of
the aforementioned design replaced by foil. These panels are flexible and physical
stability is often provided by the host device itself.
Solar cells are effectively photo diodes (see Conductors and Semiconductors,
Section 1.1.2, and Diodes, Section 2.1.1.4, as well as Figure 13.5). Incoming light is
absorbed by semiconducting material, usually silicon, where it drives electrons from
the p-n boundary layer towards the front contacts. This yields an overall current flow
from the back contact through the p-type and n-type layers to the front contacts.
The potential difference between the front and back contacts is the voltage that
can be used, in the case of silicon cells about . A standard x cell
can produce a current of up to under optimal conditions. Solar panels combine
many solar cells in order to achieve higher voltages. However, in contrast to batteries
where higher voltages are achieved by simply adding them in series, each solar cell
Power 175
flow of light
current
front contact
anti-reflection layer
n-type silicon
(-) (-) (-) (-) (-) (-) (-)
V (+)
(+) (+)
(+) (+) (+) p-type silicon
(+)
back contact
(-) free electrons (+) electron hole
electronic cross-section through a solar cell
equivalent
Figure 13.5 Solar cell. A solar cell resembles a p-n interface like a diode (see Figure 1.5 for reference).
The front contacts run across the surface of the cell (this is the pattern that can be seen when looking at
a panel). Using the energy of light, electrons overcome the p-n barrier and current flows from the front
to the back contacts. This explains why, strictly speaking, solar panels do not produce energy but simply
pump charges.
has a slightly different MPP. Hence, MPP tracking logic chips have been invented
that optimize the combination of several cells to yield an optimal joint voltage.
While the power simply follows · , the MPP defines the optimal point
(combination of U and I) under consideration of the open-circuit voltage (when the
poles are not connected) and the short-circuit current.
Over 90% of all solar panels utilize mono- or multicrystalline silicon cells
with added contacts. Amorphous and multi-crystalline silicon is much cheaper
to produce, at the cost of efficiency, which is between 10% (amorphous) and
15–20% (multicrystalline) in comparison to 16–22% for monocrystalline silicon.
Experimental solar cells on the basis of gallium and indium can achieve efficiencies
40%, although under conditions that are hard to achieve in real-life applications.
The theoretical maximum efficiency of a perfect solar cell would be 85%, which is
predicted by Carnot’s thermodynamic theorem.
There has been a significant price decrease for solar panels over the past forty
years, as the price per watt has fallen by 20% for every doubling of industry capac-
ity. This price dependency, in analogy to Moore’s law that predicts the doubling of
compute power every two years, is called Swanson’s law.
13.4.2 Energy Harvesting
Three methods of energy harvesting that have emerged from an experimental stage
and now see commercial use are piezo generators, inductive chargers and radio
frequency (RF) harvesting.
Quartz crystals are traditionally used in timekeeping (see Piezo Element in
Section 2.1.2.3). In brief, a crystal of silicon dioxide is mechanically deformed and
the regular vibrations of the atoms in the grid lead to microcurrents that can be
measured electronically. That current can also be harvested and used for recharging
batteries. The larger the volume of a piezoceramic, the larger the current that
can be obtained. Typical form factors are piezoceramic discs, flat cylinders that
can be mounted mechanically underneath a surface that experiences pressure. As
microcurrents from a vibrating crystal (or one that deforms under pressure and
then relaxes) flow in both directions, for DC applications it is necessary to add
a rectifier bridge (see Diodes, Section 2.1.1.4). An example calculation of one
manufacturer gives an idea about the voltages that can be expected from their
piezoceramic film: Given a x patch of µ thickness, using the
material-specific parameters, a pressure of leads to a voltage of .
Low-supply-current DC-DC converters such as the MAX1674 can provide or
from just input voltage at as little as µ , at an efficiency of about 30%.
Piezoelectric harvesting is thus not the most energy-efficient method of generating
power; however, the key here is to find large surfaces that see continuous mechanical
pressure, such as floor panels or street surfaces.
Inductive charging is already widely used for charging small consumer de-
vices such as mobile phones and electric toothbrushes. The underlying principle
is that AC from mains electricity passes through a coil, which induces a magnetic
field (see Interactions of Electric and Magnetic Fields, Section 1.2.2). That magnetic
field can be picked up by a second coil at some distance from the primary coil. Here,
the field induces an electric current that can then be used to charge a battery. The
efficiency of inductive transfer is in the range of about 80% at distances of .
The third more recent method, RF harvesting, borrows the principle of induc-
tive charging. However, rather than picking up a strong magnetic field as induced
with a coil, RF harvesting relies on antennas that respond to frequencies of the
electromagnetic spectrum commonly used in data communication. Experimentally
it has been shown that WiFi and mobile phone signals can be utilized to charge bat-
teries, although the bursty nature and the modulation of the signal makes them less
ideal than RF transmissions that come in continuously using nonvariant frequencies.
Commercially available solutions are optimized for distances of up to .
Chapter 14
Actuators
In the traditional sense an actuator provides the physical output from pneumatic,
hydraulic or electrical systems, primarily in the form of motion. It is the mediator
between an activated power source such as a steam engine and an object. While
in the modern use of the word even complex output peripherals can be referred
to as actuators, the term also covers any of the small electronic components that
are directly connected to control systems like microcontrollers. In this section the
focus is on those electric components (e.g., lights, buzzers and motors). With much
of the foundations already discussed in Chapters 1 and 2, it suffices to point out
a few standard actuators frequently used at the prototyping stage and to define the
boundary between basic actuators and complex output devices.
14.1 FROM BUZZERS TO SPEAKERS (SOUND)
The two most basic strategies to generate sound are electromechanical and piezo-
electric buzzers. The older electromechanical buzzers, which utilized the sound of
a frequently switching relay, have largely been superseded by piezoelectric buzzers
that use the controlled oscillation of a crystal to emit simple tones. Buzzers are
single-wire output devices. Typically, one terminal receives a small DC input volt-
age at the desired frequency, and one is connected to ground. It is important to note
that buzzers cannot create the same spectrum of frequencies and volume of actual
loudspeakers known from audio systems. These are based on the deflexion of a
membrane using electromagnetic forces and usually require some degree of control
electronics.
177
common cathode (-) common anode (+)
- +
R G B R G B
R GB R GB
- +
Figure 14.1 Pin configuration of RGB light-emitting diodes. RGB LEDs exist in common cathode
and common anode configurations. The relative amount of the three color channels is controlled by
differential input voltages for each of the red (R), green (G) and blue (B) components in the common
cathode configuration. In the common anode configuration, the supply voltage is constant but controlled
access to ground determines the color of the LED.
14.2 FROM INDICATOR LIGHTS TO DISPLAYS (LIGHT)
LEDs are the core components of segmented displays, such as LED rings, LED
matrices or 7-segment displays. While a single monochromatic LED requires the
connection to a single data input (e.g., with a supply voltage of or )
and ground, red-green-blue (RGB) LEDs that can emit millions of colors typically
require three independent inputs. RGB LEDs exist in common anode and common
cathode configuration; that is, the rule of thumb that the longer leg of the LED
connects to the supply voltage is not necessarily true (see Figure 14.1).
Segmented displays (e.g., 7-segment for the display of numbers), matrix
displays (e.g., with a by matrix of LEDs) or strips with dozens of LEDs
per meter require higher currents to drive all lights appropriately. In addition,
controlling each light individually with a dedicated output would easily saturate
even microcontrollers with many output ports. There are multiple possibilities to
go about that problem: One is a technique called charlieplexing, which utilizes the
tristate capability of microcontroller I/O ports (see Figure 14.2). However, refresh
rate, peak current and the sensitivity of the circuit to the failure of even one single
LED puts practical limits to this strategy. In addition, as shown before, RGB LEDs
require three input lines and thus employing charlieplexing for color output adds
additional complexity to a project. An alternative to multiplexing is the use of
external controller ICs that take time-dependent input signals from just a few ports,
such as SPI or I 2 C (see Hardware Interfaces, Section 19.1).
Actuators 179
I/O - 1
1 2 5 6
I/O - 2
3 4
I/O - 3
Figure 14.2 Charlieplexing LEDs. Left: 7-segment display, showing “4”. Right: The principle of
charlieplexing. A port can either be ON (e.g., outputting voltage at ), OFF (port pulled to ground) or
INPUT, which is a high-impedance state the mimics the port electrically disconnected from the circuit.
Assuming an initial all-OFF state for I/O ports and LEDs, in order to switch LED 1, we would need to
keep I/O - 1 at LOW and set I/O - 2 to HIGH to switch LED 1 on. The response of LEDs 4 would depend
on the state of I/O - 3. It would not matter in this case if we had to pull I/O - 3 to LOW; however, in order
to avoid complications if LED 4 were to be on, I/O - 3 can momentarily be set to INPUT. Charlieplexing
can in theory address 2 − LEDs through ports.
Liquid crystal displays (LCDs) utilize the ability of some organic molecules
(the liquid crystals) to alter the polarization of visible light. In combination with
polarization filters, by controlling the spatial orientation of the liquid crystal mate-
rial electronically, it can be determined whether light can pass in a particular cell.
Depending on the use case, those cells can be quadratic and aligned as a matrix
of pixels (picture elements) or they can be formed arbitrarily (e.g., as elements of
a 7-segment display). Basic LCDs utilize a mirror behind the cells and the picture
is formed by reflecting or blocking the light that is falling in, but some designs
also combine this with active background illumination. In combination with highly
resolved background illumination, one can devise color matrices in which each indi-
vidual pixel is comprised of three small LCD cells for each color channel red, green
and blue. As is the case for arrays of LEDs, the control of many pixels requires
the use of specialized driver ICs that are typically controlled via standard field bus
systems.
LCDs have been the principal component of many other technologies, notably
thin-film transistor (TFT) or in-plane switching (IPS) displays. The discussion of
these more advanced display technologies is beyond the scope of this book, but the
interested reader may want to follow up on the concepts of e-ink paper that use
electric fields to accumulate ink particles in the form of characters, or photo- and
electroluminescence as used in plasma screens or organic LEDs.
14.3 FROM VIBRATION TO ROTATION TO SWITCHING (MOTION)
Different technologies exert a mechanical force, encompassing:

• Vibration (vibration alarm)
• Linear motion (solenoid)
• Rotary motion (motor)
• Switching (relay)
To provide a logical structure for these different types of components, this section
begins with applications of piezoelectric devices, then introduces different types of
motors based on different operating principles, before expanding into the vast range
of relay technologies.
14.3.1 Vibration and Piezoelectric Motors
The deflexion of piezoelectric elements not only can be used to move air as in the
example of a buzzer, but also to move payload with larger mass. Depending on the
build, many vibration alarms utilize off-center mounted weights that are set into
motion with a piezoelectric mechanism. Similar to the buzzer, the simple applica-
tion of a driver voltage is sufficient for activation. More sophisticated designs of
piezoelectric motors are used to transport load along a linear axis by the alternat-
ing extension into two defined directions, exerting a push and a pull on a planar
load (inchworm motor). Following a similar principle, the controlled extension of
piezoelectric material can be used to push forward the center axis of a rotary motor.
The control of these motors is complex and therefore requires a certain amount
of control logic; piezoelectric motors are often used for scientific purposes where
the displacement of payload is measured in distances of a few nanometers. The
precision of piezoelectric elements is also used in inkjet printers where they propel
ink droplets.
14.3.2 Solenoids and Electromagnetic Motors
The basic principle of solenoids and electromagnetic motors is introduced in the

section on interactions of electric and magnetic fields (see Section 1.2.2). When
applying a current on a wire that is arranged in a tightly wound helix (the solenoid),
in the helical tunnel a magnetic field that, in a first approximation, is uniform and
linear can build up. If the current is strong and the helix tight enough, this magnetic
Actuators 181
field is strong enough to push out or pull in a metallic rod. While solenoids work
with low voltages as available from microcontrollers, most require currents in the
ampere range to function properly.
The introductory example in Figure 1.12 explained that a wire that is carrying
a charge experiences a push or a pull into the plane of a magnetic field, depending on
the direction of the current. The example neglected to show what happens once the
wire has turned 180 degrees around the rotational axis. The new position effectively
means an inversion of the direction of current, which in theory would necessitate
the motor to stall and then rotate back by 180 degrees. If the motor is driven by AC,
it is possible to design a motor that functions entirely by alternating the current
at the same frequency as required to turn the motor. There are many technical
improvements to this design and variants that do not rely on the alternating character
of AC alone, all of which are beyond the scope of this book. In the case of DC
motors, the stator (the stationary part that provides the axis for the rotor wire) can
hold commutators or slip rings that provide the mechanical means to reverse the
current every half rotation. Commutators often employ metallic or carbon brushes
that sweep along the surface of the rotor. This causes friction, and thus reduces the
efficiency of the motor and represents the potential for mechanical wear-and-tear.
This gave rise to brushless DC motor designs that internally convert DC to AC. As
with solenoids, while the low voltage of microcontrollers may be sufficient to run
a DC motor, the current provided by output ports is typically not sufficient to drive
one.
Servo and Stepper Motors
Classical AC and DC motors can only be controlled by the voltage that is being
supplied. From a microcontroller perspective, this allows the speed control of DC
motors either by connecting the device to an analogous output port or by PWM; see
Section 3.2. As such, these motors are typical single-wire components, although
usually with an intermediary current booster, such as a MOSFET. For applications
that require precise movements and control of speed, angle and torque, this degree
of control is not sufficient. Here, servo and stepper motors provide a solution (see
Figure 14.3).
A servo motor is a DC motor with an integrated gear box, a sensor that
determines the current state of an output gear, and a control circuit. Two wires,
power and ground, provide a constant supply for the motor. A third wire controls the
position of the motor, usually by a voltage or PWM signal that encodes the desired
final position of the motor. Through reading a rotary encoder, potentiometer or
IC determines if motor is on
or off by comparing actual relative direction of current
and target position
A1/2
servo output A1/2 B1/2 A1/2
A1
B1/2 B1/2
gear drive B2
A1/2 A1/2
position of
S
DC motor pot
B1/2 rotor (S) B1/2
N
A2
IC
B1 A1/2 A1/2
3-pin connector A1/2 B1/2
C
B1/2
B1/2
servo motor stepper motor
(top view) (front view)
Figure 14.3 Servo and stepper motors. Left: Servo motor. The motor unit is constantly powered via
the positive (+) and ground (-) leads. The control lead (C) communicates with the IC, which determines
whether the DC motor is in the appropriate position by reading the potentiometer behind the main gear. If
further control is required, the IC supplies the DC motor with power. Center: Stepper motor. By applying
current of opposite direction to A1 , A2 (left) and B1 , B2 (right), the magnetic field of the respective coils
attract the ends of the bipolar rotor. Right: Activity of coils and direction of current through the A1/2 and
B1/2 blocks for eight representative positions. To achieve continuous rotation, the configurations of A1/2
and B1/2 have to change transiently from stage to stage.
Actuators 183
other type of position sensor, the control circuit performs a continuous comparison
between the actual and target positions. An integrated circuit than determines the
voltage and duration for the internal DC motor to rotate, taking into account
instantaneous feedback from the position sensor. This allows the servo motor to
maintain a position, even when an additional load or counterforce is applied. Servo
motors are typically a lot faster and can provide more holding torque than stepper
motors, although at the cost of granularity. Typical applications for servo motors are
the control of rudders where the final position is between − ◦ and ◦
.
Stepper motors are used for very precise movement control. They differ from
many DC motors in that they are not using a static magnetic field and a current-
driven rotor. Instead, stepper motors employ permanent magnets as rotors, which
align toward stators that change their magnetic polarity depending on a current. The
precision of a stepper motor is determined by a number of parameters, mostly by
the number of stators, the current waveform used to induce their magnetic field,
and the electronic control mechanism to orchestrate the magnetic field change. A
very simple stepper consists for four stators (labeled A1 , A2 , B1 and B2 in Figure
14.3), two of which constitute a pair that is poled opposite to the other pair. The
control electronics determines the direction of current for each pair, depending
on the desired position of the rotor. While in this example the resolution of the
motor is limited by the on or off of the stator pairs, in a so-called microstepper
motor with, for example, 12 stators (one every ◦ rather than one every ◦ )
and with a sinoidal control pattern (where the voltage is not only switched on or
off, but ramps up and down in each stator independently), the resolution would be
greatly increased. While stepper motors have great precision, their internal control
is difficult at high speeds, when they are also most sensitive to high loads. Stepper
motors also use significantly more power than other DC motors. They typically
require specific driver circuits, some of which use proprietary protocols, while
others follow standard hardware interfaces like SPI or I2 C.
14.3.3 Relays
Relays are switches that allow the control of high work voltages or AC from low-
voltage DC circuits. While so-called solid-state relays that work purely on the basis
of semiconducting material exist, this section focuses on electromechanical relays
that utilize a solenoid to exert their function (see Figure 14.4, left). In principle,
relays use small, low-current DC to control a lever that, depending on the build,
either breaks (normally closed) or closes (normally open) a secondary circuit that
operates on higher voltage or AC. The operating modes normally closed (NC) or
fixture
nonconductive "normally
flexible
lever open" SPST
contacts SPDT
pivot
solenoid "common"
spring DPST
~
DPDT
low voltage high voltage
(control) (work)
Figure 14.4 Relays. Left: Cross-section through a relay. When a small voltage is applied, the solenoid
pushes a piston against the back of a nonconductive lever, which is normally held in place by a spring.
The lever pivots and pushes two flexible contacts against each other. This closes the normally open
high-voltage circuit. Right: Electronic symbols for switches and relays.
normally open (NO) are purely based on the mechanics inside the relay. Some
relays can be latched or time delayed (i.e., their state remains even when the
control voltage is switched off again). Analogously to the nomenclature for manual
switches, there are relays that follow
• SPST (single pole single throw)
• SPDT (single pole double throw)
• DPST (double pole single throw)
• DPDT (double pole double throw)
logic (see Figure 14.4, right). This means that relays may not only switch a single
work circuit with one control voltage (SPST), but may also alternate between the use
of two work circuits (SPDT). The DPST and DPDT forms replicate this behavior,
but with the double number of output circuits.
Characteristics that often require consideration in the selection of relays
are the operation voltage and current of the control circuit, the maximum and
continuous voltage and current ratings of the work circuits, as well as the switch
times, from latent to active, active to latent, or the maximum number of switch
operations per second. Relays are mechanical devices, so it is also possible to
activate the work circuit by exerting strong external force on the package. For
critical applications, there are indications for vibration or shock resistance.
Actuators 185
14.4 OTHER FORMS OF ENERGY
Any form of energy that one can transduce from electricity can be used for output
purposes (e.g., there are various types of heating contraptions and electrical igniters
that can in principle be controlled via microcontroller output ports). Although they
depend on external power supplies that can deliver a considerable amount of current,
they are often listed as actuators for use with microcontrollers.
Chapter 15
Sensors
Sensors inform electrical systems about the world around them. In analogy to the
discussion on actuators (see Chapter 14), here we are going to focus on sensors
that are directly connected to, or are part of, an embedded device, although in
principle anything with an electrical connection can become a sensor for the IoT.
We roughly differentiate between reading the fundamental dimensions of physics
(time, location) or measuring physical (e.g., motion, temperature, current) and
chemical (e.g., particles, gases) triggers. In this book we do not cover social media
activity, although the counting of contributions with a particular hashtag on Twitter
or sentiment analysis can provide important insight for computational decision
making.
Sensors can deliver qualitative information (water level above threshold?
– yes/no) or provide quantitative, absolute readings. For the latter it is further
important to differentiate between the sensitivity and resolution of a sensor, and the
resolution of these signals when they are read into a digital computer (see Section
3.2 on A/D conversion).
For the sake of scope and brevity, we cannot discuss the very active area
of computational image recognition that allows the inference of facts from still
images or real-time video feeds. Originating from quality control applications with
simple is? / is not? decisions, current technology combines high-resolution cameras
with digital image processing for the identification of entities and patterns. Modern
image recognition sees applications in the tracking of license plates on roads or
faces in crowds, as well as many other tracking and/or counting purposes. These
systems are worth mentioning as they can be obtained as simple sensors, although
internally they combine digital imaging and complex algorithms.
187
15.1 TIME
In the context of technical and business processes, we are interested in relative

time (e.g., uptime of a system, time since last event) or local time (local time
is offset to coordinated universal time (UTC) or Greenwich Mean Time (GMT)).
Most processors have capabilities to measure relative time on the basis of their
own clock speed, conferred by crystals (see piezo elements in Section 2.1.2.3).
In connected devices local time can either be retrieved over the Internet (using
the [Simple] Network Time Protocol, [S]NTP), or from a real-time clock (RTC)
that is independent from other system components. RTCs are usually ICs that are
default parts of larger computer systems (including many mobile devices), but are
not typically included on basic microcontrollers and so forth. In contrast to timers
in processors, RTCs have their own battery supply, run continuously and often add
additional logic to compare and adjust to world time. The most common principle
behind RTCs is counting crystal oscillations, but counters that use, for example,
powerline frequency as baseline exist as well. RTCs are peripherals that usually
communicate via hardware interfaces.
15.2 LOCATION
Localization technologies are divided into global and indoor localization systems.
The reason for this division is simple: global navigation satellite systems (GNSSs)
ideally require unobstructed line-of-sight connections between the satellites and
their respective receiver for best performance. A reliable GNSS fix is based on
at least four satellites, which is often difficult to achieve indoors due to signal
attenuation and multiple reflection. Indoor positioning systems (IPSs) can utilize
locally deployed anchor points, such as WiFi routers or Bluetooth beacons, with
thus stronger signals.
The key principle behind both technology types is signal tri- or multilater-
ation: Given the distances n to three or more points in space, it is
possible to infer a unique position from the intersection of the spheres of radius
n around these points. In the example of GNSS it is the relative dis-
tances from geostationary satellites with known positions around the globe. These
distances can be inferred from the time differences of arrival of precisely time-
aligned messages from different satellites at the receiver. For indoor localization,
these relative distances are inferred from either signal runtime or from the amount
of signal reduction experienced for signals of known strength.
Sensors 189
15.2.1 Global Localization
There are currently four GNSS providers, as well as more regional systems such as
the Indian Regional Navigation Satellite System (IRNS) with fewer satellites and
reduced global reach. The four main systems are:
• Navigational Satellite Timing and Ranging, more commonly called Global
Positioning System (NAVSTAR / GPS)
• Global Navigation Satellite System (GLONASS)
• Galileo
• BeiDou Navigation Satellite System (BDS)
NAVSTAR/GPS (or just GPS) is the oldest commercially available GNSS. It is
owned and operated by the United States government, who established it in the late
1970s for military purposes. Since 1996, GPS has been made publicly available as
dual-use technology, first with deliberately lower spatial resolution for commercial
activities, and since 2000 with the full ± resolution also available to the United
States military. Since it was first established, GPS has used more than 70 satellites,
with 31 currently being in orbit and active. The Russian GLONASS system has gone
through a phase of inactiveness since its establishment in the mid-1990s. Today
there are 26 healthy satellites, making GLONASS the second most used GNSS
in the world since 2008. GLONASS currently features a ± resolution. Both
Galileo, operated by the European Space Administration, and BDS, from the China
National Space Administration, have been established to be independent from the
GNSS provided by the former superpowers. They still have fewer active satellites
but go through a deployment phase that sees them as equivalent competitors to GPS
and GLONASS in the future.
A comparison of the precise mechanisms behind these GNSSs is beyond the
scope of this book. It suffices to say that the different GNSSs differ in the way
they generate time-aligned messages, and how their respective messages are be-
ing encoded and transmitted. In principle the messages of different GNSSs can
be combined, with some experimental chipsets suggesting a resolution in the cen-
timeter range under optimal conditions. While these quad-constellation chipsets are
still seeking maturity, GPS receivers have become cheap commodities that exist
in IC and breakout form factors. In the latter case, they often feature a simple se-
rial interface that communicates so-called National Marine Electronics Association
(NMEA) sentences in raw text, such as
$GPGGA,123519,4807.038,N,01131.000,E,1,08,0.9,545.4,M,46.9,M,,*47
The first field indicates the type of message (here: GPGGA, Global Posi-
tioning System Fix Data), followed by time (12:35:109 UTC) and the latitude and
longitude (48.07 degree North, 1.13 degree East). This sentence further shows that
a fix from 8 satellites has been obtained. There are two distinct measures of altitude:
The 545.4m position is above mean sea level, or 46.9m above the WGS84 ellipsoid
(World Geodetic System 1984). The sentence finishes with a checksum (*47).
It is noteworthy that unfiltered GPS positions often scatter around the actual
position. In practice GPS coordinates often see progressive averaging using a
Kalman filter (see Section 24.2.1.2), and especially in automotive applications, a
mapping to the closest street.
15.2.2 Indoor Localization
GNSS chipsets are becoming increasingly sensitive and can, under certain condi-
tions, provide useful data even when used inside buildings. Mobile phone providers
have long supported law enforcement agencies with multilateration data based on
the relative signal strength to cell towers, a strategy which works in both direc-
tions and which also enables mobile phone manufacturers to improve location data.
However, mobile phone multilateration is crude, and while signal penetration into
buildings is higher, it is not a method of choice for indoor localization.
WiFi, Bluetooth or other radio multilateration strategies can be used to de-
termine the position of a device relative to anchor points (or beacons) supporting
these standards. One entry point for radio-based localization is the aforementioned
received signal strength indication (RSSI). However, RSSI values can vary signif-
icantly between even nearby positions, requiring multiple sample points for reli-
able multilateration. Depending on the number of base stations, the resolution of
RSSI-based solution is between to for WiFi and in the sub- range for
Bluetooth. On the basis of reference points within a building, one can also establish
a database of RSSI value sets for each position and use the measured RSSI values
for a quick lookup. This strategy is commonly referred to as fingerprinting, with
resolution better than . While the RSSI value can change dramatically when
a radio signal penetrates building structures, the signal itself is not changed. With
specialized WiFi and Bluetooth devices, it is possible to determine the time-of-
flight differences that the signal requires between a point of interest and several
base stations. As the signal propagates at the speed of light, the synchronization
and communication between the various base stations becomes the bottleneck of
Sensors 191
the procedure. Alternatively, base stations with antenna arrays (minimally three; in
practice up to eight) can determine the angle-of-arrival (i.e., the time-of-flight dif-
ferences between antennas with known distance to each other allow the application
of simple trigonometry to infer the source of the signal).
Indoor positioning systems have not yet reached the level of maturity that
GNSS have. There is no prevalent standard for WiFi positioning yet and, while
indoor asset or people tracking is technically possible, at this stage choosing a WiFi
IPS system comes at the risk of vendor lock-in. Both Apple and Google in their
role as mobile phone manufacturers have proposed Bluetooth beacon standards for
indoor localization, iBeacon and Eddystone, respectively.
15.3 PHYSICAL TRIGGERS
In everyday sensor devices, physical triggers are those that are based on the fun-
damental gravitational and electromagnetic forces: kinetic force, light and sound,
temperature, and current.
15.3.1 Position, Motion and Acceleration
Motion is the positional change of an object, and acceleration is the change of dis-
placement of an object over time. The most simple motion sensors are mechanical
tilt switches, in which a small metal ball within a cylindrical container rolls toward
or away from two contacts, indicating movement by establishing or interrupting
an electrical circuit. These two-leaded components often only work reliably in one
particular orientation. However, there are six degrees of freedom that characterize
the motion of an object:
• Translation (∆)
1. Up or down
2. Left or right
3. Forward and backward
• Rotation ( , relative to the forward/backward axis)

1. Pitch (up or down)
2. Roll (left or right)
3. Yaw (rotating the forward/backward axis)
Along with the three-dimensional orientation of the object, this yields nine de-
grees of freedom: − ! − ! −! ∆− ! ∆− ! ∆− !
x y z . The first electronic iner-
tial navigation systems utilized gyroscopic platforms that maintained a fixed po-
sition independent of orientation. On these platforms were linear accelerometers
and mechanical gyroscopes, which provided an analogue readout of translation and
rotation on the basis of rotary encoders and resistance. German V2 rockets used
such systems for guidance through continuous integration on analog computers that
inferred motion and rotation. Historically, the applications of inertial navigation for
military missiles, human space flight and aircraft contributed to their development
and highlighted the need for smaller digital computers to process their information.
Modern inertial measurement units (IMU) combine data from an accelerom-
eter, gyroscope and magnetometer to infer the relative position of an object and
its movement in space. These devices are often marketed as 9DoF sensors, as they
are holding a 3-axis accelerometer, a 3-axis gyroscope and a 3-axis magnetome-
ter. Some manufacturers add barometric pressure sensors as indication of altitude.
These IMUs are often combined with location data from GNSS or IPS (see Section
15.2.1) in a computational process referred to as sensor fusion, allowing the aug-
mentation of the respective methods and thus increasing resolution and reliability
of a location fix.
Most current consumer-grade IMUs utilize micro-electro-mechanical systems
(MEMS), which feature mechanical components that are manufactured on-chip.
The mounts and springs in the schematic (see Figure 15.3.1) were exactly that
in the past, but today are flexible structures with known properties directly etched
into the chip. While in the depicted accelerometer and gyroscope the capacitative
difference between two plates provides an indication of translation and rotation,
alternative builds are based on piezoelectric readout. The inference of the 9DoF
requires fast computation, including appropriate filtering (e.g., Kalman; see Section
24.2.1.2) to average out any stochastic noise in the system. Low-price components
may not possess the computing power to infer robust readings. Most IMU units
(as well as standalone accelerometers and gyroscopes) communicate via hardware
interfaces, allowing the rapid transfer of data to a processing unit. Most IMUs
feature sample rates of up to several kHz, and are available on breakout/evaluation
Sensors 193
boards, although at their highest integration, some packages are in the range of
× × including sensor fusion unit. Their power consumption is
typically less than for the active unit.
Specialist IMUs can detect vibrations of (i.e., acceleration, and ex-
tremely small angular rotation). Specialist gyroscopes are indeed widely used in
aviation and underwater navigation: The ring laser and fiberoptic gyroscopes uti-
lize the so-called Sagnac effect, the interference of polarized light depending on
rotation. Other sensor types exploit the change of conductivity in well-stirred elec-
trolytic solutions. However, none of these aforementioned methods is compatible
with the demands of the mass market. Standard MEMS suffer from sensor drift,
which means that the real position and orientation have to be determined by alter-
native means on a regular basis. In environments where this is not possible (e.g.,
unmanned underwater vehicles), the more exotic and significantly more expensive
methodologies are being used.
15.3.2 Force and Pressure
Force is the interaction that changes the motion of an object, and pressure is the
force generated by collisions of liquid and gas particles inside a closed container.
The more particles are within a given volume, the more collisions happen. Pressure
is defined as the force that acts perpendicular to a surface , or .
There are two commonly used reference systems in respect to pressure: sci-
entific and technical applications are concerned with the full scale of pressure,
ranging from ultrahigh vacuum to high overpressure, measured with absolute pres-
sure sensors, and gauge pressure sensors that indicate over- or underpressure rela-
tive to barometric pressure. There are further sealed gauge sensors that maintain a
reference to indicate normal atmospheric pressure independent of loca-
tion. Absolute pressure sensors with a narrow pressure range are now also routinely
used in mobile phones, where they can be used as simple altimeters to determine
the relative vertical movement of a person over time.
The most common electric sensor for general purpose pressure measurement
operate on the basis of the piezo effect (see Section 2.1.2.3), either by increas-
ing resistance when the sensor surface is deformed by pressure or following the
piezo generator principle. The plastic versatility of piezo electric material allows
the implementation of micrometer sensing platforms with MEMS technology (e.g.,
for barometric pressure sensors). In combination with mechanical force collectors,
capacitative sensors in which the plate distance is dependent on the pressure (see
V+ V-
d1 c1
d2 c2 plates of
known mass
directions of only allowing plates oscillate

measurement up and down horizontally -
movement with angular disturbance
known spring produces different
constant c1/c2 ratios
z z
x x
S N
y y
accelerometer gyrometer magnetometer
(relative orientation to
magnetic N, for each
X-, Y- and Z-axis)
electron flow when
left: towards N-S
right: parallel to N-S
Figure 15.1 9DoF measurements. The principle behind accelerometer (top, left), gyroscope (top, right)
and magnetometer (right). An accelerometer determines motion in one direction. In this example, only
two measurements are taken for replication (by reading capacities 1 and 2 that are dependent on the
distances 1 and 2 between the tethered outer plates and the fixed middle plate – see also Capacitance
(Section 1.1.3.3). Knowing the mass of the outer plates as well as the resistance provided by the
mounting springs, from the ratio between 1 and 2 one can determine the acceleration into the direction
of measurement. The gyroscope is an extension of this principle: Here, the springs also allow lateral
movement and in operation the upper and lower plates see continuous oscillation between left and right.
If the device is rotated around the axis, the plates deflect upward and downward because of their inertia,
and the 1 2 ratio change allows the inference of the angular rate. Both accelerometer and gyroscope
work both in macroscopic implementation as well as MEMS devices where they are built on-chip. A
magnetometer utilizes the deflection of electrons in the presence of an external magnetic field. When
parallel to a magnetic N-S axis, electron flow is in straight lines around the needle. When exposed to
a magnetic field, electron flow is deflected either toward or away from the external field, depending on
the direction of current. This deflection can be measured with a Hall effect sensor, inducing a potential
difference between two probe points in the needle that is proportional to the direction and degree of
deflection.
Sensors 195
Section 1.1.3.3), or inductive sensors in which pressure is translated into the move-
ment of a wire in a magnetic field (see Section 1.2.2), are used in some applications.
Pressure sensors for industrial applications sometimes feature analog output, which
needs to be calibrated to give useful readings. For use in consumer devices, digital
pressure sensors with internal calibration are more prevalent. These can have very
small footprints in the range of × × if not mounted on break-
out/evaluation boards, communicate via field bus systems, and often feature current
consumption of the order of magnitude of a few microamperes.
Sensors with larger surface area and which are subject to higher force, as used
in scales (remembering weight is just the mass-dependent gravitational downward
force) or in mechanical flex sensors, often follow the principle of the piezoelectric
sensors although they use conductive polymers that see an increase of resistance
when deformed.
15.3.3 Light and Sound
Light and sound have different physical underpinnings. However, as they stimulate
the two most important human senses, and so they are typically discussed together.
15.3.3.1 Sound
Sound waves are periodic local air pressure changes. The human ear can hear air
waves of frequencies between and . The amplitude of the air waves is
what we perceive as loudness. As such, the detection of sound very much follows
the principles of detecting any other force or pressure.
In microphones the force of the sound waves is collected on a diaphragm.
The vibration of the diaphragm is then translated into electrical changes by electro-
magnetic induction (a reverse loudspeaker), a change in capacity (as in capacitative
pressure sensors; see Section 15.3.2) or using piezoelectronics (see Section 2.1.2.3).
The more dynamic this initial sampling is, the better subsequent electronics and/or
software after further amplification can decompose the sound wave into its distinct
frequency components, differentiate amplitude and deal with noise.
The simplest sound detector is therefore a mono-channel microphone with
only the most basic amplification and analog output. The other extreme is highly
specialized digital devices with high-frequency resolution across the audible spec-
trum and vast dynamic range, that with at least two microphones can represent spa-
tial differences in the signal (stereo) very clearly. For use with embedded systems,
the choice is often a sound detector that combines a simple microphone, ampli-
fier circuit, and potentially some processing such as thresholding. The latter can be
an especially useful feature, as even in the absence of background noise a micro-
phone/amplifier circuit generates output, and defining a threshold on the hardware
level can allow to control for that.
Sound is occasionally used in industrial environments to detect machine
failure. However, in this case, sound is only a secondary effect that is caused by
the vibration of components. The detectors to measure this type of sounds often
resemble MEMS similar to those found in IMUs (see 15.3.1). Laser scanning and
optoaccoustic measurements are also employed in these specialist applications.
15.3.3.2 Light
Visible light and as well as the near UV and infrared range of the spectrum can be
detected with a variety of simple electric components, for example light-dependent
resistors or photo diodes (see Section 2.1.1).
Photoresistors on the basis of cadmium/sulfur (CdS) as well as lead/sulfur
(PbS) see widespread use as light detectors. CdS show best sensitivity for visible
light between to , whereas PbS have peak sensitivity in the infrared
and may only be useful for visible light of strong intensity. However, these compo-
nents only allow the qualitative detection of light, or after calibration, some crude
quantitative approximation of the light level (see exemplary Table 15.1). Both CdS
and PbS-based components are not Restriction of Hazardous Substances Directive
(ROHS)-compliant. Because of their poor UV performance, light-dependent resis-
tors are not used for detection of UV-A/B radiation. The solution are photo diodes
coupled with amplifier circuits, which makes UV detection more expensive than
other light-level measurements. While LDRs are passive components, the opera-
tion of a UV detector can draw current in the milliampere range. To measure light
quantitatively, both the overall intensity as well as the spectral intensity, differing
over the frequency, is relevant. Most digital luminosity sensors are based on photo
diode/amplifier/DAC integration, featuring output that is precalibrated by the man-
ufacturer (e.g., for output in lux or Watt per cm2 ). Color sensors are devices that
follow the principle of luminosity sensors, but utilise filters to obtain different read-
ings for red, green and blue. In contrast to the aforementioned sensors that report
just one light level (e.g. at the peak of their sensitivity corresponding to yellow
light), the separation into three color channels allows to analyze the color even of
reflected light. These devices often feature fast sampling rates and communicate via
hardware interfaces like SPI or I2 C.
Sensors 197
Table 15.1
Light Levels and Light-Dependent Resistance
Light Level Lux Resistance

Darkness 0 Ω
Moon light 1 Ω
Room light 10 Ω
Outside, dawn, overcast 100 Ω
Outside, day, overcast 1,000 Ω
Outside, sunny 10,000 Ω
It is important to note that most scanner devices ( barcode, laser scanner) as

well as camera systems are based on the same principles as simple light sensors,
although especially for the latter at a much higher level of integration with many
millions of sensor pixels per chip.
15.3.4 Temperature
Temperature is the energy lost through the motion of particles. As a key physical
property, there are a wealth of scientific and technical methodologies for the
precise direct or indirect measurement of temperature. The most common electrical
temperature sensors are:
• Bandgap temperature sensor
• Temperature-dependent resistor ( thermistor)
• Resistance temperature detectors (RTDs)
• Thermocouple
Bandgap Temperature Sensor
If the semiconductor components of a circuit are carefully characterized, the

bandgap effect allows to measure temperature at no cost. The forward voltage of
a silicon diode is temperature-dependent; that is, knowing the voltage allows the
inference of the temperature using Boltzmann’s constant and electron charge. The
so-called Brokaw bandgap reference can be implemented with just two transistors,
four resistors and a comparator. This allows the integration of temperature sensors
in microprocessors and very small IC packages. Commands to access the respective

values are often part of standard microprocessors, but the measurements may only
be useful for crude approximations of temperature.
Thermistors and Resistance Temperature Detectors
Thermistors are amongst the cheapest components for temperature measurement.

Depending on their type, these resistors either see a decrease of resistance (negative
temperature coefficient [NTC]) or an increase of resistance (positive temperature
coefficient [PTC]) with rising temperature. As with light-dependent resistors, the
resistance needs to be calibrated for the conversion between resistance and tem-
perature. Both NTCs and PTCs show nonlinear behavior (i.e., the simple interpo-
lation of temperature given two resistance/temperature pairs is only valid over a
small temperature range). The Steinhart-Hart equation is a third-order approxima-
tion T · · , and manufacturer data sheets typically provide
, and for a given product.
NTCs are commonly used for measurement applications, whereas PTCs
more often see applications where a cutoff of electricity is beneficial above a
certain temperature threshold (i.e., control or protection circuits). Intrinsically,
NTCs follow the principle of bandgap temperature sensors (i.e., the increase in
temperature gives rise to more charge carriers in a semiconductive polymer).
PTCs are made from ceramic whose resistance rises suddenly with an increase in
temperature, lending themselves to protective applications. Thermistors are related
to resistance temperature detectors, but are preferred over RTDs in the temperature
range of − ◦ to ◦
for their greater precision.
RTDs utilize coils of elementary metal wire wrapped around a nonconductive
core (coil-element) or meandering on a nonconductive surface (thin-film). As the
material-specific conductivity is known per mass/volume and temperature, the
resistance provides a good approximation of temperature, with the length of the wire
compensating for imprecisions of production as the relative error decreases. RTDs
◦ ◦
have greater range over thermistors and are used between − and . The
so-called standard platinum resistance thermometer (SPRT) is the gold standard for
◦
electronic temperature measurement, with an accuracy of ± . Because of
the cost of platinum, the SPRT is not widely used. Most RTDs therefore compensate
for accuracy by using sophisticated circuits that aim to minimize the systematic
error of a simple voltage divider for the determination of resistance.
Sensors 199
Table 15.2
Examples of Thermocouple Typology
Type Conductors Temperature Range Guaranteed Accuracy

J Iron- − ◦ to ◦ Class 1: − ◦ to ◦ :± ◦
constantan Class 2: − ◦ to ◦ :± ◦
K Chromel- − ◦ to ◦ Class 1: − ◦ to ◦ :± ◦
alumel Class 2: − ◦ to ◦ :± ◦
S Rhenium- ◦ to ◦ Class 1: ◦ to ◦ :± ◦
platinum Class 2: ◦ to ◦ :± ◦
T Copper- − ◦ to ◦ Class 1: − ◦ to ◦ :± ◦
constantan Class 2: − ◦ to ◦ :± ◦
Thermocouple
The Seebeck effect describes the electric potential difference that can be observed
when two conductors of different material and different temperature are connected.
As a rule of thumb this thermoelectric effect is a voltage drop of about µ
(see the Peltier element for further information, Section 2.1.2.3). Thermocouples are
components that utilize this effect, in that they keep part of the instrument at room
temperature (reference temperature), and expose the other part of the instrument to
the temperature that is to be measured.
Depending on the conductive metals that are being used, there is a typol-
ogy of thermocouples with different optimal temperature ranges, levels of accu-
racy/tolerance and resistance to harsh environments (see Table 15.2). Thermocou-
ples are primarily used in industrial settings, which lead to the typology and also
an ecosystem around mount and connector standards. They see continuous deploy-
◦
ment in environments up to , although in particular tungsten/rhenium-alloy
◦
couples can withstand in the absence of oxygen. Although thermocouples
achieve relatively high accuracy of ± ◦ at these high temperatures, RTDs have
◦
replaced them for many applications in a range up to .
Pyrometric Reading
Hot objects emit infrared radiation of which the intensity decays over distance.
Pyrometers are often hand-held remote-sensing devices that determine the distance
and infrared intensity level, allowing (a) convenient handling, (b) quick and safe
measurement and (c) the ability to determine temperatures at which physical probes
would melt (e.g., in the smelter industry).
Digital Temperature Reading
Thermistors, RTDs and thermocouples are intrinsically analog devices. With their
requirement for calibration, they are not very practical in applications where soft-
ware adjustments are difficult. This has led to the development of digital thermome-
ters that take care of conversion and communicate their results over industrial field-
bus systems.
15.3.5 Current
All ammeters for the direct measurement of current in a circuit are based on Ohm’s
V
law (see Section 1.1.3.4) (i.e., via R ), knowing voltage and resistance). The
choice of shunt resistor and the caveats when trying to measure very small currents
are best left to specialist text books.
The indirect measurement of current is often relevant in building automation
and energy cost monitoring when the current draw of a third-party device or the
overall draw through the primary electricity supply is to be determined. In both
cases one measures the magnetic field around the wire, which is proportional to the
current it carries (see Section 1.2.2). There are two main strategies to determine the
magnetic field; measuring the Hall effect around AC and DC supply lines, and the
inductive effect around AC lines:
• Hall effect sensors require the precise orientation of a sensor plate orthogonal
to the magnetic field around the supply line. The plate itself is supplied with a
constant current, which yields a potential difference when the magnetic field
permeates it. This is the analog output signal.
• Inductive current meters are coils that are positioned around the supply line
like a cuff. As the current in the line changes direction constantly (typically at
- ), the buildup and decay of the magnetic field induces an electric
current that can be converted into a voltage, which is proportional to the
strength of the magnetic field.
Both methods are calibrated to the size and geometry of their respective imple-
mentations. The voltages that are generated are often small, potentially requiring
Sensors 201
amplification that is provided on many devices in the breakout/evaluation form fac-

tor.
15.4 CHEMICAL TRIGGERS
For the purpose of this book, we define chemical sensors as those that recognize the
presence of compounds (e.g., dust or smoke) and/or determine or quantitate their
composition (e.g., the amount of nitric oxide in air).
It is the very nature of chemistry as a science to develop precise and accurate
methods for the sensitive detection and quantification of atoms and molecules.
Given that any device that yields some sort of electrical output may count as
sensor, most specialist analytical instruments utilized in scientific research could
be classified as such. Many devices used in bioanalytics are chemical sensors, but
although, for example, instruments for DNA sequence analysis have shrunk from
the size of refrigerators to that of USB sticks (lab-on-chip), they remain unusual
companions for microcontrollers and embedded systems.
There are more than 100 million substances indexed in the Chemical Abstract
Service (CAS). While chemically their identification is sometimes trivial in the
laboratory, there are massive constraints to our ability to detect and measure them
electronically. In most cases there would be an indirect proof for the presence of
a particular substance; for example, the release of a proton following a specific
reaction leads to a change in pH (acidity), and this can be measured with an
electronic probe. This may require manual sample preparation, the exchange of
consumables and the cleaning of the device after use.
Here, we focus on the functioning of chemical sensors that are straightforward
to use with conventional microcontrollers and that do not necessitate any manual
handling.
15.4.1 Solid Particles
Particles are not further defined agglomerates of matter. They are usually carried
in a medium, such as air or water. The number of dust and smoke particles in air
can be counted electronically, which sees applications in smoke detectors and in
environmental monitoring.
Smoke
Traditional fire alarms utilized the radioactive ionization of air in a small gap. This
allowed an extremely small current to flow, typically amplified using a transistor.
If smoke particles annealed with ionized air, the current dropped and activated an
alarm. Ionization alarms are environmentally critical and are being progressively
replaced by photoelectric or carbon-monoxide alarms. Photoelectric sensors utilize
a small light (e.g., LED) and a photo diode that are mounted so that no light can
fall onto the detector. If smoke particles enter the casing, some light is scattered
on the particle and activates the photodetector. The problem of both ionization
and photoelectric detectors is the differentiation from normal dust, particles that
are contained in cigarette smoke, or simple water vapor. Many household detectors
therefore only test for carbon monoxide, which is a side product of burning organic
substance, or require both carbon monoxide and smoke detection and occasionally
heat to trigger an alarm.
Dust and Particulate Matter
Environmental and industrial applications require the count of particles in a

medium, such as air or water. In contrast to smoke detectors that only focus on
detection, the precise quantitation of the number of particles is key when monitoring
product quality or pollution levels. There are three main strategies:
• Extinction
• Scattering
• Coulter effect
After removing any particles above a certain size with a filter/sieve from a liquid,
the medium passes through the detection gap. For light scattering and Coulter
effect detection this gap is ideally dimensioned so that only one particle at a time
can pass through, triggering an event for each particle. For extinction devices,
the number of particles can be inferred from the amount of light that is being
blocked by the particles in the light path across a gap of defined width. For very
precise measurements, the method has to be calibrated to the extinction coefficients
of the most common particles. Extinction is often used for larger particles; for
example, for quality control in chemical plants where petrochemical produce can
be contaminated with particles µ . The method is less sensitive than light
scattering, which works well for measuring particles in ultrapure water down to
Sensors 203
the size of bacteria and large viruses. Devices using the Coulter effect test for the
lack of conductivity when particles pass through a semipermeable membrane of a
galvanic cell (see Section 13.3.1), in which they stop permeability, leading to a brief
interruption of current.
Dry air measurements are mostly built around extinction/scattering detec-
tion. These devices can determine fine particles µ (particle matter, PM 2.5 )
and coarse dust µ (PM10 ) as major industrial and urban pollutants. Many
consumer-grade devices do not allow to count the actual number of particles, but
use the opacity/reflectivity of air as a relative indicator of contamination over time,
facilitated by the slow movement of air with a small fan. Laser-based devices allow
more quantitative readings by quasi-imaging of particles, utilizing different beam
widths to distinguish between distinct particle sizes. Here, the air is moved through
a laser beam and the loss of signal intensity is proportional to the size of the particle
as it blocks out part of the circular beam. Assuming spherical particles, their num-
ber, and overall volume and weight can be approximated. While cheap laser-based
devices are available in the price range of expensive extinction/scattering sensors,
their operating current of can limit their utility for mobile deployment.
The accuracy of these sensors depends to a large degree of the volume of
medium that is being tested for a reading, and at which speed the test is performed.
There is considerable debate if diffusion into a precisely dimensioned sensor
chamber is sufficient, or whether a steady flow of medium in the range of several
liters per hour should actively pass through a sensor unit.
15.4.2 Humidity
Humidity describes the amount of water vapor in air, or the amount of water stored
in soil. The detection and quantitation of water is essential, as most industries can
suffer either too much (e.g., building maintenance) or too little (e.g., agriculture)
humidity. There is a vast range of scientific methods to quantitate water and water
vapor, thus here we are going to focus on sensor technologies routinely used with
microcontrollers and in embedded systems for mass production.
Hygrometers for water vapor and soil humidity share the same principles
but also use slightly different methods. Prevalent methods for the measurement of
relative and absolute humidity of air are:
• Resistive and capacitive measurements are related as they are directly based
on the conductivity of water. Capacitative sensors use a hygroscopic di-
electric substance that mediates the flow of current between the two plates
depending on the amount of water that is absorbed from the environment.
The change in the properties of the capacitor can be directly correlated with
relative humidity, typically in the order of to per 1% relative
humidity. In resistive sensors, voltage required to establish a current between
two probes separated by a gap with hygroscopic substance is being measured.
Both sensor types require calibration and achieve between 2–3% accuracy at
5–95% relative humidity. As the conductivity of the hygroscopic material
can also be temperature-dependent, it may be necessary to infer humidity
based on reading both voltage/current and temperature. Resistive and capac-
itative sensors are amongst the cheapest available for the mass market. They
usually support field bus systems and draw current in the single-digit mA
range, mostly during conversion of the reading into a humidity value and
communication.
• Dew point hygrometers utilize Peltier cooling elements (see Peltier element,
Section 2.1.2.3). They test for electric conductivity or optical reflection on
the surface of the element once the temperature falls below the dew point and
water condensates. As the dew point is dependent on the relative humidity, a
lookup table can provide a good approximation between the voltage required
to chill the element and the level of humidity. Dew point hygrometers
are often used in industrial/manufacturing settings, where they can provide
accuracy in the range of ± 0.005% relative humidity.
• Thermal conductivity hygrometers test how easily air can transfer heat. If air
is dry heat cannot be dissipated as easily as when the environment is more
humid. In thermal conductivity meters, a test and a control thermistor (see
Section 15.3.4) are in a circuit that sees a large current. The control ther-
mistor resides in an absolute dry atmosphere, whereas the test thermistor is
exposed to the outside. The temperature difference between both thermistors
is an indication of the absolute humidity in the environment.
Soil humidity sensors are occasionally based on the principle of resistive and capac-
itative measurements, and the soil itself is the hygroscopic material. The tensiometer
is a device that determines the osmotic pressure inside a test vessel with a solution
isotonic to plant cells. This vessel experiences underpressure if there is too little
water in the soil, which can be measured using capacitative pressure sensors etc
(see Force and Pressure, Section 15.3.2). More recently, frequency domain (FD),
time domain transmission (TDT) and time domain reflectometry (TDR) sensors
have seen applications in agriculture. In simple terms, FD sensors are LC circuits
(see Section 1.2.3.2), which adds over simple DC capacitative measurements that
the resonance frequency of the circuit is a better overall estimator of soil humidity.
Sensors 205
TDT and TDR are based on the penetration of soil with electromagnetic waves. The
velocity of the wave is dependent on the water content, and can be measured as
signal propagation changes in the range of a few hundred picoseconds. TDT and
TDR measurements are accurate to 1% and do not require calibration for the soil
type, but are significantly more expensive than FD or other devices.
15.4.3 pH and Other Ion-Specific Indicators
The concentration of hydrogen-ions (H+ ) determines the acidity or alkalinity of a

solution. It is communicated as pH = -log10 ([H+ ]), by definition ranging from 0
(acid) to 7 (neutral) to 14 (base). As with most compounds, there is a vast range
of methods to determine the pH in a laboratory setting. Here, we are focusing
on a single type of sensor frequently used in mobile pH meters for monitoring
environmental pH levels.
Measuring [H + ] is based on the potential difference determined with a stan-
dard hydrogen electrode (SHE). The SHE is a platinum electrode that mediates the
redox reaction: 2 H+ (solution) + 2 e – −−! H2(gas) (see Section 13.3.1). The num-
ber of electrons that can be turned over in this reaction is directly dependent on
the concentration of H+ ions in the solution. The flow of electrons from the probe
into solution yields a potential difference that is equivalent to the concentration. pH
meters frequently suffer from sensor drift and need to be calibrated to a particular
temperature and pressure range.
Other Ions
The concentration of other ions (e.g. alkali metal and other metal ions, even some
organic compounds) can be determined by redox mechanisms as well. Often,
electrodes are surrounded by special pore filters that allow only water and the
compound in question to reach the electrode surface.
An important development in pH measurement is the development of H+ -
dependent field-effect transistors (ion-sensitive field-effect transistor [ISFET]).
These allow miniaturized lab-on-chip deployments that enable thousands of parallel
measurements at the footprint of an integrated circuit. While not yet compatible with
the demands of the mass market, ISFETs see potential applications in food safety
and mobile biomedical diagnostics.
15.4.4 Alkanes, Alcohols and Amines
Organic substances like carbon monoxide and nitric monoxide, alkanes (Cn H2n+2 ),
alcohols (Cn H2n+1 OH) and amines (x-NH2 ) in gaseous phase can easily be oxidized,
which enables their measurement. In simple terms, this means: substance −−!
substance + + e – , and the detection of the electron as well as the selectivity using
pore filters works as described in the last section. Catalytic detectors mediate this
reaction by providing heat (heat filament), allowing oxygen to reduce the substance
in question. Alternatively, some electrochemical sensors exploit the fact that some
reducing agents bind otherwise surface-bound oxygen, which allows the flow of
current in certain semiconductors.
All these methods have the correlation between the concentration of a sub-
stance and an output current in common. Selectivity is provided by the filtering of
alternative reactants by size or other physical or chemical properties, and in the case
of catalytic detectors, the temperature of the filament. Following calibration, the
current can be translated into absolute concentrations. A critical issue with catalytic
sensors is their relatively high power demand, often requiring and a current in
the range of .
Chapter 16
Embedded Systems
Systems-on-chip and microcontrollers are at the center of many sensor or actuator

systems, as well as simple gateways. In Section 2.3 on programmable computers we
have distinguished them from even simpler computing devices by the presence of
memory and easily accessible input/output ports, and from multipurpose computers
by the execution of code that is typically specific for the task for which the controller
is deployed. Just as it is not feasible nor sensible to provide a market overview for
any computer given their short update cycle, this chapter cannot serve as a substitute
for catalog browsing and vendor information. The focus here is on key properties
that can play a role in the selection of a microcontroller for use in embedded systems
without providing definitive recommendations. Often, these properties are tightly
linked to each other (for example, if battery life time is paramount, there is only
a limited choice of controllers that can be used; requirements around security and
safety dictate others).
Embedded systems are the compute units that are build into any devices
that require electronic control, but that do not necessarily connect to a stand-alone
computer. This ranges from small sensors where the compute unit itself may be
larger than the actual physical sensor (for example, as a computing center within a
MEMS, e.g., see Section 15.3.1). It reaches to simple computers in home appliances
like washing machines, and ends in industrial machinery and even standard vehicles
that these days often house dozens if not hundreds of interconnected embedded
systems.
Embedded systems are generally built in two ways: either a task-specific
circuit board can host a microcontroller that controls the logic of the device, or
a microcontroller board with often standardized physical footprint and input/output
interface can act as host for task-specific add-ons. With the rise of the Arduino
207
and Raspberry Pi platforms (see Connecting Things, Section 4.3), credit card-size
microcontroller boards that can be easily programmed through USB connections to
computers have led to a democratisation of embedded programming. These boards
have produced vast ecosystems of shields (Arduino), capes (Beaglebone) or hats
(Raspberry Pi) that have specific functions like GPS functionality, motor control or
interfaces to various field bus systems. In the professional context, however, these
microcontroller boards rarely see applications beyond initial prototyping. Here,
manufacturers may test the capability of a microcontroller and develop code for it.
Ultimately, they are going to take that microntroller and place it on their own printed
circuit boards. It is noteworthy that most microcontrollers exist in different form
factors and in different mounting types. While for breadboarding a sizeable dual-
inline format may be ideal for handling and experimentation, for mass production
smaller surface-mount components are typically preferred.
The Arduino UNO microcontroller board provides an excellent example of
how a microcontroller interfaces with the physical world (see Figure 16.1). It is
important to keep in mind that the Arduino is essentially a hobbyist device in
many respects. The ATmega 328P microcontroller is a large device in terms of
size, but controllers for professional applications can feature speeds of several
hundred megahertz, megabytes of memory, and dozens of I/O pins, at a fraction
of the physical footprint of the ATmega 328P. In fact, many of the ARM Cortex
microcontrollers found in current top-end embedded systems are closely related
to the powerful processors that can be found in mobile phones and so forth. The
microcontroller boards supporting these units are themselves highly complex and
require careful study in comparison to the Arduino UNO, whose intuitive use was a
particular design goal.
16.1 MICROCONTROLLERS
There are typically many different microcontroller boards such as the Arduino UNO
that expose more or less of a controller’s capability to the outside world. Under the
assumption that these boards are primarily for prototyping purposes, their properties
are less important than those of the actual controllers themselves. What are the
choices in the selection process?
Embedded Systems 209
reset analog pin 5

pin 0, rx ATmega analog pin 4 4
pin 1, tx 328P analog pin 3 6 Digital
pin 2 analog pin 2
3 5
pin 3, PWM 16 MHz analog pin 1
pin 4 analog pin 0 Arduino UNO
V+ 32kB GND
GND program -- 1
crystal V+
crystal 2kB pin 13 2
pin 5, PWM SRAM pin 12
pin 6, PWM pin 11, PWM Power Analog
pin 7 1kB pin 10, PWM
pin 8 EPROM pin 9, PWM footprint ca. 6.5 cm by 5 cm
Figure 16.1 Arduino UNO. Left: The 26-pin ATmega 328P microcontroller (DIP). The controller is
supplied with power via the V+ and GND pin connections, and the execution of code is triggered by
an external crystal. It features 32 kB program memory, 2 kB SRAM and 1 kB EPROM. Most pins of
this microcontroller are general purpose input-output (GPIO) with some degree of specialization: pin 0
and 1 are default-enabled for serial communication, pins 3, 5, 6, 9 to 11 can produce pulse-modulated
signals, and analog pins 0 to 5 report to the built-in 10-bit ADC. Right: The Arduino board exposes
the functionality of the ATmega 328P. The microcontroller (1) can be programmed via In-Circuit Serial
Programming (ICSP), a 6-pin interface in the case of the ATmega328P. The board is supplied with
power either by a dedicated battery port (2) or USB (3). The controller is driven by a 16 MHz crystal
(oval shape). There are two voltage regulators to provide on-board 5V and 3V3, as well as capacitors to
stabilize the supply. The USB connection that can be used to program the ATmega 328P is mediated by
a dedicated microcontroller, the ATmega 16U2 (3), which also coordinates the display of the hardware
serial interface over USB. Troubleshooting here is enabled by a ICSP nearby. The Arduino board can
be reset with a push button (4), and onboard LED and serial communication status is are surface-mount
LEDs (5). Female headers expose the GPIO pins (6). On the level of the circuit board, it becomes clear
that the headers are really just extensions of the pins of the microcontroller.
16.1.1 Architectures
ARM, AVR, the Intel 8051 series, MIPS, PIC or PowerPC are a few examples
of microcontroller architectures. An architecture is the principle logic behind the
functioning of a processor, along with a specific command set that can be used by
programmers to execute compute functions, access memory and input-output, and
manage power.
It is important to note that while some designs such as the AVR (AV-RISC,
by Atmel) or PIC (Peripheral Interface Controller, by Microchip Technology) are
specifically developed and built by a particular manufacturer, others like ARM or
PowerPC are theoretical designs that can be built by a range of license-holding
vendors. Each of these designs can give rise to an entire line of different processors
featuring different properties. For example, while ARM is generally known as a
reduced instruction set computer architecture (RISC, a fast instruction set often used
in microcontrollers), different versions such as ARMv7, ARMv8, ARM Cortex,
and derivatives thereof define entire lines for particular applications. Although the
nomenclature and naming is different from manufacturer to manufacturer, it is often
possible to identify a microcontroller with very similar capabilities from different
manufacturers, such as ATmega 328P (AVR) and PIC18F2520 (PIC). In this case,
the preferred programming environment, actual performance and cost may become
the decisive factor.
Within the product line of each manufacturer, the controllers can be divided
by applications, ranging from general purpose microcontrollers, controllers with
accelerated command sets (e.g., for real-time applications), or microcontrollers
nearing the capability of a processor for multipurpose computing. The bus width
(8-, 16- or 32-bit) has a direct impact on execution speed (e.g., a 16-bit integer
can be modified in one step on a 16-bit machine, but requires several steps in a
8-bit machine). Increasing bus width also enables a larger addressable memory,
although the latter is often less relevant as most microcontrollers have little memory
for the sake of cost, so that the addressable space is less critical. While 8-bit
microcontrollers have the least capability, their cost, especially when purchased
in bulk, still makes them worthwhile competitors when choosing a controller for
embedded systems for appliances produced in mass production. These controllers
may not be a first choice for applications that require fast computation, but they are
usually sufficient for controlling even complex business logic and decisions within
a device.
Embedded Systems 211
16.1.2 Power Consumption
The power consumption of a microcontroller is tightly linked to the execution

speed. As every operation (including a no-operation cycle when idle) consumes
a given amount of power, doubling the speed of a processor usually also doubles its
consumption. While modern microcontrollers are more energy-efficient than their
predecessors, using a 16-MHz 8-bit controller is still likely going to conserve more
energy than using a 48-MHz 16-bit device. Most microcontrollers therefore support
reduced energy modes that throttle performance for the benefit of power saving.
Additional strategies for increased energy efficiency is the disabling of ADC/DAC
ports, interrupts, as well as the implementation of various sleep modes that effec-
tively slow down the controller and require only energy for the conservation of
memory.
16.1.3 Input-Output Capability
In contrast to processors for multipurpose computing, microcontrollers usually sup-

port a number of input and output pins (GPIO) for communication with peripheral
devices. GPIOs also often support pulse-width modulated output, as well as digital-
to-analog or analog-to-digital conversion. The differences between capabilities may
be exemplified on two AVR controllers: while the ATmega 328P has 13 digital and
6 analog GPIO ports, the ATmega 2560 features over 80 such ports. That is, the
ATmega 2560 can integrate data from four times as many input sensors.
While it is often possible to mimic the communication pattern of low-level
protocols (e.g., SPI, I2 C) or fieldbus systems (e.g., CAN) when interfacing to
external devices by direct bit manipulation on the GPIO (bit banging), many
microcontrollers have dedicated logic encoded in hardware that simplifies and
accelerates this communication. The nature of these interfaces and their working
is detailed in Part VI on Device Communication.
Some GPIO ports are interrupt-enabled. This means that a signal (high, low,
rising or falling flank) can invoke code even though the processor may just be stuck
in a different part of the business logic. This is an essential feature for the real-time
response to input signals. While the addition of resistors to the GPIO may sound
trivial, pull-up or pull-down resistors on these ports force them into a defined state
even when they are not electrically connected, which can be a useful feature when
evaluating button presses or other operations where the electrical connection is only
present when the trigger is active.
16.1.4 Operating Systems and Programming
General purpose computers require an operating system that mediates between pro-
grams in the user space and machine commands in the processor. To accommodate
the needs of any possible use case, operating systems have to balance flexibility
and overhead. For most microcontrollers, even the most basic housekeeping per-
formed by an operating system would present too much of a strain on the hardware.
Here, so-called boot loaders copy machine commands straight into the processor
for immediate execution after startup. This complicates the programming of mi-
crocontrollers, as code editing and compiling to machine code has to happen on a
separate machine before in a second step the machine code is transferred into the
controller’s memory. While more cumbersome than developing on and for a local
desktop machine, modern integrated development environments that simplify these
steps exist for virtually all microcontroller boards.
There is a motion to include security-relevant functions as part of the mi-
croprocessor itself. That is, rather than leaving the implementation of cryptographic
functions to the application programmer, these functions could soon become a com-
modity that is easily accessible through machine commands specific to a controller.
This would lower the entry barrier for less-experienced programmers, which would
enable them to write more secure embedded software without explicit knowledge
of security methods.
In conclusion, for many IoT applications that require battery-driven operation, the
most energy-efficient microcontroller may be desirable. This will typically require
a compromise in terms of execution speed and features (such as a large number of
I/O). When developing hardware for the mass market, price is an especially impor-
tant parameter. Unit costs can differ significantly (from 1 USD for a simple 8-bit
controller to 10 USD for high-speed units with a floating-point unit).
Part VI
Device Communication
Chapter 17
Communication Models
Communication is an abstract concept. Similar to our definition of information in

Chapter 3, there is an intuitive component to how we use the term communication
in everyday conversation, as well as a set of physically quantifiable entities that
are used for communication in the technical sense. When we address the means of
communication in a technical context (i.e., between devices), we may refer to
• The hardware components (e.g., wires, transmitters and receivers)
• The nature of the transmission (e.g., a current, an electromagnetic wave) or
• How the transferred information is to be interpreted, for example:
– Is a positive voltage on or off?
– Do we read the incoming byte as as 8-bit unsigned integer, 0 – 255, or
as signed 7-bit integer, -127 – +128?
Because communication follows particular conventions and standards, these
three layers are sometimes inseparable and a good knowledge of the multilayered
nature of communication is helpful, especially when working in interdisciplinary
teams. For example, software developers may focus primarily on the interpretation
of data in its digital representation, whereas hardware engineers are concerned about
physically measurable time differences in between electrical signals.
The following chapters cover hardware interfaces used for connections be-
tween microcontrollers and external devices, as well as standards for wired and
wireless communication. The protocols that build on the Internet as transmission
215
channels are then detailed further in the Software section (Part VII). Both the fol-
lowing chapters and Part VII will frequently refer to the widely accepted Open Sys-
tems Interconnection (OSI) or the Transmission Control Protocol/Internet Protocol
(TCP/IP) models to explain some features of the physical and logical layers.
17.1 OPEN SYSTEMS INTERCONNECTION REFERENCE MODEL
The OSI Reference model in its current form is over 30 years old. While often being
used for explaining the technical stack of the Internet, the primary aim for the model
is being a common reference point when developing communication standards, and
ensuring compatibility to the respective upper or lower layers (see Figure 17.1).
Representing a compromise between two standardization bodies, the reference
model has been published in 1984 as International Organization for Standardization
(ISO) 7498 or ITU X.200.
17.1.1 Layer 1: Physical
At the most basic layer there are standards and definitions around hardware in-
terfaces and electrical signals for data communication. For example, for Ethernet
cables, this concerns the dimensions of plugs, the spacing and meaning of pins,
or the voltage range at which a connection must operate. In the case of wireless
standards, Layer 1 specifies frequencies, modulation and signal strength: the core
WiFi, Bluetooth or proprietary radio protocol definitions begin with this first layer.
The hardware interfaces in Chapter 19 belong here.
17.1.2 Layer 2: Data Link
At the Data Link Layer, standards such as the Serial Line Internet Protocol (SLIP),
the Point-to-Point Protocol (PPP) or the Asynchronous Transfer Mode (ATM)
define how two adjacent nodes in a network do communicate, using the physical
infrastructure defined in the Layer 1. In other words, at this stage, it is already
clear how physical signal changes encode a bit stream, but adjacent nodes need to
become aware of each other. This happens by framing the bitstream into packets,
cells or frames (depending on the nomenclature) that establish a basic handshake
between the two nodes.
Historically, SLIP was used for modem-based communication, whereas ATM
represents the underlaying standard of the Integrated Services Digital Network
Communication Models 217
data unit layer examples

application HTTP, FTP, 'IoT protocols'
host / service presentation peer-to-peer
data TLS, SSL
layers protocols
session NetBIOS
segment /
datagram transport TCP, UDP
packet / network
datagram network IP, ARP, ICMP, IPSec
device / media protocols
layers frame data link Ethernet, ATM, PPP
bit physical Ethernet, WiFi, USB, Bluetooth, SPI, I2C
"Hello"
"Hello" 1 2
Figure 17.1 Open Systems Interconnection Model. The seven key layers of the OSI model are physical,
data link, network, transport, session, presentation and application. For each of these layers, there are
exemplary technical standards with an entry point at that level. For example, the Ethernet standard defines
physical dimensions as well as electrical properties for establishing a physical connection between two
devices (indicated as 1 in the exemplary network diagram). The smallest unit of data that can be captured
at that level is the bit. The first three to four layers are the device or media layers, detailing how data
is exchanged between nodes on the Internet (indicated as 2). The standards on these levels are typically
referred to as network protocols. Once a continuous physical connection between two edge nodes (the
sender and the receiver) are established, data connections are organised in sessions, presentation (e.g.,
unencrypted or encrypted) and application layers. These are protocols for the communication between
two devices (indicated as 3), including how information is requested and sent between them like in a
Web connection.
(ISDN). Other protocols based on fiber-optic communication are Synchronous Op-

tical Networking (SONET) and Synchronous Digital Hierarchy (SDH). It is impor-
tant to note that already at this level, the technologies are not entirely separable.
PPP provides a unification layer and allows different point-to-point protocols to
share the same physical link, irrespective if it is via a serial connection, ISDN or
Ethernet. Standards like PPPoE specifically deal with PPP connections over Ether-
net, whereas PPPoA are PPP connections via ATM. Both are the default methods
to establish the Digital Subscriber Line between a DSL exchange (provider infras-
tructure) and the end point (user).
17.1.3 Layer 3: Network
At the Network Layer, data from Layer 2 receives a unique address to form a packet
for transfer within the network. Rather than focussing on a physical one-on-one
connection as in Layer 2, the data packet carries the conceptual information to bring
its payload to a target node in the network, irrespective of the physical structure
of this network. Chapter 22 will detail how IP packets are formed and how the
information is interpreted by processes at Layer 3 and Layer 4. Other protocols
at Layer 3 are the Address Resolution Protocol (ARP) that links IP addresses to
hardware addresses (media access control [MAC]), or the Internet Control Message
Protocol (ICMP) that like ARP does not serve data communication but has a role
in Internet infrastructure maintenance. For example, ICMP packets can be used to
determine why a node does not respond to a connection request.
17.1.4 Layer 4: Transport
The Transport Layer determines how packets from Layer 3 are routed through the
network to reach a target computer. As with IP, because of their relevance when
developing IoT solutions we will detail TCP and User Datagram Protocol (UDP) in
Chapter 22.
17.1.5 Layers 5 – 7: Session, Presentation, Application
In the concurrent TCP/IP model (see Section 17.2) these layers are jointly referred
to as Application Layer. The session is a concept of a connection between two
computers that continuously exchange data delivered over Layer 4. While at Layer
4, depending on the protocol, there may be trial and error of delivering a packet,
Layer 5 assumes a stable peer-to-peer connection that is entirely agnostic to the
Communication Models 219
network structure. The Network Basic Input/Output System (NetBIOS) protocol

allows computers to share data and resources (such as printers) as if they were
connected directly. Layer 6 then deals with the presentation of the data, for example,
to establish that a series of bytes (such as in a graphic) is interpreted by both sender
and receiver in the same way, or to exchange data in otherwise encoded or encrypted
ways. Hence, Transport Layer Security (TLS) and its predecessor, Secure Sockets
Layer (SSL), are both placed at this OSI layer (see Part VIII on Security). The
Application Layer then hosts high-level protocols such as the HTTP or the various
IoT-specific protocols, which will be detailed further in Chapter 22.
17.2 TRANSMISSION CONTROL PROTOCOL/INTERNET PROTOCOL

MODEL
The OSI and TCP/IP models are often displayed as competing proposals. However,
they can be understood in a complementary way, in which the OSI model is more
protocol agnostic and general, and the TCP/IP model is tailored toward discussing
particular software implementations for the Internet. The TCP/IP model consists of
four layers:
1. Network Interface
2. Internet
3. Host-to-host
4. Application
The network interface corresponds to the data link (OSI Layer 2) and assumes
that whatever data is transmitted over the network carries sufficient information
to be passed between two directly connected computers. The data is arranged in
frames. Only in the Internet Layer (corresponding to OSI Layer 3) the network
has a notion of unique IP addresses. The Host-to-Host Layer via TCP establishes a
logical connection between computers across the network. On top of this connection
there is the Application Layer, an umbrella term for OSI Layers 5–7.
Chapter 18
Information Encoding and Standard
Quantities
Chapter 3 on information theory focused on the number of bits that are necessary
to communicate numerical values or accurately represent them in computers. In the
section on binary calculations (see Section 2.2.3) we saw that 8 bits make up 1 byte
as one of the smallest quantities of information.
18.1 CODING SCHEMES
One byte is also often synonymously used for one character, although this is only
true for a particular way of encoding information. Which bit series do we have
to transmit to communicate the character string byte? The most simple scheme of
mapping alphanumeric characters to bit combinations is the American Standard
Code for Information Interchange (ASCII), which defines the meaning of the 256
different bit combinations that can be encoded in a byte (see Figure 18.1 for an older
version utilizing only 7 bit).
Unfortunately, even 256 characters are not nearly enough to communicate
the special characters that are present in the languages around the world. With the
increasing internationalization and democratization of computing, it soon emerged
that ASCII would not be sufficient to encode the world’s information. The Unicode
Transformation Format (UTF) addresses the shortcomings of ASCII by offering
region-specific UTF codes with 8- or 16-bit length. With the excess coding capacity
of the 16-bit UTF (65,535 different characters), even graphical representations
221
Figure 18.1 ASCII table. This is an older 7-bit ASCII table from a 1970s publication (source:
Wikipedia), with the least significant bit first (LSBF). The first four bits are listed in columns b1 to
b4 . The remaining three bits b5 to b7 are displayed as triplets above. For the character B, we would read
0100 001 from the table, but in fact because of the LSBF convention, the bitstream going over the wire
would be 1000 010. ASCII does not only encode alphanumerical characters, but also a range of control
characters that could be used by specialized hardware such as printers. For example, LF stands for line
feed and sending the respective code would return the printer head to the beginning of the line.
of the smiley, :-), or social memes are now encoded, allowing the use of so-
called emojis as regular characters. While it is good practice to use UTF as
standard encoding in user-facing applications, for the sake of simplicity the low-
level communication between microcontrollers and peripheral devices often uses
simple ASCII.
18.2 INFORMATION QUANTITIES
The quantification of information happens in bits and bytes. As both units are
derived from the dual system, a kilobyte (1 kB) traditionally referred to 1024 bytes,
and a megabyte (MB) to 1024 kilobytes, etc. The next four orders of magnitude
are denoted by the gigabyte (GB), terabyte (TB), petabyte (PB) and exabyte (EB)
and so forth. For marketing and regulatory purposes, in 1998 the IEC determined
that the suffixes for information quantities should follow the decimal system, with
Information Encoding and Standard Quantities 223
1 kB representing 1000 bytes, as being more intuitive to use by lay people. The
proposal suggested the kibibyte as 1024 bytes, followed by the mibibyte for 1024
kibibyte etc. In practice, the traditional units (kb, MB, and so forth) are often used
interchangeably with the dual as well as the decimal cascade, which can be a source
of confusion.
For historic reasons, transfer rates are still occasionally communicated in baud
(after Emile Baudot, a French engineer in the nineteenth century). The baud (Bd,
symbols per second) dates back to a time when arbitrary encoding schemes were
used to maximize the throughput of a sparse data connection. While the use of 8-bit
ASCII indeed means 1 Bd = 1 byte per second, a simple alphabet (A–Z) could be
encoded in 4 bit, meaning that while still transferring 1 byte per second, the actual
transfer rate is 2 Bd. As the Bd rate is largely irrelevant for multimedia applications,
today the bit rate, bits per second (bp/s, also bps), is used more frequently. Derived
units are the kilobit (kbit/s) or megabit (Mbit/s) per second.
18.3 INFORMATION ENCODING
The arbitrary nature of encoding binary information with physical signals has been
repeatedly indicated. Thus far, most of the examples in this book corresponded a
high voltage with a logical true or 1, and a low or no voltage with a logical false or
0.
There are in fact a few alternatives to encode information for transmission (see
Figure 18.2). Besides the naive interpretation mentioned before, common unclocked
formats are uni- and bipolar non-return to zero level (NRZL) and non-return to zero
level, inverse (NRZI). Unipolar NRZL can follow our naive convention, but also
allows the reversal of directionality (i.e., 0 could be tied to a positive voltage and 1
to ground). In a bidirectional scheme, one logic level is tied to a positive voltage,
while the other one is tied to the negative voltage; that is, with changed polarity.
Again, how a particular device interprets the either positive or negative voltage is a
question of definition; for example, the RS-232 interface (see Section 19.1.2) ties
− to − to 1 and to to 0. NRZL has the disadvantage that, in the
absence of an additional indicator that defines the boundaries between data bits,
long stretches of one and the same logic level can make interpretation difficult. This
is where NRZI provides some alleviation: This coding scheme treats occurrences of
logic 1 as a command to change level, whereas logic 0 does not change the voltage
level. In practice this means that long stretches of 1 induce a frequent change of
signal, which makes recovery of the data bit boundaries easier. However, in cases
series of bits (raw information)

0 1 0 1 0 0 0 1 1 1 0 1
encodings
naive
NRZL
NRZI
Manchester
(clock)
Figure 18.2 Information encoding schemes. A series of data bits (raw information) needs to be sent.
The naive encoding ties logic 1 to a positive voltage and 0 to ground. NRZI can follow this convention,
or be the reverse with inverted logic levels. NRZI encodes logic 1 as change in the voltage level, and 0
with no change. Manchester encoding takes into account the data bits and a clock signal, which splits the
data bit slot into two halves. Through a XOR operation the overall with both the up and the low clock
signal, the number of signal bits are doubled. This operation allows for more efficient error correction.
of long stretches of 0, issues with clock recovery remain. A clocked information

scheme is Manchester encoding, which augments each data bit with a high and low
clock signal. The signal level is calculated as a result of the data bit, the transition
between data bits, and a XOR operation with the clock signal. With a clock signal
being present anyway, the advantage of Manchester encoding is the ability for error
correction.
Chapter 19
Industry Standards
How data between two or more devices can be exchanged depends primarily on
the preferred route (wired, wireless) and a convention on how signal changes over
time are to be interpreted. Already at the lowest level of the OSI model neither the
laws of physics nor those of information theory impose a particular interpretation
of a millisecond-long voltage peak upon us (see Chapter 18 for more details on
information encoding).
This chapter is separated into the sections Hardware Interfaces, Wired Com-
munication and Wireless Standards, in which commonly used industry standards are
going to be introduced. Typically these standards are detailed in extensive technical
documentation, providing sufficient information for engineers to develop solutions.
Our focus here will be on their distinguishing principles and features, and not on
the degree of details required for an implementation.
We define:
• Hardware interfaces. These are standards that are used for the communi-
cation between devices in embedded systems (e.g., a microcontroller and a
sensor). While some embedded systems can tap into the vast ecosystem of
computer bus systems (e.g., to connect to IDE, ATA or PCI hard drives),
for the sake of brevity the section is restricted to those standards that are
supported even by the implest devices.
• Longer-lange wired communications. The focus here is going to be on stan-
dards often used in automotive, building and industry automation, where sen-
sor, actuator and microcontroller units are communicating using standardized
cable connections.
225
• Wireless standards. Standards in this area include the widely known telecom-
munication protocols typically used by mobile phones, meshed sensor net-
works and wireless data communication in general.
It is important to note that industry standards typically change over time, and there
is often competition of different standards for a particular purpose. Their rise and
fall depends on technical capability, spread and adoption in the field, and market
politics. The following standards therefore only represent a current sample of
common standards, not a comprehensive directory of past and present technologies.
19.1 HARDWARE INTERFACES
Digital hardware components interact with their environment either through propri-
etary connections and protocols with an arbitrary number of data lines, or a range
of standard interfaces that are further detailed in the following sections.
Conceptually, the most simple peripherals require only a single port. The press
of a button can be sensed using a digital input, whereas the temperature-dependent
voltage from a thermometer requires an analog input (see Section 3.2 on analog-to-
digital conversion). However, some devices need to send more complex data with
semantic structure, such as geographic information (see Section 15.2.1), or many
data points as required to fill a display. This information can be viewed as a stream
of bytes that need to be transferred.
19.1.1 Communication Principles
Common characteristics that detail data communication are: parallel or serial,

synchronous or asynchronous exchange? The general technical question is: How
can one device send a byte to another? An intuitive solution is sending 8 bits
in parallel, occupying 8 digital ports on each of both devices. But how can two
bytes be sent? Obviously, a convention is required that indicates the end of the first
byte, and the beginning of the second. Thinking about this necessity leads to the
consideration of a second mode of transfer: rather than sending 8 bits in parallel, the
device can be restricted to sending data just over 1 digital port, utilizing the same
segmentation strategy that was previously employed to separate the byte values, to
send consecutive bits. This is called serial transfer, which in fact is the prevalent
mode of data transfer in most current protocols. The advantage of speed is therefore
often sacrificed for a more economic utilisation of the available input-output ports
(see also: embedded systems, Chapter 16).
Industry Standards 227
parallel bit stream serial bit stream
byte byte bit bit bit bit bit bit bit bit
bit 2 1 8 7 6 5 4 3 2 1
byte
1 1
time
2
t7 t6 t5 t4 t3 t2 t1 t0
3
4 byte
(not shown)
2
5
6 byte 1: 1 1 1 0 0 0 0 1
7 byte 2: 1 1 0 1 0 1 0 1
time
t1 t0
Figure 19.1 Parallel or serial bitstreams. Sending two bytes (byte 1: 11100001 and byte 2: 11010101)
using parallel or serial transfer modes. Use of a parallel bitstream would require 8 separate data lines,
while 1 data line is sufficient for serial transfer. The convention proposed in this schematic uses positive
voltage to indicate 1 or ground (0). It is important to emphasize that this is only convention. A system
that requires positive voltage for 1 and reverse polarity for 0 would be equally possible. The data is read
at distinct time points t(0..1) in the case of parallel and t(0..7) in the case of serial communication.
The reliance on a trigger to define the temporal boundaries of what is shown as

t0..7 in Figure 19.1 introduces two types of transfer: synchronous and asynchronous
signaling. In the example of serial communication, synchronous signaling requires
an external pacemaker that indicates to both sender and receiver when a bit is to
be sent or detected, respectively (see also the clock signal for Manchester encoding
in Chapter 18). While this leaves every possible time slot available for payload
and thus enables the fastest serial transfer, hardware solutions for synchronous
transfer are complex and often require additional components (e.g., oscillators).
In contrast, asynchronous transfer utilizes variations in the signal itself to indicate
the beginning and end of a byte. For example, at t-1 the sender could indicate the
transfer to start, and from t8 there could be a 1 signal 1.5x the length of a normal
bit. This would allow the receiver to separate individual bytes, and interpolate t0..7 to
interpret the appropriate bit pattern in between (e.g., using strategies augmented by
coding schemes like NRZI, Chapter 18). This concept of indicating payload bytes
is referred to as start bit and stop bit, and was often used in modem communication;
for example, in the common configuration termed 8N1, meaning 8 character bits,
and one start and one stop bit. The N indicates the absence of a parity bit, which is
a crude form of error detection around even or odd counts of 1 in a sent byte.
19.1.2 Serial/UART
The simplest method to exchange binary information is the serial protocol. Two
wires, commonly labelled Tx (transfer) and Rx (receive) allow the exchange of in-
formation in duplex (i.e., both participants can send and receive data simultaneously
by connecting Tx1 to Rx2 , and Tx2 to Rx1 ). In the absence of other configuration
details, most peripherals offering serial input or output use their standard operating
voltage (e.g., 3V3 or 5V) to indicate 1, in a 8N1 configuration. Commonly used
transfer rates are 300, 600, 1,200, 2,400, 4,800, 9,600, 14,400, 19,200, 28,800,
38,400, 57,600, and 115,200 bits per second. Readers familiar with the modem
technology of the 1980s and 1990s will find these rates familiar.
While it is possible to implement serial communication purely in software by
bit banging a digital port, this puts strain on the CPU especially at very high speeds,
as the clock speed must exceed the bit rate significantly. Most microcontrollers
therefore have dedicated universal asynchronous receiver transmitter (UART) chips
that can segment outgoing data and generate a bitstream, or populate a buffer with
bytes on the basis of an incoming bitstream. These UARTs are usually tailored
towards the needs of embedded systems, and while specialist UARTs feature buffers
of several kilobytes and support transfer rates in the megabit-per-second range, the
type encapsulated in microcontrollers are usually less powerful. In many cases, the
first in-first out (FIFO) buffer serves as a first line of defence against high loads,
allowing the further reception of data while the CPU is busy processing. With
their dedicated processing, UART chips can also perform basic error detection and
correction.
For historical reasons, and also as they are still present in legacy industrial
hardware, the RS-232 and RS-485 serial interfaces deserve a mention. These are D-
subminiature (D-Sub, Sub-D) connectors in D-shape with a defined pin layout, first
introduced by Cannon for a variety of different purposes in the 1950s. The 9-pin
or 25-pin D-Sub formed the communication (COM) port of early personal comput-
ers, and where the standard interface between computers and modem devices. In
addition to the Tx (here termed TXD) and Rx (RXD) lines discussed before, RS-
232 also features pins to indicate readiness of the device (DTR/DSR) and for flow
control (RTS/CTR) (see Figure 19.2). RS-232 uses bidirectional NRZI encoding, in
which nonzero voltage levels of opposite polarity indicate 1 and 0.
data pair
receive data (RXD, RX) - IN transmit data (TXD, TX) - OUT
data terminal ready (DTR) - OUT
data carrier detect (DCD) - IN
ground (GND)
1 2 3 4 5
screw screw
6 7 8 9
data set ready (DSR) - IN ring indicator (RI) - IN

request to send (RTS) - OUT clear to send (CTR) - IN
handshake
handshake
Figure 19.2 RS-232. RS-232 uses indicators to inform terminal software about various states. The
DSR/DTR pair establishes whether a device like a modem is connected and ready. The RI and DCD
inform the terminal about the presence of a call and physical connection. RTS/CTR control the flow
from the host to the device and vice versa. The RS-485 standard uses a similar pin configuration, but
utilizes different electrical levels that, for example, enable longer wire lengths (hundreds rather than a
dozen meters). There are mappings of the 9-pin to the 25-pin D-sub standard, not using the majority of
the pins of the latter.
19.1.3 Serial Buses
In principle, parallel data transfer should outperform serial transmission (see Sec-
tion 19.1.1) and in nonembedded computing, parallel interfaces such as Industry
Standard Architecture (ISA), Peripheral Component Interconnect (PCI) or Small
Computer System Interface (SCSI) were long the norm. An advantage of aforemen-
tioned buses in contrast, for example, to a single (serial) connection is the ability
to connect to a number of devices on the same data lines (see Figure 9.1 for the
conceptually related bus topology in networking). However, there are a range of
serial buses that combine the advantages of serial interfaces (significantly less cost
in terms of circuit board real estate and wiring) with the ability to attach various
devices, and that provide sufficiently performant connections between microcon-
trollers, on-board components and external devices.
19.1.3.1 Serial Peripheral Interface
The 4-wire Serial Peripheral Interface (SPI) is one definition of a serial bus. It was
introduced by Motorola and focuses primarily on wiring and transmission, with
no restriction on how data is relayed between host (the microcontroller) and slave
devices. SPI features a clock line for synchronization (SLCK), data lines to (MISO,
master in-slave out) and from (MOSI, master out-slave in) the host, as well as a slave
select (SS) line. The nomenclature can often be confusing and variations exist, with
SLCK often termed CLK on the slave device, DI (data in) on the device taking data
from MOSI, and DO (data out) sending data to MISO at the microcontroller. Slave
select then connects to chip select (CS) on the device. There are clever variations
in which the host can coordinate the exchange of information between devices and
mimic a ring bus. However, in its most simple incarnation, SCLK, MOSI and MISO
lines are shared between all slaves, but SS is specific for each (see Figure 19.3). This
means that devices can be controlled with digital ports.
In operation, the master pulls down SS low for the active device (i.e., that is
addressed for data exchange). It is good practice use a pull-up resistor between V+
and SS, as otherwise undefined states may occur if the line is not actively pulled
down by the host. The clock signal at SCLK dictates the data rate. There is no
formal requirement and careful reading of the documentation is often required, as
some devices support any clock rate, whereas others expect minimal and maximal
rates ranging from kHz to MHz. This allows for theoretical transfer rates in the
megabit-per-second range. Several SPI modi are known, specifying whether SCLK
low or SCLK high indicates an active cycle (clock polarity, CPOL), or whether
MOSI/MISO are asserted at raising or falling SCLK (clock phase, CPHA). The
data on MOSI/MISO is free format (i.e., it is the responsibility of the device
manufacturer to specify the meaning of any received or sent bit patterns). Also,
the length of the transfer is open (i.e., the transmission may be indefinite as long as
SCLK is present and SS is low).
There is no handshake implemented in SPI, meaning unless there is an explicit
protocol implemented in the device, both host and slave can send data in full-duplex
without confirmation.
19.1.3.2 Inter-Integrated Circuit
The Inter-Integrated Circuit (I2 C) standard defined by Philips (now NXP) in 1982
supports a more structured exchange of data than SPI. It requires just two wires,
serial clock (SCL) and serial data (SDA), and much of the meaning of the data is
specified in the protocol. I 2 C has evolved over the past thirty years, increasing the
number of devices that can be addressed over the two wires from dozens to over
a thousand, and transfer rates from to . However, the fastest mode
(ultrafast mode) does currently only support unidirectional communication, while
all other modes are bidirectional. Because of the structured format of data exchange,
A B
microcontroller devices / components SS

SCLK CLK SCLK
MOSI DI Hello World MOSI
MISO DO
MISO
SS1 CS
SS2
SS3 CLK
DI Slide left to... C
"master" / "host"
DO mode clock clock
CS polarity phase
0 low rising
CLK x 1 low falling
y
DI 2 high rising
DO 3 high falling
CS z
"slave"
Figure 19.3 Serial Peripheral Interface. (A) Architecture. The example shows a microcontroller (host)
connected to an output device (Hello World), a mixed device (Slide left to...) and a sensor (xyz). The
clock and data lines are shared where needed. Note that it is possible to operate the output device without
establishing the MISO/DO connection. The slave select signal is specific to every device. (B) Operation.
The slave select signal is pulled low in order to establish data transfer. Upon the first clock cycle, bits
are sent according to the clock (here: 011011 on MOSI, 101101 on MISO). (C) SPI modes specifying
definition of the clock signal.
I2 C delivers potentially less throughput than SPI, but it is capable of addressing and
connecting devices with a minimum of circuit board real estate and is therefore
suitable for the communication of many ICs in a single circuit.
The two lines, SCL and SDA, are electrically configured in an open collector
arrangement (i.e., connected to V+ through a pull-up resistor). This establishes
when they are high ( V+) or low ( V+). SCL and SDA lines are shared by
all slave devices connected to a host, and the host can give up its role and hand
over control to another device to act as host. Every I2 C communication happens
in two phases: first, the host sends a 10-bit address to specify the slave that it is
working with, followed by a bit that indicates whether it wants to read from or write
to that slave. Immediately in the next clock cycle, the corresponding slave confirms
by pulling SDA low. The designated sender (either host or slave) sends 8-bit data
over the SDA line, immediately confirmed with a low SDA by the receiver. The host
then concludes the communication by pulling SDA up again. The protocol further
allows the host to specify which register of the slave to read or write. The meaning
of those registers depends entirely on the nature of the device and is usually subject
to documentation. An important feature of I 2 C is clock stretching. This allows the
slave device to keep SCL low, thus stopping the communication until it is ready to
process any further requests by the host. This is a very effective mechanism of flow
control.
A potential weakness of I2 C especially in electronically noisy environments
is that if there are any ambiguities in the interpretation of the signal, host and
slave devices can easily get out of sync, rendering the devices unaddressable. While
this issue can be resolved by resetting SCL after a time time-out period, it clearly
comes at the cost of reduced transfer rates. Another issue is the shared address
space (i.e., device addresses are not strictly unique identifiers but are often used by
manufacturers to designate a particular device class). This then requires additional
tricks, such as physically interrupting SCL or SDA for the device that is not wanted
active, facilitated by the fact that I2 C supports hot swapping. The total capacitance
of the bus is limited to , which restricts its length to a few meters.
19.1.3.3 1-Wire
The 1-wire serial interface introduced by Dallas (now Dallas-Maxim) is an out-

standing bus in many respects. It supports wire lengths of hundreds of meters, knows
a 64-bit address space in which each device has its own unique identifier (composed
of a 48-bit unique ID, an 8-bit device type identifier, and an 8-bit check sum), and
it doubles as both power supply (excluding ground) and data connection.
1-wire uses asynchronous serial transmission as discussed in Section 19.1.2,

and has relatively low throughput at just about 16 kbit per second. However, like
I2 C, the wire supports only half-duplex communication. Every 1-wire device must
carry an capacitor that supplies power to the device when a logical 0 (i.e.,
ground) is applied on the wire — this restricts 1-wire to device classes that can
operate in the microampere range with only very low milliampere peak currents. In
order to optimise life time when the device is sending data, the signal is encoded
in long pulses of µ at ground to indicate 0, and short pulses of µ at V+
(typically ) to indicate 1. In the absence of signalling, the wire is usually pulled
up through a resistor to V+.
The interface supports hot swapping. The protocol itself resembles I2 C in that
the host initiates communication (a µ low), followed by a bitstream containing
the address and request to either read or write information.
19.1.4 Joint Test Action Group
The Joint Test Action Group (JTAG) interface was invented in the 1980s as means
for the automated testing and programming of circuit boards in mass production.
The idea was to use a standard header which could connect to a testing infrastruc-
ture, probing the circuit board with defined inputs and reading the respective output,
or to copy firmware to microprocessors already mounted on the circuit board. This
is similar to the ICSP ports on the Arduino board as depicted in Figure 16.1, where
two groups of 6 pins each allow access to most device functions.
While JTAG itself resembles a SPI-like serial bus with defined functionality
(see Figure 19.4), the connector that plugs into or onto the header can differ from
vendor to vendor, thus rendering JTAG a somewhat ill-defined industry standard.
The core specification indicates 4 JTAG pins, but JTAG connections with vendor-
specific extensions with up to 40 pins are commonly seen in the field. This also
leads rich ecosystems of programming adapters, although bit banging of the four
core JTAG pins is still possible.
19.1.5 Universal Serial Bus
The Universal Serial Bus (USB) is probably the most commonly known device
interface these days, having been the de facto standard for data communication in
the professional and consumer market for the past two decades, since 1995, as well
as having had an additional role as power adapter since the mid-2000s (see Section
13.2.2).
JTAG TMS TMS TDO TMS TDO TMS TDO

4-pin TCK TCK TCK TCK
connector TDI TDI IC1 TDI IC2 TDI IC3
TDO
JTAG host
adapter
Figure 19.4 JTAG interface. Every chip on the circuit board has its own JTAG interface. JTAG
communication resembles SPI (see Figure 19.3). Similar to SS in SPI, TMS is mode select and specifies
the IC on the circuit that is to be tested or programmed. The clock signal from the host adapter (TCK)
can range from 1 to 100 megabits per second and paces data from TDI (data in) to the first IC. If an IC
is not selected on TMS, TDI is forwarded to TDO (data out); otherwise TDI is interpreted and the result
(e.g., content of a particular register in the IC) sent to TDO. This daisy chain link ultimately returns the
result to the host adapter. An interesting technical detail is the link between TCK and TDI/TDO: TDI
bits are asserted on the rising edge of TCK, whereas TDO pushes bits on the falling edge.
USB is a master-slave serial bus like SPI, but rather than connecting all
devices to the host, it allows complex device network arrangements following a
star architecture, with a central device and additional tiers around it. For example,
while the host may only have one USB interface, through a hub with many ports
(that itself can be part of an end device), it is possible to connect a large number
of devices. Power demands are the limiting factor to the physical size of the device
network, with powered USB devices and so-called active (powered) hubs adding
about up to a total of from the primary host to the end device in any one
chain.
Traditionally USB was a half-duplex format, featuring low-speed (1.5 Mbit/s)
and full-speed (12 Mbit/s) connections. The second generation of USB (USB 2.0)
added a high-speed mode of 480 Mbit/s. USB 3.0 brought 4.8 Gbit/s (super-speed)
and the latest incarnation (USB 3.1) can interface at full-duplex in super-speed-plus
of around 10 Gbit per second. The USB specification is complex and its detailed
discussion of transfer modes, device classes and host controller roles are beyond the
scope of this book. It suffices to say that USB has grown from a simple plug-and-
play serial interface for keyboards and mice to a one-size-fits-all industry standard,
which has given rise to a huge ecosystem of devices.
However, two features deserve some consideration: The compatibility be-
tween different USB adapter/plug systems, fostering the connection of sensible
device combinations, and the differential transfer that gives USB devices some re-
sistance against electromagnetic interference and noise. Standard size USB adapters
following 1.0, 1.1. and 2.0 specification typically have four pin connections (see
Figure 19.5A). These are 1: V+, 2: D-, 3: D+ and 4: ground. Mini- or micro-USB
in addition feature an ID line between D+ and ground, which, if the pin is pulled to
ground, indicates that the device can take a host role, while the other device becomes
a slave. The relevance of this is explained in the next paragraph. Here, it suffices to
say that many device classes (e.g. personal computers, printers, digital cameras)
come with a plug that takes the device capabilities into account. These roles are
implicitly represented in the USB plug, and USB cables reflect sensible and uncom-
mon device combinations in the cable matrix (Figure 19.5B); that is, some cables
simply don’t exist. USB 3.0 adapters feature up to 6 more pins, primarily adding
two additional D+/D- pairs. They are compatible with standard-size plug type A
adapters by having the four standard pins in a proximal position, allowing them to
communicate at up to USB 2.0 speed, while the additional pins are slightly recessed
and only connected in USB 3.0 sockets. The smaller adapter types resemble mini-
or microadapters, but that additional pins in a lateral extension which makes USB
3.0 mini- or microadapters incompatible with older devices. The new USB type C
adapter takes the pin number to 24, adding additional data line pairs. The rationale
behind the D+/D- data pairs is as follows: USB uses differential signalling (Figure
19.5C), in which both have to take opposite direction to yield a valid signal. If the
signals are not complementary, invalid and erroneous signals can more easily be
identified.
USB On-The-Go (OTG) added the ability of devices to take both host and
slave role, while the original specification considered a type A device as host and
a type B device as slave. Today, many device classes (e.g., personal computers,
printers, digital cameras) come with a plug that takes the device capabilities into
account (e.g., it would be uncommon for a printer to take the host role if it is
connected to a computer, while it may well serve as host when a camera is directly
connected to it). This is mediated by pin 4: ID. Some interesting features include the
ability to negotiate sessions outside of which the data connection rests and therefore
does not consume any power.
19.2 LONGER-RANGE WIRED COMMUNICATIONS
Section 19.1 on hardware interfaces primarily focused on how peripheral compo-

nents communicate with the central computing unit within a device or around a local
computer. This implied distances of at most meters and response times in the s
range. In complex technical systems, dozens if not hundreds or thousands of devices
A B
USB plug type A
rare
4 3 2 1
GND D+ D- 5V yes no
USB plug type B rare rare no

5V D- yes no rare rare
1 2
rare rare no rare no
yes no rare rare yes no
4 3 rare* no rare* no rare* no yes

GND D+
USB plug type mini-A

ID C
54321
time
USB plug type mini-B D+

2.8 V
ID
54321
0.3 V
D-
USB plug type micro-A NRZI 1 0 1 1 0 1 0
54321
USB plug type micro-B
54321
USB plug type C
Figure 19.5 USB plugs and data connection. (A) USB plug types, not to scale. The most common plugs
are standard type A connectors on personal computers, now slowly being replaced with type C plugs.
Micro-B adapters are now commonly used as charging adapters. (B) Plug/cable compatibility matrix.
Some plug combinations are rare, while others don’t exist. Adapters following USB 3.0 specification
are not explicitly shown. Type C plugs are most often connected to type C, but cables to USB 3.0-type
standard, mini-A and micro-A can occasionally be found. (C) USB encodes signals following NRZI (see
Section 18.3). The D+/D- signals are maintained at and . Only if the signals take opposite,
differential voltages, the signal is valid. In the case of the exemplary signal loss on D+, the overall
message would still remain valid as the short disturbance can be recognized by looking at D-.
need to exchange information, sometimes over distances of several kilometers. For

example, there are some of cabling in a modern airbus. Even a standard
family car today features over 50 microcontrollers that monitor and orchestrate ev-
erything from motor control and assisted breaking to reading lights. HVAC systems
are often centrally controlled in commercial buildings. And logic controllers on
industry shop floors exchange information with manufacturing execution systems.
Some of these scenarios are discussed in Chapter 6. Last but not least, every office
computer network must exchange data over long distances.
Fieldbus systems mediate data communication within vehicles, buildings and
for industrial manufacturing. This chapter first introduces fieldbuses before moving
on to Ethernet as the de facto standard for wired Internet data communication,
and it concludes the section on longer-range wired communication with powerline
systems that send data over the local electrical grid. The logic behind this order is
as follows: Initial fieldbuses aimed for the most cost-effective connection of devices
over walkable distances, avoiding the complex and potentially slow protocol stack
that is inherent to Internet communication. In fact, many fieldbus systems were
developed in parallel to TCP/IP at a time when, for example, these Internet protocols
were just undergoing standardization. Only later, when Ethernet became more
abundant for general purpose data communication, device manufacturers realized
the opportunity to utilize this infrastructure for what is now termed industrial
Ethernet for real-time applications. Thus, before delving into the intricacies of
actual Ethernet, the section on fieldbuses is going to mention it as a transport layer
for specific protocols.
While most of the discussion around wired communication is going to touch
upon conventions between device manufacturers (as they are standards), there are
also challenging technical issues when communicating digital signals over long
wires. For example, they act as antennas and are susceptible to electromagnetic
interference, and may therefore require additional measures against noise. There is
the effect of capacitative load, which is the accumulative current on a wire that
is carrying long spike trains from digital signals. Together, these issues bring a
physical constraint to the maximum data rate that can be communicated over on
long wires. And at the most basic level, while the book does not specifically address
these rather mundane pieces for each technology, wired connections are not very
tolerant to empty branches that leave the electric circuit open: so-called terminators
that add at least a Ω resistor to unused connectors are commonplace.
19.2.1 Fieldbus Systems
Even before digital computers became the norm in industrial device control, ma-
chines exchanged information among each other; tubes and pipes indicated status
by air pressure. In quite the same way, the 4-20 mA current loop was established
in the 1950s and was one of the earliest standards to communicate device status.
The loop is a wire pair forming an electric circuit that carries a current which, by
some convention, reflects the status of a device. For example, it may represent an
analog measurement from a thermometer (e.g., = ◦ and = ◦
,
values in between are interpolated) or could be used to indicate discrete logic (e.g.,
= proceed, = pause, = stop). The current loop is set up in a way
that it is robust to voltage drop due to long wire lengths or the impedance of a
device. While this is technically fairly easy to implement and current loops are still
frequently encountered on factory floors, they have a critical shortcoming: They are
peer-to-peer and connecting many different devices requires significant wiring ef-
fort. Just as discussed for hardware interfaces (see Section 19.1), serial bus systems
that interconnect a range of devices and that use an addressing system to direct data
often provide the best compromise between data rate and installation effort. For
field applications with communication over sometimes considerable distances, this
has led to the term fieldbus.
19.2.1.1 Fieldbus ecosystems
Today there is a vast range of competing fieldbus standards. While this choice is
influenced by market politics and the attempt of vendors to be the one who rules
them all, there are some valid technical reasons for diversity and some bus systems
simply reflect the needs of different verticals: for example, the maximum response
times in automotive applications with communication between critical components
for safe driving are unlike those of the manufacturing industry, where there is
potentially more tolerance for lag, but more complex network architectures and
larger individual data points. Modern fieldbuses can roughly be divided into systems
that define their own hardware interfaces (plugs, cables, switches; including some
standard components such as RS-232) and software protocols, and those that build
on Ethernet for the physical layer (OSI 1-2), but which often add a protocol stack
that is distinct from TCP/IP (industrial Ethernet).
There are many approaches to connect a set of devices. For example, a bus
with ring topology (see Figure 9.1) may be implemented with a series of pairwise
RS-232 connections and allow convenient low-profile message handling, but may
not be resourceful and easy to maintain on a shop floor with large distances
between devices. An Ethernet connection may allow the utilization of existing
infrastructure and require less wiring, but TCP/IP communication takes too long
for real-time messaging. At the same time, all bus topologies have pros and cons
in terms of network organization, depending on how many devices have a control,
sensor or actuator function, and where they are located relative to each other in
the network. For example, who controls traffic between mission-critical devices
when several sensors are trying to communicate over the same bus at the same
time? If the application is controlled centrally, polling from particular devices or
allotted time slots (TDMA, time division multiple access) may be a choice. If the
application lends itself to distributed control, pseudorandom approaches such as
carrier sense multiple access/collision avoidance (CSMA/CA) are useful, in which
devices can send data on the fieldbus after a predetermined break following a ranked
priority of each device. In the worst case, the messaging behavior of all devices
is random, but to an extent that even additional grace periods after collisions are
unpredictable (CSMA/CD, carrier sense multiple access/collision detection). While
such communication patterns may eventually yield to successful data exchange, the
lag may be intolerable for real-time response. We will revisit some of these issues
in the more detailed discussions of a few exemplary fieldbuses.
In the range of fieldbuses, some are established and widely used industry stan-
dards, while others are only employed by niche manufacturers for their particular
devices. The selection here provides just a glimpse into some common fieldbuses,
not in any particular order or by any means comprehensive:
• Application: Vehicle
– Controller area network (CAN)
– Local interconnect network (LIN)
– FlexRay
– Media oriented systems transport (MOST)
– Train communication network (TCN)
• Application: Building automation
– Building automation and control networks (BACnet)
– Digital addressable lighting interface (DALI)
– Local operating network (LON)
– DyNet
– KNX
• Application: Industrial automation
– BitBus
– ModBus
– ControlNet
– Interbus
– ProfiBus
– Foundation Bus
It should be noted that the above classification into applications is rather
arbitrary, and that these fieldbuses also cover different layers of the OSI model.
While CAN is probably best known for its use in automotive, it can also be found in
building and industrial automation. In fact, successful buses like CAN can give rise
to entire domain-specific protocol stacks. For example, CANopen and DeviceNet
both build on CAN, the latter being an implementation of the Common Industrial
Protocol (CIP). However, CIP is also implemented on top of ControlNet, which has
a very different technical underpinning. This hints at the zoo of device interfaces
and exchange protocols on the modern factory floor. Much like in the IoT on a
global level, interoperability between devices of different manufacturers is a major
pain point when working with real-life installations.
In an attempt to provide standardization of fieldbus systems, since 1999
the IEC has defined a taxonomy under the name Communication Profile Families
(CPF), which aims to classify and define them by technology irrespective of their
brand names. While this is a worthwhile attempt, the CPF list is far from complete.
Using the example of an industrial setting (see Figure 19.6), the various levels
of the fieldbus pyramid and their communication requirements are discussed.
Transfer Rate and Data Size
On the bottom of the pyramid (see Figure 19.6) there are devices that may only
be required to send or receive a few bits, but with absolutely little tolerance to
data loss or lag. This is the level where it may be desirable to have a fast, direct
sensor-actuator connection, which may not be possible because of distance or
network architecture. In this case, the fieldbus is a compromise. Toward the tip
of the pyramid, these response times typically become less critical, while at the
same time the size of data packets increases and the overall network may become
more complex. At the very tip of the pyramid, when information does not have
to be processed in real-time, the complexity of the system is ultimately the most
important issue, so that Internet protocols that are detailed in Chapter 22 are
becoming relevant. As mentioned before, among other considerations that play into
the selection of a fieldbus is the handling of messages.
Update Cycle and Communication Pattern
In the example of Figure 19.6, a simple peer-to-peer communication pattern that

is initiated by a sensor and maybe a single raw value is sufficient to inform the
valve controller of the system pressure. However, on the field level the PLC might
pull data from all valves in synchronized iterative steps, perform a calculation, and
then update the valve controllers simultaneously. On the cell level, it might be
necessary that all PLCs can read the status of all other PLCs at any given time.
In such a scenario, the fieldbus changes conceptually from a transmission medium
to a means to provide shared memory. Some bus systems define on the application
layer (OSI layer 7) specialized data structures that represent the properties of the
physical objects that they control — this is in stark contrast to other fieldbuses that
just focus on the physical and network layers.
At some stage an embedded system, PLC or DCS can take the role of a
gateway and interface the devices that are connected via the fieldbus into the
Internet. As such fieldbus systems are the extension of the IoT into the parts of
buildings or the shop floor that do not typically come with an independent Internet
connection.
19.2.1.2 Traditional Fieldbuses
This section provides some technical details on a three exemplary fieldbus systems,
primarily from the industry automation domain, where they first originated, as
well as CAN bus, which is important in automotive communication. The point of
this exercise is less the accurate reflection of the specification of the respective
fieldbuses, and more to the give the reader an appreciation of their utility and
provide some approximate figures for distances, times and data rates in fieldbus
communication.
command level worldwide

MES APS ERP
Ethernet
1000-5000m 3 fieldbus
cell level DCS
PLC PLC PLC
Industrial
fieldbus 2 500-1000m Ethernet
field level PLC
1-wire 1
4-20mA
fieldbus
sensor/actuator level pressure 100m valve 100m pressure
sensor controller sensor
Figure 19.6 Fieldbus pyramid. This example pictures a plant at four levels. On the sensor/actuator
level, pressure sensors feed back information to a valve controller that is about away. Fast response
to over- or underpressure may be required here (1). Depending on the distance and the number of sensors,
direct wiring of the sensors to the microcontroller in the valve’s control box could be an option: The
1-wire hardware interface (see Section 19.1.3.3) allows for connections up to this distance. It would
also be possible to have a direct 4-20 mA current loop running between a sensor and the controller
and react to a change in current. Alternatively, a fast real-time enabled fieldbus such as CAN would
lend itself to a response time of less than for a few data bits. On the field level, several such
valve controllers would be attached to a PLC, possibly overseeing an entire pipeline within a factory.
The distance ranges from to between the PLC and the valves (2). The PLC acts as
coordination center without human intervention, processing information within . Again, a real-
time enabled fieldbuses may be required. On the cell level, several PLCs feed back information to a
distributed control center (DCS), where data visualization for human supervision might be happening.
This might be to control all processes within the entire plant, capturing processes within a to
perimeter (3). Most fieldbus systems should be able to return the necessary data within .
On the command level, MES and other software components (advanced planning and scheduling, APS;
ERP), may be drawing large data sets from the DCS via conventional Internet connections, i.e. Ethernet,
with response times of several seconds. Industrial Ethernet may be used on all three levels (1–3) by
excluding TCP/IP traffic in favor of purpose-built standards such as EtherCAT or ProfiNet.
BitBus
BitBus is potentially the oldest industrial fieldbus system. It was first introduced
by Intel in 1983 and in 1991 was established as IEEE industry standard 1118.
On the physical level BitBus uses a half-duplex RS-485, a serial communication
standard extending RS-232 (see Section 19.1.2) by the ability to connect to multiple
devices to the same transmission lines and a common ground. The link level is
determined by SDLC (Synchronous Data Link Control), an IBM standard with error
correction dating back to the 1970s. On the network level, constrained by signal
run times and impedance, BitBus supports a synchronous mode for distances up
to , a maximum of 28 slave devices at 2.4 Mbit/s, and a selfclocking mode
reaching as wide as , connecting up to 250 nodes at 62.5 kbit/s. With its
master-slave protocol, BitBus is deterministic and has guaranteed response times.
All slaves receive a message sent by the master, so it is the responsibility of the
slave to identify which messages are addressed to it. Much of the BitBus standard
was encapsulated in Intel’s chip i8044, which supported individual messages of
just up to 248 bytes, plus information around remote access and control (RAC, i.e.,
addressing). The addressing and routing information in each BitBus message also
allows the forwarding of data to components that are not directly connected to the
BitBus, but the correct handling of such information, however, is in the hands of the
application.
A/S Interface
The bit-oriented A/S Interface (ASI) bus for the exchange of parameters between
actuators, sensors and PLCs was conceived by a consortium of eleven German
industrial companies and the specification first published in 1991. By the mid-
1990s, the first prototype interfaces became available and adoption began to rise,
leading to IEC standard IEC 62026-2 in 1999. An important focus of ASI is the
suitability of the hardware interface for the shop floor, as well as having minimal
overhead for the communication between PLCs and periphery. For example, the
yellow or red flat rubber cable specified for the physical connection is designed to
be selfhealing to IP67 in case of accidental damage, while being at most
by and flexible to allow for minimally invasive upgrades to the factory
infrastructure. ASI expects a master device as well as a dedicated bus power supply,
allowing to power the messaging infrastructure as well as low-power peripheral
A/C devices with at (the signal is modulated over the power, with the
power also providing the clock signal for Manchester encoding; see Section 18.3).
No device can be further than 2 repeaters away from the host, overall resembling a
device tree as discussed for USB (see Section 19.1.5). As every segment can be at
most , this means that any two devices can be no further than apart. The
master continuously iterates over all devices. Each master-slave dialog of 4 bit each
(net, without overhead) takes about µ . The update times in a fully extended
network of 62 devices are thus specified to be at most for sensor input and
for actuator output. This equates to a data rate of 167 kbit/s in between a
master and a slave.
ProfiBus
ProfiBus (process field bus) started out as a collaborative research project between
more than a dozen industrial partners and universities in the late 1980s, and has been
standardized by various national bodies and since 1999 as IEC 61158/IEC 61784.
The original specification aimed to be a universally applicable messaging system for
industrial control systems termed ProfiBus FMS (fieldbus message specification)
— unifying communication from the sensor level to connecting shop floor to top
floor, and providing an application layer for complex devices. Unfortunately, with
polling cycles in the range and relatively low transfer rates, FMS was too
heavyweight for many real-time applications.
The most widely used ProfiBus variant today is ProfiBus DP (Decentralised
Periphery), first established in 1993. The bus uses the half-duplex RS-485 standard
with a 9-pin D-Sub connector or a M12 plug on the physical layer. Devices are
arranged in a linear bus with master-slave architecture that uses token passing on
the link layer. Each linear bus can be up to long, like the ASI allowing for
treelike expansions of the bus organization. However, overall, ProfiBus can only
support 127 devices on the bus. The transition from one bus segment to another
requires the use of a repeater device, which impacts the data rate. Depending on
the number of segments, physical segment length and overall length, the maximum
data rate of ProfiBus can range from 9.6 kbit/s to 12 Mbit/s. Each message can
carry a payload of 0 to 246 byte. ProfiBus DP knows different device classes, which
impacts their role and priority on the bus. Some devices can periodically become the
master on the bus, while others will only act as slaves. This gives rise to ProfiBus
DP’s application layer with three distinct versions: DP-V0 (periodic communication
and device status), DP-V1 (acyclic communication and alarm) and DP-V2 (slave-to-
slave communication). ProfiBus sees itself as a mediator between real-time critical
systems running, for example, ASI, and high-level integrators that communicate
with top floor applications through Ethernet.
There is a last ProfiBus variant, ProfiBus PA (Process Automation), for

applications in hazardous areas with explosion risk. ProfiBus PA installations use
the same protocol as ProfiBus DP, but on the physical layer control the number of
power supplies that can be used and the maximum current that can be drawn over
the interface. This concept is also known as Fieldbus Intrinsically Safe Concept
(FISCO), a standard also adopted by a variety of other fieldbuses.
Controller Area Network
The Controller Area Network (CAN) was introduced by Bosch in 1983 for automo-
tive communication, and saw first deployments in vehicles in 1988. After standard-
ization of various CAN variants by ISO in the 1990s as 11898-x, can has become
the de facto standard for in-vehicle communication, but has also seen adoption in
building and industrial automation.
The core CAN specification describes OSI layers 1 and 2 of a full-duplex
linear serial bus, with application-specific adaptations like CANopen or DeviceNet
on higher levels. The physical layer resembles RS-485, often with the typical 9-pin
D-sub adapter, although only two data lines termed CANH and CANL participate in
differential signaling and are the only lines used in space-constrained installations.
In a neutral state, called recessive or logic 1, both lines carry . Logic 0 is also
called dominant, and is encoded by pulling CANH to and CANL to .
CAN is a multimaster protocol that allows each device to take the host role for
the duration of a message, provided that the line has been in recessive mode for
a certain duration. CAN adopts CSMA/CD+AMP (AMP, arbitration on message
priority, which is effectively CSMA/CA): Each master receives back their message
on the bus, allowing for error and collision detection and correction. Over short
distances of less than , CAN can achieve data rates of 1 Mbit/s, although a low-
speed modus with up to 125 kbit/s is specified to work at distance. In building
automation where distances can easily become larger and the cost of cabling can
become an issue, a CAN variant that dispenses the CANL line on distances up to
exists: however, this standard can only support 40 kbit/s, and signals by pulling
the data line to V+ and ground.
Each device on the CAN bus has a unique identifier (11 bit in the base frame
format defined in CAN 2.0A, a 29-bit identifier in the extended frame format defined
in CAN 2.0B). This identifier serves for addressing and prioritization of messages.
Each message carries addressing data, error detection bits (for cyclic redundancy
checks), and up to 8 bytes of payload. There are four frame types on OSI layer 2,
a data frame (e.g., a broadcast of a value for general consumption), an information
request frame (i.e., from one device to another), an error frame in case of message
collisions and a spacer frame to introduce a delay on the bus. Since 2012 there
have been preparations for a modified CAN standard termed FD (flexible data rate),
which enables 64 bytes of payload by simultaneously reducing bit times.
Among the fieldbuses discussed in this section, CAN provides the highest
degree of reliability in terms of successful transmissions. With less than µ
response times for high-priority messages in high-speed mode, it is the fastest bus
in the comparison.
19.2.1.3 Industrial Ethernet-Based Buses
In contrast to the fieldbus systems discussed in the previous section, which brought
about their own definitions of the physical and the link layer (OSI layers 1+2),
industrial Ethernet-based buses focus on purpose-built networking and applications
(although the typical RJ-45 connector is occasionally replaced by a 4-pin M8
connector in industrial settings). There are generally two strategies for these fieldbus
systems:
• One is strictly speaking not a fieldbus anymore, but by using the well-defined
Ethernet connection as a physical layer, providing high-level abstractions
that are useful for the integration of data across the entire shop floor.
• Alternatively, they provide modifications on all OSI layers to facilitate real-
time applications over standard Ethernet. Here, bus update cycles in the range
of µ are possible.
These real-time Ethernet variants deserve a closer look, as they often provide inter-
operability with standard Ethernet while bringing the necessary transmission rates
for time-critical applications. Without anticipating the discussion of conventional
Ethernet (see Section 19.2.2) and the IP/TCP/UDP stack (see Chapter 22) too much,
it should be noted that their key priority is reliable transmission and efficient routing
of messages over large networks — but not speed. This puts industrial Ethernet-
based buses over all communication levels above OSI layer 1 in parallel to these
technologies (see Figure 19.7). In the example of three fieldbuses that are using
Ethernet, it becomes clear how the dissociation from the standard communication
stack can yield the necessary data rates. Again, as in the previous section, this serves
to highlight the challenges that are addressed with particular technologies, and not
to provide the blueprint for implementations.
Ethernet Industrial Protocol
The publication of the Ethernet Industrial Protocol (Ethernet/IP) standard dates

back to the year 2000 and represents a joint development of Allen-Bradley and
the Open DeviceNet Vendor Association (ODVA). Their aim was the adaption
of the Common Industrial Protocol, an OSI layer 4–7 suite of communication
standards, to Ethernet. In the simple terms, Ethernet/IP provides guidelines for the
use of standard TCP and UDP communication for peer-to-peer (explicit messages)
and broadcasting (implicit messages). While the protocol was initially designed
for nontime-critical applications, improvements to Ethernet itself (from Mbit/s to
Gbit/s) along with network services such as Precision Time Protocol (PTP) now
allows for response times in the -range.
ProfiNet
ProfiNet (process field net) was proposed by the ProfiBus Foundation, and is part of
IEC 61158/61784 standards. However, except for concepts on the application layer,
it is completely independent of ProfiBus. There are two subsystems called ProfiNet
IO (for peripherals) and ProfiNet CBA (Component Based Automation, for the field
layer). Within ProfiNet IO exist three device classes (conformity classes, CC A–C),
reflecting the degree of conformity with the standard and their utility for certain
applications (A: infrastructure I/O, no time constraints; B: industry automation
and processing; C: highly time-critical I/O). This is complemented by three device
types, field device, controller or supervisor, which can take different priority on the
network.
On the basis of a standard Ethernet connection, ProfiNet CBA can utilize
TCP/IP to connect to top floor applications over the Internet. However, depending
on their need, ProfiNet IO devices can either utilize standard TCP/IP or UDP (CC
A), or the ProfiNet real-time control protocol, which itself knows two modi: the
normal modus can operate at cycle time (for CC B), while the isochronous real-
time (IRT) utilizes the aforementioned PTP times and can facilitate cycle updates in
the µ range (for CC C). We will see how such short cycle times can be achieved
in our discussion of the EtherCAT standard (the standards and specific terminology
are different, but the approaches are conceptually similar).
Ethernet for Control Automation Technology
Ethernet for Control Automation Technology (EtherCAT) was introduced by Beck-

hoff Automation in 2003 and is standardized as IEC 61158. It is one of the most
consequent implementations of real-time Ethernet, entering the OSI stack at layer 2.
EtherCAT is transparent to normal Ethernet traffic (called Ethernet-over-EtherCAT,
EoE), but requires specialised hardware (called EtherCAT Slave Controller, ESC)
for devices with real-time requirements. EtherCAT installations can assume most
topologies and combinations thereof, with up to 65,535 devices in each network
segment. The communication is full-duplex like any other Ethernet, but for perfor-
mance purposes maximum wire lengths of are suggested.
The two key differences between EtherCAT and Ethernet are (a) on-the-fly
processing of frames passing through a devices and (b) isochronous real-time with
significant sub- µ precision established through PTP and feedback mechanisms
between distributed clocks. The on-the-fly messaging works as follows: Rather than
normal Ethernet, where a data frame (a OSI layer 2 structure) is received by a
device in its entirety, processed, and then the message is either sent in response
or forwarded, EtherCAT processes the data frame as it comes through. The bits that
have already arrived at the device are immediately copied into memory (through
DMA, direct memory access). If there is any response, this data can be copied into
empty slots of the data frame as it leaves the device onwards to the next, even before
it is read and processed to its end. To facilitate these high speeds, the entire protocol
stack is implemented in hardware. While Gbit/s-Ethernet could be used in principle,
the current EtherCAT standard runs entirely on 100-Mbit/s-Ethernet infrastructure,
as the net performance is significantly higher ( ) than normal Ethernet with
an IP/TCP/UDP stack.
19.2.2 Ethernet
Ethernet is the mother of modern wired data communication. Originally developed

as part of a research project at XeroxPARC, where a cable running across the
building was considered the Ether, Ethernet has been the summary term for both
cable connection systems and a data transfer protocol for the past 40 years. In
fact, the Ethernet used today to provide wired connections for anything from office
networks to huge switches operating in Internet backbones (at 100 Gbit/s) bears
little resemblance of the experimental 1-Mbit/s solution invented at XeroxPARC
in 1973. The first commercially available Ethernet products entered the market in
1980, followed by standardization as IEEE 802.3 in 1983.
OSI default Ethernet/IP ProfiNet EtherCAT

layer Ethernet
7 e.g. HTTP CIP ProfiNet app EtherCAT app

6
5
4 TCP / UDP ProfiNet EtherCAT
3 IP RTC RTC
2 standard Ethernet modified Ethernet
1 standard Ethernet (IEEE 802.3)
Figure 19.7 Options for industrial Ethernet. Four types of utilization of the Ethernet physical connec-
tion (from left to right): Standard TCP/IP connectivity (e.g., for HTTP over the Internet); the Ethernet/IP
protocol that utilizes most of the previous stack, but brings about industry automation-specific modifi-
cations of OSI layers 5–7; the ProfiNet approach that allows utilization of the IP/TCP/UDP stack but
introduces a package and routing stack, which is optimized for real-time applications; and EtherCAT
that modifies the Ethernet data link layer (and everything above) for even shorter update cycles.
19.2.2.1 Ethernet Physical Layer: Hardware
Ethernet was promoted as a standard that operates over a shared medium, i.e. where
the network infrastructure does not belong to a particular device. While appearing
trivial from a modern perspective, conceptually moving from a wired peer-to-peer
connection to the Ether as shared resource helped shape the idea of ubiquitous con-
nectivity in the 1970s. However, even the Ether requires infrastructure to maintain
signals over long runs of wire (a repeater), route information in the most efficient
way (a router, or switch), and ultimately connect to a device. Over its long period
of existence, facing the ever increasing demands on data transfer rates, the physical
layer of Ethernet has changed significantly. In the example of an early Ethernet
installation, modern home/office infrastructure and Ethernet in high-performance
computing, a selection of physical layer solutions and the evolution of the standard
are going to be discussed.
Early Ethernet: Coax Cables and 10BASEx Adapters
Very early commercially available Ethernet following the 10BASE5 standard used
half-inch thick RG-8/U coaxial cable with Ω impedance. The conductive core
of this cable transmitted data in half-duplex mode over distances of up to .
In order to connect devices, holes were drilled into the RG-8/U and wires were
threaded into these holes to feed into Ethernet tranceivers, at distances of at least
along the main bus to minimize signal reflection and interference between
devices. 10BASE2 simplified some of the installation effort. The thin Ethernet RG-
58 cable was cheaper than RG-8/U and despite its shorter reach of , it had a
few convenient properties for wiring (e.g., ability to bend, shorter distance between
clients). The standard adapter for RG-58 was the Bayonet Neill-Concelman (BNC)
connector, for which T-adapters allowed simple branching off the main Ethernet bus
to network interfaces or signal repeaters. For both 10BASE5 and 10BASE2, the 5-
4-3 rule applied: 5 segments (main cables) can be connected over 4 repeater units,
of which only 3 segments should carry active network participants.
Modern Ethernet: Twisted Pair and RJ-45 Jacks
Standards occasionally termed fast Ethernet and most represented by variant

100BASE-TX are still commonplace in home and office environments with de-
mands in the 100 Mbit/s range. The coax cable is here replaced with twisted pair
cables (often called Cat cable) and the characteristic RJ-45 jack, correctly named
a modular connector following TIA/EIA-568 8P8C (8 position, 8 contact). Only 4
of the 8 pins (pins: 1, 2, 3, 6) are used for data in a standard Ethernet cable (see
Section 13.2.3 for comparison with Power-over-Ethernet, which utilizes additional
wires for power supply). It is noteworthy that ruggedized Ethernet connectors used
in industrial applications often have only these four data pins, like the M8 adapter.
Twisted pair cables are categorized (hence Cat) into applications with different re-
quirements. Cat cables exist in various lengths, with being the longest dis-
tance that can be bridged over a single cable before additional infrastructure (i.e.,
repeater or switch, creating a star topology) has to be used. 100-Mbit/s-Ethernet is
possible with Cat5 cables, supporting frequencies of up to over in
unshielded twisted pairs. Gbit/s-Ethernet requires Cat6 cables with 4 twisted pairs
(all 8 lines used for data), some of which shield both pairs and cable separately to
support up to . The biggest advantage of Cat cables is their price and easy
handling, while some of their weaknesses in terms of signal stability are counter-
acted by the differential signaling that is used for low-level data encoding. While
Cat5 cables utilize a Rx/Tx pair for full-duplex serial communication, Cat6 cable
features both improved conductors with better performance, as well as four data
pairs that can dynamically assume both the Rx or Tx role. As both cable types fit
into the standard RJ-45 socket, an important Ethernet feature is autonegotiation for
devices to recognize which method of communication is desired.
High-End Ethernet: Optical Cables
Optical cables can support much higher signal frequencies than electrical wire. For
applications with transfer rates exceeding 10 Gbit/s (e.g., to connect computers
for high-performance computing in data centers or to feed Internet backbones),
specialized hardware that interfaces electrical to optical cables is available. The
realities of the market have here replaced standardization by independent third
parties like the IEEE. Multi-source agreements (MSA) between vendors have
established standards such as Small Form-factor Pluggable (SFP), XFP or QSFP
(Quad SFP), all of which take in an electrical digital signal, feed it into a laser that
generates modulated light signals, and after transmission over a fiber-optic cable
regenerate the electric signal on the other side. Depending on the type of cable
and the wavelength of the laser, distances of up to can be bridged. It should
be noted that optical cables and fiber-optic transmission are not restricted to these
very fast applications. It is not uncommon to use standards such as 10BASE-F for
Mbit-Ethernet in areas with significant electromagnetic interference, such as factory
floors. With 10BASE-F, individual connection distances in the kilometer range are
possible.
19.2.2.2 Ethernet Physical Layer: Encoding
Bit encoding (see Section 18.3) varies widely between the different Ethernet stan-
dards. Traditional Ethernet over coax cable utilized Manchester encoding with 2
voltage levels and accepted the significant overhead of this method. Faster Ethernet
standards introduced the Physical Coding Sublayer (PCS, within OSI layer 1) that
arranges how data is to be encoded. The 100BASE-TX standard moved to the 4b/5b
standard on three voltage levels: 4b/5b utilizes a lookup table (not shown) to encode
4 data bits (a nibble) in 5 transfer bits. The additional bit of information is used for
synchronisation, as the table is designed to minimize long runs on a particular logic
level, and for error correction. The three voltage levels follow the MLT-3 (multilevel
transmit) encoding that resembles a NRZI encoding.
However, in contrast to NRZI, the change to logic 1 does not automatically
yield the transfer of 1 on the line, but the next one of values the fixed series
− . The series can be understood as a series of voltage levels.
Bit series:
0111 | 0100 | 0010 | 0000 | 1011
4b5b code:
01111 | 01010 | 10100 | 11110 | 10111
MLT-3 output:
0+0-0 | 0++00 | --000 | +0-00 | ++0-0
Gbit-Ethernet and faster utilize 8b/10b or 64b/66b encoding following the

same principle. Also at the level of the PCS are conventions that allow the efficient
crossing of data from Ethernet to SONET or SDH (see Section 17), far-reaching
technologies used in Internet backbones.
19.2.2.3 Ethernet Link Layer
In contrast to many systems introduced in Section 19.2.1 that are completely

agnostic as to how data is transferred over the wire, Ethernet comes with many
features on the link layer (OSI layer 2) that resemble mechanisms also in place for
data routing over the Internet. The key concepts for data transfer over Ethernet are:
• On the Media Access Control (MAC) layer, every device on the network
has a unique address (the MAC address) and all data is sent in the form of
data telegrams (data frame). A data frame carries addressing information,
payload and error checking information. On the MAC layer, there are also
mechanisms for concurrent use of the network utilizing by CSMA/CD (see
an explanation in Section 19.2.1) to avoid data loss in case of collisions on
the network.
• The Logical Link Control (LLC) layer mediates between the MAC and the
network layer. That is, it arranges the handover between the Ethernet data
frames and IP packages detailed on OSI layer 3.
Both the concepts of MAC and LLC have been standardized in IEEE 802. A
cross section of an Ethernet frame on the physical and the link layer is shown in
Figure 19.8.
When two devices are connected over Ethernet, frame collisions can only
happen when the physical connection supports just half-duplex communication.
Fast Ethernet is generally full-duplex, thus rendering collisions a problem of the
past. In larger networks currently there is generally a network switch that makes
pairwise connections between the end devices and the switch in a star topology,
thereby controlling the flow of information across the network. Managed switches
MAC address indicates type: IP traffic,

of the sender Wake-on-LAN packet, etc.
MAC address of the receiver, quality of service, 32-bit CRC

e.g. 7c:c3:a1:b2:c8:ee network priority OSI layer 3 data, e.g. for the frame
IP package
receiver sender 802.1Q EtherType payload check sum

address address tag
OSI layer 2 (6) (6) (4) (2) (1500) (4)
OSI layer 1 preamble start payload gap

(12) (1) (1522) (12)
indicates start at least 64 byte to for a minimal data frame introducing a

of a new frame mandatory gap
between frames
Figure 19.8 Ethernet frame. Every transmission over Ethernet begins with a 12-byte preamble (a bit
pattern that allows sender and receiver to synchronize their clocks; that is, to negotiate the boundaries
between bits) and a start byte. This is followed by at least 64 bytes of payload encapsulating information
relevant for all higher OSI layers. After the the payload, there is a 12-byte gap. On the link layer, the
Ethernet frame communicates which device should receive the data and where it was sent from. The
802.1Q tag and EtherType bytes organize priority on the network, as well as an indication what type of
payload follows (there is a vast range of codes ranging from IPv4 traffic to audio formats to EtherCAT).
Following the 1,500-byte payload comes a 32-bit cyclic redundancy check code.
allow the intelligent prioritization of traffic across the network with some degree of
manual control, while unmanaged switches generally work out of the box and route
traffic following a simple queue. On some older networks it is still possible to see
so-called Ethernet hub devices. These are simple repeaters that take data on one port
and forward it to all other directly connected devices. This means that there is still
potential for collisions on the network, as these rather mechanical devices create a
funnel effect by concentrating data from various devices to the main network. In
other words, while a hub simply deals with OSI layer 1 data and copies it across,
switches build device tables that map physical ports to MAC addresses of attached
devices (touching upon OSI layer 2). In higher-level communication between
computers, addresses (e.g., IP addresses) are then mapped to MAC addresses, and
the switch directs traffic only to the device with the appropriate MAC address.
19.2.3 Powerline
Section 19.2.1 briefly mentioned that some standards (e.g., the ASI bus) aim
to deliver power and data over the same wire to minimize cabling effort and
to provide added value over just a data connection. A similar but reverse logic
applies for powerline: Given that most buildings have electrical cables for mains
electricity anyway, one can use the A/C of or as carrier frequency
for data transfer. However, also on the grid level there are entry points for data
over electricity. In fact, proprietary narrowband powerline in the sub-megabit/s
range has been in use for decades as means of telemetry and supply-and-demand
management between grid infrastructure. Meanwhile, there are at least half a
dozen attempts for standardization, e.g. Distribution Line Carrier (DLC), Real-time
Energy Management via Powerlines and Internet (REMPLI), PoweRline Intelligent
Metering Evolution (PRIME) or G3-PLC, all aiming to put in place powerline as
means for load balancing, automated meter reading and infrastructure monitoring.
In principle the technology is the same for domestic/building use: Across the
periodic change of alternating current, there can be many minute amplitude changes
of much higher frequency during the same period to encode digital information
(compare with Section 1.2.3.4 on electromagnetic wave modulation). However, the
standards around powerline communication are rather different between narrow-
band uses in grid management, building automation and broadband uses.
The building automation field has known X10 since the 1970s as a standard
to communicate one from a set of 16 different commands like lights on, lights off,
etc. At the change from upper to lower amplitude (and vice versa), a 1-ms-burst of
indicates a logic 1, otherwise as logic 0. With a range of safety functions
(e.g., the repetition of commands), X10 yields a net transfer rate of about 20 bit/s
and is therefore only useful for very simple automation tasks.
Modern-day powerline is often used to extend the range of Ethernet in
buildings. There are many products that combine a plug with a simple Ethernet
adapter available in the consumer market (e.g., following multiple versions of the
HomePlug standard). These devices are fed by Ethernet, transfer the information
over up to of electricity wiring, and feed back into Ethernet at a remote site
on the same electricity circuit. These standards achieve data rates in the Gbit/s range
under optimal conditions. Unfortunately, powerline communication is a potential
source of electromagnetic interference, while also being susceptible to it. Hence,
these ideal data rates are not often achieved in practice.
19.3 WIRELESS STANDARDS
One might argue that a section on wireless communication is a repetition of

the previous one: simply replace the wire and a series of electric pulses that
characterize a wire-based standard on the OSI physical layer with air and radio
waves. While there is some validity to the overall reasoning (and the reader will
note that, for example, the concept of a MAC address that we have seen for
Ethernet communication is also used for wireless devices), radio technology is
different as it enables IoT devices to operate away from wired infrastructure: it is the
independence of a static position and thereby often the lack of mains electricity that
drives many discussions around the use of wireless standards in small IoT devices.
In practice, the choice of radio always depends on the balance between
constraints around
• Data rate
• Signal range
• Power demands
These differences are not academic; the right choice of wireless technology influ-
ences the success of an entire IoT project. Even in contemporary standards, data
rates range from bits/s to Gbits/s, their signals reach from centimeters to hundreds
of kilometers, and sender devices consume between to in active
operation (see Figure 19.9). Unfortunately, none of these standards are optimal for
all requirements; physics dictates that one can pick any two, so careful consideration
is important.
Some radio standards such as Bluetooth and WiFi, as well as many genera-
tions of cellular data services, have already seen widespread adoption in the con-
sumer market and industry, at a time when “putting things on the Internet” was not
their primary purpose. Today, the development of novel technologies for wireless
connectivity specifically for IoT and the lobbying for new standards are both very
active fields that are far from consolidation. That is, while many current household
IoT devices may use Bluetooth or WiFi, there are good technical reasons to seek
alternatives that are more specific to particular IoT applications. The advantages
of wireless communication are also not stopping at factory floors and municipal
infrastructure, and long-range low-power options are widely discussed. However,
with a plethora of competing standards, interoperability on the hardware level is
an important but distant goal. The problems thus reflect those that Section 19.2.1
discussed for fieldbuses. It has been humorously noted before that one day we may
range throughput
cellular technology low-power WAN
(3G, LTE, etc.) (LoRa, SigFox, etc.)
local area network body area network
(e.g., WiFi) (e.g., Bluetooth LE)
passive and near-field
radio (e.g., RFID, NFC)
power demand
Figure 19.9 Wireless standards. The three axes: range, data rate (throughput) and power demand.
As throughput and power demand correlate, they are displayed as parallel axis in this representation.
Standards shown as extremes are passive radio tags that require close contact and transfer only a few
bit/s on the lower right, and cellular technology that can communicate over hundreds of kilometers and
has peak rates of several hundred Mbit/s.
require one wireless gateway for every device, as no two IoT devices share the same
radio standard to communicate with one another.
IP-to-the-Edge
One architectural decision when devising an IoT solution is whether end devices
(i.e., sensors and actuator) should have their own identity on the Internet (i.e., an
IP address), or whether they can behave agnostic to the Internet and simply interact
with a gateway device that relays any information into the network. For example, it
is possible to implement a command chain to an actuator device so that the gateway
can communicate across the Internet, but the command has to be extracted and sent
over the end device-specific protocol stack to be executed. Alternatively, in a setup
in which the Internet traffic can be routed all the way to the end device as it has its
own IP address and Internet stack implementation, the gateway can be omitted and
the end device can be treated like any other network device, independent of whether
it is a computer, printer or coffee maker. This simplifies software development and
counteracts architectural complexity, and at the same time allows for the securing
of the entire end-to-end connection using encryption and/or secure tunnels (see Part
VIII for security aspects). Unfortunately, the opportunity to do so comes at the
cost of power, as establishing and holding an Internet connection requires more
ZigBee Standard OSI Layer 7

Profiles Application Protocols
ZigBee Thread Standard
- network - UDP/DTLS OSI Layer 4-6
- transport 6LoWPAN BLE 4.x/6LoWPAN IPv4, IPv6
- protocol - IPv6 - IPv6
IEEE 802.15.4 IEEE 802.15.1 IEEE 802.11
Link layer: MAC Link layer: MAC Link layer: MAC
IEEE 802.15.4 IEEE 802.15.1 IEEE 802.11
Physical layer: Physical layer: Physical layer:
2.4 GHz, PSK 2.4 GHz, FSK 2.4/5 GHz, PSK
ZigBee, Z-Wave, etc. Bluetooth LE 4.x WiFi
Figure 19.10 ZigBee: Proprietary protocol or broad standards. Traditionally ZigBee devices support
the buildup of a mesh network with message forwarding and so forth, including some standard function-
alities summarized in application profiles. The 6LoWPAN standard provides an interoperability layer
between IEEE 802.15.x (ZigBee, Z-Wave, or Bluetooth) and higher-level Internet protocols, such as
the Thread standard introduced by Google/Nest. The WiFi standard is shown to provide a well-known
reference point.
computing and uptime than sending a raw value over a proprietary radio connection
that is only standardized on the OSI physical layer. However, there are applications
where energy efficiency is not a primary concern. Here, the relevant technologies
are advertised with their ability to do IP-to-the-edge.
The ZigBee standard can be used to exemplify both proprietary protocols and
IP-to-the-edge. On the physical layer ZigBee uses the IEEE 802.15.4 standard just
like a few other radio standards (like Z-Wave). The classic ZigBee stack features
functionality for package forwarding within a mesh network, including use-case
specific messages. However, it can also be adopted to the 6LoWPAN standard
(IPv6 over Low-Power Wireless Personal Area Networks) that interfaces ZigBee
devices with the Internet (see Figure 19.10). Interestingly, 6LoWPAN also provides
integration with other radio standards of the IEEE 802.x family.
Default Frequencies and Communication Standards
Chapter 1 on the physical foundations of electromagnetic waves introduced the

frequency intervals (bands) used by the ITU to regulate use of particular frequencies
(see Table 1.2). With the exception of passive and near-field radio communication
that use low frequencies (LF, ITU band 5), most other standards including television
broadcasts, mobile phone infrastructure, domestic cordless phones, WiFi and long-
range radio share overlapping or adjacent frequencies in the ultrahigh frequency
(UHF) range, ITU band 9.
While generally radio data communication can happen across the entire
spectrum within ITU band 9, the industrial, scientific, medical (ISM) bands centered
around (worldwide) and (United States and Australia), and the
short range device (SRD) band at (Europe) provide commonly used base
frequencies. ISM and SRD are unlicenced, meaning that everyone can use them, but
they are not unregulated, meaning there are rules as to how these frequencies are to
be used. As many devices from many manufacturers compete for radio frequencies
and airtime, there are limits to how ISM and SRD frequencies can be occupied (see
Figure 19.11):
• maximum power: in order to prevent an arms race with higher and higher
transfer power as means to dominate the communication on a channel, there
are strong limitations to the maximum power that can be used to send a
signal: for example, at , this is or 14 dbM, respectively.
• channel bandwidth: while communication channels with larger frequency
intervals can in principle encode more information (see FSK modulation,
section 1.2.3.4), this would be at the cost of the number of parallel channels
that can be used by a larger number of devices.
• duty cycle: to allow all devices a fair share of airtime that enables anticolli-
sion strategies such as CSMA/CA to work, no device can send data longer
than, for example, an accumulative 1% per hour, often dependent on the rel-
ative strength of the signal used.
That is, many technologies send and receive radio waves that share very sim-
ilar physical characteristics. At the same time, given that no electrical system is
perfect (sensitivity to thermal noise, random electromagnetic emission) and radio
waves themselves are shaped by the medium they travel through, there is a con-
siderable danger of crosstalk between different device classes: the unwanted inter-
play of microwave oven, television receiver and mobile phone is almost proverbial.
radar Oven UHF VHF

shortwave
microwave "radio"
1 mm 1 cm 1m
-3 -2 -1 0 1 2
10 10 10 10 10 10
12 11 10 9 8 7 6
10 10 10 10 10 10 10
1 Terahertz 1 Gigahertz 1 Megahertz
2500 500
MHz MHz
ISM ISM SRD
2.4 GHz 915 MHz 868 MHz
GHz 2.40 MHz 863

100% DC, 10 mW
2.41 0.1% DC, 25 mW
864
2.42
865 1% DC, 25 mW
2.43
2.44 866
2.45 867
3 channels of 25 kHz,
2.46 0.1% DC, 10 mW
868
1% DC, 25 mW
2.47
869 2 channels of 25 kHz,
2.48
10% DC, 500 mW
2.49 870 100% DC, 5 mW
14 channels: 16 channels:
WiFi various IEEE 802.15
802.11b/g/n standards
79 channels:
Bluetooth
Figure 19.11 Exemplary frequency use. Starting from the general overview of the electromagnetic
spectrum introduced in Figure 1.13, this schematic provides a drill-down into the spectrum from
to . Within this range, there are the ISM bands at and the one at ,
as well as the SRD band around . The two vertical scales have different resolution, left
specifying the frequency use over at and right showing at . The
ISM band is shared by 14 20-MHz wide channels of the IEEE 802.11b/g/n WiFi standard, 79 Bluetooth
channels, as well as 16 channels common to IEEE 802.15 (e.g., ZigBee). The SRD band is separated
into smaller fragments for various use cases. For example, at 100% duty cycle (continuous sending), one
can utilise a window from to with 10-mW output power. A slightly larger window
extending to about allows only 0.1% duty cycles, but at . Other windows define
distinct, smaller channels of each.
However, even within a particular technology, a large number of participants can

negatively impact the performance of a wireless standard.
Many well-known wireless communication standards are vendor-independently
normed in IEEE 802.11 (wireless LAN, WiFi) and IEEE 802.15 (wireless PAN,
e.g., Bluetooth, ZigBee). These standards primarily deal with OSI layers 1 and 2,
although for example the Bluetooth Low Energy standard has evolved to incorporate
use case-specific profiles that dictate the content of messages, reminiscent of OSI
layer 7, while ZigBee has traditionally supported a use case-specific application
layer and is slowly opening up to allow generic IP-based traffic. Wireless standards
are neither advertised nor discussed in isolation, and thus very often the separation
into the layers of the OSI model and what constitutes the standard and what not can
become difficult to assess. For example, the WiFi standard is almost exclusively
used in connection with the IP stack, although IEEE 802.11 does not enforce any
particular use of the physical connection.
Physics and Technical Constraints of Wireless Communication
Although in principle also relevant for data communication through the wire, two
issues deserve special consideration when discussing radio communication:
• Link budget
• Concurrency
The link budget describes the amount of power loss there is between the sender and
a receiver, and how individual system components between them contribute to the
increase or decay of the signal along the way. In a simple formalism, for a wireless
system, that is:
RX TX − TX TX − FS − M RX − RX
with RX being the received power (in dBm), T X the transmitted output power
(dBm), T X various losses through connectors (dBm), T X transmitter antenna
gain (dBi), F S path loss through the medium (dB), M other losses along the
way (dB), RX receiver antenna gain (dBi) and RX various losses on the receiver
side (dB). It is worth noting that dB is a relative measurement, dBm an absolute
measurement with a power reference (0 dBm = ), and dBi the relative change
to an isotropic radiator, as discussed in Section 1.2.3.3.
The free space path loss ( F S ) is by far the single most important contributor
to signal decay. The graphical representation of the Poynting vector (see Figure
1.16) demonstrates intuitively how an electromagnetic wave front of a certain

energy becomes diluted the further it travels, as the gray surface area becomes
wider and wider. The degree of signal loss over distance is roughly , but
dependent on the frequency of the electromagnetic wave. For example, doubling
the signal path contributes to F S at , and even at .
Calculations of F S are not robust to multipath propagation, effects of which are
often approximated in M as contribution from multipath fading.
The sensitivity of a wireless communication system is also dependent on the
bandwidth that can be used in a channel. While precise calculations exist, here the
focus is on an intuitive explanation: radio signals are subject to perturbation by
thermal noise. Over a large interval of frequencies, the signal to noise ratio shifts
towards the signal (i.e., the noise component is averaged out and makes signal
recovery easier than in cases where the message is encoded over a smaller frequency
range). This, however, leads to the issue of concurrency.
Section 19.2.1 on fieldbus systems introduced the issue of concurrency. In
other words, when more than two participants share the same medium, there is a
risk of data loss if two or more participants start sending at the same time and
the receiver cannot separate the overlapping messages, similar to being spoken to
by several individuals at once. Mechanisms to deal with concurrency (time slots,
collision detection) have been discussed in the respective section. Interestingly,
the issue of concurrency is even more severe in wireless communication. While
in the design of wired infrastructure it is possible to actively control the number
and behavior of devices on a network, electromagnetic waves do not care about
organizational considerations, physical boundaries or applications. A single chatty
sender that is not compliant with the regulations of the ISM and/or SRD bands can
easily sabotage an entire telecommunications system. The careful utilization of a
frequency band through channels and duty cycles is therefore important to achieve
certification of a wireless standard. In order to avoid continuous blocking, many
standards also implement predetermined or stochastic channel hopping or frequency
adaptation (see Figure 19.11).
19.3.1 Passive and Near-Field Radio
Passive radio senders receive their power to transmit a message from an external
interrogation signal. That is, upon converting an electromagnetic signal into an
electric charge of less than µ , the said charge then drives the transmission of
a preformed, outgoing message from an integrated circuit over the antenna. This
is then received and processed by the interrogation device. While invented several
times in the history of technology, the first patent to refer to this principle of passive
backscatter as radio frequency identification (RFID) dates to 1983. Throughout
the 1980s and 1990s several application areas for RFID-like technologies were
commercially exploited, ranging from building security to asset tracking (see also:
RFID as predecessor of the Internet of Things, Section 4.3).
19.3.1.1 Radio Frequency Identification
Proprietary passive radio systems have made a place for RFID tags that are widely
used in modern goods traffic and monitoring applications. These tags store data
from single bits to kilobytes, allowing the unique identification of any asset. As
such, RFID is covered by a range of standards detailing physical specifics of
transmissions and the content of RFID messages in different verticals.
On the technical level one distinguishes purely passive RFID tags that use just
backscatter (drawing from inductive or capacitative coupling), and active devices
that can utilize a separate power supply to amplify the message upon activation.
While passive RFID tags operate at close range (with a balanced interrogation
device to minimize the number of responding units), active RFID tags can have
a reach of hundreds of meters. Semiactive RFID on the contrary modulates the
incoming interrogation signal and can be fine-tuned to match the distance between
tag and reader.
RFID operates in different frequency bands (see Table 19.1), all governed
by substandards of ISO/IEC 18000. While price is likely the key driver for many
applications of ISO/IEC 18000-6 (UHF), a reason for ISO/IEC 18000-2 (LF) may
be the reliability of operation near metal structures and in wet environments that
negatively affect signal strength. At the same time, the majority of tags operating
in HF (ISO/IEC 18000-3) have significantly larger storage capacity, driving their
application where large amounts of data need to be transmitted. Active tags exist
across all frequency bands. Their advantage is reach, making them the technology
of choice, for example, for vehicle tracking from bridges. In many cases, RFID tags
use planar antennas (see Figure 1.23) that are activated with a signal that shows
circular polarization. This is to maximize the effect on the tag, since depending
on the application the orientation of the reader to the tag is usually unknown and
may be hard to control. Alternative antenna designs use a ferrite core and coil.
These RFID tags often have a characteristic pill form, sometimes small enough to
be transplanted under the skin of animals or very enthusiastic humans.
The simplest RFID tags are 1-bit devices that simply indicate presence. They
are used, for example, for securing goods in retail stores. Most other tags have
Table 19.1
RFID Frequency Bands
Band LF HF UHF Active Tag
Base frequency - kHz to GHz
Maximum range
Power source Passive, induc- Passive, induc- Passive, Integral
tive coupling tive or capaci- capacitative battery
tative coupling storage
Penetration ++ + 0 ++
Harsh + 0 - -
environments
(e.g., metal,
water)
Relative cost Lower $ range Cents (cheap) Cents Several $
(cheapest)
Memory bytes kbytes bytes bytes
Relative data rate 0 + ++ ++
Bulk reading - + ++ +
Applications Livestock Asset tracking, Bulk item Asset
(farm animals), access control tracking (inside tracking
industrial containers) (vehicles),
applications asset
localization
(large items)
storage capacity in the range of several bytes up to kilobytes. While the earliest
RFID tags were read-only and received a unique ID at the time of fabrication,
modern chip designs allow several thousand rewrite cycles and nearly no restriction
on the number of reads, provided that the rewrite is sufficiently powered. It is
noteworthy that the activation signal must be about a thousand-fold higher than
what is required to power the chip, meaning that especially for passive tags short
distances and interrogation signals of up to have to be used.
The encoding of the bitstream (e.g., Manchester encoding, see Section 18), the
modulation method used (e.g., phase shift), the encryption of the message and the
handling of message collision when multiple tags respond to the interrogation signal
— all this is very much dependent on the particular use case and specified differently
in standards for various verticals. For example, both the MIFARE contactless travel
card and many European ID cards build upon ISO/ IEC 14443 to define the physical
layer of a RFID tag embedded in a credit card form factor for personal identification.
As a key technology for international trade, animal tracking and monitoring
in regulated markets, there are estimates of a turnover of about 10 billion RFID tags
per year.
19.3.1.2 Near-Field Communication
Near-field communication (NFC) is a technology that builds upon the RFID princi-
ple of electromagnetic induction over small distances, particularly the communica-
tion on the 13.56-MHz (HF) band. While RFID means mostly static messages and
very simple data exchange, NFC is designed to allow complex interactions between
communication devices (such as payments or other commercial transactions) in
which both devices can take the role of both sender and receiver, at very close range
of usually less than . NFC transceivers are now commonly part of mobile
phones. Integrated circuits or development boards encapsulating NFC functionality
and providing data access via common hardware interfaces are now a commodity.
While being able to read HF RFID tags and adhering closely to ISO/IEC
18000-3, NFC is based on ISO/IEC 14443 and 18092, and also governed by the
Near Field Communication Forum, established by Nokia, Philips und Sony in 2004.
While the ISO/IEC standard concerns mostly OSI layer 1 specifications, the NFC
Forum specifies message formats and so-called Tag Types for particular NFC use
cases. As key architectural difference to RFID, the following communication modes
are valid:
• NFC card emulation: the NFC device can mimic the functionality of other
close-contact tags, including MIFARE and FeliCa. This allows mobile
phones to act as access cards, payment devices or travel tickets.
• NFC reader/writer: this is effectively HF RFID tag emulation, as well as
acting as an interrogation device for tags.
• NFC peer-to-peer: for the unconstrained exchange of data between two NFC
devices at rates ranging from 106 kbit/s to 424 kbit/s.
On the technical level, NFC knows an active and a passive communication modus.
The active modus assumes that both devices are powered, and when sending,
generate their own field, while an idle or receiving device deactivates its field. In
passive mode, the initiator generates a field that is modulated by the secondary
device (this is similar to the semiactive RFID mode). In both cases, the data is
Manchester-encoded and modulated via ASK (see Section 1.2.3.4).
19.3.2 Data Radio
In contrast to passive and near-field communication in which the responder signal is

largely driven by a physical response to an interrogation signal, the radio standards
in this section all comprise active, powered senders and active, powered receivers
with signal amplification circuits. This also means that their range and data rate is
directly dependent on the energy budget they have available; there is a physical limit
to the strength and duration of the radio signal that can be sent using the maximum
current of a battery. Single bursts of a few bytes every few hours impact battery
life time differently than the continuous exchange at several Mbit/s, and there are
standards that are more or less suited for both for applications. It is difficult to
achieve consensus as to which radio standards are optimal for different use cases,
but one might roughly categorise the following examples:
• Low-throughput, low-energy industrial or home automation
– RF modules
– ZigBee
– Z-Wave
– ANT
– EnOcean
– Bluetooth Low Energy

• High-throughput consumer and office radio
– WiFi
– Bluetooth
• Low-throughput, low-energy long-distance radio, e.g., for smart cities
– LoRa(WAN)
– Sigfox
– Weightless
As with the fieldbus standards, this list is by far not comprehensive and not all of the
above standards can be covered in detail. Large transfer rates over long distances
are the realm of cellular data services, which are highly regulated, which happen in
licensed frequency bands and for which infrastructure is typically not in the hands
of individuals or small and medium-size enterprises. While effectively using some
sort of radio, these standards are covered in the next section.
Here in this section, the focus is on radio standards that are commonly
available and that leave the governance of the infrastructure in the hands of the
respective user or organization. As before, the role of this book is not to provide
the depth that would be required to implement a hard- or software solution based
on the standards in question. Rather, coming from first principles, interesting details
about proprietary radio modules, ZigBee, Bluetooth, WiFi and long-range standards
are picked up to give the reader an appreciation of the problems that have to be
addressed in wireless digital communication.
19.3.2.1 Proprietary RF modules
Cheap home automation systems and remote sensor units in the consumer market
(including radiators and weather stations) often build on proprietary radio frequency
modules. Their diversity in the market is huge, primarily fuelled by many companies
in Asian markets, with little or no attempts to make different modules compatible
between manufacturers. On the most basic level, a sender/receiver pair establishes
a communication channel over the air that is mirrored through a serial interface.
However, the base frequency (if outside a standard ISM and/or SRD band), the
bandwidth, the channel, the modulation, the encoding and —if any— encryption of
the data is typically not known to the user.
There are some attempts to make radio devices interoperable, at least for
devices from one particular manufacturer, and the RFM series (e.g., RFM12B)
from HopeRF or modules with the CC1101 chipset from Texas Instruments have
seen widespread adoption. Commonly, both sender and receiver are combined
into a single device, the transceiver. Most RF modules available for the developer
market are indeed transceivers, as the price advantage for having just sending or
receiving capabilities only becomes relevant when moving into mass production.
The aforementioned radio modules exist in various form factors from integrated
circuit to development board. In combination with low-spec microcontrollers that
facilitate the communication with the host, they are often available as system-on-
chip, complete with antenna connection (hardware) and basic message handling (in
software).
Most proprietary RF modules manage a range of less than . If they use
OOK (on/off key), a type of ASK where the logic levels are encoded by sending or
not sending, these modules are effectively mimicking a slow, wired serial connec-
tion. On a technical level, this means that they have to deal with the same issues such
as clock recovery as serial modules. As with all communication standards, it can
therefore be beneficial to reduce the data rate under difficult conditions to allow the
radio chipset to recover individual bits in noisy environments. A key advantage of
proprietary RF modules is that they do not require much overhead in terms of proto-
cols, which can make them cheap and fast, but also unreliable and and inconvenient
to use for particular higher-level protocol stacks. At the same time, their current
consumption of around to when active, and their quiescent current
in the sub- µ range make them attractive options for battery-driven devices that
require just episodic data transfer and small message size.
19.3.2.2 ZigBee
ZigBee is a IEEE 802.15.4 standard. Both IEEE standard and ZigBee were con-
ceived toward the end of the 1990s for use in industrial or building automation
settings, where lower data rates than those of WiFi (IEEE 802.11) or Bluetooth
(IEEE 802.15.1) are required, but power saving is essential. One of the aims of the
standard was to allow devices that undergo long periods of inactivity to participate
in the network in the most energy efficient way.
Most ZigBee devices come with a link budget of 90 to 103 dBm, depend-
ing on the chip manufacturer. The standard allows the use of base frequencies at
, and , with increasing capacity both in terms of the
number of channels that are support as well as data rate (see Table 19.2). Primarily
Table 19.2
IEEE 802.15.4 Frequency Bands
Frequency Offset Bandwidth Channels Data Rate
1 20 kbit/s
10 40 kbit/s
15 250 kbit/s
for reasons of data rate, it is not surprising the majority of devices today operate at
, although at lower frequencies the same overall distances of up to
in line-of-sight can be achieved with less power consumption. The choice of base
frequency is thus dependent on the particular use case. In all cases the data is phase
modulated. In order to reduce sensitivity to noise, IEEE 802.15.4 implements Direct
Sequence Spread Spectrum (DSSS) to distribute the payload over a maximally wide
proportion of the available channel bandwidth. Conceptually, this is achieved by
overlaying the payload with additional data; it is thus similar to Manchester encod-
ing (see section 18.3), but does not employ a temporally fixed but pseudorandom
bit pattern. This bit pattern also serves CDMA/CA to determine who is currently
actively sending, as the pseudorandom pattern is unique to a particular device.
In contrast to nonstandardized RF modules mentioned before, the physical
layer of IEEE 802.15.4 devices is thoroughly defined and allows interoperability
across the devices of about ten manufacturers, including a definition of 64-bit MAC
address that give devices a unique identity on the network and a MAC feature set
on OSI layer 2 that regulates connection initiation, concurrency mechanisms via
CSMA/CA or guaranteed time slots, and device security. ZigBee data frames are
relatively complex, with a header comprising up to 35 bytes for secure applications.
Each sender keeps a count of data frames already sent, information which is also
encapsulated in secure payloads. At the receiving end, this information can be used
to evaluate a genuine transmission. Thus, ZigBee aims to guarantee both security
(by encryption) and data authenticity.
An important feature of ZigBee devices is their ability to create mesh net-
works (see Figure 10.1) with a theoretical capacity of up to 64k devices. This is
achieved by an OSI layer 3/4 protocol defined by the ZigBee Alliance, dealing with
mechanisms for message forwarding and implementing self-healing and rerouting
in case of node failure. ZigBee devices are often integrated with sensors or actuators,
thus the ZigBee protocol stack extends to OSI layer 7 with sensor- or actuator-
specific profiles (see Figure 19.10). This stack is still the prevalent use of ZigBee in
the field, with ZigBee IP and more recently 6LoWPAN offering an IP-to-the-edge
alternative.
On the technical level ZigBee manufacturers achieve their link budget with
different strategies: Some optimize receiver sensitivity (up to -100 dBm, where
-85 dBm is required by IEEE 802.15.4), while others keep sensitivity close to
the requirements but therefore invest in higher transmit power (up to [3
dBm], where [-3 dBm] is required). From the perspective of the raw link
budget, a combination between a strong sender and a sensitive receiver is ideal.
However, if energy consumption is critical, design choices favoring star topology
with mains-powered highly sensitive receivers in combination with remote senders
using standard transmit power may be preferable. The ZigBee standard mandates
a minimum device life time of 2 years when battery operated. This is achieved by
peak currents in the range of when sending, and sleep currents in the sub-
µ range.
For stand-alone applications (i.e., where ZigBee connectivity is not part of
a sensor/actuator), for example the boat-shaped Xbee module with a 20-pin
footprint and standard serial connectivity on pins 2 (TX) and 3 (RX) has become
a commonly encountered sight. These modules communicate with host computers
via simple AT commands and are relatively transparent when used as replacement
of a wired serial connection. The setup of an entire ZigBee network with complex
topology can become more complex, as the definition of roles and functions within
the network must be preconfigured.
19.3.2.3 Bluetooth and Bluetooth Low Energy
Bluetooth was invented by mobile phone company Ericsson in 1994 and was the
first personal area network standard to be accepted in IEEE 802.15 (802.15.1). It
was originally meant to replace wired connections between a mobile phone and
audio devices such as headsets, or for short-range data transfer between two phones
(e.g., to exchange contact details). The standard has since been developed further by
the Bluetooth Special Interest Group (Bluetooth SIG), which has grown to several
thousand member organizations. Bluetooth has since also become a standard option
for connecting certain wireless peripherals like keyboards and mice to computers,
as well as for for communication of wearable devices and beacons with their
environment.
Features
Bluetooth 1.0 entered the market in 1999. Back then, difficulties with interoper-
ability between phones and peripherals as well as severe security and privacy flaws
saw frequent recommendations to disable Bluetooth in phones altogether. Between
then and 2011, various versions of what is now known as Bluetooth Classic were
introduced (1.1, 1.2, 2.0+EDR, 2.1+EDR, 3.0+HS, 3.0+EDR; EDR = enhanced
data rate, HS = high speed). These standards continuously improved security and
interoperability issues of the earlier versions, and increased the data rate in steps
from 732 kbit/s to 3 Mbit/s (in addition, 3.0+HS utilizes Bluetooth to negotiate the
temporary use of WiFi at 25 Mbit/s).
A major pain point of earlier Bluetooth versions was the pairing process be-
tween master and slave devices, originally requiring to enter a pin on all participants,
a process overhauled in Bluetooth 2.1 by introducing the so-called secure simple
pairing (SSP) mechanism. SSP offers a range of options that also work with devices
with no or very limited input-output capabilities, including out-of-band (OOB) iden-
tification, which mediates pairing by exchanging information between both devices
over NFC.
The current base version of Bluetooth is 4, Bluetooth Low Energy, with
Bluetooth Smart (4.2) being the most current subversion. BLE has been around
since 2011 and is currently one of the standard options for connecting wearable
devices to mobile phones as well as a base for various beacon technologies (e.g.,
location beacons; see Section 15.2.2). Key features of BLE are transfer rates of 1
Mbit/s, short times to reestablish connections in less than , strong encryption
by default and considerably lower power use than Bluetooth Classic (see Table 19.3
for a comparison between classic Bluetooth and BLE). With the final definition
of Bluetooth 5 on the horizon, it is advertised that the standard will quadruple
range and double speed of low-energy connections while increasing the capacity
of connectionless data broadcasts by 800 percent. It can thus be assumed that
Bluetooth is furthering its position as prime connectivity option for IoT, which
is also exemplified by improvements like Bluetooth Mesh or integration with IoT
communication standards such as Thread (see Figure 19.10).
Frequency Use
As shown in Figure 19.11, Bluetooth Classic uses 79 channels in the 2.4-GHz ISM
band, or 40 channels of double bandwidth with BLE. The frequency hopping mech-
anism allows device pairs to switch channel in sub- intervals. The frequency
hopping as well as encryption contribute to the relative security of Bluetooth. The

base specification for Bluetooth Classic defined power classes with transmission
power of up to 20 dBm, although these particularly strong transceivers (class 1)
are usually used only where power is not a concern, and come occasionally in
combination with external antennas to achieve ranges of hundreds of meters. Most
prominently class 2 (4 dBm) and class 3 (0dBm) Bluetooth transceivers are more
widely used in mobile phone or computer peripherals, with typical ranges between
to . The BLE standard does not know these power classes. While BLE
hardware is typically backward compatible with Bluetooth Classic, it can typically
not operate both Classic and BLE at the same time.
The improvements in data rate over the various iterations of Bluetooth Classic
were are in part owed to quite fundamental changes on the physical layer: some
variants of phase shift modulation (quadrature phase-shift keying) allow for a higher
information density per unit of time. However, for reasons of energy efficiency the
BLE standard returned to more basic modulation methods.
Networking
Bluetooth Mesh promises to become a major improvement over the current master-
slave architecture. The mesh standard combines the advantages of BLE and ZigBee
and provides functionalities such as message forwarding. In Bluetooth Classic
networks, up to 7 slave devices can associate with one master in a star configuration
referred to as piconet. This master then determines the use of µ long airtime
slots (i.e., it can request information from individual piconet participants and trigger
their response). A modification of the star network is the so-called scatternet, which
allows devices to act as host in combination with one set of devices, and as slave
in combination with another set. The identity of devices on these networks is
encoded in a 48-bit identifier, of which 24 bit are specific to the manufacturer as
an organization unique identifier (OUI), and the remainder for individual identity.
Furthermore, each Bluetooth device can carry an up to a 248-byte long device name.
Protocol Stack and Profiles
There are several protocol stacks for Bluetooth that manage the different steps in a
connection. (1) In the Inquiry phase, a device will request identification from other
Bluetooth devices. If the Inquiry is received by a device that is ready to pair, it will
respond with its unique ID and device name. Which of the devices is sending the
Inquiry and which one responds is largely dependent on the use case and the role
Table 19.3
Bluetooth
Bluetooth Bluetooth LE
Channels 79 40
Bandwidth
Noise mitigation Frequency hopping Direct-sequence spread spec-
trum
Data rate 3 MBit/s 1 MBit/s
Range
Initiate connection
Complete
handshake
Relative power
consumption
Peak current
Voice capable By design No

Security Weak, user defined on applica- Strong, user defined on appli-
tion layer cation layer
Topology Star (255, up to 7 slaves active Peer-to-peer, more complex on
over shared medium) application layer
(master/slave) in the network. (2) If two devices have been paired once before, they
can go into Paging mode, which forms a transient logical connection between the
two. (3) Ultimately, two devices can be in Connection mode. This is where they can
potentially exchange payload, provided that both devices are actively participating
and listening. Although Connection mode is implying data transfer, it is possible
that one or both devices are in one of the many Bluetooth sleep modes to conserve
energy.
A receiver device can be in Sniff mode, meaning that it is asleep for the
majority of time but can wake up at regular intervals to check for messages. The
length of these periodic intervals are determined by the device itself. In Hold mode,
the master can request the slave device to sleep for a predetermined length of time,
i.e., the control is fully in the domain of the master. In Park mode, the master
initiates the slave’s sleep until further notice; that is, the slave wakes up for very
short amounts of time to check for the master’s request to return from sleep.
The application layer of Bluetooth Low Energy is well defined. While a
Bluetooth connection can in principle replace a wired serial connection, the data
typically has to fit into what is known as GATT: Generic Attribute Profile, or Profile
for short. Currently Bluetooth knows more than 30 profiles with specified format.
A few common ones are:
• Serial Port Profile (SPP), to mimic a serial interface
• Hands-free (HFP) and Headset (HSP) Profile, to facilitate mobile phone
functions over peripherals
• Human Interface Device (HID), to attach input devices such as keyboards
and computer mice
• Advanced Audio Distribution Profile (ADP), for audio streaming
Importantly, there are entire groups of Bluetooth profiles centered around
particular verticals, such as health care (with individual profiles, e.g., for blood
pressure monitoring, body heat thermometers), sports and fitness (profiles for
running, cycling) or environmental sensors.
There are also profiles around proximity sensing (FMP, find me profile, to
home in on misplaced Bluetooth tags or PXP, proximity profile, to detect devices
within a particular range). The vendor-specific beacon technologies like Eddystone
(Google) or iBeacon (Apple) relate closely to PXP.
19.3.2.4 WiFi
The WiFi standard on the basis of IEEE 802.11 was proposed in 1997. As the most
important standard for wireless data communication in the consumer market, in
industry and mobile applications, WiFi and the more generic term wireless LAN
(WLAN) have become interchangeable.
IEEE 802.11 defines the physical and link layer for communication in the
2.4-GHz, 5-GHz and, practically not yet relevant, 60-GHz as well as 900-MHz
bands. The first WiFi standards that appeared in 1999 (IEEE 802.11a and IEEE
802.11b) managed data rates up to 11 MBit/s (at ) and 54 Mbit/s (at
), respectively. Since then, WiFi has evolved at an enormous pace, with sub-
standards now in the two-character range. And while the IEEE 802.11ac standard
with up to 1.2 Gbit/s over multiple input/multiple output channels (MIMO) is now
widely available, the IEEE 802.11ad standard in the band can be considered
experimental with nearly 7 Gbit/s. However, it is clear that the push into the market
is just a question of time.
WiFi supports a range of network architectures including mesh, but practically
the ad hoc peer-to-peer mode and the infrastructure mode with a central router
in a star network are commonly used topologies. In infrastructure mode, a router
can manage up to 255 parallel connections. The identification of devices in the
network happens via the 48-bit MAC address (12 hexadecimal characters) and the
Service Set Identifier (SSID), a human-readable clear text name. Access to airtime
is determined by a CSMA/CA mechanism, and receivers can request information
to be re-sent if a frame check sequence is not in agreement with the content of a
data frame. WiFi knows three different frame types for device management (e.g.,
device discovery), traffic management (e.g., request to send more) and payload
frame (that encapsulate OSI layer 3–7 content, e.g., TCP frames). WiFi supports
a range of encryption standards: Wired Equivalent Privacy (WEP, not considered
safe anymore) and two versions of WiFi Protected Access (WPA), which provide
safety on the lower OSI layers.
Even in its early incarnations, IEEE 802.11 was almost fourfold faster than
later versions of Bluetooth, or almost 100 times faster than ZigBee, all of which
use channels in the band. At the same time, WiFi is the most energy-
demanding technology in this comparison, so that the use of WiFi has to be carefully
considered when designing for the IoT. Obvious reasons that make WiFi potentially
energy hungry is the maximum sender power, which is in the range of
to in the 2.4-GHz band and up to at , depending on local
legislation. However, even in the absence of strong physical signals, maintaining
and using a WiFi connection is expensive; while in particular BLE has means to
quickly reestablish connections between previously communicating devices, WiFi
requires considerably more protocol between two devices to engage after a certain
timeout.
A key feature and difference from other radio standards is the use of sophisti-
cated modulation methods that increase the density of information within a channel.
For example, orthogonal frequency-division multiplexing (OFDM) adds multiple
streams of phase shift-modulated data (see Figure 1.27) into a single channel. As
a result, in the time domain, at each point there is a different signal amplitude on
a particular frequency. Over all frequencies in a particular channel, using computa-
tional methods such as the fast Fourier transformation (FFT), the different streams
can be extracted from the overlaid signal. While this increases the spectral effi-
ciency of the transmission dramatically, the multiplexing and demultiplexing of the
data costs compute time and thus energy. When considering that modern MIMO
approaches utilize several of such multiplexed streams at once, it becomes clear
why WiFi routers today require multiple processing units.
Simple WiFi modules that support only the older IEEE 802.11 b/g/n standards
often provide a compromise for use in battery-operated sensor equipment, with peak
transmit currents in the sub- range. Importantly, some WiFi modules such
as the ESP8266 make available excess compute power and allow the running of
user-specific code (e.g., for interfacing with sensor or actuator components).
Summary
In summary, for short-range wireless data connectivity, this section has introduced
the ZigBee, Bluetooth (Classic and BLE) and WiFi standards. Table 19.4 recapitu-
lates some of the information to provide a direct comparison. The range of for
WiFi is a relatively conservative estimate for standard indoor deployments. When
combined with directed antenna technology and with appropriate signal amplifica-
tion, WiFi can bridge line-of-sight distances of dozens of kilometers.
19.3.2.5 Long-Range Low-Power Radio Standards
Many devices to participate in the IoT may be battery-driven and will benefit
from data communication with features as described for Bluetooth Low Energy
(see Section 19.3.2.3): a standard with comparatively low transfer rates, but with
great energy efficiency. However, without much signal amplification (e.g., through
Table 19.4
Personal and Local Area Network Wireless
ZigBee Bluetooth Bluetooth LE WiFi
IEEE standard 802.15.4 802.15.1 802.15.1 802.0.11, e.g.

b/g/n
Frequency range , ,
,
Data rate 250 kbit/s 3 Mbit/s 1 Mbit/s 11 Mbit/s (b),

54 Mbit/s (g),
600 Mbit/s (n)
Maximum range
Battery lifetime Months Days Months Hours
Network capacity 64k 7 n/a 255
antennas), the range of BLE is rather limited. This is the domain of long-range
low-power radio standards, such as
• LoRa(WAN)
• Sigfox
• Weightless
as also shown in Figure 19.9.
In this section, the principles of long-range radio are exemplified by the LoRa
transmission standard, and its use in wide-area networks in the form of LoRaWAN.
LoRa
As specified by the LoRa Alliance in 2015, a key feature of LoRa is the proposed
operation time of end devices of 10 years on a single AA battery, at least with
respect to pure communication ability. The technology can achieve this degree of
power efficiency and its suggested range of between to in urban areas
and tens of kilometers in rural areas only by compromising on the data rate, which
is between 300 bit/s to 5.5 kbit/s depending on the environment (as well as two
high-speed modes featuring 11 kbit/s and 50 kbit/s). The specification is designed

to support networks with hundreds of thousands of devices within urban areas.
LoRa utilizes the ISM and SRD bands at (Asia), (Europe)
and (North America), among others. Due to regulation, the maximum
output power for LoRa in Europe is therefore at or 14 dbM (versus 21
dbM in North America). The available channel bandwidth ranges from to
, with being a default option. Over such -wide channel,
the output performance yields a link budget of about . As described in the
introduction to this section, at this frequency and transmission power, the devices
are only allowed 0.1% of accumulative airtime. With a maximum allowed antenna
gain of 2.15 dBi, how can LoRa overcome issues of range and signal strength?
First, by using frequencies in the sub- range, LoRa can achieve deeper
building penetration than, for example, WiFi. Second, by adaptively adjusting
transmission power, LoRa can dynamically adjust its power use to the currently
required data rate and environmental parameters (that is the so-called adaptive data
rate, ADR). Third, by using a special form of spread spectrum modulation, LoRa
transceivers can extract payload from radio signals just above the noise band.
Many radio standards focus the signal over a relatively small bandwidth. That
means that while a connection is stable as long as the sender is the most dominant
one at the particular frequency, as soon as stronger interferers come into play, this
connection is jeopardized. This is clearly not compatible with a radio standard
that aims to save transmission power in order to extend battery life. Instead, LoRa
spreads the signal over the entire bandwidth (i.e., literally bits of information even
within a particular channel are encoded at different frequencies during the process
of transmission). This is the spread spectrum modulation. LoRa uses the so-called
chirp spread spectrum (CSS) method, which means that the frequency continuously
increases or decreases while a particular payload is being sent. While going through
the chirp, the data is encoded by smaller frequency shifts, but always along the
general chirp trajectory. Thus with spread spectrum modulation, the message gets
sent over a wide frequency range, and even if that signal is just above background
noise, it is difficult to deliberately or accidentally destroy the message in its entirety.
The more difficult the environmental conditions are, the more the signal needs to be
spread over the frequency range. One of the key parameters of a LoRa connection
is therefore the spread factor, which provides a balance between signal robustness
and data rate. Given the duty cycle limitations in the LoRa bands, depending on the
spread factor and the adaptive data rate, the maximum payload of a single LoRa
message is between 11 and 242 bytes.
Together with the adaptive data rate, security is another important component
of LoRa media access control. Individual devices as well as the network are secured
by (a) a 128-bit device-specific key, (b) a 64-bit application-specific key and (c) the
64-bit network key. This means that there are measures in place both against device
identity theft as well as eavesdropping on the network.
LoRaWAN
LoRa gateways interface LoRa radio devices and the Internet. LoRa gateways cover
all spreading factors on each of the channels they support. There are limits to the
number of messages on different spreading factors they can receive simultaneously,
however, this limit can be overcome in practice by adding more gateways into the
network. At the same time, increasing the number of gateways in a geographic
area also allows devices closer to a gateway to send at a higher bitrate through
the ADR. The ADR mechanism evaluates the signal-to-noise ratio of an incoming
data package at the gateway, and then recommends an appropriate spread factor to
the end device (see Table 19.5).
LoRa knows three different device classes, which define device sleep and
energy mode of end devices:
• A: Sensor downlink only after sending
• B: Downlink at defined time slots
• C: Continuous downlink
Class A devices wake up following their own sleep pattern, and stay awake for a
small amount of time after sending a ping (or measurement) to the gateway. Class
B devices wake up at defined time slots to evaluate whether there is data for them.
Class C devices support a continuous downlink, which obviously comes at the cost
of energy and may only be useful for main-powered devices.
In deployments with many thousand devices per gateway air time is a precious
currency. As a rule of thumb, an end device should not use more than of air
time each day, which translates into between 20 and 500 10-byte messages per
day, depending on the spreading factor. Hence, LoRa devices can also trade smaller
payloads against a larger number of messages.
LoRaWAN gateways come in form factors often resembling WiFi routers for
home use, with connections for external antennas to further boost the the maximum
distance the technology can facilitate. The majority of current LoRaWAN gateways
therefore comes in ruggedized cases for outdoor installation, as antenna connections
Table 19.5
Spreading Factor and Data Rate
Spread Factor Signal/Noise 11 Byte- Bitrate

Net Airtime
7 − 5469 bit/s
8 − 3125 bit/s
9 − 1758 bit/s
10 − 977 bit/s
11 − 537 bit/s
12 − 293 bit/s
contribute to signal decay and the gateways may be deployed on rooftops and
so forth. For use in end devices there are development boards and ICs for LoRa
communication that interface to the host through generic serial connections.
19.3.3 Cellular Data Services
Cellular data services use radio communication in licensed frequency bands (i.e.,
the frequency range) is exclusively allocated to particular providers to facilitate
operations without interference through third parties. In order to achieve maximum
coverage, gateways with partially overlapping reach define so-called cells (hence
cell phone), in which end devices can transiently connect to individual gateways
and experience a spatially continuous network. As such, end devices in a cellular
network behave as if they are moving within a mesh network.
The gateways themselves are connected to the providers’ infrastructure for
call management and so forth, and represent the interface into the public Internet
(see Figure 19.12). One such gateway typically features a sector antenna that can
◦
cover an angle of up to , and the directly connected base transceiver station
(BTS). A prominent installation with all-around coverage features three BTS, all of
which are coordinated through a base station controller (BSC). While the BTS gives
identity to a particular cell in terms of the frequency it operates on and facilitates
physical layer functionality (time division multiple access; TDMA, or forms of
CDMA), the BSC determines the frequency of the BTS and controls the handover
BTS BTS BTS BTS

MSC
BSC BSC
other providers, public

Internet, etc.
Figure 19.12 Mobile phone and cellular data infrastructure. As a phone’s location changes it may
connect to different BTSs, coordinated through the BSC. Location changes are also registered at the
MSC, so that a new call to the phone can be routed through the network in the most efficient way.
between them. The BSC also acts as funnel to the mobile switching center (MSC),
computational infrastructure that provides a lookup at which BTS a mobile phone
has last been seen and where routing information is generated.
While the first incarnations of mobile telephony were based on analog radio,
digital cellular data is the underlying principle of modern communication. Over
the past three decades multiple standards have evolved to provide more stable
connections, better voice quality and higher data rates. These are commonly referred
to as mobile data generations, denoted by the letter G:
• 1G: analog network, telephony only
• 2G: digital network, following the GSM standard, with short messaging
service (SMS), multimedia messaging service (MMS), and data services
including substandards
– GPRS (General Packet Radio Service), up to 120 kbit/s
– EDGE (Enhanced Data Rates for GSM Evolution), up to 240 kbit/s
• 3G: digital network, following the UMTS (Universal Mobile Telecommuni-
cations System) standard, including substandards
– WCDMA (Wideband Code Division Multiple Access), up to 384 kbit/s
– HSPA (High Speed Packet Access), up to 14 MBit/s
• 4G: digital network, either

– WiMAX (Worldwide Interoperability for Microwave Access), up to 50
Mbit/s
– LTE (Long Term Evolution), up to 50 Mbit/s
• 5G: currently under development
19.3.3.1 3rd Generation Partnership Project
Historically, and on a global scale, there were many more mobile telephony stan-
dards, such as the Advanced Mobile Phone System (AMPS) developed by Bell Labs
in the 1980s or Personal Digital Cellular (PDC) introduced by Japanase Association
of Radio Industries and Businesses in 1991. However, most of these early standards
either disappeared or were consumed into novel standards, so they do not play a role
in more recent telecommunication. Work around the European Telecommunications
Standards Institute’s (ETSI) GSM standard cumulated in the 3rd Generation Part-
nership Project (3GPP), the main body to drive the later UMTS and LTE standards.
While the IEEE’s WiMAX standard plays a role for some 4G equipment, recent
developments suggest that the 3GPP standards are going to be dominating until a
mature definition for 5G evolves.
19.3.3.2 Frequencies and Signal Range
Second generation mobile data services following the GSM specification have cen-
tral frequencies around (North America), (Europe),
(Europe, North America) and (North America), coining the terms dual-,
triple-, or quad-band, depending on which frequencies and in which localities a
mobile end device could be used.
The band around the central frequency can be relatively wide, for example the
GSM 900-MHz base frequency denotes the interval between to
for the uplink from device to BTS, and between to for the
downlink from the BTS to the device. This division is characteristic for all 3GPP
standards.
The UMTS frequency band that is centered around , which is used
in Europe and many parts of the world except North America, spans
to (uplink) and to (downlink). Comparable base
frequencies in North America are , and , among another
half-dozen larger intervals with a total of more than 40 uplink/downlink pairs used
around the world. The LTE specification knows even more frequency interval pairs
for uplink and downlink; the most relevant blocks in Europe span the range from
to .
The alternative 4G transmission method is WiMAX, which follows a different
technical paradigm than the mobile networks. The IEEE 802.16 standard resembles
a long-distance WiFi with speeds of up to several hundred Mbit/s under ideal con-
ditions. However, to achieve such transfer rates, frequencies of up to and
clear line-of-sight connections between a base station and a spatially fixed client are
necessary for IEEE 802.16-2004 (fixed WiMAX). Slower WiMAX modes such as
IEEE 802.16e-2005 support handover between base stations and non-line-of-sight
connections, although at the cost of speed ( 5 Mbit/s) with transmissions in fre-
quencies between to . These limitations have led mobile phone man-
ufacturers to largely deprecate support for WiMAX, as in particular in metropolitan
areas with tall building structures the standard cannot play out its advantages.
While a fixed WiMAX connection can bridge distances of over , mobile
data cells in urban areas are rarely larger than in diameter, although several
kilometers are possible. In order to achieve longer battery life time for the mobile
devices, 3G and 4G networks have often been designed for even shorter distances
between BTS and device, as the transmission power can automatically be reduced
to lower power consumption. The base station itself can send signals with power
between to .
19.3.3.3 Technology Overview
2G: GSM, GRPS, EDGE
The first digital networks following the GSM standard rolled out in 1992. As
mentioned in the previous section, transmission is divided into separate up- and
downlink frequency ranges, each of which is further divided into channels. In the
case of GSM , there are 124 channels of each. These channels are
shared between subscribers, for example through TDMA of approximately µ
about 200x per second. This allows the phase-shifted encoding of about 14 kbit/s
of information, meaning that even for voice transmission relatively strong data
compression must be used. The brevity of the TDMA time slots also negatively
impacts the reach of the GSM network, as the offset between the times that the
signal travels must be compensated.
The primary data service over GSM is Circuit Switched Data (CSD), with
a net data rate of 9.6 kbit/s. While this is comparatively slow, many asset tracking
solutions based on GSM utilize this service for the transmission of small amounts of
positional data. The CSD can be understood as a time-slotted serial connection. In
contrast, GPRS that started in 2000 used a packet switching approach to delivering
data. The CSD is based on the allocation of a time slot in the TDMA mechanism,
and this slot is occupied whether or not an end device has data to send or receive.
The GPRS is demand-driven; that is, the TDMA mechanism may allow an end
device to use four of the eight slots of a TDMA frame to transfer data packets,
but only if there is no concurrent use. While there is also the High Speed Circuit
Switched Data (HSCSD) service that combines up to eight time slots for transfer
rates of up to 115.2 kbit/s, this service is less reliable than GPRS, especially for
nonstatic end devices. Thus, while slower, the packet switching method employed
by GPRS is preferable, allowing net transfer rates of 53.6 kbit/s.
In its most advanced version, EDGE, an alternative modulation method to
standard GSM, allowed higher spectral efficiency and enabled net download rates
of 220 kbit/s and upload rates of 110 kbit/s. Strictly speaking, these data services
are GRPS over EDGE, or HCSD over EDGE.
3G: UMTS, WCDMA, HSPA
The 3G standard was officially introduced into the market in 2003 and uses packet-
based data transmission by default. Further to the packet-based mechanisms in the
GSM standard with discrete pairwise interactions between an end device and a BTS,
here every reachable BTS and multiple possible routes are used for package transfer.
A further boost in transfer rates stems from the 3.84-MHz wide bandwidth of the
individual UMTS channels. Duplex activity is facilitated by both frequency (FDD,
frequency division duplex) and time division (TDD, time division duplex).
This initial UMTS standard, W-CDMA , supports up to 384 kbit/s. Access
to the shared medium over CDMA means that different devices use the same
frequency, but only authorized pairwise connections between a device and a BTS
are able to recover a particular message from the data stream. In combination with
the packet switching approach, this means that although no individual package can
be guaranteed to reach its target, eventually a complete message can be recovered.
CDMA dynamically maximizes the amount of information that can be transferred
over the medium for any one user.
An improvement to W-CDMA is HSPA, which became available in 2006,
with up to 14 MBit/s. HSPA covers subprotocols for High Speed Downlink Packet
Access (HSDPA) and High Speed Uplink Packet Access (HSUPA). A follow-up
standard is HSPA+, supporting up to 40 Mbit/s. Both HSPA and HSPA+ are based
on improvements to signal modulation.
4G: LTE
The now prevalent 4G technology is LTE, with deployments since 2010, or its
more recent incarnation LTE-A (LTE Advanced). Depending on the baseline, these
standards promise data rates of up to 100 Mbit/s. Like UMTS, the LTE standard is
based on packet switching. The LTE specification is fully IP-compliant so that LTE
devices essentially support IP-to-the-edge by default.
In contrast to UMTS with 5-MHz channel bandwidths ( net), LTE
channels can be up to . However, while these blocks are divided by CDMA
for multiple UMTS subscribers, in LTE a method called Orthogonal Frequency-
Division Multiple Access (OFDMA) is being used. Within a particular channel,
there can be hundreds of subchannels (termed carriers) that can be individually mod-
ulated. This allows LTE devices connected to a particular BTS to share bandwidth
more efficiently, as both code and frequency domains are distributed between those
devices. In addition, LTE supports MIMO channels when devices carry more than
one antenna, thus leveraging the capacity of multiple channels.
While not a completely separate standard, NarrowBand IoT (NB-IoT) is a
3GPP family specification for the transfer of small amounts of data over the LTE
network. It is also referred to as Cat-NB1, denoting an equipment category within
the LTE specification. In contrast to other LTE communication, NB-IoT is restricted
to 170 kbit/s for downlink and 260 kbit/s for uplink over just 200-kHz wide
channels. This constraint facilitates long battery lifetimes comparable to those of
other LPWAN standards such as LoRa (i.e., ten years on two AA batteries). NB-
IoT can either be operated by the provider as stand-alone service, in-band along
with other LTE transfer, or as extra payload in the gap between individual LTE
channels.
5G
Devices following the 5G standard are expected to arrive in the market between
2020 and 2030. Currently, the standard is more like a wish list defined by the
Next Generation Mobile Networks Alliance, taking into account the following
developments:
• The expectation of a significantly larger number of devices requiring wireless

connectivity, acknowledging the origins of 5G in a collaboration between
NASA and the M2M/IoT community
• Increased demand for fast data communication of up to several Gbit/s
• The need for better energy efficiency of devices on the network (e.g., through
low latencies in the 1-ms range)
The spectrum for 5G communication is currently expected to center around the 28-
GHz, 37-GHz and 39-GHz frequency bands.
19.3.3.4 Modules for 2G, 3G and 4G connectivity
GSM connectivity is available in many form factors, from development boards for
the most common microcontroller platforms to system-on-chip with software SIM,
thus providing footprints even smaller than the canonical SIM cards. The current
demands of these modules are significant compared to other connectivity options
discussed in this chapter. For example, at operating voltage, in standby mode
ready to receive a connection, a GSM module may draw up to , and while
engaged in a data connection, this may rise to almost in active transfer.
This suggests that end devices utilizing GSM connectivity should remain in sleep
mode as long as possible and restrict sending activity to a minimum. While some
GSM modules behave like serial modems and interact with their host through serial
connections at AT commands, others allow low-level access to parameters like
channel selection.
While occasionally present on GSM modules, hardware modules for 3G and
4G more often support data transfer over SPI and I2 C or even high-speed USB.
However, the necessary quick input-ouput to and from the module itself costs
energy, and peak currents of up to are not unusual for some of these modules
when in full operation.
19.3.4 Satellite Communication
In areas that are very distant from wireless telecommunication infrastructure like
cellular towers (e.g., in deserts, the rainforest or in the ocean), Internet may only be
available through satellites. Geostationary satellites orbit Earth with
at an altitude of roughly . This means that these satellites stay at approxi-
mately the same position relative to any point on this planet, allowing them to serve
as a relay station for radio communication. Technical setups utilizing geostationary
satellites typically make use of directed parabola antennas from fixed or mobile
installations to send and receive information to and from the satellite. The satellite
in turn communicates this information to fixed base stations on the ground, which
are connected to conventional Internet infrastructure.
While in rural areas with basic telephone infrastructure hybrid solutions have
been used in domestic and industrial settings since the 1990s (e.g., to request a
page from a website via a slow modem but receive that data over a fast satellite
connection), IoT communication in truly remote areas may require the satellite
connection for both sending and receiving data. In any case, the setup may include
very small aperture terminals (VSAT), antennas that are oriented toward the satellite
to concentrate the signal over just a small width. If VSATs are used on mobile assets,
especially ships with many degrees of freedom in their movement, continuous
correction and reorientation of the antenna are essential. Given the distance between
Earth and the satellites, although electromagnetic waves travel at the speed of light,
a round-trip connection has a latency of at least . This means that even though
satellite connectivity may be the last resort for an IoT solution, there are certain
real-time requirements that are not compatible with this type of communication,
and even standard TCP-based connections may experience an intolerable lag.
An alternative to geostationary satellites are networks of satellites (e.g., from
providers Globalstar or Iridium) that orbit at much lower altitudes of around
(LEO, low Earth orbit constellation). These networks are required to
achieve good coverage of the Earth’s surface, with satellite trajectories intersecting
at particular waypoints that are distributed over the globe. The benefit of low-
flying satellites is that they allow much lower latencies. However, in contrast
to being a single reference point as in the case of geostationary satellites with
a wider geographic reach, here the regular handover between satellites means
that connections need to be interrupted and reestablished several times per hour.
Communication with LEO satellites may also require connections with more than
one satellite at once, hence antennas are often omnidirectional, which in turn means
less signal strength and lower signal-to-noise ratio. To compensate for this, data
rates of LEO satellites are often lower than those with VSAT setup for geostationary
satellites.
In practice, data transfer contracts from satellite communication providers
typically come with the necessary hardware (e.g., in the form of satellite modems
with RS-232 interface). In most cases, under optimal conditions, data rates of about
500 kbit/s can be achieved (although consumer-grade Internet with large VSATs
may support much higher downlink rates). Communication with satellites typically
runs over small frequency bands, often separated into Earth to satellite (e.g.,
Inmarsat, to ) and satellite to Earth direction ( to

). The Iridium network, equally active in the range, operates
240 subscriber channels of 31.5-kHz bandwidth that are phase shift modulated
and which allow time-slotted access to the medium. It becomes clear that there
is nothing special about satellite data communication — the technology is much
the same as in any other wireless data transfer. Data exchange with geostationary
satellites requires directed and concentrated radio signals of often several watts
over a clear line-of-sight to and from the satellite to perform optimally. It is worth
mentioning that subscriber communication primarily uses the so-called L band
with frequencies as listed before, while the K band supports higher data
rates, but is reserved for the backbone between satellite and ground station. At
these very high frequencies, the quality of the connection even depends on minute
atmospheric details like moisture, and thus specialized equipment including large
parabola antennas are necessary.
Subscription fees for satellite Internet are significantly higher than for any
other mobile connectivity for mobile assets. Therefore its use has to be carefully
considered.
Part VII
Software
Chapter 20
Introduction
The previous parts V and VI of this book introduced various complex forms of
hardware that may be found in IoT devices, ranging from intelligent sensors to
microcontroller platforms and telecommunications modules. It has become clear
that today the majority of these devices build on some sort of digital logic, thus
blurring the boundary between hardware and software. While the functioning of
many of the previously discussed devices is just as much based on algorithms,
code and the execution of binary logic as it is on physical matter, and the humble
mobile phone is just an example, the book has generously accepted their software
component as given.
The book is not a programming guide, but in the following chapters aims
to highlight some challenges when developing software for an IoT solution. These
challenges range from technical issues such as the response time that microcon-
trollers require for measuring a sensor signal to conceptual questions around device
identity, and they do not stop when addressing data analysis strategies and how
to integrate information in an IoT ecosystem. Here, we will focus on just a few
concepts that practitioners may come across when doing software development for
IoT, namely:
• Common issues around distributed systems, or why writing software that
integrates processes over several separate physical entities is hard.
• Constraints of embedded systems and the different meanings of real-time in
computing.
• Mechanisms that are involved in sending data over the Internet, and the
advantages and disadvantages some IoT protocols bring about.
291
• Data plumbing (i.e., message distribution and storage) in the backend.

• Core concepts of data analytics for IoT.
• Software components and frameworks to facilitate interoperability in the IoT.
20.1 COMMON ISSUES OF DISTRIBUTED SYSTEMS
A distributed system conceptually is a computer system in which different processes

are executed on physically separated machines (i.e., typically without shared mem-
ory and simultaneous access to state information). This vague definition allows to
recognise a variety of situations that face the issues of distributed systems, and these
issues are intimately linked to any type of networked computing. At its very core,
even the lack of a common clock that has been discussed in the sections on device
communication (see Chapters 18 and 19) is part and contributes to problems with
distributed systems.
The following scenarios and associated questions serve to highlight some of
the challenges that affect us on a macroscopic level (e.g., devices in the IoT) as well
as on the microscopic level (e.g., distributed processes in a computing cluster):
• Suppose there is a network of sensor devices in which all devices measuring
a critical value should signal an alarm in case at least two other devices can
confirm the measurement.
– Is there a lead device? Or do all devices take the same precedence in
the decision making?
– Which is the first device to ring the alarm? Or in other words, what is
the series of events that lead up to an alarm?
– What happens if the value has fallen below the threshold when confir-
mation of the initial value is returned from the shadow devices?
• Suppose two people each carry a sensor device and an alarm is to be sent to
a base station exactly once in case one of them takes a measurement above a
certain threshold.
– Do they necessarily have to communicate between themselves before
contacting the base station? Or should the base station block any
additional messages after receipt of the first alarm?
Introduction 293
– If they do communicate between them, and both take measurements

above the threshold, what is the mechanism that only one of them
indicates this to the base station?
• Now consider both aforementioned scenarios, but take into account means of
communication that exhibit random availability.
Discussing the strategies that would optimally address these challenges is

beyond the scope of this book. Even so, while these examples are quite closely
related to what one may encounter in real-life IoT scenarios, and a failing Internet
connection is surely not an unlikely threat, the same issues can be encountered in
computing centers where databases are replicated over multiple physical computers:
When incoming data is saved to computer A, it cannot yet be served when computer
B is queried straight thereafter. For the short moment that is required to copy across
the information from computer A to computer B and possibly vice versa to achieve
consistency, both systems have to become unavailable for further write-access. If
one adds additional computers (also called nodes in the vocabulary of distributed
systems), and the breakdown of communication is a real possibility, there is the
danger of partitioning, also referred to as split brain; the system fragments into
two subnetworks that are consistent amongst themselves, but not with each other.
There are measures to add partition tolerance, however, the so-called CAP theorem
(consistency, availability, partition tolerance) poses that this necessitates compro-
mises regarding consistency and availability. Eric Brewer’s theorem is sometimes
incorrectly reduced to “you may choose any two of”:
C: every read delivers the most recent information, or it is marked invalid

A: every request immediately receives some information, but it may be invalid
P: the system is fault-tolerant and can lose a number of messages between nodes
The different failure modes that may lead to partitioning are interesting. The loss
of messages can be attributed to either the loss of communication, or the failure of
nodes to send and/or receive the messages. From the perspective of one particular
node, both situations are not distinguishable. While there is extensive academic
work around strategies to document communication in a distributed system in order
to understand weaknesses in an architecture, including receipts for messages and
receipts for these receipts and so forth, this important practical issue remains. It is
thus easy to comprehend why device management has become a standard function-
ality in many IoT backends. With historical data on battery level, last appearances
on the network and most recent reported locations, it is much easier for a human
operator to discern general network failure from device failure.
Debugging and testing distributed systems such as IoT solutions can be
difficult, especially if issues only occur in production but cannot be replicated in the
laboratory environment. An especially feared class of node failure is the Byzantine
fault, in which the node exhibits different behavior and delivers different responses
to different communication partners. While the complete loss of a node is relatively
easy to diagnose, the identification of a complex fault and robustness to a traitor
node can be extremely challenging and hard to achieve.
20.1.1 The Fallacies of Distributed Computing
The CAP theorem is thus not only relevant for designing distributed backend
systems (e.g., with several message brokers and database instances), but quite
clearly extends into the initial scenarios and helps to appreciate the challenges of
every IoT system. A valuable set of implicit recommendations has been proposed
by Peter Deutsch in the early 1990s, who stated the following in a paper Fallacies
of Distributed Computing:
1. The network is reliable
2. Latency is zero
3. Bandwidth is infinite
4. The network is secure
5. Topology doesn’t change
6. There is one administrator
7. Transport cost is zero
8. The network is homogenous
It is worthwhile for all programmers to periodically remind themselves that they are
facing the fallacies of distributed computing when developing for the IoT.
20.1.2 Identity and Openness of IoT Systems
Identity is an important concept for distributed systems (e.g., to uniquely identify

a sender and a receiver in a communication). In a computing center environment
in which nodes are systematically added to provide scalability, upon creation that
Introduction 295
identity is either assigned to the computer (e.g., by means of an IP address) and/or

to a particular software process (e.g., representing another instance of a distributed
database). It is conceivable how communication between the different nodes at the
time of creation plays a role when device identity is established in such setting.
In contrast, sensor networks and other IoT installations may grow more
organically, and potentially without communication between end devices. When
we think of an IoT product for the consumer market, it is almost essential that it
exists in total ignorance of devices deployed in the networks of other users, yet
software in the backend needs to be aware of their different identities. This means
that device identity must be established and that the uniqueness of that identity
must be guaranteed without knowledge of the other devices. Common strategies
involve hash functions that map time stamps of the first switch-on and/or long serial
numbers (e.g., from 1-wire devices, see Section 19.1.3.3) to shorter identification
phrases. While random numbers may be an option, their reliable generation poses
difficulties on very basic processors, unless stochastic noise from components can
be used as seed.
Chapter 21
Embedded Software Development
Chapter 16 on embedded systems described the constraints that programmers face

with these small devices: low execution speed and little memory often mean
that only basic business logic can be implemented, and that even housekeeping
operations such as encryption may require a considerable proportion of the available
computing resources. Hence, there is a decision to be made when designing an IoT
architecture, for example, which parts of the overall information flow should be
executed on an embedded device (e.g., an edge device, such as a sensor or actuator),
and which parts can be deferred to processing on a more powerful gateway or in the
backend (e.g., either the cloud or otherwise larger infrastructure).
Without reiterating the information already provided in the respective chapter,
the performance of a microcontroller must be put into perspective of a contemporary
processor for use in desktop or laptop computers. The clock speed, which in a first
approximation is the execution speed, is between 10 and 100 times slower for an
average 16-bit controller when compared to a single core of a desktop computer
CPU. At the same time, the random access memory of standard computers is more
than 100,000 times larger than that of an average microcontroller. Plus, there is
typically no persistent memory or disk drive to buffer results for later use. Complex
mathematical operations including sophisticated data analytics is therefore largely
prohibited on IoT end devices. It is noteworthy that the computing units used in
modern smartphones are conceptually and in terms of performance a lot closer to
laptop or desktop computers than they are to the microcontrollers that are used in
the majority of dumb devices, such as home appliances.
297
21.1 POWER SAVING AND SLEEP MANAGEMENT
As outlined in Chapter 16, embedded systems are often at the center of battery-
powered end devices. While modern operating systems have power management
functions at their very core, the lack of such underlying infrastructure brings with it
the necessity to identify opportunities and actively save power as part of embedded
application software. This imposes some responsibilities on the programmer, which
often require detailed knowledge of the respective hardware platform:
• Low-level access to a microcontroller enables the lowering of clock speed
and/or operating voltage. This may allow a software developer to utilize
the full capability of a CPU for computation, but execute relatively simple
business logic at much lower speed.
• Specialized input-output chips or respective modules in the controller can
be switched off if not needed; for example ADC/DAC units that require
reference voltage can be physically detached from power.
• Many microcontrollers also require a reference voltage to detect fluctuations
in their operation voltage (e.g., to prevent short-term brownouts). While this
measure increases the stability of the system as the controller can actively
skip a few clock cycles in the presence of voltage ripples, the reference
voltage and the comparison come at the cost of additional current.
• When a device is not dependent on cyclic execution of code but can rely on
external triggers, it is possible to switch of internal timers that keep ticking
while a CPU is otherwise asleep. Conversely, hardware interrupts can be
disabled if they are not needed.
• Even pulling unused GPIO pins to ground (and not keeping them floating)
can save a few microamperes.
Importantly, even when a CPU is not busy because it may be idling in a
while f ... g loop waiting for an external signal to arrive, most processor archi-
tectures execute what is known as a NOP or NOOP: no operation. The CPU may
not actively do anything, but even NOP consumes power. It is therefore essential
for a programmer to recognize phases in which code may not get executed, and
manually send the microcontroller into an energy saving sleep mode. Some of these
sleep modes are comparatively light and the processor may kick in when required,
while other modes are more elaborate and require a stepwise wake-up of modules
to enable different levels of functionality.
Embedded Software Development 299
21.2 REAL-TIME REQUIREMENTS AND INTERRUPTS
Embedded systems are often utilized at the interface to electrical or mechanical sys-
tems, and microcontrollers may have to react to such input as fast as possible. This
is often referred to as real-time. There is a somewhat academic differentiation into
soft real-time and hard real-time requirements, the latter meaning a potentially fatal
outcome if the response to an event is met later than a particular arbitrary threshold
(think of the triggered inflation of an airbag versus the jitter in a videoconference
— both require real-time processing of data). In this context it may also worth men-
tioning that real-time means different time scales to different stakeholders: While
an embedded engineer may think of latency involving multiple clock cycles, or
microseconds, for a web backend developer every time scale less than a second
may be real-time enough.
General purpose computers typically run a basic software environment, the
operating system (OS, see also Chapter 16). Computer systems with a real-time
operating system (RTOS) may require sophisticated algorithms to schedule the
execution of vital code. There are different types of multitasking that allow the
real-time response to an event even though other code is executed simultaneously.
There is:
• Cooperative multitasking, in which multiple processes request time for exe-
cution and negotiate priority with the kernel.
• Preemptive multitasking, in which processes have allocated slots that guar-
antees their execution.
The communication between such asynchronous processes is complicated and
resembles in parts the issues of distributed systems (see Section 20.1). It is important
to note that while most modern OS support some sort of multitasking, they are often
far from being suitable for real-time applications, as they aim to provide the smooth
running of user-centric front applications, which is often not compatible with fast
memory swapping required for microsecond response times.
Smaller computers without an OS, for example programmable logic con-
trollers, often use simple time-sharing multitasking strategies with relatively gen-
erous time slices that guarantee that a piece of code can be executed during the
duration of such slice. The periodic updates of some industrial fieldbus systems are
therefore carefully aligned with the update cycle of the PLC (see Section 19.2.1).
In contrast, on simple microcontrollers whose code execution is not governed by
POLLING strategy INTERRUPT strategy

in software: in software: in hardware:
repeat forever { enable_interrupt(do_X) if the GPIO

pin changes
repeat forever {
status = 0 state
complex_operation_1
complex_operation_2 1. HALT
while (status = 0) {
} 2. EXECUTE
complex_operation_1
3. RESTORE
complex_operation_2
status = read_GPIO
} save current registers to buffer
save current execution point
if status = 1 { load do_X into registers
do_X; execute do_X
} restore registers from buffer
restore execution point
} continue...
Figure 21.1 Polling versus interrupt. An interrupt service routine (ISR) is executed when an interrupt
request (IRQ) is triggered. The conventional way to assert whether do X should be executed on
the basis of some GPIO trigger is called polling: after each execution of complex operation 1 and
complex operation 2, the status of the GPIO pin is read and, if status = 1, do X is executed. In contrast,
a hardware interrupt can be configured such that the interrupt vector is pointing to do X (i.e., do X
becomes the ISR). If the GPIO pin changes from 0 to 1, the current program is stopped, its data secured,
do X executed, before finally the contents of the registers are restored again.
an operating system and that need to react to external stimuli, event-driven strate-
gies such as interrupts are more prevalent than periodic polling of information from
hardware modules (see Figure 21.1).
While in a polling strategy it is the programmer’s responsibility to ensure
that complex operation 1 and complex operation 2 are deterministic and return to
the loop within the real-time requirements of the system, with the interrupt-based
strategy the real-time behaviour of the system is ensured even without knowledge of
complex operation 1 and complex operation 2. Interrupt-enabled microcontrollers
can often tie the state of one or more GPIO pins to the execution of a routine. That
is, there is a physical module on the chip that asserts the state of the pin, and initiates
the cascade of activities that is required to execute the interrupt service routine.
Chapter 22
Network Protocols: Internet and IoT
While the previous sections on IoT software development have only touched the
surface of such broad topics, the concepts and protocols behind Internet and IoT
messaging deserve a more detailed look, which is the purpose of this chapter. For
obvious reasons there is a considerable overlap with the Chapter 17 on communi-
cation models, which provides a basis for both this chapter and the discussion of
communication hardware.
Network protocols are the software building blocks for communication be-
tween computers. Specialized IoT application protocols designed for the Internet
of Things build upon the variety of network protocols that are in use today. These
chapters discuss the most important and most accepted standard protocols that are
used for Internet communication today: IP, TCP, and UDP.
As mentioned in Chapter 19, we have witnessed the rise and fall of many in-
dustry standards. There is no exception when it comes to competing IoT protocols:
some protocols have experienced significant industry adoption and remain estab-
lished standard protocols, while others have not achieved significant adoption in the
IoT space. While the standard network protocols at the core of Internet commu-
nication are well-defined and well understood, application layer protocols for IoT
communication are still competing for widespread adoption and market dominance.
The IoT protocols chapter discusses IoT protocols with significant industry
adoption that are, on the surface, competing against each other. The chapter will
demonstrate that there is no one-size-fits-all protocol, and the reader will see that
the competition between protocols is not a real competition, as most of them are
designed for different IoT communication styles, architectures, and use cases.
301
22.1 NETWORK PROTOCOLS
When discussing computer communication, the term protocol is used for describing
communication behavior among multiple physical or logical entities. Surprisingly,
there is no single established definition of the term. In Chapter 19 we referred to
protocols as conventions. In most cases, a protocol describes rules, algorithms,
message representations, and mechanisms for communication.
It is important to remember that while there is no single ubiquitous protocol
for computer communication, there are predefined protocol suites (which means
a well-defined collection of multiple single protocols that build upon each other)
available, like the Internet Protocol Suite we discuss in Section 22.3. The protocols
in these suites are known to work together seamlessly and simplify communication
between devices. Entities in a network can only communicate if they share the same
communication protocols, depending on which layer of the OSI model (see Figure
17.1) they operate.
As with the hardware standards discussed in Chapter 19, all networking and
IoT protocols of significance are open and standardized by independent committees
and working groups like the IETF or ISO. An open standardization process ensures
that challenges of new networking and communication requirements are solved
jointly between all major companies in particular industries, and that new or
enhanced protocols are interoperable between different vendors.
22.2 NETWORK PROTOCOLS IN THE CONTEXT OF THE OSI MODEL
Historically, the OSI model was created as a joint framework for network interoper-
ability between major hardware and software manufacturers. The main idea in 1984
was to have a foundation for a complete protocol suite that enables interoperable
networking between computers. The OSI protocol suite implementation itself was
eventually superseded by the now popular Internet Protocol Suite (see Section 22.3),
which gave rise to the alternative view also discussed in Section 17.2. However, the
model of the OSI protocol suite has gained popularity for educational purposes and
today network protocols are typically structured and described using the OSI refer-
ence model. Not all concepts of other protocol suites like TCP/IP map seamlessly
to the OSI model, as its layers are summarized again in the following:
1. Physical layer
2. Data link layer
Network Protocols: Internet and IoT 303
3. Network layer
4. Transport layer
5. Session layer
6. Presentation layer
7. Application layer
Each layer has its own set of communication protocols that build upon the
communication protocols in lower layers. One of the main ideas is that different
protocols on a single layer are exchangeable without the need to replace or modify
the protocols on other layers; see for example the replacement of an Ethernet
connection with a WiFi connection for the purpose of using the Internet.
22.2.1 Advantages of a Layered Communication Protocol Model
Although the mapping of complex protocol implementations to the OSI model lay-
ers can get tricky, the advantages of a layered protocol outweigh the inconvenience
of having to modularize communication concepts and software along predefined
boundaries. The advantages of layered communication protocols in the OSI stack
are:
• Abstraction: Higher-level protocols do not need to understand the function-
ality of lower-level protocols. A message routing protocol, for instance, does
not need to know about the specifics of the underlying physical level. It is
completely irrelevant for the message routing protocol what the physical
layer of a connection looks like.
• Exchangeability of protocols: It is possible to replace single protocols in one
of the layers of the OSI model without the need to replace the whole network
stack.
• Simplification: As Chapters 17, 18, and 19 suggest, communication encom-
passes many levels requiring specialist knowledge from microwave physics
to distributed systems programming. The OSI model allows developers to
focus on specific parts of the communication stack without knowing in de-
tail how all underlying or overarching layers work. This is especially im-
portant for the IoT, because many different underlying technologies can be
used while the majority of software developers stay focused on the actual
application layer.
22.2.2 Vertical and Horizontal Communication within the OSI Model
We have seen that the OSI reference architecture (see Figure 17.1) defines layers
that have their distinct responsibilities. When data needs to be sent via the network,
these layers pass messages between them. The layers are connected to each other
via defined interfaces that connect adjacent layers, each following specific message
formats. This concept allows higher-level protocols to pass messages to underlying
layers without knowing the specifics of the implementations of these layers. In
practice, some layers may be skipped and a higher-level protocol can pass a message
directly to a lower-level layer’s interface. For example, OSI layers 5 and 6 are
completely optional and a layer 7 protocol can use a layer 4 protocol directly.
Messages are passed down vertically between the logical layers of the OSI
model until it reaches layer 1. Layer 1 protocols transmit data horizontally to other
machines and the messages are passed up until it reaches the destination layer.
Layers 2–7 of the OSI model are logical layers. This means that these
protocols do not pass messages across machines, but only vertically. Only OSI
layer 1 protocols are required to transmit data to other machines on the network,
following conventions as laid out in Chapter 19. When the layer 1 protocol of
a sending party writes the data to the physical link, the receiving party needs to
interpret the incoming data. The receiving party now passes the data up its own
vertical protocol stack, and if the stack is compatible with the sender’s stack, the
message gets unwrapped on each layer until the application layer protocol receives
the actual data that was sent by the application of the sender.
22.2.3 Data Encapsulation
When messages are passed between vertical layers, the actual data from the higher
layer needs to be encapsulated, otherwise receiving parties won’t be able to pass the
messages up in their own stack. These encapsulations contain the so-called headers
required by the protocol and the actual data from the upper protocol (see Figure
22.1). A header is simply extra data added to the start of a message in a well-
defined way. As soon as data is sent by one party, each layer wraps the contents of
the upper layer when the message is passed down the stack and the receiver unwraps
the message on each layer until the actual data is received on the application layer.
An example of encapsulation between OSI layers 1 and 2 in case of Ethernet can be
seen in Figure 19.8.
7 7 data 7
6 6 7 data 6
5 5 6 7 data 5
4 4 5 6 7 data 4
3 3 4 5 6 7 data 3
2 2 3 4 5 6 7 data 2
footer 2
1 physical 1
Figure 22.1 Data encapsulation in the OSI reference architecture. Each protocol encapsulates the data
and headers from the upper layers when the message is passed down the stack. The receiving party is
able to unwrap the message on each layer when it is passed up the stack. This figure illustrates that a
header is added on each protocol layer. Layer 2 (typically Ethernet in the IoT context) also adds a footer
to the message.
22.2.4 Indirect Connection and Message Routing
In typical Internet network architectures – including common IoT solutions – there

is seldom a point-to-point connection between two communication partners with-
out intermediaries. Especially if Internet communication is used, there are many
intervening systems with different responsibilities involved. TCP/IP uses packets
for communication and messages are routed between many machines in a trans-
parent way. Internetworks, or networks of networks, require protocol translation
mechanisms between the individual networks in order to communicate. For Inter-
net of Things end-to-end architectures, wireless and wired networks are often used
together and sometimes even network or transport protocols are transparently trans-
lated by intermediate machines. For example, IoT devices based on Bluetooth and
mobile phone connections communicate with the corresponding backend system in
the cloud with transparent translation between protocols (i.e., invisible to the user).
These intermediate devices (often routers) do not necessarily implement all
layers in the OSI stack, but only the specific layers that are required to route
or translate messages. Figure 22.2 shows how messages are passed between two
endpoints with a router in between.
7 Application L7 Protocol 7 Application
6 Presentation L6 Protocol 6 Presentation
5 Session L5 Protocol 5 Session
4 Transport L4 Protocol 4 Transport
3 Network L3 Protocol 3 L3 Protocol 3 Network
2 Data Link L2 Protocol 2 2 L2 Protocol 2 Data Link
1 Physical L1 Protocol 1 1 L1 Protocol 1 Physical
Figure 22.2 Message routing with intermediate devices. This figure shows messaging routing with an
intermediate device (router). The intermediate device does not implement all OSI layers and redirects
the message to the correct destination.
22.2.5 OSI Layers Revisited
The layers of the OSI model (see Figure 17.1) are often categorized into upper
and lower layers. The upper layer consists of level 5 to 7 protocols, which are
often jointly referred to as peer-to-peer or application layer protocols. Most IoT
application software developers will interact with these protocols. The lower layers
are responsible for transport, routing, and physical translation of the upper layers,
and thus more important to understand for IoT hardware developers or embedded
programmers.
TCP/IP, the most common protocol suite for IoT applications, treats layers 5
through 7 essentially as a single layer, so the borders between individual levels of
the upper layers are blurred and not easy to distinguish.
While the OSI model was presented from bottom up in Chapter 17, one aim in
reiterating the model is to highlight important aspects from a top down perspective:
looking at it from the software side of communication.
22.2.5.1 Layer 1: Physical
The physical layer, often referred to as PHY, transfers data physically and sends
it over the network. It translates the logical messages from the upper layers into
hardware-specific operations that form electronic signals. Layer 1 also defines
possible network topology (e.g., bus, ring, mesh, star; see Figure 9.1) and the
transmission mode (simplex, half-duplex, or full-duplex).
22.2.5.2 Layer 2: Data Link
The data link layer is responsible for encapsulating upper layer information into
data frames, which are ultimately sent over the connection in the physical layer.
Layer 2 mainly concerns addressing and delivery of these frames in the same
network, for example using physical addresses such as the MAC addresses (media
access control address; see also Section 19.2.2 on Ethernet communication). These
addresses are required to distinguish access to the network medium, since multiple
physical devices for network access can be used.
The data link layer is also responsible for error detection and error handling.
The underlying physical layers can be error-prone and transmission errors occur
on a regular basis; this is especially true for wireless networks. Using mechanisms
like cyclic redundancy checks (CRC) of the data in the frame, the data link layer
validates that the received data is correct and can be passed to the upper layers
instead of requesting and waiting for a retransmission.
It is noteworthy that the most common data link layer protocol used today is
Ethernet (see Section 19.2.2).
22.2.5.3 Layer 3: Network
The main responsibility of the network layer is message routing for internetworks,
which are networks of networks. This means data can be moved between inter-
connected networks that are not necessarily geographically located close to each
other. While Layer 2 deals with addressing in local networks, this layer is used for
routing WANs like the Internet. Layer 2 uses physical addresses (MAC addresses)
and Layer 3 uses logical addresses. The most common logical addresses used in the
Internet are IP addresses.
The network layer protocols encapsulate the higher-level messages in data-
grams that can be split or reassembled if they exceed the maximum frame length of
the Layer 2 protocol. The messages can get fragmented before they are sent to other
machines and are reassembled on these machines before they are passed to upper
layers.
Beside routing and encapsulation, Layer 3 protocols also include diagnosis
and error handling. Protocols like ICMP are used to detect if logically connected
machines are online and messages are routable to these machines.
Between machine communication endpoints, there are typically a variety of

different machines that route traffic, called routers. These interconnect the actual
networks with each other and the packets can traverse between an arbitrary number
of networks. With these devices it is possible to hop between the different networks.
22.2.5.4 Layer 4: Transport
The transport layer connects upper and lower layers of the OSI model, but is
typically considered a lower layer protocol, although it is responsible for end-to-
end data transmission. It relies on the lower layers for packaging, addressing, and
routing functionality. The transport layer also provides higher-level abstractions
than lower layers, which can be used by the actual application protocols.
Some transport layer protocols contain mechanisms for process-level address-
ing to allow concurrent access to the transport layer for multiple applications. Seg-
mentation and reassembly is also a part of most transport layer protocols, processes
that create chunks of data that is passed to the lower layers. Connection-orientated
Layer 4 protocols like TCP also provide additional features like flow control and
retransmissions.
22.2.5.5 Layer 5: Session
The session layer is the first unambiguous upper layer in the OSI stack. In the
Internet Protocol Suite (see Section 22.3), the concepts of Layer 5–7 are blurred
and a sharp distinction is not always possible. Layer 5 is designed for managing a
session, which is a logical connection of two connected software programs. This
includes setting up and closing a session. Often, these mechanisms are exposed to
higher-level protocols via programming interfaces like Unix sockets.
The TCP/IP stack does not define an explicit session layer protocol. Connection-
oriented protocols like TCP are hard to distinguish from sessions. The session layer
protocol is one of the examples where the mapping of TCP/IP to the OSI reference
model is not optimal and provides ways for different interpretations.
22.2.5.6 Layer 6: Presentation
The presentation layer is concerned with modification and preparation of data for
the uppermost layer: the application layer. It is not required and often not even used
because most of the functionality resides in the actual application layer protocols.
Typical functionality that resides on layer 6 is encryption and compression. One
of the most common examples for encryption are SSL or TLS, although strictly
speaking these encryption mechanisms are present on multiple OSI layers.
Similar to Layer 5, the TCP/IP stack does not map very well to the OSI model,
thus different interpretations are common. Most often the presentation layer is not
part of a protocol discussion within the Internet Protocol Suite.
22.2.5.7 Layer 7: Application
The application layer is responsible for actual application-level protocols. There is a

vast number of application layer protocols available for different purposes. Popular
Layer 7 protocols include: HTTP (for web information; see Section 22.4), SMTP
(for email), Telnet (remote access) or one of the various IoT messaging protocols
such as MQTT (see Section 22.8). Non system-level programmers seldom use lower
OSI layer protocols to interact with the network, but focus on these application
layer protocols. This includes IoT software developers who are not concerned with
hardware, most of which will work with Layer 7 protocols exclusively.
22.3 INTERNET PROTOCOL SUITE
The TCP/IP protocol suite, often referred to as Internet Protocol Suite, is a collection
of protocols that form the foundation for the global Internet and thus the Internet
of Things. It is important to understand that the Internet Protocol Suite consists
of many different individual protocols that can depend on each other, while other
protocols in the suite are mutually exclusive in their usage. Although the Internet
Protocol Suite is often used with the term TCP/IP, the use of TCP as a transport
protocol is not compulsory. The book will discuss the individual transport protocols
in detail in this section.
22.3.1 TCP/IP and the OSI Model
The previous sections reiterated that the OSI reference model is a formal educational
tool that is used to separate concerns of specific networking functionality, and that
the somewhat mismatching layers of the OSI model and TCP/IP were historically
competing concepts.
The layers of TCP/IP are simpler from a conceptual perspective and are tightly
coupled to the protocol implementations of the Internet Protocol Suite. In fact, it
is hard to discuss the workings of the Internet and modern computer networking
in general without discussing those actual implementations. As briefly outlined in

Section 17.2, the TCP/IP model only consists of four layers:
1. Link layer
2. Internet (IP) layer
3. Transport layer
4. Application layer
Although the OSI model is not an optimal way of describing TCP/IP and
networking in general, in practice most people tend to use the OSI layering instead
of the TCP/IP layers for discussion about protocol stacks.
22.3.2 Layers of TCP/IP Messaging
The Internet Protocol Suite consists of the following layers:

1. Link layer: This layer is similar to the OSI Data Link, OSI layer 2 (see
Section 22.2.5.2). It operates directly on the physical layer and does not run a
core TCP/IP protocol in most cases, since Ethernet is most often used on this
layer. Historically, protocols such as PPP or SLIP were used on this layer.
For most typical Internet applications including IoT, Ethernet is by far the
most popular protocol on this layer.
2. Internet layer: This layer contains the package routing protocols that are used
for internetwork communication by using logical addresses. Besides routing,
the protocols on this layer have different responsibilities like packaging, data
manipulation and message delivery. In the classic TCP/IP model, both IPv4
and IPv6 are the key protocols on this layer (along with support protocols
like ICMP). The Internet layer is similar to the Network Layer, OSI layer 3
(see Section 22.2.5.3).
3. Transport layer: The Transport layer adds an important end-to-end abstrac-
tion layer with a variety of responsibilities. These responsibilities include
segmentation and reassembly of application messages. The most common
transport protocols in the Internet Protocol Suite, TCP, and UDP, provide
either a connection-orientated or connectionless communication model with
varying delivery guarantees.
4. Application layer: The Application layer is similar to OSI layers 5 to 7. Most

IoT application protocols reside on this level, including MQTT (see Section
22.8), CoAP (see Section 22.5), and XMPP (see Section 22.6). Typically it is
not feasible to implement lower layer protocols for application-specific use
cases, so most custom and proprietary IoT protocols are developed on the
application layer by relying on TCP or UDP. Most IoT software developers,
who are not involved with addressing networking close to the hardware level,
implement their applications using protocols on this layer.
22.3.3 Internet Protocol
The Internet Protocol (IP) is arguably the most important protocol of the Internet
Protocol Suite. It is the foundation of modern computer networking, and conse-
quently the fundamental communication protocol for the Internet of Things. A key
takeaway for application developers is that there are multiple IP versions in use
currently: IPv4 and IPv6. While the main ideas are the same for both, it is impor-
tant to realize the differences between these protocols. This section discusses the
importance and implications of IPv4 and its successor, IPv6, for IoT applications.
The change from IPv4 (which most people refer to as IP by default) goes
along with fundamental architectural changes to the network. IPv4 and IPv6 are
therefore often deployed in parallel until the ongoing final global transition toward
IPv6 is finished. Both IP versions are still important and unfortunately not always
interoperable.
The IP protocols reside on OSI layer 3, the network layer, or TCP/IP layer 3
(Internet layer). The main functions of the IP protocols are:
• Addressing: IP provides logical addresses, called IP addresses. These ad-
dresses are used to interconnect machines on the same or different networks.
• Routing: If the communicating machines do not reside on the same network,
IP provides a routing mechanism to deliver datagrams through intermediate
machines, the routers. In practice, all Internet communication traverses many
different routers until the destination for a datagram is reached. In most
cases, the destination is not directly routable and thus the delivery is indirect
(compared to direct delivery on the same network).
• Fragmentation and reassembly: The various OSI network layer and physical
layer implementations do not necessarily understand arbitrarily large data-
grams. Thus IP includes a mechanism to split datagrams on the sender side
local network Internet remote network
192.168.1.1
91.198.174.192
Figure 22.3 Interconnection of networks by using IP addresses. This figure illustrates the interconnec-
tion of networks via IP addresses. The client on the left local network connects to the server on the right
(remote) network. All datagrams are routed via intermediate routers until the final destination is reached.
and a mechanism to reassemble datagrams on the receiver side if the data-

grams exceed a predefined size.
The main purpose of IP is the delivery of data over interconnected networks
by using logical addresses. This interconnection represents the foundation for the
global Internet, and subsequently the IoT. Figure 22.3 illustrates the interconnection
of networks by using IP addresses:
The need for higher-level protocols on top of IP, like TCP, becomes clear
when looking at the characteristics of the Internet Protocol. A characteristic of IP is
an unreliable and unacknowledged delivery. IP does not keep track of datagrams that
were sent nor does it retry the delivery of datagrams. This means that IP is unreliable
and operates on a best-effort basis. If any problem occurs somewhere along the
transmission path (hardware or software failures on any intermediate machines),
the data is inevitably lost. The receiver does not send an acknowledgment for any
received datagram to the source of the data. IP has error protection capabilities by
using checksums, though. In case of transmission errors the packet gets dropped
(i.e., a router can verify the validity of a packet and order its deletion). The second
characteristic of IP is that it is connectionless: IP does not have a concept of a logical
connection, which means there is no way for the protocol to ensure the destination is
actually reachable. IP just sends the datagram to the destination address, regardless
of the destination’s capability to receive data. Due to the connectionless nature
of IP, packets can arrive out of order on the receiver side and are not reordered
automatically.
Achieving reliable network communication requires higher-level protocols
like TCP. Reliable communication has its prize in terms of latency and bandwidth,
so the efficiency of IP is a key advantage if no reliable communication is required.
22.3.3.1 IPv4
The initial specification of IPv4 was released in 1981 with RFC 791. At the time
IPv4 was specified, providing a technology that scales to such a colossal number
of interconnected machines as we face today with the global Internet was not a
goal. In fact, most people did not expect at that time that the global Internet would
ever grow to billions of machines in only 30 years. Nevertheless, IPv4 proved
to be a solid backbone for communication at scale, although many additions and
new technologies were invented to overcome some of the limitations, for example
network address translation (NAT) at the transition between wide and local area
networks. Today IPv4 is still the standard protocol for interconnecting devices and
machines over the global Internet and is one of the foundations of the IoT.
It is important to realize that IPv4 was designed with a much smaller number
of connected machines in mind. This is why IPv4 has serious shortcomings that
lead to the development of its successor, IPv6. The most important shortcomings of
IPv4 are:
• The format of the IPv4 address xxx.xxx.xxx.xxx with x being single digits
means that the maximum number of available IP addresses is approximately
4.3 billion. This is far less than the current population on Earth, and repre-
sents an issue considering that many people in developed countries currently
own dozens of devices that are all requiring an IP address. With the advent
of the IoT and those technologies penetrating ever further domains of human
existence, the IP address exhaustion is a serious challenge.
• The global IP addresses were initially distributed in classes. These classes
were big blocks of IP address spaces that were reserved for companies and
public institutions who participated in the early Internet. Even after classful
networking was replaced by classless inter-domain routing (CIDR) in 1993,
companies that received a Class A network still hold many unique IPv4
addresses, while IPv4 addresses remain a scarce resource for most Internet
users.
• The advent of small and private networks at home and in companies caused
the development of mechanisms to mitigate the IPv4 address depletion,
like NAT. Today most private networks are connected via one router to the
Internet, while the machines in the network itself communicate with their
own subnets (often in the 192.x.x.x or 172.x.x.x address space).
IPv4 Addresses
An IPv4 address is a unique 32-bit identifier and is primarily used for routing
messages. Each network interface requires an IP address that is unique in the
particular network. For machines on private networks that are not accessible from
the outside, a unique address within a local address range is sufficient.
An IPv4 address looks like the following (in this case, this is the IP address for
one of the wikipedia.org servers): 91.198.174.192. For local networks, an example
IPv4 address looks like the following: 192.168.0.1.
With a length of 32-bit, there is a theoretical maximum number of 4,294,967,296
IPv4 addresses. In reality, there are additional reserved IP addresses that must not
be used, such as the 127.0.0.0/8 block or the 240.0.0.0/4 block. Thus, there are
approximately 3.7 billion public IPv4 addresses available.
22.3.3.2 IPv6
In order to address the challenges with IPv4 for the growing global Internet, the
development of IPv6 was finished in 1998 with the release of RFC 2460. IPv6 is the
official successor of IPv4 and is seen as one of the key enabling technologies for the
Internet of Things.
The changes to the IP protocol are extensive and many new mechanisms
were developed alongside IPv6. The most significant improvements over the older
Internet Protocol are:
• IPv6 now uses 128-bit addresses. That means there are approximately 340
sextillion IP addresses available. Such a huge address space means that the
routable address depletion problem is solved, even for the most overexag-
gerated IoT device number estimations. To give this number a perspective, a
common comparison says that 128-bit addresses are enough to assign about
100 unique IP addresses to every atom on the surface of planet Earth.
• The protocol headers were significantly simplified to increase performance
on router implementations. IPv6 does not use checksums, because the un-
derlying protocols (e.g., Ethernet) and upper protocols (e.g., UDP/TCP) are
expected to perform checksum validation.
• IPSec, a security protocol suite for secure communication over insecure
Internet Protocol networks (like the global Internet) was developed alongside
and is built in into the IPv6 protocol.
• Mobile IP, a key technology for the mobile Internet of Things, is part of IPv6.
• Additional features and protocol extensions like quality of service support,
IPv4 transition capabilities, and a privacy mode were developed.
The trade-off for the advantages is that the protocol is significantly bigger in terms
of data usage. The overall network protocol overhead is higher than with IPv4.
The IPv4 protocol uses (without an optional header) 20 byte and IPv6 has a fixed
header length of 40 bytes (even without header extensions). However, in practice,
IPv6 often performs better than IPv4, because the header is significantly simplified
and no checksum validation is performed. In addition, IPv6 does not perform
any fragmentation itself as these responsibilities are transferred to the upper layer
(which is typically TCP or UDP). For IoT use cases over mobile networks, where
costs are determined by data volume, the usage of IPv6 may be a prohibiting factor
that needs to be considered.
IPv6 Addresses
IPv6 addresses are globally unique per device, which allows to address all devices
on the Internet directly. This renders techniques like NAT obsolete for many use
cases. Unfortunately the longer addresses are also harder to handle for humans and
this is why various shortcut notations exist for the textual representation of IPv6
addresses.
A typical IPv6 address: 2620:0000:0862:ed1a:0000:0000:0000:0001
Most IPv6 addresses do not occupy all 128 bits. This results in fields that are
padded with zeros or contain only zeros. A shorthand form for representing IPv6
addresses is available. The rather lengthy IPv6 address can be represented as the
following shorthand notation: 2620:0:862:ed1a::1.
Leading zeros of a field can be omitted and whole 16-bit fields of zeros can
be represented with a single “:” in order to generate a shorthand IPv6 form.
22.3.4 TCP
Above IP, TCP is one of the main protocols in the Internet Protocol Suite. Its
significance becomes clear when looking at the alternative name of the Internet
Protocol Suite: TCP/IP. Transmission control resides on OSI layer 4 and provides
a connection-oriented, reliable, and acknowledged communication channel for data
streams over IP. TCP is the foundation of most Internet communication and is used
as underlying protocol for many important application protocols like HTTP (see
Section 22.4), MQTT (see Section 22.8), FTP, SMTP, and SSH.
TCP is a complex protocol and although the protocol itself has been around
since 1981 when RFC 793 was released, many optimizations and extensions were
added. Over the past thirty years the protocol has demonstrated maturity and
robustness. Some implementations deliver only core TCP functionality, while others
support the various extensions. This section aims to give an overview of the most
important characteristics of TCP as underlying protocol for IoT communication.
The popularity of TCP as transport protocol for communication between
machines and computers stems from the following characteristics and functions:
• Transmission reliability: TCP provides reliable communication over poten-
tially unreliable communication channels. The protocol itself ensures that
data, which needs to be sent from one machine to another, gets delivered
and no data is lost. This means TCP handles retransmission of data and
deduplicates data in case of redundantly received packets. Error detection
mechanisms with checksums are also in place to ensure data has not been
corrupted during the transmission process.
• Connection management: TCP is connection-oriented, which means there is
an established logical communication channel between two communication
partners. Before actual application data can be sent, TCP ensures via a three-
way handshake that the communication partners are available and ready for
sending and receiving data. If one of the communication partners determines
that it does not require this channel anymore, the connection will be closed,
which means both ends can detect reliably that the other communication
partner is not available anymore.
• Stream-oriented: All data sent over a TCP connection is sent as a continuous
flow of data. The protocol ensures that the data is received in correct order
and no data is lost in the transmission. Since TCP is not message-oriented
but stream-oriented, this also means that data packaging is not necessarily
the same on the receiver side and sender side. TCP only makes sure that data
is reliably received.
• Bidirectional: TCP is bidirectional and supports full-duplex communication
(given that underlying protocols also support full-duplex communication).
Data can be sent and received over the same connection simultaneously.
• Multiplexing: On a given system, different software processes can send and

receive data by utilizing the same TCP implementation. To differentiate the
producer and consumer applications, port addressing is used. Ports are used
to multiplex data from different applications and pass it to the IP layer. When
data is received, demultiplexing takes place by forwarding the data to the
correct consumer applications.
• Flow control and congestion avoidance: TCP is a stream-oriented protocol
and has mechanisms to regulate the flow of data, since network speed and
bandwidth can vary between different consumers and producers. The flow
control mechanism in TCP ensures that a receiver is not overwhelmed by
too much data and the sender only transmits data if the other party is
ready to receive. Beside these flow control mechanisms, different congestion
avoidance algorithms are available to ensure that no congestion collapse
occurs even when network traffic is a problem. This typically happens if one
of the routers between communication endpoints is overloaded. TCP ensures
the slowdown of the sender to recover the connectivity.
The characteristics of TCP allow application developers to use the network as
a reliable resource. There is no need to handle complex aspects like data loss pre-
vention or network congestion control from the perspective of an application. The
application layer can treat the network as stable resource with TCP. The protocol is
a popular choice as network layer protocol, especially for communication over the
Internet in IoT contexts. However, TCP is not lightweight and embedded devices
may have to dedicate considerable computing resources to establish and maintain a
TCP connection. For local and reliable networks, UDP may be a suitable alternative.
22.3.5 UDP
The Internet Protocol Suite defines a second layer 4 protocol: UDP. Compared to
TCP, the UDP protocol is rather simple and has only one duty: pass application
messages to IP. In contrast to TCP, it completely relies on IP without any additional
protocol mechanisms, which means all characteristics of IP for message delivery
detailed in Section 22.3.3 apply. UDP is connectionless, stateless, and nonreliable.
Although UDP does not provide guarantees that are fundamental for some
application protocols, there are use cases where UDP is a good protocol choice.
One of the advantages of UDP is its performance. The protocol overhead is very
low compared to TCP. UDP does not require a stable connection and data can be
sent without any acknowledgments. This saves bandwidth and improves speed at
the cost of reliability and message ordering. An important use case for UDP is
video or music streaming. Missing a single datagram is often not as harmful as high
latencies in such use cases. If multicast capabilities are needed for local network
communication, UDP is a good choice. For constrained IoT devices that do not have
their own TCP stack, UDP may be an lightweight alternative to TCP if unreliable
and unordered delivery is suitable for the use case.
All application protocols that rely on UDP as layer 4 protocol such as CoAP
(see Section 22.5) are required to implement reliability features themselves. UDP
often does not play well with NAT and firewalls, so TCP may be an alternative for
communication over the public Internet. For local networks and networks that are
under a developer’s control, UDP might be a very efficient and performant choice.
22.3.6 Ports
On typical computers there are different applications that need to communicate

over TCP/IP. On most modern operating systems, the TCP/IP implementation is
provided by the operating system itself. A mechanism is required to distinguish
between applications that want to send and receive data, since the same IP address
is used for all these applications. TCP (and also UDP) use so-called ports, particular
subchannels of the connection that are identified by a 16-bit integer (ranging from
0 to 65535) to distinguish between connections from different applications.
When a TCP/IP connection is established, it consists of the following logical
4-tuple:
1. Source IP: The IP address of the party that establishes the connection.
2. Source port: The initiating party’s port number on which the connection is
established.
3. Destination IP: The IP address of the other communication party.
4. Destination port: The port number on the other communication party.
It is worth noting that a client, which initiates a connection, typically uses
ephemeral ports that are chosen randomly (typically port numbers higher than
32768 on Linux systems and higher than 49152 on most other operating systems). A
server application binds to a specific and well-known port like 80 for HTTP servers
or 1883 for MQTT brokers.
Figure 22.4 demonstrates the concept of multiplexing and demultiplexing
with ports works in order to separate application traffic.
MQTT HTTP CoAP MQTT HTTP CoAP

client client app server server server
port port port port port port
1883 80 5683 1883 80 5683
TCP UDP TCP UDP
TCP/1883 TCP/1883
TCP/80 UDP/5683 TCP/80 UDP/5683
IP IP
TCP/1883; TCP/80; UDP/5683 TCP/1883; TCP/80; UDP/5683
client server
Figure 22.4 TCP/IP multiplexing and demultiplexing via ports. Multiple applications and protocols
are running on each computer in this example. All applications on the left client are multiplexed in order
to use the same IP address and physical communication channel of the machine. The server on the right
demultiplexes and passes the traffic to the correct applications. Both IP and the transport protocols (TCP,
UDP) provide a way to multiplex and demultiplex.
22.4 HTTP AND HTTP/2
HTTP is arguably the most important protocol for the Internet as we know it today.
It powers the World Wide Web and is supported by virtually any platform and
programming language. HTTP has become very flexible over the years and various
workarounds have been invented to overcome shortcomings of the protocol, which
today makes HTTP a popular choice for IoT communication.
HTTP uses a classic request/response communication model (see Figure
22.5), which allows the client to send data to the server and receive a response after
the processing finished on the server side. While this model is a perfect fit for the
World Wide Web (a client can request a web page and the server delivers the page),
event-driven IoT communication with HTTP is not optimal for server-initiated
messages, because there is no standard way for proactively sending messages as
server. Workarounds like long polling and new protocols such as websockets were
invented over the years to solve some of these issues.
An advantage of HTTP is the flexibility of the protocol. This is due to the
ability to add arbitrary headers to any request and response. There are many stan-
dardized headers that are universally accepted like authentication, cache control,
and client preferences for the encoding and the language of the response. HTTP
response
HTTP/1.1 200 Ok
Date: 14 Mar 01:49:26 GMT
Server: Apache
request
GET / HTTP/1.1 <html>
Host: www.example.org <head><title>News</title></head>
User-Agent: Mozilla/5.0 <body><h1>Headlines</h1>Some stories<body>
Accept: */* </html>
client server
Figure 22.5 Request/response pattern. The client initiates a request and the server sends a response as
soon as it processes the request. The figure illustrates an example request and an example response for
HTTP/1.1.
headers are extensively used for the World Wide Web by web browsers and web
servers (e.g., for cookies and user agent indications).
Each HTTP response contains universally accepted and standardized response
codes that indicate if the request was successful or if an error occurred. These error
codes indicate the root cause of the issue and the client can take appropriate action
after interpreting the response. The response codes are categorised and the first digit
of the response code represents a category. Table 22.1 shows the categorization of
response codes.
22.4.1 HTTP Methods
Each HTTP request contains a HTTP method that causes the server to take a specific
action. The most common HTTP methods that are used for the World Wide Web
and for HTTP APIs are arguably GET and POST. The remaining HTTP methods
are more commonly used in RESTful APIs (see Section 22.4.4). It is important to
note that not all web servers allow all HTTP methods for a specific resource. Table
22.2 provides an overview of the available HTTP methods that are in use today.
22.4.2 HTTP/2.0
HTTP/2.0 was released in 2015 and is the most current version of HTTP. It is the
first major update of the HTTP specification since 1997. The advancement of the
World Wide Web and HTTP REST APIs in the 2000s changed the demands for
Table 22.1
Categorization of HTTP Response Codes
Format Category Meaning
1xx Informational General information that does not represent a success or a

failure
2xx Success The request was successful and the response code provides
additional information about the nature of the success
3xx Redirect The requested resource is not available and redirection is
needed
4xx Client Error The client sent an invalid request and the operation could not
be completed
5xx Server Error An error occurred on the server side
Internet communication with HTTP. The text-based HTTP/1.1 (and its predecessor,
HTTP/1.0) had shortcomings that were addressed in the HTTP/2.0 specification.
The main areas of improvement are:
• Binary communication: HTTP/2.0 is a binary protocol, which is easier to
parse and more compact. A potential disadvantage of the binary nature is
that it is harder to debug for humans than the textual HTTP/1.x protocols.
• Multiplexing: Multiple requests and responses can be carried over the same
TCP connection concurrently. This is a significant advantage over older
versions of HTTP that required one TCP connection per request (or the reuse
of a TCP connection).
• Header compression: With HTTP/1.1 all headers must be transmitted for
every request. Web browsers often send multiple headers that sum up to
multiple kilobytes. HTTP/2 allows the headers to be compressed to reduce
overhead.
• Server push: A web server can push data to the client proactively if resources
have changed.
Table 22.2
HTTP Methods
Method Description
GET Used to retrieve a specific resource on the server.

POST Used to send data to the server (e.g., contents of web forms) that the server
needs to process.
HEAD Used to retrieve the headers of a resource but not the actual contents of the
resource.
OPTIONS Used to get information about the communication options of a server or a
specific resource (e.g., which methods are available).
PUT Used to send data to the server that represents a new version of the actual
resource. Sometimes used for uploading data when POST is not appropri-
ate.
DELETE Used to delete a resource.
TRACE Used to return the request back to the client for diagnostic purposes.
CONNECT Used for establishing a tunnel. Typically only implemented by proxy
servers.
PATCH Used to update parts of a resource. Not widely used except for RESTful
APIs.
These improvements result in a largely reduced overhead and can decrease

latency significantly. HTTP/2 is a good candidate for a request/response protocol in
the IoT context.
22.4.3 HTTP Authentication
HTTP itself is a layer 7 protocol and relies on the TLS standard (see Chapter
27) for securing the communication between client and server. Two authentication
mechanisms are specified for HTTP. Both use the standardized Authorization HTTP
header.
Basic Authentication
This is the most common authentication mechanism and sends the username and
password in the form user:password as base64 binary-to-string-encoded text to
the server. This authentication method requires TLS; otherwise attackers could
eavesdrop the username and password in plain text.
Digest Access Authentication
This authentication type requires a so-called nonce (a randomly generated string)

that is sent by the server to the client. The client calculates a MD5 hash value of
username, password, and other information (including the nonce), and sends the
hashed value to the server. The server can reconstruct the hash value and is able to
determine if a correct username and password combination was used for calculating
the hash value. The username and password information of a client is never sent
to the server in clear text. Attackers are not able to read the plain text value of
the credentials. If this method is used without transport encryption like TLS (see
Chapter 27), man-in-the-middle attacks are still possible.
Token-Based Authentication
This authentication type is layered onto existing HTTP interactions and is not de-
fined as part of the core specification. Token-based authentication builds a federated
model where a single authentication server can provide authentication for multiple
resources. Token-based approaches are often more suitable for IoT networks than
simple username/password schemes. A widely used token authentication model for
HTTP is the OAuth2 specification. For more information, see the later chapters on
security, especially Chapter 26.
22.4.4 RESTful APIs
With the success of web service APIs in the 2000s, many standards like the Simple
Object Access Protocol (SOAP) family emerged. Although HTTP is used mostly
for its communication, SOAP is not considered particularly lightweight because of
the use of XML and complex protocol details. An HTTP and web-centric approach
for designing web APIs was suggested by Roy Fielding, one of the main authors
of the HTTP specification, in 2000 by introducing the term representational state
transfer (REST).
REST defines an architectural style of APIs that allows clients to access
and modify web resources by using predefined stateless operations. Today, most
RESTful APIs are based on HTTP and are extensively using the HTTP methods,
also known as verbs in the context of REST. There is no single standard for
RESTful APIs, but different encodings such as JavaScript Object Notation (JSON)
or Extensible Markup Language (XML) are used on different conceptual levels,
along with HTTP, Uniform Resource Identifiers (URI) and TLS.
Modern REST APIs are sometimes built with the Hypermedia As The Engine
Of Application State (HATEOAS) principle in mind, which means the API itself
allows the client to discover available actions of the API without the need for
hardcoding actions on the client side.
RESTful APIs differentiate between URIs for collections and resources. A
resource is a particular individual endpoint with a single, logical element that is
typically represented in JSON, XML, HTML, or any other format. A collection is
an accumulation of individual resources.
The following examples show a RESTful API that represent a user dictionary:
• An URL that represents a collection of users: http://my.api.com/users
• An URL that represents an individual user with the name abc:
http://my.api.com/users/abc
Table 22.3 lists the meaning of the most common HTTP verbs in the context
of REST.
22.4.5 HTTP for IoT Communication
HTTP is arguably one of the most important standards for Internet communication
and is more popular than ever. If an IoT use case requires a request/response model
and there is no need for push messaging, HTTP is one of the best options. If
bandwidth is a limiting factor, consider using HTTP/2 and JSON (or binary formats)
Table 22.3
Common HTTP Verbs and their REST Meaning
Verb Meaning on Collections Meaning on Resources
GET Returns the URIs of the collection Returns the element

members
POST Creates a new entry in the collection Typically not used on resources
and returns the URI of the newly cre-
ated resource
PUT Replaces the entire collection with a Replaces or creates the re-
new one (bulk operation) source
DELETE Deletes the entire collection (bulk op- Deletes the resource
eration)
as data encoding to avoid the verbosity of HTTP 1.1. Although HTTP/2 provides
push capabilities, websockets, and/or MQTT (see Section 22.8) may be a better
alternative if push communication is needed.
Even if other protocols are used for IoT messaging, HTTP in conjunction with
RESTful architectural styles are often useful for delivering historic data to clients if
they request them on an infrequent basis. HTTP should be part of your IoT protocol
toolbox.
22.5 COAP
The Constrained Application Protocol (CoAP) is a web transfer protocol that is

designed for constrained devices and networks. It is an open IETF standard (RFC
7252) and supports different protocol extensions, which are also recognized by
the IETF. CoAP is based on the request/response communication model similar to
HTTP (see Section 22.4), and supports additional protocol features that are useful in
IoT scenarios. The most popular use case of CoAP is in wired and wireless sensor
networks. It is based on UDP, but supports alternative transport mechanisms via
extensions. Due to its frequent usage in constrained and local networks, CoAP is
often proxied over HTTP to be suitable for Internet wide data transfer.
22.5.1 UDP as Transport Protocol
CoAP is based on UDP, which allows a variety of unique features while being slim
and efficient. UDP is not always ideal for Internet communication or communica-
tion between multiple networks due to its nonreliable nature. There is a chance
that messages arrive out of order or get lost entirely. CoAP implements simple
mechanisms that allow optional reliability mechanisms in order to mitigate these
issues:
• Simple stop-and-wait retransmission: A CoAP message can be marked as a
confirmable message by adding a protocol flag. This enables an acknowledg-
ment mechanism between sender and receiver. The sender of a message has
the guarantee that the message is received at least once. To avoid congestion,
exponential back-offs are implemented.
• Deduplication: CoAP has a built-in deduplication mechanism based on
message identifiers. This mechanism is in place for all CoAP messages.
These basic mechanisms add reliability to the unreliable UDP transport. They are
not intended to replace more sophisticated features for reliable transport, which
are offered by TCP natively (see Section 22.3.4 for more information about the
reliability features of TCP). CoAP is capable of using other transport protocols
like TCP or Short Message Service (SMS), as specifications for these additional
transport protocols exist.
22.5.2 Protocol Features
One of the main design principle of CoAP is to use minimal resources on the devices
and networks. The protocol offers many features that make it useful for a variety of
IoT applications:
• Lightweight: CoAP has a fixed 4-byte header for all its messages. The header
may be followed by compact binary options and a payload, which is the
application message. Requests and responses both share the header concept.
As discussed, CoAP uses the datagram-oriented UDP, which is lighter than
the stream-oriented TCP.
• RESTful semantics: CoAP uses RESTful semantics, similar to HTTP, and
defines the verbs GET, PUT, POST, and DELETE. The similarity of HTTP
and CoAP in their RESTful design allows the conversion of CoAP to HTTP
and vice versa via cross-protocol proxy applications.
• Asynchronous message exchange: Due to the datagram-oriented UDP trans-

port protocol, messages can be sent and received out-of-order. Requests and
responses are processed in an asynchronous fashion. This characteristic eases
the task to implement client and server software with high message through-
put.
• Security with DTLS: CoAP is based on UDP and thus TLS, the most common
transport security protocol, cannot be used as it relies on TCP. For estab-
lishing a secure and encrypted communication channel between client and
server, CoAP leverages the Datagram Transport Layer Security (DTLS) pro-
tocol. It is an alternative security protocol for UDP based on TLS, having a
similar feature set that deals with packet reordering, loss of datagrams, and
large data transferals. Section 27.2.6 covers DTLS in detail.
• Built-in discovery: The CoAP protocol has a built-in resource discovery
feature. This enables clients to automatically detect all resources provided
by a CoAP server. The discovery mechanism is especially useful for M2M
scenarios without human interaction.
• Observable resources: RFC 7641 describes a CoAP extension that allows
clients to observe resources. After the client sends an Observe request and re-
ceives the corresponding grant, all updates to the resource are automatically
pushed by the server to the client. As a result, the client does not need to poll
the resource for updates. Observable resources are an efficient mechanism if
resources are updated frequently.
• Group communication and multicast: CoAP supports sending a message as
UDP multicast request to an IP multicast group. This results in one message
being sent to a group of clients and enables an easy-to-implement group
communication. RFC 7390 describes the group communication capabilities
of CoAP in detail. Not all servers may support this feature and it is not
possible to use this feature with encryption via DTLS.
22.5.3 Use Cases
CoAP is an IoT protocol that is suitable for a variety of use cases. A prominent
use case for CoAP: constrained devices (e.g., microcontrollers or small embedded
systems) in local networks that lack the ability to run a full-fledged TCP stack.
These devices have significant limitations in terms of memory and computing
resources. Due to the lightweight nature of CoAP, the protocol is often used for
sensor and actuator communication in local networks. The communication pattern

is based on request/response principles and the RESTful design allows to create
REST APIs on devices that are too constrained for HTTP.
CoAP is suitable for wired and wireless networks that are deployed locally.
Deployments that require communication over the Internet or other interconnected
networks may need to use HTTP proxies due to NAT traversal issues with UDP.
22.5.4 CoAP Discovery
CoAP is designed for M2M communication, which means little to no human inter-
action is required. For this reason, CoAP specifies a resource discovery mechanism.
RFC 6690 defines the Constrained RESTful Environments (CoRE) link format,
which specifies a well-known interface resource for CoAP and the CoRE link for-
mat. It is also possible to use filter queries to reduce the amount of entries returned
based on the client’s capabilities.
A CoAP client can issue a GET request to the standardized /.well-known/core
resource, and the server returns a response with links to the available resources and
their attributes. The discovery mechanism also works with UDP multicast, which
allows to detect all available resources in a network.
22.5.5 Comparison to HTTP
CoAP is not intended as a replacement for HTTP. RESTful HTTP interfaces can
be easily migrated to the more compact and efficient CoAP protocol. In practice,
CoAP APIs are more often implemented with those protocol capabilities in mind
that are unique to CoAP (like discovery, asynchronous messaging and multicast).
The CoAP protocol is designed for M2M communication and is more lightweight
than HTTP. A significant difference is that CoAP uses UDP and HTTP uses TCP as
underlying transport protocol.
Both HTTP and CoAP are excellent choices for creating RESTful services
and both define the common verbs GET, PUT, POST, DELETE. See Table 22.3 for
details about the verbs. In CoAP environments, both communicating parties often
act as client and server.
HTTP and CoAP are interoperable with cross-proxy support. The CoAP
specification defines how protocol cross-proxies should be implemented. HTTP is
intended for web communication and CoAP for M2M communication, so CoAP is
often used in local networks and is translated to HTTP for Internet communication.
22.6 XMPP
The eXtensible Messaging and Presence Protocol (XMPP) is an open, modular,

and XML-based messaging protocol designed for chat applications. It was initially
released in 1999 as Jabber and gained popularity in the early 2000s for powering
many chat applications. The built-in protocol features were specifically designed for
implementing instant messaging functionality. XMPP is standardized at IETF and
has multiple extensions available under the XMPP Standards Foundation (XSF).
The protocol strives to provide a standard for real-time messaging and is based
on XML stanzas, small XML snippets. One of the fundamental features of XMPP
is to stream XML over a network. Beside the focus on instant messaging and chat
applications, XMPP also provides functionality designed for IoT use cases via ex-
tensions.
A XML stanza that requests the current roster from the server:
iq to=“user@example.org” type=“get” id=“123”

query xmlns=’jabber:iq:roster’/
/iq
XMPP was originally designed for chat applications and has built-in features that
are ideal for instant messaging applications. There are a large number of XMPP
Extensions (XEP) that specify new optional features that are not available for all
client and server implementations. Noteworthy features of XMPP are:
• Decentraliszed architecture: Due to the decentralized nature of XMPP, any-

one can operate XMPP servers, which is similar to e-mail technologies like
SMTP. Federation mechanisms are built into the protocol, so servers can be
chained and messages can be forwarded between servers.
• Extensible: The XML structure of XMPP is the enabler for modularity and
extensibility since custom functionality can be built on top of the core
protocols by defining new XML stanzas. The most common extensions
are published as XEPs, although it is possible to define use case specific
extensions that are implemented by the client and server implementation
used for a concrete scenario.
• Presence and contact lists: XMPP has a contact list feature built in, the so-
called roster. Applications can subscribe to a status changes (e.g., online / do
not disturb) of other users or applications. This is similar to the friend list
feature of social networks or popular chat applications.
• One-to-one-messaging: XMPP is built for one-to-one messaging between
participants. It is not a peer-to-peer protocol and requires a XMPP server
for the participants to communicate. There are extensions for XMPP for
serverless messaging available as well as multiuser chat.
• Jabber IDs: All XMPP clients have addresses in the form username@domain/
resource like clientid@xmppserver.com/phone1 for distinguishing commu-
nication partners. These Jabber IDs must be globally unique as clients are
addressable via these identifiers.
• Push communication: XMPP communication opens a TCP connection that
stays active as long as a client is online. This allows the server to push
messages directly to the communication partner at lowest latency.
• Security: Security standards like StartTLS and Simple Authentication and
Security Layer (SASL) are built-in into the protocol. A server can require
the use of encrypted communication or authentication.
A variety of additional features for XMPP are available via XEPs, including:
• Multiuser chat
• Service discovery
• Publish/subscribe messaging
• IoT extensions
The XEPs are extensions to the XMPP core protocols which are standardized
at the XMPP Standards Foundation (XSF). These extensions render the XMPP
protocol very flexible and powerful due to the modularity of extensions. The XEPs
are not required to be supported by server and client implementations, so often
the client libraries and server products are fragmented in terms of their feature
sets, since not all XEPs may be supported by all implementations. All client and
server implementations involved in the communication need to implement the XEPs
desired for the use case.
XMPP was designed from the ground up for chat applications at a time the
term IoT was not yet invented. There are XEPs available that fill the functionality
gap for IoT use cases. Not all servers and clients support these XEPs and they do
not have significant industry adoption yet.
If a use case requires additional functionality that is not available via XEPs,
client code libraries and server products may be modified or extended to support
custom and use case specific extensions.
22.6.2 XMPP as an IoT Protocol
Although XMPP is at its core a chat protocol and not a protocol designed for the
Internet of Things, the messaging functionality is suitable for many IoT use cases.
The use of XML may be a problem for IoT scenarios with constrained hardware, but
the use of the Efficient XML Interchange (EXI) extension may help in such cases.
In addition, there are IoT protocol extensions available that can be used if the server
and client implementations both support these XEPs. Core protocol features like
friend lists or the presence functionality for determining which clients are online
may not be needed for IoT scenarios, although they are part of the core protocol.
The use of globally unique Jabber IDs may also not be needed for all IoT use cases.
XMPP is a good protocol for the Internet of Things if a concrete IoT project
requires some of the core protocol features, such as presence. An example would be
a project where chat needs to be implemented. For scenarios that involve transfer
of data over mobile networks, XMPP may be too verbose by default and may need
considerable modification and use of extensions like EXI. If the core strengths of
XMPP and its built-in features are not needed, other IoT messaging protocols like
MQTT (see Section 22.8) may also be worth evaluating.
22.6.3 Use Cases
The unique characteristics and flexibility of XMPP make it a good candidate as

protocol choice for some Internet of Things applications. The protocol shines for
use cases that are similar to instant messaging and for use cases that do not have
minimal bandwidth requirements. Built-in features like presence and the contact
lists are useful for implementing chat applications with XMPP. The modularity
with extensions allows the tailoring of use case specific implementations. IoT
applications can profit from the flexibility and modularity of XMPP as long as the
server and clients support the extensions needed. If constrained hardware is in the
mix, XMPP’s use of XML may be too heavyweight. Voice over IP (VoIP) is also
supported by XMPP with the Jingle XEPs, so XMPP is a natural choice for all
chat-related IoT scenarios.
22.7 AMQP
The Advanced Message Queuing Protocol (AMQP) is a protocol for messaging

between applications. It is popular for backend applications and message exchange
in nontrivial application landscapes in order to decouple the systems via messaging
patterns. AMQP is a standardized wire protocol (ISO/IEC 19464:2014) and also
gained popularity for IoT applications.
AMQP 1.0 is the most current standardized version. As a protocol, AMQP 1.0
— in contrast with older versions — does not specify how server applications (e.g.,
brokers) need to be implemented, but it defines the mechanics of how messages can
be passed between applications via various messaging patterns. This approach adds
flexibility but may also add complexity in the design of the backend application
topology. Multiple enterprise integration patterns are possible to be implemented
with AMQP on the backend side as well as simple device-to-cloud communication.
The price for flexibility of AMQP is complexity on client implementations, which
may not be suitable for constrained devices. Complex IoT messaging use cases
between more powerful devices and the cloud are better use cases for AMQP.
22.7.1 Characteristics of AMQP
AMQP has unique characteristics that are optimal for for IoT backend communica-
tion and sometimes IoT device communication:
• Secure: AMQP optionally supports TLS for transport encryption and SASL
as authentication framework out-of-the-box. AMQP clients can upgrade to
TLS communication or initiate the TLS communication directly.
• Bidirectional: Both partners in an AMQP communication can send and
receive messages over the same connection. Device-to-cloud communication
is as well supported as cloud-to-device communication in the IoT context.
• Multiplexed: Multiple logical sessions between communication partners can
be established over the same connection. An application could for example
write to different queues or consume from different topics or queues over the
same connection.
• Portable: AMQP defines its own extensive type system for data representa-
tions that can also be used for application data. The protocol is not platform-
specific and libraries exist for multiple programming languages.
container container
connection
session
link
link
nodes nodes
Figure 22.6 AMQP building blocks. An AMQP communication consists of different building blocks:
containers with multiple nodes, a connection between the containers with one or more sessions, and links
between the nodes.
• Compact: Due to the binary protocol format of AMQP, it is compact com-

pared to text-based protocols. The efficient type system allows to encode
metadata as well as application data with minimal overhead. Despite being
compact, bandwidth requirements are often higher for AMQP compared to
other protocols like MQTT.
• Flexible: AMQP is designed to support multiple communication patterns
and messaging topologies like peer-to-peer, client-to-broker, and broker-to-
broker. Many different messaging patterns are possible to implement via
AMQP like publish/subscribe, point-to-point communication via queues, and
request/response. AMQP allows implementations to add domain specific
queuing models.
• Reliable: Multiple reliability guarantees are possible for messages, from fire-
and-forget to exactly-once delivery. Recovery strategies are defined by the
protocol for all failure cases.
22.7.2 Basic Concepts
AMQP is designed to be a flexible messaging protocol and has basic abstractions

and concepts that need to be understood. Figure 22.6 shows the basic elements of
an AMQP communication.
The building blocks of AMQP are:
• Containers: A container is an application that wants to communicate via

AMQP. A container usually contains one or more nodes.
• Node: A node is a communicating entity inside a container. A queue or a

topic (for publish/subscribe messaging) would be such an entity.
• Connection: A connection is an abstraction for the actual transport. Before
any further communication can happen, a connection needs to be established
between the containers. One container establishes the connection (in IoT
contexts most often a client instead of a server). Connections are handled
as precious resources and stay open as long as possible. Typically TCP is
used for connection establishment.
• Session: A connection can have multiple sessions. A session is basically a
sequential communication channel abstraction that can hold multiple links.
A session provides a window-based flow control model to control the total
transfer frames a sender and receiver can handle (which is defined by their
respective buffer sizes).
• Link: A link is a communication path between containers. It is created over a
session that enables the transfer of messages. A link is unidirectional, which
means one container acts as receiver and one container acts as sender. Links
can be created by either container at any time. The links are established
between a node on each container. An example would be to establish a link
to a queue on the server from a client.
AMQP offers different levels of abstractions for high flexibility. These con-
cepts are the basis of complex messaging scenarios. Decoupled backend applica-
tions in particular profit from the flexibility that AMQP brings, but this comes at the
cost of complexity. The basic building blocks of AMQP are suitable for backend
systems as well as device-to-cloud communication.
AMQP aims to be a versatile protocol for all application messaging needs. It is

not surprising that the protocol offers many features that are vital for business
application messaging as well as IoT messaging:
• Security support built into the protocol: AMQP uses TLS for transport en-
cryption. Applications that initiate the connection can either use plain TCP,
start with a TLS handshake, or connect via plain TCP and then upgrade to
TLS. The flexible SASL standard is supported by the protocol for authenti-
cation needs and even complex authentication scenarios are supported.
• Flow control: AMQP has flow control built into the protocol to prevent
application overload by too many messages. Basically, two flow control
models are supported: session flow control, which protects the infrastructure
of a container to get overloaded by too many messages, and link flow control,
which protects a single application against overload. These flow control
mechanisms enable the creation of robust messaging infrastructures.
• Type system: The protocol has a full type system available for AMQP
messages. This type system defines platform agnostic and custom encodings
for data representations. This is useful for adding metadata to the message
envelope of AMQP messages. The application data may be encoded as well
with this type system. Alternatively it is also possible to use application-
specific payload formats like JSON, XML, or any other representation that
suits the use case.
• Brokered or peer-to-peer communication style: Containers can communicate
directly in a peer-to-peer fashion. The flexible building blocks of AMQP en-
able completely decoupled communication (e.g., by using publish/subscribe
mechanisms) via message brokers. For IoT applications, a typical use case
is to use a brokered model instead of peer-to-peer communication if cloud
communication is in the mix.
• Multiple messaging patterns possible: The flexible nature of AMQP allows
different messaging patterns. AMQP supports classic decoupled messaging
via queues and also allows publish/subscribe communication. Most messag-
ing enterprise integration patterns can be implemented via AMQP, so even
the most complex messaging use case is possible. The protocol includes sup-
port for transactions across containers.
22.7.4 AMQP for the Internet of Things
AMQP is a flexible protocol that has its roots in the (backend) business application
messaging but recently gained traction as an IoT protocol. The abstractions and
basic building blocks of the protocol are suitable for the implementation of most
IoT messaging use cases, from device-to-cloud and back.
The feature-richness comes at a price of complexity, so many client library
implementations have a significant overhead if they support all AMQP features,
which may lead to challenges for constrained devices. The protocol overhead on
the wire is, despite AMQP being a binary protocol, higher than other IoT protocols,
such as CoAP (see Section 22.5) or MQTT (see Section 22.8). For devices with suf-
ficient computing power and bandwidth, AMQP may be a good choice, especially if
complex messaging patterns need to be supported by the IoT application. Advanced
features like flow control are useful for making sure constrained devices are not
overwhelmed by messages they can’t handle. Implementing horizontally scalable
cloud backends with AMQP may be challenging, depending on which AMQP soft-
ware is used.
22.7.5 AMQP 0.9.1 vs 1.0
AMQP 1.0 was a big step forward towards a standardized AMQP version that is
universally applicable. The popular version 0.9.1 is still in use by many message
brokers. Version 0.9.1 and 1.0 are not compatible and need to be considered as
completely different protocols, feature-wise and also on the wire. AMQP 0.9.1
defines a brokered communication model while 1.0 allows a brokered and a peer-
to-peer communication. All information in this chapter is applicable for AMQP 1.0
as this is the official successor. Greenfield IoT projects that consider using AMQP
need to decide on the version of the protocol. The server and client implementations
are not compatible across AMQP versions.
22.7.6 Use Cases
The flexibility of the AMQP protocol in version 1.0 is useful for IoT communication
and as backend communication protocol. AMQP has additional complexity and
overhead compared to other protocols like CoAP or MQTT. It may be a good choice
for use cases like the following:
• Creating complex decoupled backend system architectures with different

messaging semantics
• IoT applications that require maximum flexibility on the protocol layer
• Custom application libraries that are built on top of AMQP
• As a wire protocol for messaging APIs like JMS
22.8 MQTT
The MQTT protocol is a publish/subscribe messaging transport protocol with

version 3.1.1 defined as ISO/IEC 20922:2016 standard. The protocol is designed for
maximum efficiency and lossless communication over lossy networks, a common
IoT use case. MQTT is a mature protocol that has been powering many SCADA and
Internet of Things applications for years. Although a first version of the protocol was
created in the late 1990s, it was not until 2010 that a first royalty-free version of the
MQTT specification was released to the public, which served as the foundation for
the standardized and open MQTT 3.1.1 specification in 2014.
MQTT is a messaging protocol that is very simple and flexible. It is suitable
for constrained devices and is also popular for mobile apps. The only requirement
for MQTT is TCP as transport protocol (see Section 22.3.4). TLS can be optionally
used for encrypting the communication channel (see Chapter 27). The rich feature
set, scalability characteristics, active ecosystem, and wide support on open and
proprietary IoT platforms positioned MQTT as an important IoT protocol for a
variety of use cases.
22.8.1 Publish/Subscribe
MQTT uses the publish/subscribe pattern (see Figure 22.7) that decouples the
sender of a message from the receivers of a message. A message broker delivers
all messages to the respective clients that are interested in receiving messages. In a
publish/subscribe system like MQTT, a sender of a message and their receivers do
not necessarily know of the existence of each other as all communication partners
communicate directly with the broker. A single MQTT client can act as both
publisher and subscriber for true bidirectional and decoupled communication.
The MQTT protocol uses subject-based filtering for message delivery to
the correct client by using topics as metadata for every published message. Each
published MQTT message contains a topic that is specified by the sender. The
MQTT broker ensures that all subscribers which are interested in that particular
topic receive the message.
A MQTT topic is a hierarchically and treelike textual data structure that
separates its topic levels via a slash (/ ) as separator. These topics are highly dynamic
and it is a common use case for MQTT that different devices publish on different
topics. Subscribing clients are able to select the messages via fine-grained topic
subscriptions that can also contain wildcards for multiple selections of the topic tree.
Figure 22.7 illustrates example topics and how they are represented hierarchically.
A
(3) publish subscriber 1
(1) subscribe
(2) publish (3)
publisher broker subscriber 2
(1)
(3)
(1) ...
subscriber n
B example topics: resulting topic tree:
a/sensors/heat/ heat
a/sensors/humidity
sensors
a/example humidity
a
example
Figure 22.7 Publish/subscribe pattern. (A) The publish/subscribe pattern decouples message producers
(publishers) and message consumers (subscribers). A central broker is responsible for forwarding
messages to the correct subscribers. A consumer indicates the interest in certain messages by subscribing
to certain topics on the broker. (B) MQTT topics can be represented as tree data structures. The three
topics on the left result in a topic tree as seen on the right side.
22.8.2 Protocol Characteristics
The MQTT protocol has many unique characteristics that are useful for IoT appli-
cations:
• Binary: The MQTT protocol is a binary protocol. Binary protocols are

optimal for the speed of transmission and interpretation as these bear less
overhead than text-based protocols.
• Efficient: MQTT is designed for minimal protocol overhead. The smallest
MQTT packets (heartbeat requests and responses) consist of 2 bytes, which
is negligible protocol overhead. Many protocol headers are represented as
bitmasks and even the indicated message size uses an optimized algorithm
for length encoding. The protocol does not bear any unnecessary overhead.
• Bidirectional: MQTT clients establish a TCP connection to the broker and
after the connection is successful, the broker is able to send messages to the
device and vice versa. A single MQTT client can publish and subscribe to an
arbitrary number of topics via the same connection and is able to send and
receive messages simultaneously.
• Data agnostic: MQTT does not prescribe a format for the payload in any
way. The payload format of MQTT is binary and a message can contain up to
256 MB of data. Popular choices for applications are JSON, XML, Protocol
Buffers, or custom encodings of the data.
• Scalable: The protocol has proven to be extremely scalable. Broker imple-
mentations exist for handling multiple millions of concurrent MQTT con-
nections simultaneously with very high message throughput. The decoupling
via publish/subscribe mechanisms is the foundation for scaling from a few
devices to a vast number without any modifications of the applications that
are using MQTT.
• Push communication: MQTT uses persistent TCP connections that are ini-
tiated by a client. The broker forwards MQTT messages to interested sub-
scribing clients instantaneously. The whole communication is event-driven
and designed for minimum latency.
• Suitable for constrained devices: The protocol is designed to be as easy
to implement on the client side as possible. There is a huge ecosystem
available and client software libraries are available from MCUs to high-level
programming languages like Java, Python, or Javascript.
22.8.3 Features
Besides plain publish/subscribe based communication, MQTT has a rich feature set
that has proven to be useful for a variety of Internet of Things applications.
• 3 QoS levels: MQTT defines three message delivery guarantees, called

quality of service levels (QoS) which can be configured for each message
that is sent by a client. These message delivery assurances are particularly
useful if a MQTT client session spans over multiple TCP sessions, which is
the case for clients that disconnect and reconnect frequently. On a protocol
level, MQTT ensures that QoS 0 messages are received at most once, QoS 1
messages are received at least once, and QoS 2 messages are received exactly
once.
• Retained messages: A publishing client can send a message as a retained
message, which marks this message as last known good value and the broker
retains this message for the topic. As soon as a new client subscribes to a
topic that contains a retained message, it will receive that message imme-
diately after the subscription is successful. This is useful if new subscribers
should immediately receive a message for a given topic instead of waiting
for new messages to occur.
• Topic wildcards: MQTT message topics are structured as trees (see Figure
22.7). In order to select whole parts of the topic tree to subscribe to a subset
of topics, wildcards are available. Multilevel wildcards (#) are available as
well as single-level wildcards (+). Multilevel wildcards select a whole topic
subtree while single-level wildcards match only one topic level in the topic
tree hierarchy.
• Last will and testament: A client can define a last will and testament (LWT)
message when connecting to the broker. The MQTT broker sends this LWT
message on behalf of the client in case it disconnects unexpectedly. Use cases
that require the notification of other clients if a specific client is disconnected
are easy to implement with this functionality.
• Persistent sessions: Clients can connect to the broker with either a clean or
persistent session. Clean sessions are invalidated on the broker as soon as
the client disconnects. Persistent sessions for MQTT clients can be resumed
as soon as the client reconnects. It is the duty of the broker to remember
all session details like existing topic subscriptions and unfinished message
delivery flows for that particular client.
• Message queuing for offline clients: A MQTT broker queues all QoS 1 and
QoS 2 messages that a client with persistent session missed while offline.
• Dynamic Topics: Topics are lightweight and many thousands or even millions
of topics can exist in a MQTT system. All topics are created dynamically by
publishing or subscribing to a specific topic without any prior administrative
action.
• Heartbeats: Unreliable networks often create half-open socket problems,
which happens if one end of the TCP connection is not available anymore
without notifying the other end. Besides a stalled communication channel,
this can cause a waste of resources for the client or broker. To solve this
problem, MQTT has application protocol level heartbeats that are config-
urable by each client. If a communication partner did not send a heartbeat
message (or any other MQTT message) for that configured time frame, the
broker or client assumes that the communication partner is offline.
• MQTT over websockets: Most MQTT brokers support MQTT over websock-
ets in addition to the standard TCP transport mechanism, which is useful for
web applications as this allows the use of MQTT for push communication to
the web apps. A true device-to-browser push and vice versa is possible and
virtually every web application can behave as a MQTT client.
22.8.4 Use Cases
The main use cases for MQTT are classic IoT scenarios that require an event-driven
experience with push messaging, especially if reliability over unreliable networks
(e.g., mobile networks) is needed. MQTT shines for connecting constrained devices
to the Internet. The protocol is popular for use cases with low bandwidth and high
latencies as well as connecting devices with mobile applications and web appli-
cations over a central communication channel. There are broker implementations
available that scale to millions of concurrent MQTT devices.
22.9 OTHER PROTOCOLS
In addition to the protocols that were covered in this chapter, a plethora of other IoT
application layer protocols exist for specialized use cases.
In the context of the Industrial Internet of Things and Industrie 4.0, protocols
like the Data Distribution Service (DDS) and OPC Unified Architecture (OPC
UA) are widely used for M2M communication. These protocols are not general-
purpose protocols which are used for IoT communication but have their strengths
in industry-specific scenarios. DDS and OPC UA are comparatively complex and
require a steep learning curve compared to the protocols discussed in this chapter
due to their domain specific and rich feature sets, including semantics for data.
However, both are industry standards in their respective fields and have gained
significant traction.
Other protocols without widespread adoption also exist, like Blocks Extensi-
ble Exchange Protocol (BEEP) or Very Simple Control Protocol (VSCP), as well
as MQTT for Sensor Networks (MQTT-SN). These protocols can be considered as
IoT protocols for specialized use cases.
22.10 CHOOSING AN IOT PROTOCOL
Application developers for IoT applications face the challenge of having to choose
between a variety of suitable protocols for their projects. All of the protocols dis-
cussed in this chapter have a place in the IoT world and depending on requirements,
one of these protocols will be the more natural choice. None of these protocols is a
silver bullet; hence, it is a good thing that all these protocols are available as choices
in the IoT toolbox. Table 22.4 summarizes key facts about each protocol.
Table 22.4
IoT Protocol Comparison
HTTP CoAP AMQP XMPP MQTT
Architecture Req/resp Req/resp Point- Req/resp, Pub/sub

to-point, Pub/sub
Pub/sub
Representation Text, Binary Binary Text Binary
binary
Transport TCP UDP TCP TCP TCP
Security TLS DTLS TLS TLS TLS
(Transport)
Discovery No Yes No No No
(XEPs exist)
Scope D2C D2D D2C, D2C, D2C,
C2D C2D C2D
Suitable for Possibly Yes Possibly Possibly Yes
Constrained
Devices
Server Updates Pull Pull, Push Push Push
Push
Abbreviations: C2D=Cloud-to-Device, D2C=Device-to-Cloud; D2D=Device-to-Device .

Chapter 23
Backend Software
The backend of an IoT solution provides the integration layer for connected end
devices. That is, the backend collects data from sensor devices, it routes control
commands back to actuators, it saves data for long-term storage, and it forwards
relevant information (either raw or processed data) to frontend applications that
run in web browsers or on mobile devices. No matter what the actual technical
implementation looks like, the backend is the hub in the center of a star network
that mediates between all other players of an IoT solution. Conceptually, there
is no difference between an IoT backend that runs in the cloud, on on-premises
infrastructure, or directly in a local network (see also Section 8.4). However, with
the overarching aim of integrating information across otherwise isolated silos, a
cloud-based IoT platform may be preferred over a local installation.
23.1 IOT PLATFORM SERVICES
Most developers of IoT solutions may opt for a commercially available IoT platform
service. From a customer perspective, there are two different types of services,
highlighting that the term IoT platform often means different things to different
people:
• PaaS, from established cloud providers such as Amazon, Microsoft, etc.
• SaaS, often built on PaaS, such as Xively, Evrythng or Opensensors, etc.
The first type provides platform-as-a-service (PaaS, see Figure 8.2) for other
software developers and makes accessible functions for ingestion, storage and
processing of data from IoT devices. In contrast to building IoT software from
345
scratch using open-source components (e.g. with the SMACK stack consisting of
Apache Spark, Mesos, Akka, Cassandra and Kafka), these PaaS offerings wrap a
considerable amount of convenience code around core functionality, so that their
users can develop IoT backends with less effort. What these IoT backends comprise
is entirely dependent on the requirements of the user, and many developers may
prefer a PaaS over a software-as-a-service (SaaS) solution for the ability to write a
completely self-contained piece of software.
The second type, SaaS, is turn-key ready IoT software that can be adjusted
to new sensor devices and graphical representations in a web browser. However,
in contrast to PaaS, these solutions often attempt to provide one-size-fits-all func-
tionality. On this level of abstraction, there are certain compromises to be made.
While PaaS-based software can be built toward a specific use case, especially in
terms of a data model, the look and feel or the interface to frontend applications,
a SaaS-based backend often has one particular data flow, one look and feel, and
relatively inflexible APIs. The obvious advantage of SaaS is that these platforms
can be configured and used by lay people, in contrast to PaaS offerings that require
considerable programming skills.
The principle functions of an IoT backend are input, storage, processing and
output. These are the most abstract building blocks, which are be implemented
with a level of detail required to represent the properties of the devices as com-
prehensively as necessary. This is most intuitive when considering storage: When
designing an IoT backend for a connected thermostat on the basis of PaaS, one could
think about writing to a disk table with the columns thermostat ID and temperature.
With a SaaS-based solution, thermostat identity and temperature remain on the ab-
straction level device ID and measurement (and those terms would be used when
accessing the device through an API), and some of the business logic in a frontend
app needs to be dedicated to map these generic terms back to the requirements of
the application.
At the time of writing, there are at least a dozen major cloud providers
with specific IoT PaaS offerings in the market, along with over one hundred SaaS
platforms that can be adjusted to particular use cases. This set is complemented by
domain- and vertical-specific platforms for building control, asset tracking, and so
forth, totaling an approximate 700 IoT platforms with a huge scope range.
Backend Software 347
23.2 FUNCTIONS OF AN IOT BACKEND
The precise implementation of the IoT backend is dependent on the use case. Figure
23.1 puts the building blocks input (message broker), processing (e.g., analytics)
and output (storage, interface to apps) into perspective to each other. It should be
noted that this schematic represents just one possible flow of data, and that not all
backends require these building blocks, but may need others instead.
23.2.1 Message Handling
A message broker is an event-oriented middleware for the asynchronous routing of

messages that is fed by a queue; and from one source process to many destination
processes. This message distribution is either a simple one-to-many replication,
but can also involve routing based on content: the diversion of device-specific
information in Figure 23.1 would be an example for this. Depending on the protocol
used by the broker, the destination processes do not necessarily have to reside on the
same hardware, but can be distributed in the entire accessible address range within
a network. As such, the broker is at the core of many distributed backend systems
such as IoT platforms. This decoupling (also called command query responsibility
segregation [CQRS]) is useful for a variety of reasons. First, machines running slow
processes involving disk input-output (such as databases) can be built differently
from those that do not require disk operations, but fast computation. This is also
referred to as loose coupling between the broker and services that subscribe to
it. Second, with a broker one can implement redundancy and disaster recovery
strategies and dynamically switch between machines. Third, a set of brokers on
different machines can also contribute to scalability, as a logical funnel consisting
of many brokers (for example, 5 ! 3 ! 2 ! 1) can distribute the data streams of
many incoming processes to machines with the highest remaining capacity. This is
a way of load balancing.
In the context of the IoT, a message broker is not only a module for internal
communication, but primarily responsible for data ingestion. That is, rather than
taking messages from local processes, end devices outside the network represent
endpoints and send data to the broker, which then distributes them internally. While
local message passing knows a variety of protocols and mechanisms, all of which
take into account the difficulties of developing for distributed systems (see Section
20.1); in the context of complex backends for the IoT the publish/subscribe pattern
is the most prevalent. IoT protocols are outward facing (i.e., accepting data from
devices), as discussed in Chapter 22.
long-term
storage
end devices message stream

firewall
(sensors) broker processing
end devices
(actuators) device real-time current
info analytics data
device
control user and access management
RESTful API
'core platform' firewall
web app mobile app
Figure 23.1 Functions of an IoT platform. To highlight the different functions of IoT end devices,
both sensor and actuator identities have been kept separately. First, crossing security measures such as
a hardware and/or software firewall, data is ingested at the message broker, the input component of
the platform. Data is then routed toward multiple blocks with distinct functionality. A long-term storage
component may just save raw measurements from a sensor device to disk. Disk operations are potentially
slow; hence the same data is also routed toward a stream processing component. Data from the stream
may just be held so that frontend applications can request a most recent measurement without the need to
query the comparatively slow disk storage. Simultaneously, real-time analytical functions may provide
basic descriptive statistics over a rolling time frame, or run more sophisticated computational methods
on it. In a predictive analytics scenario, the output of the real-time analytics component may be relevant
for device control. The device can also be controlled via a web or mobile app using the platform API.
The device control unit is also aware of the most recent device diagnostics (e.g., most recent time stamp,
battery status, IP address). From the perspective of a frontend app, once requests have passed through
a firewall, API queries are authenticated against a user and access management system. If sufficient
privileges are available, the API can access end devices via the device control unit, retrieve current or
processed data from real-time analytics, or request historical device data from the long-term storage.
23.2.2 Storage
Much of the data ingested from IoT devices may only be of immediate value (i.e.,
it may only be useful when used directly). However, other data may be useful when
agglomerated and analyzed over a longer period of time, if the value of it is clear at
all. The most basic tuple of data gives
• Information on the sensor that a measurement originated from
• A unique identification that distinguishes the sensor from other sensors
• The time stamp of the measurement
• As well as the measurement itself
Knowledge about the type of sensor allows a later inference of the reliability and the
unit of the measurement, the identity of the sensor marks it as valid data point and
may allow the localization of the measurement, and the time stamp clarifies either
when the data has been taken and/or when it has been received at the server, and the
measurement stands for itself.
A trivial, nonredundant way of storing this information on a computer system
would be to maintain three text files: one keeping the actual measurements, and
two auxiliary files with information that can be associated with that main table (see
Figure 23.2). However, consider that the main table alone could be several million
lines long, featuring thousands of different sensor identities and several hundred
instrument types. While writing the data to disk in a flat file may be fast, searching
for the data points that fulfill certain criteria requires parsing every line of data.
Hence, most typically there is a database system employed in the backend of an IoT
solution.
23.2.2.1 Relational Databases
Relational databases (or relational database management systems, RDBMS) borrow

conceptually from the simple table solution outlined in Figure 23.2. However, the
data is not saved to disk in raw text format, but more often in compressed and
structured binary formats that allow quick access to particular data points. This is
primarily facilitated by additional indices, optimized data structures that associate
particular values or value combinations with the respective data points’ positions in
the large storage file. When a search is executed on a table, the RDBMS can quickly
locate the relevant data points from a lookup in the index, and directly retrieve these
values by jumping to the respective position in the storage file.
a simple table of measurements

sensor sensorID time stamp measurement
thermometer IDtmp0001 15/08/2016 15:45:12.57 23.6
microphone mph3452 15/08/2016 15:46:21.11 16
thermometer IDtmp0003 15/08/2016 15:52:00.74 21.2
from a table on from a table on

instruments: assets:
thermometer IDtmp0001
type: room temperature asset type: thermometer

measures: temperature location: kitchen
unit: C installed: 01/01/2012
precision: / 0.5 C
manufacturer: XYZ
model: 123
Figure 23.2 A set of tables describing measurements of several IoT devices and their associated
information. Arrows refer to relationships between these entities.
The term relational stems from the design feature that tables can be conceptu-
ally linked or related to each other. In the exemplary case, the instruments and assets
tables would be linked to the measurements table via so-called keys (i.e., sensorID
in the assets table refers to sensorID in the measurements table and so on). Good
relational database design follows the ACID rules: atomicity, consistency, isolation
and durability. Atomicity ensures that there is no redundancy in the database and
that tables are designed to represent the smallest level of granularity. In our exam-
ple, this means that instrument and asset values should not be repeated in the main
table. Consistency and isolation mean that even though the RDBMS may introduce
changes on several tables simultaneously, it occurs to the user as if the system had
been halted to sequentially and consistently update the tables. In the example, when
updating the sensor name, this should happen both in the main and the instruments
tables. Durability refers to strategies that ensure that data is consistent even if a fatal
crash should occur during the update process.
Commonly, RDBMS are also referred to as Structured Query Language
(SQL) databases. SQL is an ISO 9075 standard to operate RDBMS, featuring com-
mands ranging from database creation to controlling indices, importing and updat-
ing data, and most importantly, to querying the data. A simple, exemplary retrieval
statement may look like

SELECT sensorID, time stamp FROM measurements WHERE sensor = ’thermome-
ter’ and measurement = 20
SQL features the JOIN command to concatenate tables for complex queries, and
allows the retrieved data to be agglomerated (GROUP BY) or ordered (ORDER
BY) for display.
23.2.2.2 NoSQL Databases
More recently, NoSQL (not only SQL) has gained popularity for a number of
applications. NoSQL databases differ from RDBMS in that they do not follow the
concept of interlinked tables, but represent the data as alternative data structures;
for example
• Key : value store (a hash, or dictionary)
• Graph (key ! relationship ! value)
• Tuple (more complex structures)
For example, the so-called key : value (KV) storage aims to provide very fast access
to values on the basis of a key word. Although using different internal mechanisms
as RDBMS indices, their function is comparable. One aim of the KV paradigm
is to represent all data so that it can retrieved with a key. The side effect of this
strategy is redundancy, as many NoSQL databases do not support the notion of
a JOIN for the sake of speed. In order to play out their main advantage of rapid
retrieval even further, many KV stores also attempt to keep as much as possible of
their data in memory. As the physical random access memory of a single computer
is more limited than disk space, this often requires partitioning the entire data set
across multiple computers. In reference to the CAP theorem (see Section 20.1),
many NoSQL systems compromise on consistency in favor of absolute availability.
That is, a query may be processed by any computer holding parts of the KV system,
providing an answer without the guarantee that it is consistent. Eventual consistency
refers to the concept that in the absence of further writes, all nodes of the distributed
database infrastructure will eventually receive all of the data. In analogy to ACID
in RDBMS, there is the notion of BASE (Basically Available, Soft State, Eventual
Consistency).
NoSQL databases are often schemaless. Traditionally, the schema referred to
the structure that RDBMS enforced on a database designer. As the data in NoSQL
databases is not represented in tabular format and a key could be associated with
any given number of arbitrary objects (some NoSQL databases indeed even support
a form of key ! document mapping), they are thus schemaless.
23.2.2.3 Time-Series Databases
Time-series databases often lend themselves to storing sensor measurements in an

IoT backend, as every incoming value is associated with a time stamp (and if not
from the sensor itself, then the time of receipt). While date/time data types are
supported by all major RDBMS, their use can be rather cumbersome when it comes
to retrieval of data. This is where dedicated time-series databases shine: In contrast
to RDBMS where date/time queries often require additional effort; for example, to
include terms along the lines of NOW() -INTERVAL 7 DAY for the retrieval of the
past week’s data, concepts of time points, time intervals, and even geographic time
zones are much better represented in the query language of time-series databases.
Furthermore, the data structures used internally are optimized for searches involving
the various time concepts, often including summary statistics for derived concepts
such as peak time for measurement, and so forth. In practice, many time-series
databases are built on NoSQL principles.
Chapter 24
Data Analytics
The computational analysis of data, or data analytics, is not intrinsic to IoT but a
field of general importance. The statistical and algorithmic methods used for IoT
data analytics are by no means special and can be found in other areas as well,
ranging from digital signal processing to business intelligence. Five issues often
invoked in the discussion of IoT data analytics are the
• Volume
• Velocity
• Variety
• Veracity
• Value
of the incoming data. Although data points from individual sensors may be small,
accumulating the data of thousands or millions of end devices yields a considerable
volume. At the same time, this data come in at the backend concurrently; that
is, sophisticated ingestion strategies may be required (see Section 23.2.1 for the
role of the message broker in an IoT backend). To leverage the data across many
different device classes, maybe even across different application domains, one has
to engage with a huge variety of data. This extends into semantic differences, and
a thermometer for a blast furnace delivers data of a different quality and quantity
than one a medical doctor would use (see Section 25.2 on how ontologies can help
address these issues). With end devices deployed in regions with shaky Internet
connectivity, questions of veracity arise: Is the data recent? Is the data reliable? And
finally, while the data may be of value now, how fast does that value decay if it is
353
what is the
hours to weeks best weather
for flying? - in database
- strategic insight
how many - machine learning
minutes to hours times did I
stall?
- in process
seconds to minutes battery level
- performance insight
should I land?
- summary statistics
am I falling? - operational insight
microseconds to seconds
counteract - rules engine
- signal processing
(e.g. Kalman filter)
on device on stream in batch
Figure 24.1 Scope and time frames of data analytics. The various sensors delivering data on a drone
have primary, secondary and higher-order functions. At the first level, a sudden change in location or a
steep rise in barometric pressure may indicate that the drone is falling. As the sensors involved constantly
fluctuate around the true value (for example, see Section 15.2.1 for the limitations of GPS), the signals
from such sensors must be processed in real-time within microseconds to seconds to prevent the drone
from crashing. This can only be achieved if the calculation is executed directly on the drone, as the
latency of an Internet connection would certainly prohibit a feedback loop including calculations in the
cloud over such a short response time. Also, given the amount of primary data generated by certain
sensors, their transmission may be entirely impracticable. In the context of the IoT, calculations on the
end device are also referred to as edge computing.
not immediately processed? There is only so much delay a preventive maintenance

solution can afford before being obsolete. However, while all being valid points, the
5V listed here are exactly those often associated with big data in general. At this
point, it is important to mention though that IoT does not automatically yield big
data, and that even smaller data contributes to the value of an IoT solution.
24.1 WHY, WHEN AND WHERE OF IOT ANALYTICS
Analytics is typically required to react to conditions that are not immediately

obvious from the observation of raw data alone. The time between when a data
point is generated to when it requires analysis largely depends on the type of insight
that is required. This has a direct impact on where such analysis must be performed:
the flight of a drone provides an example of how analytical questions, time frame
and computational environment are linked (see Figure 24.1). Suppose the drone is
Data Analytics 355
remote controlled using a mobile phone, which is able to receive the data from some
sensors (either raw or preagglomerated at the edge) and forward it to the cloud.
The phone thus represents a gateway device. Computation at this level, halfway
between the edge and the cloud, is occasionally referred to as fog computing, an
active area of product development at the moment. In the example, on the phone,
but more commonly in the backend, the data stream can be processed while it is in
transit (i.e., before it is stored into a database). This is also called stream analytics.
In this case, by simple thresholding, the drone pilot can be informed of a low-
running battery and respond accordingly. Other operational insight gained by stream
analytics may include warnings if the drone leaves a geofenced area. This is insight
relevant (actionable, as we can do something about it) within seconds to minutes.
Some data may only be useful if agglomerated over a longer period of time.
While the flight of the drone continues, over a time frame of minutes to hours,
performance insight may be gained by processing small batches (microbatches)
of data already stored in the database. The average number of near-crashes over
a period of time is an example, executed as a running mean of overlapping or
nonoverlapping windows.
Ultimately, one can gain strategic insight by more sophisticated computa-
tional methods, such as unsupervised or supervised machine learning. These meth-
ods are typically applied to large amounts of data and allow the extraction of key
information by the integrative analysis of many, often diverse data. The assessment
of which weather conditions are best for flying a drone would be an example.
24.2 EXEMPLARY METHODS FOR DATA ANALYTICS
A comprehensive overview of the wealth of computational methods for data analyt-

ics is beyond the scope of this book. Data analytics is the realm of data scientists,
and as the job title suggests, the scientific method and an experimental approach
is key to successful data analysis. However, to give a conceptual idea of how po-
tential data analytical methods work when used in an IoT solution, examples from
signal, stream and batch processing are presented. It is important to note that the
mechanisms shown here are exemplary for the respective layer of an IoT solution;
that is, while in this book supervised learning is associated with batch processing in
the backend, there are viable solutions that use a pretrained classifier (see Section
24.2.3.3 for the language of machine learning) on a data stream.
Table 24.1
Pros and Cons of Computing on the Edge or in Fog or Cloud
Pro Con
Edge — response Immediate compression from Loses potentially valuable raw

time: immediate raw data to actionable data, developing analytics on
information, cuts down traffic, embedded systems requires
fastest possible response specialists, compute costs
valuable battery life
Fog — response Many of the pros of edge Loses potentially valuable raw
time: milliseconds computing, closer to normal data
development work, gateway
often mains-powered
Cloud — response Computing power, scalability, Traffic

time: seconds familiarity for developers,
integration center across all
data sources, cheapest
real-time option
Data Analytics 357
24.2.1 Exemplary Methods for Edge Processing
The Kalman filter (after Rudolf Kalman, ca. 1960) and the Bloom filter (after
Burton Bloom, ca. 1970) are both useful to preprocess information on the edge.
While the Kalman filter helps to determine a true value from a series of erroneous
measurements as quickly as possible, the Bloom filter is per se not an analytical
method, but helps to determine with limited memory if a value has been previously
measured.
24.2.1.1 Bloom Filter
In an IoT context, the Bloom filter has advantages if only the extent of values
is of relevance in the backend and if data transmission is very expensive: even
without the massive storage of historical data, using a Bloom filter on an embedded
system can decide whether transmitting a particular value is necessary to achieve
complete backend coverage. In the simplest terms, the Bloom filter is a specialized
hashing technique that is more memory-efficient than a standard dictionary. It works
by taking a probabilistic model into account, rather than applying just a simple
! mapping, which is more accurate but impractical on systems with limited
memory.
24.2.1.2 Kalman Filter
The Kalman filter is an estimation method that can be used to converge quickly
on the true value from a series of noisy measurements, much quicker than the
comprehensive post hoc analysis of all data points would allow.
The exemplary use of the Kalman filter for determining the actual position of a
GPS receiver on the basis of noisy measurements will serve to explain the principles
behind its function. For reasons of simplicity, suppose it is only the position along
a line that is of relevance. Over time, , we move along the line. However, since
GPS localization is prone to error, at each in our measurement M EA there is
going to be a deviation from the actual position . Starting at a position somewhere
off our true , over the duration of , the Kalman filter determines a model of our
trajectory (our ) by an iterative, weighted averaging of an estimated location EST
and the updated but noisy measurement M EA (see Figure 24.2A). The Kalman
gain (Figure 24.2B), , is the ratio between the error of the previous estimate, as
well as our assumed error in the measurement (say, we could fix this to ±
until we converge). If is 0, we neglect the current measurement M EA and give
A B
Kalman gain =
error in estimate error in measurement
for t = k for the analysis error in estimate
error in estimate +
error in measurement
calculate Kalman gain measurement from t = k
calculate current estimate estimate from t = k-1 0 estimate reliable,

or measurement
estimate unstable
calculate new error in
estimate, for k+1 at t = 0
measurement reliable,
1 estimate unstable
updated estimate for t = k
C
probability distribution
for x
at t = k-1
t = k-1
x
t=k at t = k
(taking Kalman
gain from at k-1
t into account)
actual measurement
true course
Figure 24.2 Kalman filter. (A) The Kalman filter is an iterative process that takes into account a current
measurement as well as a previous estimate and its associated error. By providing an intuitive weight,
the Kalman gain (shown in (B)), a new estimate can be calculated on the basis of a weighted average
between the previous estimate and the current measurement. (C) A probabilistic view of the Kalman
filter highlights how the Kalman gain can bring two probability distributions for the previous estimate
and the current measurement to convergence.
Data Analytics 359
the previous estimate EST more weight, or vice versa. Formally, this means our
estimate for at is
ESTt ESTt−1 · M EA − ESTt−1
We also determine the error of the estimate ESTt−1 in comparison to the

measurement as
M EA · ESTt−1
ESTt
M EA ESTt−1
This means effectively that, over a number of iterations, the estimated EST
becomes the true , robust to momentary fluctuations of M EA . Figure 24.2C shows
a probabilistic take on the Kalman filter.
The Kalman filter is of great practical relevance, both as on-board implemen-
tation in sophisticated sensor devices that internally compensate for error, as well
as in cases where sensor values need to be interpolated due to signal loss , and so
forth.
24.2.1.3 Other Filters
Without going further into their function, DSP methods such as wavelet and Fourier
transformation play a role in cases where the spectral analysis of frequencies is
key to extracting information from a sensor. For example, DSP methods are useful
for the processing of audio and video data to enhance particular sounds or visual
features, as well as dissecting relevant information from high-frequency vibration
sensors. While the traditional Fourier transformation (FT) requires a large number
of summations of trigonometric functions, the so-called fast FT (FFT) takes the base
2 of digital computers into account to provide relatively fast approximations.
24.2.2 Exemplary Methods for Stream Processing
Into this section fall methods that operate on data in motion. In principle most
algorithms do not and cannot differentiate between data retrieved from storage
or data coming from a live Internet connection. However, just as the Kalman
filter benefits from more data over time and works well for streams, many other
computational methods are designed with complete data in mind. That means,
although every single value may be equally important, some methods can only
function once they have seen the data at least once in its entirety. This makes stream
processing challenging for a variety of applications. As mentioned before, it is

feasible to apply a machine learning classifier on a data stream, although its training
is going to depend on access to a significant proportion (a batch) of previously seen
(and stored data).
24.2.2.1 Rules Engines and Rule-Based Systems
Rules engines and rule-based systems make decisions upon incoming data. In the
simplest case, this may be an IF statement that leads to an alarm mechanism if
certain criteria are fulfilled. However, while not specific to IoT applications, the
streaming characteristics of IoT data and the buildup of information from raw data
over time (see Figure 25.1) are key challenges for rules engines: While simple
incarnations that operate on complete datasets only need to consider the priority
of statements (e.g., IF this1 THEN that1 , EXCEPT IF this1 AND this2 , THEN
that2 ), the execution of a complicated rules set that is further dependent on a
temporal component can require considerable engineering effort that include time-
out thresholds, and so forth. For example, this means that this1 and this2 do not
necessarily arrive in this order, and matters are more complicated if the system is
allowing for longer time frames and permutations of conditions.
Also, conflicting decisions must be resolved. Suppose the rule for battery
low in the example shown in Figure 24.1 is land immediately, while the rule for
location outside geofence indicates to fly back to the original position. One way of
resolving such issues is the consequent execution of Boolean algebra on large sets
of rules with the smallest logical atoms. Further, more probabilistic action selection
methods can be used.
24.2.2.2 Action Selection
Directly related to rules engines are action selection algorithms. These are conceptu-
ally borrowing from behavioral science as well as machine learning and operations
research. Suppose there is a number of different possible paths of action on the
basis of a current observation, and each path is associated with a cost (e.g., battery
run time). For a given situation, there might be an optimal path of response. In the
example initiated in the previous section, while battery low suggests the immediate
shutdown of a motor, the drone may still be able to steer back into a geofenced area
if motor power is reduced by just half. Whether this is a sensible response can be
modeled by delineating various actions and outcomes in the form of a graph. Each
node in this graph is associated with a cost, and the transition to other nodes can
Data Analytics 361
be favored or prohibited by the current data (e.g., the current position). While such
network algorithms require more calculations than a Boolean rules engine, they are
naturally less dogmatic and more flexible in response to uncommon situations. This
also means that rules engines can easily be executed even on IoT gateways with
limited power, while sophisticated action selection methods have their place in the
backend.
24.2.3 Exemplary Methods for Batch Processing
Batch processing works on batches of data (i.e., amounts that have been accumu-
lated and stored over a longer time frame). Once data is available in its entirety (or
at least providing a reasonably representative snapshot), the statistical properties of
these data can be exploited for human and computational decision making.
On the simplest level this includes basic descriptive statistics; that is, mea-
surements of
• Average
• Median or mode
• Standard deviation
and other metrics of univariate (or multivariate in case of many potentially interde-
pendent variables) analysis.
24.2.3.1 Descriptive Statistics
The average (mean, ) of a series of measurements f ng is the sum of these

measurements divided by the number of observations in this series, or
n
i
i
As the average can be skewed by extreme outliers, frequently a median or a mode

is also reported. The median is the most central value when all observations are
presented in an ordered list, while the mode is the most common value that can
be observed. In the absence of a histogram or table of percentiles that display the
distribution of the data, the standard deviation (SD, ),
√
∑n
i i−
helps to how widely values are distributed around the mean. If data is normally
distributed (i.e. all values fall around a mean in a bell shape or Gauss curve), the 68-
95-99 rule says that 68% percent of all observations fall around 1 standard deviation
(SD) around the mean, 95% around two-times SD around the mean, and almost all
data (99.7%) around three-times SD of the mean (Figure 24.3A). Many statistical
tests indicate significant difference from such normal distribution (there are many
others) if values fall outside the three-times SD boundary.
Common ways to determine if many variables (e.g., measurements from
two different sensors) are correlated or independent of each other are the visual
representation of data in the form of a scatter plot, the calculation of various
correlation coefficients, and the analysis of conditional probabilities (see Figure
24.3C). A further explanation and calculations of these metrics is not difficult, but
beyond the scope of this book.
24.2.3.2 Statistical Reasoning
Statistical reasoning allows making decisions on the basis of data. While descriptive
statistics primarily aims to highlight the properties of data, inferential and inductive
statistics help to understand the differences between groups of data, and whether
these differences can be explained by chance. Interpolation and extrapolation are
statistical methods to infer a dependent variable: for example, using the regression
model shown in Figure 24.3B, one can calculate for an that is between (inter)
or outside (extra) of the set of previously observed values of . These models are
inferred by regression.
Statistical reasoning is part of the toolset that data scientists require to infer
and extend a conceptual model behind data. For example, this may be as simple as
comparing the distributions of a particular machine parameter between a functional
and a faulty state using a statistical test. If the difference is statistically significant
(often communicated with a p-value), this parameter alone may be indicative of the
machine failure. Again, it is important to note that a significant finding indicates a
certain probability that our hypothesis is true, but it is not an actual observation.
Regression analysis attempts to find a mathematical model that explains the
dependence of two variables, and . In the simplest case, as shown in Figure
24.3B, this is a linear relationship following · . However, more complex
polynomials (e.g. · · ) are also possible. The coefficients for
, and so forth are determined to minimize the sum (or often squared sum) of
differences between the actual data and the model. For very complex data, it may be
necessary to separate the data points into intervals of and infer different models
Data Analytics 363
A B y
mean
cc = 0.9
R2 = 0.9
R2 = 0.7
cc = 0
standard x
-3 -2 -1 0 1 2 3 deviation
C
68%
# data
95% points under filter criteria
99.7%
all
boxplot
temperature
Figure 24.3 Descriptive statistics. (A) Median and standard deviation in the context of the Gaussian
distribution. The mean is the center of the distribution and values are distributed symmetrically within
the boundaries of several standard deviations. The boxplot, also called box-and-whiskers plot, is an
alternative graphical representation. In the absence of outliers, mean, median and mode all fall onto the
same line. (B) A scatter plot of two variables, and . Shown are three different observations (triangles,
black and gray circles). The values of the triangular series are independent of each other: changes in
have no impact on , the correlation coefficient, , is 0. In contrast, values of the series of black
circles fall mostly on the diagonal (with a slight offset on the -axis, the component in the equation
below), suggesting a correlation of and , as indicated by a of 0.9. When fitting a linear regression
model to the data, there is good agreement between our generalized model · and the data,
shown as 2 . If the measurement of and was more noisy, and values scattering around our
model (indicated by additional grey circles), then 2 would be lower ( 2 = 0.7), making the goodness-
of-fit questionable. (C) Conditional probabilities help making sense of data. The example shows the
distribution of temperature values across the entire data set (black line). When filtering the data set for
those that belong to a particular geographic area, the mean and standard deviations change dramatically.
Data conditioning can therefore have significant impact on further data analysis.
for each interval. The process of curve fitting is mathematically well understood and
implementations can be found in virtually all programming languages.
In the context of the IoT, these models are of great utility to detect outliers
(i.e., values that do not fall onto an idealized curve). If absolute differences between
the model and the data are sufficient to predict, for example, a device failure,
statistical reasoning in combination with a rules engine can be used to trigger
warnings, and so forth.
24.2.3.3 Machine Learning
Machine learning methods are an extension of statistical reasoning to understand,

model and classify data. The difference between traditional statistical reasoning
and classification is best exemplified: While in the previous paragraph a manually
inferred absolute difference would be sufficient to distinguish between a normal and
a faulty device state, in machine learning the statistical properties of the data itself
allow the computer to infer such thresholds, i.e. which delta would count as critical
to trigger an alarm.
Unsupervised Learning
Unsupervised learning aims to explore multivariate data and provide data scientists
with summaries and graphical representations that can be used to gain an overview.
However, also anomaly detection that does not require previous training such as
classification falls into the category of unsupervised machine learning methods.
Unsupervised Learning: Clustering
Clustering approaches are the most widely used methods for visual data exploration
(see Figure 24.4). Given data in some matrix form; for example, items (rows)
versus properties (columns), clustering reorders the data by similarity, so that the
most similar items and properties are the closest in the matrix, and the most
dissimilar items and properties are the furthest away from each other. In the
case of hierarchical clustering, the relationship between particular groups of data
(the clusters) can be visualized using a tree-shaped dendrogram, often along with
a heatmap of the data to further highlight such groups graphically. Alternative
clustering strategies such as k-means group data into a fixed number (here: ) of
categories, so that items would be assigned to the statistically most obvious groups
of arbitrary property combinations. Only indirectly related to clustering but often
Data Analytics 365
A B C
PC2
passengers
passengers
ro
windows
windows
booster
booster
wheels
wheels
power
power
bu
tr lo
1 0 2 0 0 scooter
ca PC1
1 0 2 0 0 bike sc
5 6 4 3 0 bi
car
30 30 12 4 0 bus
90 50 96 5 0 bus, train principal component
train
1 3 12 6 0 lorry plot
0 0 0 9 2 rocket
scooter, bike, car,

lorry, rocket
average distance k-means clustering

hierarchical clustering (k = 2)
Figure 24.4 Clustering and data exploration. (A) Example of the hierarchical bi-clustering of a matrix
featuring vehicles as items and their respective properties. The distance between items (and between
items and other clusters, or between clusters) is determined by the correlation coefficient of their property
vectors. It becomes clear while hierarchical clustering and the display of a heatmap (indicated here by the
actual values) can help understanding the structure of data, it cannot completely reflect the relationships
between the items. The same goes for the clustering of the properties. (B) k-means clustering with
a deliberately small , dividing the items into two classes on the basis of the most dominant
differences. (C) Principal component plot of the two main components. This separates the items by an
abstract measure of similarity and helps to visualize the homogeneity of the data.
used in combination is the principal component analysis (PCA), a method that

aims to recognize the variance of the properties and orders them according their
decreasing power to differentiate between items. The plot of the first versus the
second component of a data matrix often helps to visualize the homogeneity of a
dataset.
Unsupervised Learning: Anomaly Detection
Anomaly detection in the absence of a previously trained pattern as used in super-

vised learning can be implemented in a variety of ways. Some methods are based
on the clustering of data, testing each item or property in the matrix for membership
to a cluster. Hierarchical clustering in particular knows intuitive distance measures
replicator neural network

compressed,
input sequence output sequence
abstract form
(resembling input)
position 1 output 1
position 2 output 2 w4;1
position 3 output 3 w4;2
... ...
w4;n
position n output n
network layer 1 2 3 4 5
Figure 24.5 Artificial neural network. A simple neural network featuring input neurons (network layer
1), hidden layers (2–4) and output neurons (network layer 5). Data from the input is passed from layer to
layer until it reaches the output. At each layer, the input is the weighted sum of the respective crosswise
inputs. The output is determined by the activation function, which determines whether or not a neutron
should fire if its output is required. The aim of a replicator neural network is, after some training involving
back-propagation in which the weights for each layer and position ( l;p ) are determined, to represent
a sequence of length on the output that it similar to the input sequence. Because of the bottleneck
hidden layer 3, some information contained in the raw data is discarded, phenotypically leading to an
averaging and denoising of data. This also means that the network is robust to minor changes in the input,
but would require retraining for larger changes. This in turn means that if the output cannot be reliably
represented with a pretrained network, there is likely an anomaly or unusual input that deserves closer
inspection.
between items and clusters, and items or properties that fall outside their usual group
can be flagged for manual inspection.
Strictly speaking, belonging to supervised learning techniques, simple neural
networks referred to as replicators internally represent input data in a compressed
form. This is conceptually very similar to a hashing function, but instead of a simple
key a complex combination of input values is used to feed the mapping. In the case
of the replicator network, the result of the mapping would be a series of data that
is very similar (though not necessarily identical) to the input. A neural network
accustomed to a certain pattern on the basis of a data stream would detect an
outlier by the need to redistribute weights inside the network (see Figure 24.5).
Simple networks with just a few layers contrast so-called deep artificial neural
networks with often hundreds of layers and feedback loops, which are heavily
trained supervised learning techniques and have applications in image recognition,
and so forth.
Data Analytics 367
Supervised Learning
Supervised learning differs from unsupervised learning in that the approaches in

this category are divided into a separate training and application phase. That is,
in training or under supervision, the method learns the properties (called features)
of those items that belong to a particular class, and in which numerical range the
respective features fall to allow the assignment to the class. The result of such a
learning step is called model or classifier. The result of a classifier is assessed by
a variety of different metrics. An intuitive metric is sensitivity (how many items
in the class it correctly assigns to it) versus accuracy (how many items that do not
belong to the class are assigned to it). An optimal classifier is both sensitive and
accurate. If the training phase is concluded, the successful classifier is ready for its
application on novel data (i.e., in production). The overall workflow including both
training phase and productive use are shown in Figure 24.6.
Supervised Learning: Learning Phase
The training phase is the most complex part of a machine learning workflow, as the
computer aims to learn rules and makes statistical inference from a large amount of
input data, whereas in production a pretrained classifier is only applied to the current
set of data points. To stay with the example used in the introduction of unsupervised
learning methods, a classifier would be able to decide whether a vehicle is a bike,
car or rocket on the basis of seeing the values for relevant features (e.g., the number
of wheels or boosters). Which features are relevant and which thresholds apply is
the result of the machine-learning algorithm.
In practice, most training begins with a data matrix of items (belonging
to certain classes) versus features. Following the cleanup of raw data (removal
of empty lines or columns with incomplete data), the most relevant features are
selected. It is often advisable to remove features that can intuitively be neglected
(e.g., color of the vehicle), while feature extraction may utilize methods to reduce
the complexity of the input matrix (e.g., by mathematically combining otherwise
separate features, or by translating categorical values into numerical data). Once a
well-formed matrix is available, a variety of machine-learning algorithms can be
used to rank, weigh and determine thresholds for features, so that a classification
task can be successful.
Without going into the algorithmic details of these methods, this list provides
are rough overview for a few of them:
1
training
data
items of
("true positives")
class 1
sensitivity
...
items of
class n classifier
worse than
features ranked list random guess
feature 1 of features
feature 2 machine weights of
feature 3 learning features
... method thresholds 0 1
feature n for values 1-specificity
("false positives")
raw data model learning model selection
clean-up feature selection

feature extraction
continuous
evaluation
training data model model ok deployed

predictions
recording building testing classifier
more development production data

necessary recording
training phase
Figure 24.6 Supervised learning workflow. The bottom half of the schematic shows the steps of the
training phase as well as the logical steps of applying a learned classifier on real data in order to obtain
computational predictions. Model building is the most complex part of the training phase, reaching from
data cleanup to learning and selecting models to assessing various quality metrics.
Data Analytics 369
• Regression: The training method prepares a linear or complex model for

inter- or extrapolation of a class on the basis of the training data. That is,
a model of the form (feature 1 , feature2 , featuren ) = weight1 · feature1
weight 2 · feature 2 · · · weight n · featuren = class : : : n is used to
determine the weights (and implicitly rank) such that (feature1 , feature2 ,
feature n ) maps to appropriate classes.
• Naive Bayes: In a naive Bayes approach, we would classify a vehicle as truck
with a likelihood of (passenger power 0); that is, the probability to
catering only one passenger under the condition of having a power rating
greater than zero.
• Random forest: The naive Bayes approach gives rise to complex decision
trees. While naive Bayes is indeed occasionally based on an intuitive manual
solution, decision trees build on often random series of yes-or-no decisions
(Is the number of boosters less than five? Is the number of passengers greater
than one? Is the number of windows less than four? It is a car...). The random
forest approach builds on very large numbers of decision trees to derive an
optimal decision path.
• Support vector machine: The support vector machine (SVM) projects input
data into a so-called hyperplane, a dimensionality-reduced version of the
input, which is on the abstraction level of the principal component plot
shown in Figure 24.4C. The support vector is a geometric concept that helps
to separate the items based on their position in the plane. Once the SVM
has found appropriate support vectors, the machine can be used to separate
items on the basis of their input features. However, while being an often
successful machine-learning approach, SVMs suffer from the abstraction of
the hyperplane and it can often not be explained why or on the basis of which
features an SVM has worked. If classification success is not important, but
the identity of relevant features is (e.g., to inform an engineer why predictive
maintenance of a device may be necessary), alternative methods such as the
random forest or regression may be more suitable.
• Neural network: The anomaly detection algorithm exemplified in Figure 24.5
provides insight into the function of neural network-based methods.
A common strategy to assess the utility of a classifier is the 80:20 split. That
is, 80% of the data used in the training period is used to build a classifier, and 20%
is used to validate the classifier against data with known class labels. One important
analysis is the accuracy-versus-sensitivity plot, meaning which percentage of false
positives do we have to accept in order to capture at least a certain percentage of the

true positives in the set?
Supervised Learning: Production Phase
A good classifier that has been built and selected in the training phase provides
predictions on real data. Depending on the use case, it is often worth it to con-
tinuously monitor the performance of a classifier to detect signs of drift or bias
in the input data. That is, an increasing number of false alarms on the basis of a
pretrained predictive maintenance solution may not be due to using an immature
classifier, but a change in the input data per se (e.g., consider systematic changes
in sensor values that are linked to seasonal changes and that have nothing to do
with device status). Also, overfitting is a common problem in supervised machine
learning and the classifier may be overtrained and too specialized, so that it can only
deal with data as presented during the training phase but not the variation seen in
real deployments.
Chapter 25
Conceptual Interoperability
One of the big unsolved issues in the IoT is conceptual interoperability, exemplified
by the lack of
• Widely accepted standards that mediate device discovery across networks
• Automated ways to explain device capabilities and device relationships
• Data integration beyond the boundaries of particular vendor solutions
This chapter aims to dissect the technical challenges that need to be addressed
to facilitate an all-encompassing IoT as outlined in Chapter 6. In simple terms, how
can software learn in a way that devices are able to exchange relevant informa-
tion between each other and with people on demand, not because of a fixed and
predetermined workflow. This is where the future IoT should be different from con-
temporary M2M solutions in which data producers, processes and data consumers
are defined right from the outset. That is, today even very complex business logic
involving large numbers of very different end devices can be part of an IoT solution.
However, without breaking data from these devices out of conceptual and actual
silos, and without appropriate consideration in the software layer, there is no way
for these devices to participate in a true IoT. These are connected devices, but they
are not part of the Internet of Things as defined in the introduction of this book.
Chapter 24 on data analytics defined actionable insight as information that
we can use to react to circumstances. While intuitive human reaction is based on
the interpretation of raw data, the extraction of relevant key information, and the
contextualization of this information into the framework of prior knowledge (see
Figure 25.1), computational decision-making is far more constrained. Surprisingly,
while machine learning and action selection are well established approaches to
371
372
many inputs
raw data
barometric pressure, temperature, GPS coordinates, schedule
information "context" through

snow storm imminent, airport hotel, need to travel structure and rules
knowledge
flying and heavy snow
"learned" through
are mutually exclusive
from previous
experience
actionable
insight
rebook flight "action selection"
from catalog of
options
output
Figure 25.1 From data to knowledge. On the conceptual level, knowledge is the condensed form of
prior information that has been extracted from raw data. Every new data is evaluated following structures
and rules, and the key information extracted and fed into an ever-expanding knowledge base. The
information is compared against this knowledge base to decide upon an appropriate reaction from a
catalog of options. For example, basic weather data, a set of GPS coordinates and a schedule represent
raw data. This data carries the information that a snow storm is imminent, that a user is at an airport hotel,
and that their flight is scheduled in the near future. From previous experience the user knows that there is
a high chance of the flight being canceled, and would thus choose proactively to rebook the flight. Other
options might be to cancel the journey or accepting the risk of a possible delay.
mimic learning from information (and building up knowledge) and for choosing
the most likely (in terms of a probability) optimal solution to a problem, the
contextualization of data to provide information is far more reliant on manual
curation and human interference. For example, a database query may return a
timestamp and temperature value, but without explicitly clarifying that the value
is from an indoor or outdoor thermometer (in reference to the example in Figure
25.1), the result can be misleading. Hence, even when a software system has access
to certain IoT data through an API, there is no simple way other than explicitly
knowing how this data can be used.
The following sections detail existing options to find out which data sources
or end devices are available in the IoT, and, once found, how their data can be
interpreted in an IoT context.
Conceptual Interoperability 373
25.1 DEVICE CATALOGS AND INFORMATION MODELS
Device catalogs address the question of which devices can be found in a network,
ideally across the entire Internet, and what the devices’ capabilities are. Querying
such a catalog should return a list of devices, just like a Web search retrieves a
list of relevant pages. In a Web search the focus is on pages that feature particular
key words, whereas a device catalog might be used to explore all available devices
within a particular network, or devices that exhibit a certain characteristic (e.g., list
of all temperature sensors in a geographic area).
In a first approximation, the hardware characteristics of an IoT end device
do not matter to the data integration layer. Device discovery mechanisms can
in principle be similar to those proposed to make data service discovery easier
on the World Wide Web, such as the simple object access protocol (SOAP),
the web services description language (WSDL), or the hypermedia as the engine
of application state (HATEOAS) concept. While intrinsically following different
logical concepts and design principles, all three discovery mechanisms share an
explorative paradigm. That is, rather than having to know the exact API end points
for certain interactions, either the queries themselves or an explicit lookup return
potentially relevant further end points. For example, if we consider a RESTful API
(see Section 22.4.4) to interact with a database over the Internet, we would not
need to know the exact field names and which operations (query, update, delete)
are supported; upon interacting with the end point for the database, we would also
get a list of schemas, and upon interacting with a schema, we would learn field
names, permitted operations, and so forth. In an ideal case, both client application
and service discovery mechanism complement each other and react dynamically to
each other. This contrasts models in which client applications only interact with a
fixed set of API end points. While standards for service exploration do continuously
evolve, their basic functionality has been around since the early 2000s.
The principles of service discovery have been picked up by a variety of
IoT projects. At the time of writing, there is no strong contender for a leading
standard, and in practice their use is either experimental or highly community
specific. Examples are:
• Machine Hypermedia Toolkit
• IPSO Smart Objects
• HyperCat
• Vorto
These four standards have in common that they describe device properties and
concepts for service discovery. The actual search, however, remains largely un-
addressed. That means while, for example, HyperCat and Vorto suggest ways to
model information, how to organise and communicate device capability, and how
software should be able to explore these, they do not provide a search engine that
can be directly used. As such, they are mostly IoT information model standards that
represent a machine-readable specification of the device (see Figure 25.2), much
like the meta-tags in HTML documents should represent their content.
In practice, device discovery currently works only in isolated silos. For ex-
ample, devices from particular vendors or those using vendor-specific standards can
be discovered by these vendors’ software. In the industrial context, OPC UA (see
Section 22.9) is one standard currently gaining ground in the ecosystems of major
manufacturers. Major mobile phone manufacturers are pushing for standardization
in the home automation market, allowing their software to find and control devices
in the home of their users. However, these systems are typically constrained to a par-
ticular local network and home automation devices not supporting these standards
are excluded. This is a major interoperability issue.
A few websites that attempt general device discovery are mostly manually
curated or rely on the import of information from other services. Sadly, one of the
more comprehensive websites for IoT service discovery has a focus on security
and trawls the Internet for open ports. Both general and security websites do not
return data on device capabilities, and therefore they are not useful for the automatic
inference scenario that is laid out in the introduction of this chapter.
25.2 ONTOLOGIES
Semantic interoperability also extends into the meaning of terms. Even if a reliable
device catalog is available, how can we interpret that data from a thermometer
and assess if it is suitable for an intelligent IoT system to decide whether to
heat the building or not? What does a thermometer read? Body temperature?
Room temperature? Where is it located and how does it relate to our radiator.
The conceptual link between such terms is established by so-called ontologies.
These usually comprise a controlled vocabulary with defined relationships between
entities. Unfortunately, there is no general and widely used ontology for the IoT
yet, although there are several efforts which go into that direction, such as draft
ontologies (e.g., the W3C Semantic Sensor Network Ontology) or considerations
that are part of the previously discussed information model standards.
Device_Model_XYZ
temperature sensor
property
is_sensor boolean: true, immutable
sensor_type set (physical, chemical): physical,immutable
sensor_subtype set (force, temperature, ...): temperature, immutable
temperature float, valid: > -25 && < +75

unit string: "Celsius", immutable
sleep_mode boolean
function
temperature get: delivers current temperature
sleep set: 1: puts the sensor to sleep
Figure 25.2 Information model. An information model can be interpreted like a programming
language-agnostic object-oriented description of a device. Shown here is an example of a fictitious tem-
perature sensor with properties and functions that can be made available through its software interface.
The Vorto code generator can take such information models and generate code templates for use in
supported programming languages. The reverse rewriting of code into an information model is typically
not straightforward and ambiguous. Hence, information models facilitate better communication between
development teams that use a variety of programming languages, or that focus on different aspects of a
device. For example, code for the graphical representation of the sensor may only require the temperature
property, while code dealing with sensor settings would include the sleep mode property. Based on these
code fragments, it would be difficult to describe all device capabilities appropriately.
25.2.1 Structure and Reasoning
On the example of three subontologies (function: what can a device do; process:
when is a device typically used; localization: where is a device located) the utility
of ontological reasoning should become clear. Note that these subontologies are
conceptually borrowed from Gene Ontology, a controlled vocabulary used in the
biomedical sciences. A concrete example: a thermometer inside a smart thermostat
can be tested for its suitability in a home automation workflow by looking at its most
granular annotation: function: measurement of room temperature; process: domestic
temperature control; localization: living room. The user-specific term living room
could be a result of overloading a generic IoT ontology term domestic setting, see
Figure 25.3.
Software can reason with the help of ontologies. While in Figure 25.3 we only
see that there is a relationship between two terms, these relationships can be refined.
In the example, the implicit relationship was is a: measurement of room temperature
is a detection of ambient temperature, which is a detection of temperature, which
is a detection of a physical stimulus and so forth. By adding further relationships
such as part of and can regulate, more complex scenarios can be interpreted with
an ontology (see Figure 25.4).
However, there is also a meta-layer of information that is not as clear-cut
and factual as function, process and localization. It is the proxy or avatar function
a device can have, and this is highly dependent on the individual user: Some
cars are just cars, but your partner’s car in your driveway may be an indicator of
their presence. The temperature sensor in your smart heating system can, under
certain circumstances, be indicative of a fire. These are relationships that depend
entirely on context and it can be anticipated that a considerable amount of machine
learning or extensive manual curation are required to learn such contexts. The proxy
subontology would likely exist as a branch reflecting frequently inferred functions
and processes, as well as a branch specific to the user which could be overloaded
similar to the domestic term in the example.
25.2.2 Building and Annotation
Successful ontologies are very often manually curated, and it this stage it is entirely
unclear whether a computer-guided and machine learning-based approach would be
suitable for building an IoT ontology. The choice of ontology endpoint, degree of
granularity and generality of terms therefore has to be pragmatic. Such pragmatism
would allow the ontology to grow in small steps rather than exploding with each
root
function process localization
... ... ... ... ... ... static mobile
detection of physical stimulus ... ... home automation ... ... ... ... ... ...
detection of force ... detection of temperature ambient control ... security
detection of ambient temperature detection of extreme temperature temperature control ...
room temperature ... ...
domestic - MY home
first floor second floor
living room ... ... ... ...
Figure 25.3 Ontology. Annotation of a connected thermometer with the three subontologies function,
process and localization. The term domestic would allow the injection of a user-specific subontology MY
home. This operation would only affect the user’s active copy of the ontology, not a generally applicable,
public version. The necessary reannotation of devices with domestic localization would only be possible
for those owned by the user.
actuator radiator building control system
part of
is a
regulates
sensor temperator sensor thermostat
is a is a inference: some sensors

regulate elements of
building control systems
and are thus parts of it
Figure 25.4 Ontological reasoning. An ontological reasoner can make inferences about indirect
relationships between terms.
and every new device or device type. It would also allow community efforts and
crowd-sourced activities for device annotation, as carefully balanced and more
general terms are more easy to use (rather than vendor-specific fine-grained ones).
Although there are going to be billions of connected devices in the future, many
of these devices are going to be conceptually the same. Also, it is conceivable
that the number of applications/workflows that users require is even smaller. It is
therefore feasible to derive an ontology that caters well for all workflows without
the need to annotate each device with the maximum number of fine-grained details.
Certain meta-information as detailed in the Semantic Sensor Network Ontology
(such as the survival range for a device), which is not relevant for the majority
of workflows, would still be accessible through cross-references. An IoT ontology
would require the annotation of devices with appropriate ontology terms. This is
where information models come in as useful additions, as the ontological annotation
is just another device property that needs to be reflected.
Part VIII
Security
Chapter 26
Security and the Internet of Things
In late September 2016, most of the East Coast of the United States suffered almost
complete Internet downtime in an attack by a group of systems called the Mirai
botnet. The Mirai botnet was made up of 100,000 IoT devices, mainly CCTV
cameras that were taken over by attackers to create a distributed denial of service
(DDoS) attack of at least 620 Gbit/s. Also attacked were digital video recorders
and routers. The devices attacked were spread all over the world. This attack
demonstrates a few of the core concerns of IoT security — although there are
other concerns as well. Let us quickly outline the key points of this attack as an
introduction to the rest of the chapter.
Firstly, the devices were attacked because they had a default or easily guessed
login credentials (user id and password) that could be used by the attackers to
reprogram the devices. This is a common model for Internet attached devices and
is a clear problem. The second aspect was that the devices were identifiable from
the network. In other words, the attack system could scan for these devices and
then attack them. The third key aspect was that this attack demonstrates as an
escalation attack. In this case, the attackers were interested not in attacking the
CCTV cameras themselves, but in using the broadband network to which those
cameras were attached. This is a common type of Internet attack where the weakest
part of a system is attacked, and then the attack uses that foothold to continue the
attack elsewhere. Another simple example of this was when hackers attacked energy
meters inside a major U.S. retail chain and used those unprotected systems to listen
into the network and steal credit card details. These three simple aspects allowed
criminals to bring down a large part of the Internet. To put this into perspective, a
study by Checkpoint Software (a firewall and security vendor) in 2014 identified
381
12 million routers potentially attackable through a single flaw (known as CVE-

2014-9222). The Mirai botnet is not a complex attack, targeting systems through
the simplest of attack vectors. A more serious attack using 12 million devices could
have produced an attack of greater than 70 TBit/s, which could potentially have
disrupted the whole Internet worldwide.
However, when looking at this another way, the Mirai attack was not the worst
IoT security problem that we can imagine: there was no invasion of privacy, and
people did not (as far as we know) come to any harm, which is certainly conceivable.
The security section of this book is physically at the end of the book. However,
we recommend that you read this section before you build real IoT systems. There
are two key concepts that we promote: security by design and privacy by design.
What these concepts mean is that security and privacy should be built into the core
of any system and not added on as an afterthought. Later in this section, we describe
an IoT system that is designed with privacy and security in mind and we show how
this affects the design choices and approaches.
The rest of this part of the book is laid out as follows. This chapter outlines
the core tenets of security. Chapter 27 provides a beginner’s guide to encryption.
Those already familiar with encryption, certificates, hashes, and TLS may skip to
Section 27.3 which looks at encryption specifically on constrained IoT devices.
Chapter 28 provides a structured overview of the specific threats against IoT
devices, networks, and cloud systems. Finally, Chapter 29 provides approaches to
avoid these threats and discusses the design of a system that aims to enhance privacy
and security for IoT systems.
26.1 BOUNDARIES
Traditionally many people have thought of security as a boundary issue: we try to

keep the attackers out of our secure fortress. For example, we use firewalls to keep
our personal machines and internal networks secure. We put locks on the outside
doors of our houses. The outside edge becomes what is known as the attack surface.
The size of the attack surface is measured as the ways in which it can be attacked.
One key aspect is that the security of the whole is equal to the security of the weakest
point. It doesn’t matter how strong the lock on your door is if there is an poorly
secured window that someone can climb through. This is related to the escalation
point we described earlier.
The Internet of Things is enlarging the attack surface, and in many cases
introducing weaker points of attack. It is already estimated that there are more IoT
Security and the Internet of Things 383
devices than humans, and growth forecasts predict vast numbers of cheap devices
connecting the physical world to the Internet. With this enhanced attack surface we
need to change to a different model where security is based on identity and trust,
not on boundaries.
A key aspect of concern with IoT is that of privacy. Privacy is the control of
personal information. Privacy also has security implications: if a felon can infringe
your privacy and understand your working pattern they may be able to use this to
know when is a good time to burgle your home, or they may use that information on
which to base social engineering attacks. For example, if an attacker had access to
health trackers they could deliberately phone people when they are already stressed
in order to capitalize on that person’s vulnerability.
IoT is a significant threat to privacy because IoT devices can be used to collect
personal information: many of them are designed to do that (e.g., health monitoring
devices) and others may do so as a side effect (e.g. connected cars, smart homes).
Another key concern around IoT, privacy, and security is big data. One major
aspect of big data processing is deanonymization: the ability to put together multiple
databases and by merging them, understand more than was perhaps intended. For
example, researchers linked anonymous data from Netflix with IMDB profiles and
reviews to deanonymize a vast number of users. In another example of misuse of
big data, Uber made a system available to all of its employees called the Uber God
View that allowed employees to track anyone who had the Uber app installed on
their phone, and it was found that employees had been using it for virtual stalking
of boyfriends and girlfriends, potential dates, and many other people.
The rest of this chapter is laid out as follows. Firstly, we will examine
some other IoT attacks that have been made in order to understand the challenges
further. Then we will look at the fundamentals of security and privacy and try to
understand the lessons learned from over twenty years of Web security and Web
security challenges. From this basis we will then examine what is the same and
what is different in an IoT world. We will look at the current best practices and
examine some systems that aim to implement these. Finally we propose a set of
core principles for IoT security and privacy that aim to help guide us toward better
systems.
26.2 OTHER ATTACKS
We have seen how default passwords and open ports caused the Mirai attack. Let us
look at some other attacks to understand potential problems.
In 2015, security researchers demonstrated that they could remotely brake

a Jeep Cherokee without any physical access to the car. The main reason for this
was the direct attachment of an Internet-connected processor to the CAN bus (see
Section 19.2.1 on fieldbuses) on the car. The CAN bus is a real-time network used by
multiple parts of the car. In earlier cars, this only carried diagnostic information, but
in modern cars the CAN bus is used to signal actuators within the car. For example,
sending messages on the CAN bus can cause modern cars to brake, steer, accelerate
and perform other drastic actions. This attack (and others that have been made on
further cars) shows three key design problems. The first problem we can see is the
extension from sensors to actuators without adding further security. While attacks
on sensor data can be bad, invading privacy and the consequent actions of that, it
is clear that once a system digitizes access to actuators, the potential consequences
are much more severe.
The second clearly identifiable problem is the assumption when creating the
CAN bus that there is a boundary: only well-behaving systems will be attached to
the bus. The CAN bus itself has no security. It does not validate the authenticity
or validity of systems attached to it. It performs no encryption or authentication. A
number of attacks are possible by simply inserting a small device into the car which
attaches to the CAN bus and offers WiFi, Bluetooth, or other radio access.
The third problem is when we add a network connection to the CAN bus. This
is where we create an escalation point. By directly connecting the CAN bus to the
IoT device inside the car, we are creating an attack point for remote attacks.
In late 2014, Checkpoint Software identified a vulnerability in specific man-
agement software embedded in home routers and other devices. Using remote scan-
ning, they identified 12 million devices containing this flaw, which allows a remote
attacker to gain control of the device with administrative privileges. The attack,
known as Misfortune Cookie, shows two key problems that need attention. The first
problem is that this attack is based on an undocumented open port on the routers that
is used for management. Good Internet security practice has always been to disable
as many ports and minimize the functions available on Internet connected systems
to reduce the attack surface. This clearly shows a lack of awareness of this mantra.
The second more insidious aspect of this attack is that the attack relies on a problem
that has already been identified and fixed. The problem lies in a third-party module
that many manufacturers embed into their devices to add management functions.
The third-party identified this problem and the fix was issued in 2005. You did read
that correctly. The fix was issued nearly 10 years before the problem was seen by
Checkpoint. This highlights a massive problem in IoT security — the application
of patches and updates. We already know it is very hard to get users to update
their firmware, and a number of attacks have been based on this. What this attack
demonstrates is that manufacturers also have this problem, and in this case were still
shipping insecure versions of third-party software years after updated versions had
been issued.
Our final example of an IoT issue is a different class of attack and demon-
strates a different concern. In 2011, researchers found that a simple Google search
string identified hundreds of users of the Fitbit health tracker based on their sexual
activity.
While this mildly amusing incident was quickly resolved by Fitbit, it high-
lights a less obvious concern about IoT. We may not be aware of the way in which
personal data can be used. We can be pretty sure that Fitbit’s terms and conditions
made this legal somewhere in the many pages of small print, but the difference
between user’s expectation and the reality is of real concern. Even if companies
do not make such data available, it may also be that they are hacked. Imagine the
number of divorce cases we might see if every incident of sexual activity recorded
by Fitbit was suddenly made available by hackers.
To conclude this section: there is are real concerns about safety, security, and
privacy of IoT systems. Let us now look at how to address these.
26.3 THE FUNDAMENTALS OF SECURITY
The classic precepts of computer security are confidentiality, integrity and availabil-
ity — known as CIA. This is often known as the security triad.
26.3.1 Confidentiality
Confidentiality is the protection of information from others accessing it. Using a

real-world analogy, the confidentiality of a letter is provided by the opacity of
the envelope: you cannot see the writing. In general we require much stronger
guarantees in the electronic world than we do in the real world, because the
digitization of information makes the attackers job easier. However, there are
many counterexamples. Probably the best counterexample is standard e-mail, which
doesn’t have any confidentiality.
In the digital world, we gain confidentiality by encrypting messages. Encryp-
tion is the hiding of information in a reversible way: in an ideal world, the intended
recipient can decrypt messages, and others cannot. Encryption is a key technology
in Internet and IoT security and so we will cover it in much more detail below (see
Chapter 27). However, for the moment let’s assume that this is possible.
26.3.2 Integrity
Integrity is the protection of information by preventing attackers from modifying

data. In the real world there are ways this is done: for example, making sure all
pages are signed or initialed and contain no corrections, marking blank spaces and
pages, and so forth. In the electronic world, we have a much more secure model
called a signature. Signatures are based on hashes. A hash is a unique number that
is based on the content of a document, and captures any changes. Thus even a small
change in the document will result in a completely different hash. The hash is then
signed by the originator of the document. This ensures that only the originator of
the document could have created the hash, and therefore the integrity of the overall
document. Signatures are based on the same technology as modern encryption and
we will explore that in more detail.
26.3.3 Availability
The last of the traditional security triad is availability — making sure that a system
is running, can serve its customers, and cannot be brought down. The availability is
obviously important as a goal in itself: confidentiality and integrity protect us from
stealing or modifying information, but attackers can also damage us by making our
systems unavailable. Sometimes they can just break into a system and completely
destroy it, but in other cases they can bombard the system so it is too busy to
respond. This type of attack is called a “denial of service” (DoS) attack. We already
mentioned a distributed DoS attack (DDoS), which is when multiple machines
together try to make a system unavailable.
A second more subtle aspect of DoS and availability attacks is that they are a
key part of impersonation. A common Internet attack is called “spoofing” which
means pretending to be a system that you are not. For example, suppose I can
replace a system like https://ebay.com with my own site that looks very similar.
Then I can steal many credit card details. To do this, I might need to make the
original eBay site unavailable first. A real example of this is one of the earliest
Internet attacks on record. On Christmas Day 1994, Kevin Mitnick broke into
Tsutomu Shimomura’s computer by hijacking an existing connection between that
computer and a server. Without going into the full detail, this attack consisted of
two main steps: a DoS attack on the server (so it could no longer respond) and
spoofing of the server. This caused Tsutomu’s computer connection to be taken

over by Mitnick’s attacking computer instead of the server, and allowed Mitnick to
issue compromising commands onto Tsutomu’s system. This clearly demonstrates
the combination of DoS and spoofing.
26.3.4 CIA+
In addition to the core aspects of Confidentiality, Integrity and Availability there are
some further aspects that have been called out, which we will refer to as CIA+. To
the classic three we can add authentication, access control and non-repudiation.
26.3.5 Authentication
We cannot truly support encryption or integrity without a robust concept of identity

and authenticity. In other words, to encrypt a message to a specific person or entity,
we need to know that we are encrypting it for the correctly intended identity,
otherwise we are probably just sending it to an attacker. This was one of the main
causes of the Mirai attack — the devices attacked misidentified the attack network
as being the owner of the device because default or easily guessed user ids and
passwords were used.
In the real world, we commonly use signatures and hard-to-forge documents
to prove our identity. For example, we might show a passport and other documents
along with a signature. In the Internet world, we need more secure models and
these are provided in two ways. One way is via certificates. A certificate is a secure
electronic file that has been signed by a third party who has validated your identity.
A certificate can prove the identity of a server or a client (i.e., an organization or a
person). Once again, this uses encryption technology which we will explore later.
The other option for people is to use usernames and passwords or other
identification technologies such as two-factor (2FA) or multi-factor authentication
(MFA). Authentication of people is based on different factors. A password is an
example of something that the person knows. A fingerprint is based on inherence
— something inherent about the person. A passport or identity card is something
that the person has that proves their identity. These are all different factors. A user id
and password is therefore single-factor. Common two-factor authentication includes
adding an SMS validation (something you have like your mobile phone) or using
retinal scanners or voice recognition (something you inherently provide).
This kind of authentication can work in different ways. The most common is
that the server you are working with maintains its own database of identities and
(1) (4)
login (user id, password) validate token
user (2) identity provider website

token
token
(3)
Figure 26.1 Token-based identity is a multistep process: (1) The user requests a token from the identity
provider, and (2) gets in return a token. (3) The token is passed to a website, which (4) validates the token
against the identity provider.
authentication data (e.g. passwords). This approach has not scaled well and we all
know the pain of having to create a new password for a site. The alternative to this
is called federated identity. In this model the identity is validated and provided by
a different entity to the website. Federated identity is also known as token-based
identity; because the model works on a concept known as tokens (see Figure 26.1).
For example, Facebook and Google both provide federated login, which allows
users to log into third-party sites using their existing credentials. The benefit of
this is threefold. First, users don’t need to create a new credential. Second, if the
third-party website gets hacked, the users’ passwords are not available to be stolen.
Finally, there is a big advantage in supporting multi-factor authentication. MFA
is expensive and complex to set up. If users can use a single MFA system (e.g.,
Google) to log into multiple services then that makes life easier. The idea of a token
is that the identity provider (e.g., Google) — known as an IdP — creates a unique
token. The user passes this token to the third-party website (e.g., thirdparty.org) and
the website can look at the token. Inside the token is a signed message from Google
to say that this token belongs to John Smith and is valid for the next 30 minutes
(for example). Because the token is a signed message guaranteed by the identity
provider, then the website trusts it and allows the user to login.
There are two leading federated identity protocols in use at the time of writing.
The first is used in many enterprises and is called SAML (in fact SAML2 is the most
current and widely used version). The Security Assertion Markup Language is an
XML syntax that enables single sign-on (SSO) by capturing a token as an XML
document. Because of the overhead of XML, SAML is not widely used in IoT
scenarios, although some IoT cloud systems allow SAML for users to access Web
systems. In addition, there are some technologies that allow multilayer federation
and in those cases, SAML can play a part. A similar technology that is based on
SAML is Shibboleth, which is widely used in the educational sector.
A more widely used technology across the Web is called OAuth, and the
most commonly deployed version is OAuth2, which is used by Facebook, Google,
Twitter, Github, Yahoo, and many others. OAuth is not an acronym, though it could
be interpreted as a mixture of open and authorization. OAuth2 is a general token
system, which was originally designed for federated authorization but has been co-
opted to be used for authentication as well. Without going into too much detail,
if you authorise a third party to access your profile information, then the third
party can trust that that profile information to be yours. Therefore we can use an
authorization system to perform single sign-on login.
The user login or SSO based on OAuth2 has been formalized in a specification
called OpenID Connect (OIDC). OIDC is still earlier in the adoption phase, as
many sites support login using OAuth2 without fully supporting OIDC. With
OAuth2 SSO, the website needs to explicitly support each identity provider. For
example, you may go to a website and it will say that it supports Google, Twitter,
Facebook and Yahoo logins. OIDC extends this so that once a website fully supports
OIDC login, then any OIDC identity provider can be used, not just ones that are
preprogrammed. In this model, the users type in their email address and there is a
discovery service which then lists all the OIDC-capable logins that the users have
and offers them the choice based on that.
There is definitely discussion about using OIDC with IoT scenarios, but up
until now most of the federated identity approaches for IoT are based on the simpler
OAuth2. For example, IBM Watson IoT and Amazon’s IoT service both utilize
OAuth2 tokens for device identity. The first published work on IoT and OAuth2
was in a blog post published in 2013 (http://pzf.fremantle.org/2013/11/using-oauth-
20-with-mqtt.html), later followed up in the paper Federated Identity and Access
Management for the Internet of Things, by Fremantle et al. There are now a number
of systems supporting OAuth2 with IoT, including with MQTT and CoAP protocols
(see Chapter 22). We will look at one such system in more detail below.
26.4 ACCESS CONTROL
Access control is the allowing or disallowing of actions to systems and people based
on some form or rule or policy. In the real world there are plenty of examples. A
student may be able to access parts of the school, while the teachers may have keys
to access even more areas. The principal or custodian may have even more access.
We need to allow some people and organizations access to certain data and
systems. In the IoT world, the main access control needs are to sensors and actuators
and therefore we can think about data (from sensors) and commands (to actuators).
Traditionally access control was managed in a hierarchical fashion using a
concept known as roles. For example, the admin role would have all access and
could also grant and deny permissions to other roles. Each role had access to a
particular type of data or to certain functions. This kind of access control has two
problems in the IoT world. The first is that it is typically based on the type of data,
not more specifics. So for example, if I have access to blood sugar levels, I can see
all blood sugar levels. This kind of approach has been a significant problem with
personal data and privacy. For example, there have been many cases of police staff
incorrectly using databases for their own benefit, because their permissions allowed
them to.
The second problem with this model for IoT is that it does not scale well
across very large, heterogeneous, and distributed systems, which is a requirement
for IoT.
More modern systems allow policy-based access controls that may take into
account specifics of the data, the person or system taking the action, and the context.
For example, a policy might state that a doctor can look at a patient’s record if the
doctor is currently in charge of the patient’s care. These rules can be coded in a
language such as theXML Access Control Markup Language (XACML). Just to
be confusing XACML actually now supports JSON as well as XML. While this
approach is effective for the server aspect of IoT, it is often too heavyweight for the
gateway or device, although there has been research into effective use of XACML
in gateways as part of the EU-funded Webinos project.
Another key aspect of access control in IoT spaces is the concept of consent.
Consent is the use of dynamic access control rights that are specifically authorized
by people to their own data. Consent-based access control is also part of the OAuth2
(and inherently therefore OIDC) specifications. The way this works in OAuth2 is
that a user grants access to a scope. The scope refers to part of a user’s data. For
example, I might grant Facebook access to my name, date of birth, and average
daily weight from my IoT scales because I wanted to share that with my friends as
part of a weight-loss attempt.
26.5 NON-REPUDIATION
The final extension to the CIA triad is non-repudiation. This is the ability of a system
to provide sufficient audit and proof that a particular person or system made an
action. A simple real-world example of this would be getting a receipt from a shop
to prove that they sold you an item. In the digital world, we have more specific ways
of performing non-repudiation. For example, a signed document from one party
proves they wrote it. If the receiving party sends back a countersignature then that
proves it was received and both parties have proof that the transaction happened.
Let us take stock of where we are. Firstly, we have identified a need for
security of the Internet of Things, driven by three factors: privacy and personal data,
the ability to affect the world around us, and using weakly protected IoT devices as
a stepping-stone to attack other systems. We have looked at the fundamental aspects
of security and why they are important. The next step is to understand the threats
in more detail, and to do that we will examine a matrix. But first we present a short
cryptography primer.
Chapter 27
A Beginner’s Guide to Encryption
In this chapter we will describe the basics of encryption, digital signatures, hashes,
and the general cryptographic frameworks that we will be using. If you are already
aware of these concepts and how they work, then we suggest you jump straight to
the end of this chapter where we describe the challenges of cryptography on IoT
devices.
There are two main categories of encryption: shared key encryption (also
known as symmetric encryption) and public key encryption (also known as asym-
metric encryption).
27.1 SHARED KEY ENCRYPTION
Shared key or symmetric key encryption is where both parties share a secret key,
which is used to both encrypt and decrypt messages. We call this converting from
plaintext to ciphertext and back again.
Up until the second half of the twentieth century, all encryption was based
on this model. There are many such encryption algorithms, ranging from the very
simple and unsecure to more complex. We will look at some of these algorithms —
some historical examples, such as the German Enigma cipher from World War II,
and a currently used encryption system.
One of the simplest symmetric key cipher systems is the Caesar cipher. Caesar
used this to communicate with his army. It is a simple substitution cipher where
each letter in the plaintext is substituted with another. In Caesar’s case, he shifted
each letter by three: for example, A would become D, M would become Q, and Z
would become C. The key in this case is the number (3) that we shift by. There
393
are 25 possible keys, because shifting by 26 returns us to the plaintext, and 27 is

equivalent to 1. We can characterize the key as having less than 5 bits of information
(as — see Chapter 3). This encryption is easily broken: we can simply try
all 25 keys and see if we get a readable answer. This is known as a brute force
attack. This attack works easily in this case because the keyspace (the potential of
all possible keys) is very small.
We can imagine a more complex substitution cipher where each letter can be
substituted with any other. This gives us a key which is a string of 26 characters,
where each letter appears just once. A classic example of such a string is “Thequick-
brownfoxjumpsoverthelazydog”; but most such keys will not make any sense in the
English language. In this new cipher, we substitute A with T, B with H, C with E, D
with Q, and so on. There are 26! (roughly · ) combinations of options, making
it much harder to evaluate all possibilities. This equates to approximately 88-89 bits
of information in the key. Of course in this case a brute force attack is still possible,
but quite expensive.
We identify a break on a cipher to be any attack that is more efficient than the
brute force attack. This new substitution cipher is easily broken, as it is subject to a
simple attack. We can make use of the fact that some letters occur more commonly
in English (or even Latin!) than others. The frequency of letters in the ciphertext is
the same as in the plaintext because each letter is always substituted by the same
letter in the ciphertext. If we have enough ciphertext, we can analyze it for common
letters and quickly identify the substitutions.
To prevent against such attacks, new ciphers were developed that changed
the substitution with each letter. A well-known example of this is the German
Enigma machine. The Enigma consisted of three rotors, each of which performed
a substitution. Each rotor could be in one of twenty-six positions. The letter passed
through the first substitution (first rotor) then the next, and finally the third rotor, and
then was reflected back through the three rotors in reverse order, finally making the
letter of the ciphertext. The clever part was that the rotors moved after each letter.
The first rotor moved with each letter, the second moved once for every twenty-six
rotations of the first rotor, and the third with every twenty-six rotations of the second
letter.
In order to use the Enigma, both sides needed to have the machines set with
the initial setting. This was the weakness of Enigma, and is the general weakness
with shared key ciphers. In the Enigma case, the settings were based on a daily
schedule that was distributed to all stations in advance.
A Beginner’s Guide to Encryption 395
Symmetric ciphers today are usually split into two varieties: stream ciphers
and block ciphers. Stream ciphers work on each bit in turn, while block ciphers col-
lect together a block of bytes that need encrypting and work on the blocks. If there
are not enough bytes to fill a block then it is padded. One of the most commonly
used symmetric ciphers at the time of writing is the Advanced Encryption Standard
(AES) cipher, also known as Rijndael. This is a cipher that is approved by the U.S.
National Institute of Standards and Technology (NIST). AES has a block size of 128
bits and key sizes of 128, 192, and 256 bits. For example, AES-256 refers to the use
of a 256-bit key. The overall cipher process is too complex too describe in this text,
but there are a number of good descriptions available on the Web. At the time of
writing, there are some theoretical attacks on AES. Remember, a break in security
terms is anything that can work faster than a brute force attack. The current state is
that there are some related key attacks, which are not workable attacks, but can be
seen as theoretical weaknesses in a cipher. These attacks are still very theoretical.
For example, the best key recovery attack on AES-128 is equivalent to a brute force
attack on AES-126, and also requires a large amount of data storage.
We will return to the subject of stream and block ciphers and the applicability
with regard to the Internet of Things later in the chapter. For now we will continue
the exploration of encryption and confidentiality.
The biggest challenge in symmetric key cryptography is the distribution of
the keys. In an ideal world the two parties would be co-located at some point, se-
curely exchange cipher keys, and then communicate. However, this is impractical
in modern terms — we often wish to communicate securely with others without
ever having met them. Suppose Alice sends Bob a symmetric key via an e-mail.
However, Charlie manages to intercept the email. He can now read all the commu-
nications between Alice and Bob and neither of them are aware of it.
This leads us onto the development of public key cryptography (PKC).
27.2 PUBLIC KEY CRYPTOGRAPHY
In the early 1970s, three British cryptographers working at GCHQ (the United
Kingdom’s Government Communications Headquarters — who are the official
government codebreakers and cryptographers in Britain) discovered a new model
for cryptography. The initial idea was proposed by James Ellis in 1970, but without a
workable model. In 1973, Clifford Cocks implemented the first working algorithm,
and in 1974, Matthew Williamson invented what is now known as Diffie-Hellman
key exchange. All of this work was classified and only made public by the British
secret
secret
Alice Bob
secret
Figure 27.1 Public key encryption. Suppose Alice would like Bob to send her some important
information that needs to be kept confidential. Alice sends an open padlock to Bob. In fact she can
send many of these out (they are cheap). Because the padlock is unbreakable, Charlie can examine as
many padlocks as he likes without being able to create a key to open the padlock. Therefore, Bob can
lock his secret in a box and send it to Alice. Once the box is locked, Charlie cannot do anything to
intercept and read the message. (We also assume that the boxes are secure once locked).
Government in 1997. Not long after, various independent researchers in the United
States came up with similar (or the same) concepts. In particular, Rivest, Shamir and
Adleman (RSA, as their method is now known) described the use of large numbers
with prime factors as the basis of public key cryptography in 1977 (which is the
same approach as Cocks, and is now known as RSA after the initials of the three
U.S. inventors) and Diffie and Hellman described the key exchange protocol that
Matthew Williamson had discovered three years earlier. These key protocols form
the basis of almost all Internet confidentiality and integrity.
To understand PKC, let us imagine that a scientist has created an unbreakable
padlock that is very cheap to produce. What do we mean by unbreakable? Firstly, it
cannot be picked. Only the key that is designed for it can open it. Just as importantly,
even when you are in possession of an open unlocked padlock, you cannot reverse
engineer a key for it.
We call the mathematical equivalent of the padlock a public key, because
anyone can inspect it without breaking the secrecy of the protocol. The key is the
private key, because Alice needs to keep that secret — if Charlie has the key to the
opening with public key,

sends out public key Bob knows only Alice
could have sent it
Alice letter Bob
letter
sends letter to Bob "locked"
with private key
Figure 27.2 Electronic signature. This is an aspect of integrity: the message cannot have been altered
by Charlie.
padlock he could open it, read the message, and then relock it (see Figure 27.1 for
an example).
Before we move on to understand more of how this works in real life, let us
explore the analogy a bit further. Suppose that the unbreakability of the padlocks
is extended: not only can you not reverse-engineer the key from the padlock, but it
also works the other way around. Given a key, you cannot fashion a padlock that
opens with that key. Stick with this: it sounds unlikely but mathematically there are
such things out there!
Now suppose Alice creates many keys, but this time she keeps the padlocks
to herself. Alice sends a key to Bob. She can, in fact, send as many keys as she likes
out. Now Alice sends a message (say an invoice) to Bob that is locked with her
padlock. Bob can open the padlock (and so could Charlie), but this time, only Alice
can create the padlock and lock up the message. Bob then knows that this message
must have been written by Alice. This is known as a signature (see Figure 27.2).
(In reality, only a short unique digest of the message known as a hash is actually
signed, not the whole message, but the principle is the same).
We have seen how we can send messages that are both secret and assured
to have come from us. Given the right sizes of boxes, Bob could put an assured
message into a secret box and get both characteristics — signature and encryption
(integrity and confidentiality).
27.2.1 Prime Numbers and Elliptic Curves
The original formulation of PKC was based on very large prime numbers. It turns
out that it is very hard to identify the prime factors of very large numbers, so if you
take two very large primes and multiply them together, it is a very slow process
to go back to the original primes. More recently, many of the prime number based
systems have been replaced with a different type of mathematical problem based
on so-called elliptic curves. These offer similar levels of difficulty to break (as far
as is known), with faster initial processing. However, in 2015 the U.S. National
Security Agency (NSA) made an unprecedented warning against elliptic curves.
We do not yet know why, as of the time of writing this book — however, there
are many theories. One known issue with both prime numbers and elliptic curves
is quantum computing. In quantum computers, instead of each bit being 0 or 1,
each qubit allows a superposition of both 0 and 1, allowing quantum computers
to solve problems that are very slow for classical computers in a fraction of the
time. At the moment general-purpose quantum computers are very simple and
confined to laboratories, but they are increasing in power and reliability. In 1994,
Peter Shor identified an algorithm for quantum computers that performs prime
factorization in polynomial time, which effectively means that most existing PKC
will be broken once sufficiently powerful quantum computers come online. Given
that most quantum computers are as yet ineffective, there is some speculation that
maybe the NSA’s concern with elliptic curve cryptography (ECC) is actually based
on classical computing exploits, but this is all speculation. One thing that we do
know is that ECC is much easier to do on IoT devices, and especially on low-power,
8- or 16-bit systems. Therefore this warning is worrying for IoT developers.
For more information on all these topics (in much greater detail) you can
do no better than to read the classic book Cryptography Engineering by Ferguson,
Schneier, and Kohno.
27.2.2 Man-in-the-Middle Attacks
So far we have demonstrated that PKC can simplify the distribution of crypto keys
by having public and private keys. However, there is still a major problem: a man-
in-the-middle attack (see Figure 27.3).
The result is that we still need a way of ensuring that keys can be securely
distributed. Of course one way is for two people to physically meet, assure each
other of their identity, and then swap keys and padlocks. Unfortunately, in the
Internet realm, that just does not scale.
"A" "C" Bob thinks he has Alice's

padlock, but really it's secret
secret Charlie's
Alice Charlie Bob
"A" "C"
Alice doesn't "A"

know that
Charlie "C"
intercepted it.
letter
Charlie opens the box
"C" and reads the letter...
"A"
...and locks it with
Alice's padlock.
Figure 27.3 Man-in-the-middle attack. Suppose Alice and Bob wish to communicate, so they need
to exchange their public keys. Now imagine a third party, Charlie, can intercept and replace every
communication between Alice and Bob. Now Charlie can replace Alice’s padlock with his own, and
keep the original. Bob thinks he has Alice’s padlock, but in fact he has Charlie’s. Similarly, Charlie
replaces Bob’s key with his own and sends it on to Alice. Alice thinks she has Bob’s key, but she has
Charlie’s. Now Charlie intercepts a box from Bob to Alice. The box has an outer padlock from Bob. Bob
thinks it is Alice’s padlock, but actually it is Charlie’s! Charlie uses his own key to remove that. The
inner padlock shows that it is Bob’s message. He uses Bob’s key (which he has) to remove that. Now he
can read the message and replace it. He locks up the inner box with his padlock (which Alice believes is
Bob’s), and then locks the outer box with Alice’s padlock. Alice now receives the message and believes
it has been securely sent by Bob. Oh no!
Before we move on to solve this problem, let us translate our analogy

(padlocks and keys) into the reality of real electronic keys and PKC. The real PKC
is based on public keys and private keys. A public key is like the padlock in our
first story. Anyone can make their public key known, and then other parties can
encrypt messages to that person, that noone else can read, because the issuer keeps
their private key (e.g., the actual key) safe and does not disclose it. Similarly, if the
public key is available, any message that is encrypted with the private key can be
decrypted by anyone, proving that the owner of the private key encrypted it. We call
this first step encryption, and the second step signature.
27.2.3 Certificates and Certificate Authorities
The solution to this is that we need a distributed approach to validating that we have
the right key. The most common way of solving this is to use a certificate authority
(CA). Before we can understand a CA, we need to understand the certificate. A
certificate is a way of one participant validating the public key of another. Basically,
a certificate is a signed public key. If Alice signs Bob’s public key, it is a way of
saying that Alice trusts that this really is Bob’s public key and she has verified it.
Now if Charlie trusts Alice, and also has a copy of Alice’s public key, then Charlie
can verify the signature. Now Charlie understands that Alice is vouching that this is
Bob’s key. Assuming Alice is trustworthy, Charlie can now trust Bob’s key too.
If Alice does this a lot, then she is basically acting as a certificate authority.
We can also chain these certificates: David can sign Charlie’s certificate who signs
Bob’s who sign’s Alice’s. If we trust David and everyone behaves properly we can
trust Alice. In real life, the root certificates (i.e., the public keys) of these CAs
are distributed widely. For example, your browser will have hundreds of these
included which will be used to certify the certificates issued to secure websites.
One problem with certificates is that they can end up being expensive. Recently the
Electronic Freedom Frontier (EFF) has tried to simplify that by creating a system
(http://letsencrypt.org) that provides free certificates to websites that need it.
Before we finish discussing public key encryption, there is one final twist to
explain. When we encrypt data over the Internet, most systems use Transport Layer
Security (TLS), which is a generic way of encrypting TCP connections (see Chapter
22). TLS is what makes HTTPS secure for example, and we also use it with MQTT
to make MQTTS. CoAP uses the related standard — Datagram Transport Layer
Security (DTLS) — because CoAP is based on UDP instead of TCP.
As we discussed above, PKC relies on complex mathematics that is much
slower in one direction (e.g., factoring) than in the other (e.g., multiplying). This
means PKC is slow in general and therefore most crypto uses a two-stage process.
The PKC is used to bootstrap the communication. As part of this bootstrap, one of
the parties creates a new random symmetric encryption key, known as the ephemeral
key. Remember, symmetric encryption is our old fashioned but very fast encryption.
The PKC is used to securely exchange the ephemeral key, and then the two parties
use the ephemeral key to communicate.
27.2.4 Transport Layer Security
Transport Layer Security (TLS) originally started life as a protocol called Secure
Sockets Layer (SSL). In fact it is often still referred to as SSL in popular parlance,
despite the fact that TLS is now over 15 years old. Many protocols run over TLS: for
example HTTPS is HTTP over TLS, MQTTS is MQTT over TLS. TLS has many
benefits. It is widely understood. It has been tested a lot, and many problems have
already been identified and then overcome, making it more secure than less tested
algorithms. TLS also is very interoperable: many different implementations work
together billions of times a day. However, it does have significant challenges for
IoT. Firstly, it is a complex protocol designed for systems that are capable enough
to store many certificates and perform different cryptography algorithms. Secondly,
it has many different options and different flows, depending on the choices taken by
the client and server. This complexity adds up to a headache for IoT designers who
want small, compact, predictable code with as few options as possible. Despite this,
there are small cheap devices that support TLS. For example, we will later look at
how MQTT over TLS can be implemented on a cheap embeddable ESP8266 chip.
Let’s first look at the overall plan for TLS and then we will look at the most
common flow. TLS supports three main approaches.
• The first, most common, approach is that the server has a certificate and
the client does not. In this model, the client will authenticate itself after the
encrypted flow is initiated, using some other authentication mechanism such
as a username/password or a token.
• The second approach is that the client and server both have certificates,
known as mutual TLS. This is challenging because distributing and man-
aging client certificates is expensive and complex. It also is error-prone. In
IoT scenarios, it means that the client needs to be updated with certificates
when the certificate expires.
• The final approach is to use a Pre-Shared Key (TLS PSK) which effectively
means that instead of using asymmetric PKC, the system reverts to symmet-
ric cryptography. This has a big advantage for IoT in performance: as we
discussed earlier, PKC is expensive. PSK avoids that. However, distributing
unique, secure keys to devices and ensuring they are not stolen is even more
complex and expensive than issuing client certificates.
Given the concerns with the second two options, we will focus on the first
approach. Even when we have chosen this approach, there are still choices (and
hence complexity) to be made: we still have to choose a cipher suite. Effectively,
this is a choice from many different encryption algorithms. There are three main
parts that need to be chosen: the PKC technology used to bootstrap, the symmetric
key encryption algorithm, and the hash algorithm. For example, the snappily named
TLS ECDHE ECDSA WITH AES 256 CBC SHA cipher indicates we are using El-
liptic Curve Diffie-Hellman Key Exchange (ECDHE), Elliptic Curve Digital Sig-
nature Algorithm (ECDSA), AES 256-bit symmetric key encryption with Cipher
Block Chaining (AES, CBC) and the Secure Hash Algorithm (SHA). There are
more than 300 combinations of TLS cipher suites, many of which are known to
be insecure. Luckily, TLS allows for negotiation, which means that our small IoT
device can ask the server to support a given cipher suite. The powerful complex
server can then adapt to the requirements of the client.
One more aspect that is more relevant to HTTP than MQTT or other IoT pro-
tocols is that TLS allows a session to be used across several different connections.
Effectively the client and server can cache the agreed master key and then a second
socket connection can reuse them. In MQTT we normally use a persistent, long-
running session, so this is not so important. HTTP connections are often dropped
and restarted, hence the need for this feature.
27.2.5 An Example TLS Handshake
In Figure 27.4 we show a sequence diagram explaining the TLS initiation flow.
The flow starts with the normal TCP socket initiation. Then the client sends
a ClientHello packet. This contains the TLS version used by the client, the client’s
time, a set of random bytes (to be used later), a session key (if there is an existing
session to be re-established), and the proposed cipher suite or suites. The server
then responds with an agreed version, the server’s time, another set of random bytes
and the agreed cipher suite (i.e., the best available suite out of the union of the
suites the client has proposed and the server supports). Then the server sends its
certificate, which is now validated by the client. In our sample flow, the server then
SYN-ACK
ACK
ClientHello
(TLSVersion, Time, ClientRandom, SessionId, CipherSuites)
ServerHello
(AgreedVersion, Time, ServerRandom, ChosenSuite)
Certificate
ServerHelloDone
ClientKeyExchange(PreMasterSecret)
Finished (encrypted hash of previous messages)
ChangeCipherSpec
Finished (encrypted hash of previous messages)
Figure 27.4 Sample TLS initiation flow. In this case there is a server certificate, but no client certificate.
The flow is using RSA-based key exchange. This is the first time the client has connected to the server,
so no session reinitiation can happen.
sends a ServerHelloDone to say that this phase is over. In a mutual TLS scenario, the
server would instead send a request for the client’s certificate at this point. Once this
phase is done, the two parties can agree a symmetric key using different methods,
including RSA, and the Diffie-Hellman protocol. Diffie-Hellman is one of the key
ways of two parties agreeing on a shared secret without a third party knowing it.
In our example, we use an RSA method of agreeing a shared key. Without going
into the full details of protocol, the client sends a new random number, known as
the pre-master key (PMS), which is encrypted using the server’s public key (which
was in the certificate). Both sides have a random number (the PMS) that is hidden
from observers, together with the random numbers exchanged before, which were
available to any attacker. Using these three things, they can both compute the same
master key which can be used to encrypt all onbound data. This is what we referred
to as an ephemeral key above. Now, the client sends a ChangeCipherSpec packet
which indicates that all future packets it sends are encrypted. It now sends the first
encrypted packet, Finished, which contains a hash of all the previous handshake
messages. This is used to ensure that no man-in-the-middle attack has been done.
Then the server responds with the same ChangeCipherSpec and Finished.
For a much more detailed description of the handshake we recommend
reading the excellent article The First Few Milliseconds of an HTTPS Connection
(http://www.moserware.com/2009/06/first-few-milliseconds-of-https.html)
27.2.6 Datagram Transport Layer Security
Datagram Transport Layer Security (DTLS) is TLS modified as little as possible to

support UDP. TCP has both re-delivery of messages and message ordering, neither
of which are supported by UDP. Therefore there are two key issues that need to
be resolved to get TLS to work in a UDP environment. Firstly, handshake messages
may get lost. Secondly, the decryption of packet in TLS relies on having seen
packet . If packet is lost, then the decrypting side will get confused and decrypt
the data incorrectly. In addition, there is a problem that some of the TLS handshake
messages are larger than a datagram and may get fragmented, which is also solved
by TCP but not by UDP. To solve these the DTLS standard adds retransmission and
message sequence numbers to the packets.
One concern about DTLS is the relative newness of the protocol and its early
adoption. DTLS was introduced in 2012 (compared to 1999 for TLS). In 2013
researchers found an attack on DTLS with certain cipher suites that could retrieve
the plaintext. TLS has had many severe attacks, but these are in fact a good sign: we
know it has been tested and improved, and there is a wealth of knowledge of how
to implement it effectively. The same is not yet true of DTLS.
27.3 CRYPTOGRAPHY ON SMALL DEVICES
This concludes our beginner’s guide to encryption. However, before we leave

encryption, let us examine the unique challenges of crypto on IoT devices.
It is generally thought that cryptography is not feasible on very small (e.g., 8-
bit) devices. In fact this is not strictly true, but there are significant issues. The quick
summary is that there are two key issues in performing crypto on 8-bit devices:
memory and time. Many 8-bit IoT devices have only 32kb of memory, and even
the most effective elliptic curve libraries take around 10kb, leaving not much room
for the core logic of the controller. Even worse is that there is no room to store
proper certificate chains. The result is that a device must be hard-coded for the
specific certificates at a server, meaning that there is much less flexibility, and it is
more likely we will need to update the device with new certificate details, which
is difficult. Another issue is time: if a device takes 60 seconds to encrypt a small
amount of data, that may be acceptable if it has no user input. However, if it locks
up for 60 seconds while a user is expecting some response, they may think it is
broken.
There are two approaches to solve this. Firstly, there are dedicated hardware
crypto systems that are cheap and effective. For example, the Atmel ATSHA204A
cryptographic chip provides secure password and key storage, a unique identifier,
SHA-256 hashing, and symmetric cryptography — all accessed via the I2 C interface
(see Section 19.1 on hardware interfaces). The bulk cost at the time of writing is
approximately 0.5 USD per chip. However, these require designers to understand
and be able to use them. There is often a lack of open source code or easy
documentation, although recently this device has had examples made available for
Arduino.
The other approach is to use a more powerful platform that includes crypto,
often a 32-bit device. For example, the ESP8266 and ESP32 chips offer AES- and
TLS-based encryption for a bulk cost starting at about 2 USD per chip (ESP8266
cost at the time of writing). These chips may be used as full IoT controllers or be
configured as communications coprocessors that provide WiFi and encryption to an
existing controller system.
For example, the commonly available ESP8266 supports simple TLS with
most TCP-based protocols, including HTTPS and MQTTS. The use of TLS in
these cases is more limited compared to a normal non-IoT system, because of

the storage requirements. Even though the ESP8266 has a generous amount of
memory and program storage by IoT standards (1 Mb program storage, 80 kb
program memory and 512 bytes of EEPROM), the certificate chains would take
up most of the available memory. Therefore, instead the TLS library validates only
the server’s certificate fingerprint. This is a unique hash based on the certificate.
This has the advantage of saving a considerable amount of memory, but tying the
code to a specific certificate issued to the server. In other words, when the server
certificate expires, then the IoT device needs updating. To solve this, the system
needs an effective model to update the device’s code or EEPROM contents without
compromising usability, safety, or security. We will discuss this further later in this
part of the book.
Chapter 28
Threats, Challenges, and Concerns for IoT
Security and Privacy
In order to characterize the threats against IoT, we can use a combination of
two different approaches. We have already addressed the high-level categories of
CIA+. Now we will extend that into a wider matrix by dividing the IoT world into
three domains: the device/hardware, the network, and then the cloud/server-side.
Together, these two different ontologies form a matrix of threats. In each cell of
the matrix we can identify IoT threats, and then we can use this IoT threat matrix
to inform our approach to creating a secure IoT. Our aim in creating this matrix is
to identify threats that are unique to IoT. In some cells of our matrix, the threats
will not differ from existing Internet systems. In those cells we will still outline the
threats, but we will identify these as existing threats.
In Figure 28.1, each cell contains a very high-level summary of the threats,
challenges, and concerns for IoT security and privacy. This is then followed up with
a more detailed description below. For each cell we also look at countermeasures
and approaches to provide security.
28.1 A1: DEVICE CONFIDENTIALITY
Hardware devices have their own challenges for security. There are systems that
can provide tamper-proofing and try to minimize attacks, but if an attacker has
direct access to the hardware, they can often break it in many ways. For example,
there are devices that will copy the memory from flash memory into another system
(known as NAND mirroring). Code that has been secured can often be broken
407
Security Characteristic Device/Hardware Network Cloud/Server Side

A B C
1. Confidentiality A1 hardware attacks B1 encryption with C1 privacy, data leaks,
low capability devices fingerprinting
2. Integrity A2 spoofing: lack of B2 signatures with low C2 no common device

attestation capability devices, identity
Sybil attacks
3. Availability A3 physical attacks B3 unreliable network, C3 DDoS

DDoS, radio jamming
4. Authentication A4 lack of UI, default B4 default passwords, C4 no common device

passwords, hardware lack of secure identities identity, insecure flows
retrieval of secrets
5. Access control A5 physical access, B5 lightweight C5 inappropriate use

lack of local distributed protocols of traditional ACLs,
authentication for access control device shadow
6. Non-repudiation A6 no secure local B6 lack of signatures C6 lack of secure

storage, no attestation, with low capability identity and signatures
forgery devices
Figure 28.1 IoT threat matrix.

Threats, Challenges, and Concerns for IoT Security and Privacy 409
with scanning electron microscopes. Skorobogatov from Cambridge University has

written a comprehensive study of many semi-invasive attacks that can be done on
hardware. Another common attack is called a side-channel attack, where the power
usage or other indirect information from the device can be used to steal information.
This means that it is very difficult to protect secrets on a device from a committed
attacker.
A specific outcome of this is that designers should not rely on obscurity to
protect devices. A clear example of this was the MIFARE card (see also Section
19.3.1.2) used as the London Oyster card (for cashless payment on public trans-
port) and for many other authentication and smart-card applications. The designers
created their own cryptographic approach and encryption algorithms. Security re-
searchers used a number of techniques to break the obscurity, decode the algorithm,
find flaws in it, and create a hack that allowed free transport in London as well as
breaking the security on a number of military installations and nuclear power plants!
A related issue to confidentiality of the data on the device is the challenges
inherent in updating devices and pushing keys out to devices. The distribution and
maintenance of certificates and public keys onto embedded devices is complex. In
addition, sensor networks may be connected intermittently to the network resulting
in limited or no access to the certificate authority (CA). To address this, the use
of threshold cryptographic systems that do not depend on a single central CA has
been proposed, but this technology is not widely adopted: in any given environment
this would require many heterogeneous devices to support the same threshold
cryptographic approach.
Finally, the use of PKI requires devices to be updated as certificates expire.
The complexity of performing updates on IoT devices is harder, especially in
smaller devices where there is no user interface. For example, many devices need
to be connected to a laptop in order to perform updates. This requires human
intervention and validation, and in many cases this is another area where security
falls down. For example, many situations exist where security flaws have been fixed
but because devices are in homes, or remote locations, or seen as appliances rather
than computing devices, updates are not installed.
28.2 B1: NETWORK CONFIDENTIALITY
The confidentiality of data on the network is usually protected by encryption

of the data. There are a number of challenges with using encryption in small
devices. Performing public key encryption on 8-bit microcontrollers has been
enhanced by the use of ECC. ECC reduces the time and power requirements for
the same level of encryption as an equivalent RSA public key encryption by an
order of magnitude: RSA encryption on constrained 8-bit microcontrollers may
take minutes to complete, whereas similar ECC-based cryptography completes
in seconds. However, despite the fact that ECC enables 8-bit microcontrollers to
participate in public key encryption systems, in many cases it is not used. We
can speculate as to why this is: firstly, the encryption algorithms consume a large
proportion of the available program memory (ROM) on small controllers. Secondly,
the complexity of key distribution hinders the development and management of
encrypted IoT systems.
Another key challenge in confidentiality is the complexity of the most com-
monly used encryption protocols. The standard TLS protocol can be configured to
use ECC, but even in this case the handshake process requires a number of message
flows and is sub-optimal for small devices. It is argued that using TLS PSK improves
the handshake. However, there are significant challenges with using PSK with IoT
devices: the fact that either individual symmetric keys need to be deployed onto
each device during the device manufacturing process, or the same key re-used. In
the case of key re-use there is a serious security risk that a single device will be
broken and thus the global key will be available to attackers.
There is a version of TLS for UDP — DTLS, which provides a lighter weight
approach than TLS with TCP. However, there is still a reasonably large RAM and
ROM size required for this, and this requires that messages be sent over UDP, which
has significant issues with firewalls and home routers, making it a less effective
protocol for IoT applications. There is ongoing work at the IETF to produce an
effective profile of both TLS and DTLS for the IoT.
Many of the same concerns about cryptography apply to B2, where we look
at the use of digital signatures with low-power devices.
Given the lack of cryptography, it is not surprising that many of the attacks
on IoT devices have been based on attacking unencrypted radio transmissions,
which are common. For example, security researchers recently found that they could
fingerprint cars based on transmissions from tire pressure monitors, and in addition
that they could drive behind a car and from up to 40 feet away they could signal to
the driver that the tire pressure was dangerously low when in fact it was not. Such
an attack could easily be used to get a driver to stop and leave their car.
Other local networking approaches also have flaws. For example, Bluetooth
Low Energy 4.0 had a problem in the initial standard where the key exchange was
unsecure and attackers in the radio-reception vicinity of the exchange could then
eavesdrop on all further communication. While this has been fixed in new versions,
one of the challenges of IoT is that such function is often encapsulated in physical
chips that cannot be updated, with the result that many exploitable BLE systems will
remain in use for a long time with no chance of updates or fixes. To give another
example, the Oyster card issue mentioned above is still exploitable as Transport for
London accepts existing broken smart cards more than eight years after the exploit
was first publicized.
Even without concern about the confidentiality of the data, there is one further
confidentiality issue around IoT devices in the network, and that is confidentiality of
the metadata. Many IoT systems rely on radio transmission and in many cases they
can be fingerprinted or identified by the radio signature. For example, Bluetooth and
WiFi systems use unique identifiers (the MAC address; see Chapter 19). These can
be identified by scanning, and there have been a number of systems deployed to do
that, including in airports and in cities. These systems effectively can follow users
geographically around. If the user then connects to a system, that fingerprint can be
associated with the user and the previously collected location information can be
correlated with that user.
28.3 C1: CLOUD/SERVER CONFIDENTIALITY
In general, the issues around cloud confidentiality are the same as the issues in non-
IoT systems. There are however, some key concerns over privacy that are unique to
the Internet of Things. We cover some of these here, as well others in our discussion
around cloud access control.
One major concern that is exacerbated by the Internet of Things is correlation
of data and metadata, especially around deanonymization. Many researchers have
shown that anonymous metadata can be deanonymized by correlating it with other
publicly available social metadata (e.g., comparing anonymized Netflix viewing
data with IMDB reviews). This is a significant concern with IoT data.
A related issue is the concept of fingerprinting of sensors or data from sensors.
It has been shown that microphones, accelerometers and other sensors within
devices have unique fingerprints that can uniquely identify devices. This has been
extended to such aspects as the discharge curve of batteries. The result is that data
that is assumed to be anonymous or disjoint can often be correlated to provide a
bigger picture that has significant possibilities to infringe on users’ privacy.
A key model for addressing these issues in the cloud are services that filter,
summarize and apply stream processing technologies to the data coming from IoT
devices. For example, if we only publish a summarized coordinate rather than the
raw accelerometer data, we can potentially avoid the fingerprinting.
In addition, an important concern has been raised in the recent past with the
details of the government sponsored attacks from the U.S. NSA and British GCHQ
that have been revealed by Edward Snowden. These bring up three specific concerns
on IoT privacy and confidentiality.
The first concern is the revelations that many of the encryption and security
systems have had deliberate backdoor attacks added to them so as to make them less
secure in a NSA project called BULLRUN. The second concern is the revelation that
many providers of cloud hosting systems have been forced to hand over encryption
keys to the security services. The third major concern is the revelations on the extent
to which metadata is utilized by the security services to build up a detailed picture
of individual users.
The implications of these three concerns when considered in the light of the
Internet of Things is clear: a significantly deeper and larger amount of data and
metadata will be available to security services and to other attackers who can utilise
the same weaknesses that the security services have compromised and will continue
to compromise. In one case recently, malware from the NSA was directly leaked
after being left on a staging server.
28.4 A2: HARDWARE INTEGRITY
The concept of integrity refers to maintaining the accuracy and consistency of data.
In this cell of the matrix, the challenges are in maintaining the device’s code and
stored data so that it can be trusted over the life cycle of that device. In particular the
integrity of the code is vital if we are to trust the data that comes from the device or
the data that is sent to the device. The challenges here are viruses, firmware attacks
and specific manipulation of hardware. For example, we have seen worm attacks
on router and IoT firmware, where each compromised system then compromises
further systems, leaving behind a slew of untrustworthy systems.
The traditional solution to such problems is attestation. Attestation is impor-
tant in two ways. Firstly, attestation can be used by a remote system to ensure
that the firmware is unmodified and therefore the data coming from the device is
accurate. Secondly, attestation is used in conjunction with hardware-based secure
storage to ensure that authentication keys are not misused. This technology is known
as a hardware security manager (HSM).
In order to preserve the security of authentication keys in a machine where

human interaction is involved, the user is required to authenticate. Often the keys
are themselves encrypted using the human’s password or a derivative of the iden-
tification parameters. However, in an unattended system, there is no human inter-
action. Therefore the authentication keys need to be protected in some other way.
Encryption on its own is no help, because the encryption key is then needed and this
becomes a circular problem. The solution to this is to store the authentication key
in a dedicated hardware storage. However, if the firmware of the device is modified,
then the modified firmware can read the authentication key and offer it to a hacker
or misuse it directly. The solution to this is for an attestation process to validate that
the firmware is unmodified before allowing the keys to be used. Then the keys must
also be encrypted before sending them over any network.
These attestation models are promoted by groups such the Trusted Computing
Group and Samsung Knox. These rely on specialized hardware chips such as the
Atmel AT97SC3204, which implement the concept of a trusted platform module
(TPM). There is research into running these for smart grid devices. However,
while there is considerable discussion of using these techniques with IoT, we could
not find evidence of any real-world devices apart from those based on mobile-
phone platforms (e.g., phones and tablets) that implemented trusted computing and
attestation. We believe this is based on the complexity, lack of public documentation
and open source code, and other process issues that are making it very complex to
develop IoT devices with attestation capabilities.
28.5 B2: NETWORK INTEGRITY
Maintaining integrity over a network is managed as part of the public key encryption
models by the use of digital signatures. The challenges for IoT are exactly those we
already identified in the section B1 above, where we described the challenges of
using encryption from low-power IoT devices.
However, there is a further concern with IoT known as the Sybil attack. A
Sybil attack (named after a character in a book with multiple personality disorder)
is where a peer-to-peer network is taken over when an attacker creates a sufficiently
large number of fake identities to persuade the real systems of false data. A Sybil
attack may be carried out by introducing new IoT devices into a locality or by
suborning existing devices. For example, it is expected that autonomous cars may
need to form local ephemeral peer-to-peer networks based on the geography of the
road system. A significant threat could be provided if a Sybil attack provided those
cars with incorrect data about traffic flows.
28.6 C2: CLOUD/SERVER INTEGRITY
The biggest concern in cloud/server integrity is the lack of common concepts and
approaches for device identity. Integrity relies on identity — without knowing who
or what created data, we cannot trust that data. There is some emerging research
on using federated identities for IoT (for example the work by one of the present
authors in the paper OAuthing: Privacy Enhancing Federation for the Internet of
Things, and we will discuss this approach later in Chapter 29.
28.7 A3: DEVICE AVAILABILITY
One of the significant models used by attackers is to challenge the availability of a

system, usually through a DoS or DDoS attack. DoS attacks and availability attacks
are used in several ways by attackers. Firstly, there may be some pure malicious
or destructive urge (e.g., revenge, commercial harm, share price manipulation) in
bringing down a system. Secondly, availability attacks are often used as a precursor
to an authentication or spoofing attack, as discussed above.
IoT devices have some different attack vectors for availability attacks. These
include resource consumption attacks (overloading restricted devices) and physical
attacks on devices. A simple availability attack on an IoT device might be to force
it to use more power (e.g., by initiating multiple key exchanges over Bluetooth)
and thereby draining the battery. Another even more obvious availability challenge
would be to simply physically destroy a device.
28.8 B3: NETWORK AVAILABILITY
There are clearly many aspects of network availability that are the same as existing
network challenges. However, there are some issues that particularly affect IoT. In
particular, there are a number of attacks on local radio networks that are possible.
Many IoT devices use radio networking and these can be susceptible to radio
jamming.
28.9 C3: CLOUD/SERVER AVAILABILITY
The challenges with cloud/server availability are not new. Elsewhere we looked at
DoS attacks and DDoS attacks. The biggest challenge here is the use of IoT devices
themselves to create the DDoS attack on the server, as in the Mirai botnet.
28.10 A4: DEVICE AUTHENTICATION
We will consider the authentication of the device to the rest of the world in later
sections. In this cell of the matrix we must consider the challenges of how users
or other devices can securely authenticate to the device itself. These are, however,
related: a user may bypass or fake the authentication to the device and thereby cause
the device to incorrectly identify itself over the network to other parts of the Internet.
Some attacks are very simple: many devices come with default passwords that
are never changed by owners. We already identified this as a root cause of Mirai.
Another well-publicized example was where a security researcher gained access to
full controls of a number of smart homes, and was able to phone up homeowners and
while talking to them, remotely control lighting and heating in their smart homes.
Similarly many home routers are at risk through unsecure authentication.
Such vulnerabilities can then spread to other devices on the same network as
attackers take control of the local area network.
A key issue here is the initial registration of the device. A major issue with
hardware is when the same credential, key, or password is stored on many devices.
Devices are susceptible to hardware attacks (as discussed above) and the result is
that the loss of a single device may compromise many or all devices. In order to
prevent this, devices must either be preprogrammed with unique identifiers and
credentials at manufacturing time, or must go through a registration process at
setup time. In both cases this adds complexity and expense, and may compromise
usability. We propose the use of Web APIs such as Dynamic Client Registration
(DCR), which is part of the OAuth2 specifications, to create unique keys/credentials
for each device.
28.11 B4: NETWORK AUTHENTICATION
Unlike browsers or laptops where a human has the opportunity to provide authenti-
cation information such as a user id and password, IoT devices normally run unat-
tended and need to be able to power-cycle and reboot without human interaction.
This means that any identifier for the device needs to be stored in the program
memory (usually SRAM), ROM or storage of the device. This brings two distinct
challenges:
• The device may validly authenticate, but its code may have been changed.
• Another device may steal the authentication identifier and may spoof the
device.
We already discussed the Sybil attack, where a single node or nodes may
impersonate a large number of different nodes, thereby taking over a whole network
of sensors. In all cases, attestation is a key defence against these attacks, but as dis-
cussed earlier, there is still not enough real-world use of attestation in IoT devices.
Another defence is the use of reputation and reputational models to associate a trust
value to devices on the network.
Reputation is a general concept widely used in all aspects of knowledge
ranging from humanities, arts, and social sciences to digital sciences. In computing
systems, reputation is considered as a measure of how trustworthy a system is. There
are two approaches to trust in computer networks: the first involves a black and
white approach based on security certificates, policies, and so forth. For example,
a system known as SPINS develops a trusted network. The second approach is
probabilistic in nature, where trust is based on reputation, which is defined as a
probability that an agent is trustworthy. In fact, reputation is often seen as one
measure by which trust or distrust can be built based on good or bad past experiences
and observations (direct trust) or based on collected referral information (indirect
trust).
In recent years, the concept of reputation has shown itself to be useful in
many areas of research in computer science, particularly in the context of distributed
and collaborative systems, where interesting issues of trust and security manifest
themselves. Therefore, one encounters several definitions, models, and systems of
reputation in distributed computing research. This is an ongoing area of research,
and undoubtedly will provide a key aspect to providing trust in IoT networks.
28.12 C4: CLOUD/SERVER AUTHENTICATION
The IETF has published a draft guidance on security considerations for IoT
(https://tools.ietf.org/html/draft-irtf-t2trg-iot-seccons-00). This draft does discuss
both the bootstrapping of identity and the issues of privacy-aware identification.
One key aspect is that of bootstrapping a secure conversation between the IoT device
and other systems, which includes the challenge of setting up an encrypted and/or
authenticated channel such as those using TLS, HIP, or Diet HIP. The Host Iden-
tity Protocol (HIP) is a protocol designed to provide a cryptographically secured
endpoint to replace the use of IP addresses, which solves a significant problem —
IP-address spoofing — on the Internet. Diet HIP is a lighter-weight rendition of the
same model designed specifically for IoT and M2M interactions. While HIP and
Diet HIP solve difficult problems, they have significant disadvantages to adoption.
Firstly, they require low-level changes within the IP stack to implement (see Chapter
22). Secondly, as they replace traditional IP addressing they require a major change
in many existing systems to work. In addition, neither HIP nor Diet HIP address the
issues of federated authorization and delegation.
As discussed above, one key issue is the lack of good identity models for
IoT. Fremantle has spearheaded the work on using OAuth2 for authentication and
authorisation with MQTT and the IOT-OAS work similarly addresses the use of
OAuth2 with CoAP. Another team has built a secure mobile digital wallet by using
OAuth together with the eXtensible Messaging and Presence Protocol (XMPP). In
Chapter 29 we will take a more detailed look at using OAuth2, DCR, and federated
identity to solve these problems.
28.13 A5: DEVICE ACCESS CONTROL
There are two challenges to access control at the device level. Firstly, devices are
often physically distributed and so an attacker is likely to be able to gain physical
access to the device. The challenges here were already discussed in A1. However,
there is a further challenge: access control requires a concept of identity. We cannot
restrict or allow access without some form of authentication to the device, and as
discussed in A4, this is a significant challenge. To give a real-life example, certain
mobile phones have recently started encrypting data based on the user’s own lock-
screen personal identity number (PIN) code. This guarantees the data cannot be read
without the user’s PIN code. However, using the technique of NAND Mirroring, it
has been demonstrated that the controls that stop repeated attempts at PIN codes
can be overcome, with the result that a 4-digit PIN can easily be broken within
a reasonable amount of time. This shows the balance of usability versus security.
Asking users for an 8-digit PIN would solve this, but would make it impracticably
annoying to use their phone.
Systems such as Webinos have proposed using policy-based access control
mechanisms such as XACML for IoT devices. However, XACML is relatively
heavyweight and expensive to implement, especially in the context of low-power

devices. To address this, Webinos has developed an engine that can calculate the
subset of the policy that is relevant to a particular device. Despite this innovation,
the storage, transmission, and processing costs of XACML are very high for an IoT
device.
28.14 B5: NETWORK ACCESS CONTROL
There are a number of researchers looking at how to create new lightweight proto-
cols for access control in IoT scenarios. In Identity Authentication and Capability
Based Access Control (IACAC) for the Internet of Things Mahalle has described a
new protocol for IoT authentication and access control is proposed based on ECC
with a lightweight handshake mechanism to provide an effective approach for IoT,
especially in mobility cases. In Autonomous and Self Controlling Smart Objects
for the Future Internet, Hernandez has proposed a decentralized approach for ac-
cess control that uses ECC once again and supports capability tokens in the CoAP
protocol.
28.15 C5: CLOUD/SERVER ACCESS CONTROL
The biggest challenge for privacy is ensuring access control at the server or cloud
environment of data collected from the IoT. For example, in 2011 the company
Fitbit made data about users’ sexual activity available and easily searchable online
by default. This highlights a number of issues. Firstly, users were not aware of what
was being published by default (demonstrating a lack of informed consent), and
Fitbit had not thought through the implications of making a large amount of data
easily searchable. At the heart of this are social and policy issues regarding who
actually owns the data created by IoT devices.
Existing hierarchical models of access control are not appropriate for the
scale and scope of the IoT. There are two main approaches to address this. The
first is policy-based security models where roles and groups are replaced by more
generic policies that capture real-world requirements such as A doctor may view
a patient’s record if they are treating that patient in the emergency room. The
second approach to support the scale of IoT is user-directed security controls. The
Kantara Initiative has made a strong case for ensuring that users can control access
to their own resources and to the data produced by the IoT that relates to those users.
The User Managed Access (UMA) from the Kantara Initiative enhances the OAuth
specification to provide a rich environment for users to select their own data sharing
preferences. We would argue strongly that this overall concept of user-directed
access control to IoT data is one of the most important approaches to ensuring
privacy.
In Privacy and the Emerging Internet of Things, Winter and others have
argued that contextual approaches must be taken to ensure privacy with the IoT.
Many modern security systems use context and reputation to establish trust and
to prevent data leaks. Context-based security defines this approach, which is now
implemented by major Web systems including Google and Facebook.
One interesting approach here is the concept of a device shadow. A device
shadow is a cloud or network construct that captures all the data from the device and
then passes this onto any systems that need it. There are some significant functional
benefits to this model — e.g., the device shadow can be available even if the device
itself is offline intermittently. It also can provide significant access-control benefits,
by acting as a gatekeeper for IoT data and metadata. A simple example is that often
the metadata of systems does not have good access-control — for example an IP
address may be used to locate a device geographically. The device shadow can hide
this from the rest of the world, preventing this kind of attack.
28.16 A6: DEVICE NON-REPUDIATION
The biggest challenge in the non-repudiation network with IoT devices is the
challenge of using attestation for small devices. Attestation is discussed in detail
above. Without attestation, we cannot trust that the device system has not been
modified and therefore it is not possible to trust any non-repudiation data from the
device.
28.17 B6: NETWORK NON-REPUDIATION
The same challenges apply for network non-repudiation as discussed in B1 and

B2. Non-repudiation on the wire requires cryptography techniques and these are
often hindered by resource restrictions on small devices. PKASSO is a proposed
authentication and non-repudiation protocol for restricted devices.
28.18 C6: CLOUD/SERVER NON-REPUDIATION
Cloud/server non-repudiation is unchanged by the IoT, so we do not discuss it any

further.
28.19 SUMMARY OF THE THREAT MATRIX
In this matrix we have created a widened ontology for evaluating the security issues
surrounding the Internet of Things, and examined the existing challenges, threats,
concerns and approaches.
One area that crosses most or all of the cells in our matrix is the need for
a holistic and studied approach to enabling privacy in the IoT. As discussed in a
number of cells, there are significant challenges to privacy with the increased data
and metadata that is being made available by IoT-connected devices. An approach
that has been proposed to address this is privacy by design (PBD). This model
suggests that systems should be designed from the ground up with the concept of
privacy built into the heart of each system. Many systems have added security or
privacy controls as add-ons, with the result that unforeseen attacks can occur.
In reviewing these areas, we identified a list of security properties and capa-
bilities that are important for the security and privacy of IoT. We will use this list in
the second part of this discussion as columns in a new table where we evaluate a set
of middleware on their provision of these capabilities.
Integrity and confidentiality
The requirement to provide integrity and confidentiality is an important
aspect in any network and as discussed in several cells, there are a number of
challenges in this space for IoT.
Access control
Maintaining access control to data that is personal or can be used to extract
personal data is a key aspect of privacy. In addition, it is of prime importance
with actuators that we do not allow unauthorized access to control aspects of
our world.
Policy-based security
Managing security in the scale of IoT is unfeasible in a centralized approach.
As we discussed, access control and identity models need to be based on
policies such as XACML or OAuth2 scopes rather than built in a traditional
hierarchical approach.
Authentication
Clearly, in order to respect privacy, IoT systems need a concept of authenti-
cation.
Federated identity
There is a clear motivation for the use of federated models of identity for
authentication in IoT networks.
Attestation
Attestation is an important technique to prevent tampering and hence issues
with integrity of data as well as confidentiality in IoT.
Summarisation and filtering

The need to prevent deanonymization is a clear driver for systems to provide
summarisation and filtering technologies such as stream processing.
Privacy by Design
An important approach to ensuring privacy is to build this into the design of
the systems.
Context-based security and reputation

Many modern security models adapt the security based on a number of fac-
tors, including location, time of day, previous history of systems, and other
aspects known as context. Another related model is that of the reputation
of systems, whereby systems that have unusual or less-than-ideal behavior
can be trusted less using probabilistic models. In both cases there are clear
application to IoT privacy as discussed above.
Chapter 29
Building Secure IoT Systems
In this chapter we will explore approaches to hardware, software, network and cloud
security that provide a better result. We can never expect to be 100% secure — it is
worth bearing in mind that security is always a balance between cost and benefit.
29.1 HOW TO DO BETTER
Based on the previous section, it is possible to think a little negatively about IoT
security. We have seen many examples of hacks, a huge list of concerns, and many
areas where work needs to be done. However, this is not to say that we are not
positive about the future of IoT security. We believe that if systems implement the
best available practice for IoT security today, then many of the existing hacks would
never have happened.
To bolster this view, we will take a more detailed look at an IoT system called
OAuthing that implements many of these techniques to build a more secure ap-
proach to IoT (see Figure 29.1). OAuthing is an open source research project that
provides some of the key aspects that we need to protect the system. The code
for OAuthing is available at https://github.com/pzfreo/oauthing, and there is more
information on OAuthing available in the research paper OAuthing: Privacy En-
hancing Federation for the Internet of Things, available at http://freo.me/oauthing-
ciot-paper.
Let us start with a quick overview of the system. The system consists of three
main pieces: the device, the identity system, and the data-sharing system. In addition
there are two other parties: the manufacturer and the third-party service provider —
we’ll come back to these shortly.
423
Manufacturer * 1 Device Identity Provider 1 * User Identity Provider

(DIdP) (UIdP)
1 1 1 1
* *
* 1
Device * 1 Intelligent Gateway (IG) * * Third-Party App (TPA)
1
*
Personal Cloud Middleware 1 1
User
(PCM)
Figure 29.1 OAuthing. Concepts and entities in OAuthing. Arrows denote relationships, 1: one, *:
many.
The aim of the OAuthing model is to create a secure system, with secure
devices and a secure cloud service, together with a secure network between the two.
The security starts with the device. There are two key aspects of the device
that help solve the problems we have brought up earlier: the first is obviously
encryption, and the second is identity and registration. Our sample device is based
on the ESP8266, which costs around 2 USD at the time of writing, and has a 32-
bit processor, 1 Mb of program memory, 80 kb of variable memory, and built-in
WiFi. We coded the device’s bootloader and logic in Processing, which is the C-like
language used by Arduino devices.
29.1.1 Device Registration
One of the issues we brought up before is that of secure registration. To solve this,
we created a three-stage process for creating a secure and managed device. The
first stage is in the factory, where each device is initialized with a secure random
identifier and secret. Are not these just user ids and passwords? The difference is
simply that these are truly random and with 30 characters of random alphanumeric
digits each, they are not susceptible to dictionary attacks. A dictionary attack
is where the words in a dictionary are used to try to break a user id/password
combination. This id and secret are provided by an entity we call the Device Identity
Provider (DIdP), which is a server on the Internet that manages the security of
devices and their identities. The device also needs the fingerprint of the DIdP’s
certificate. This is because — as discussed above — the device doesn’t have
sufficient storage to manage a normal TLS certificate chain. This combination gives
the device and the DIdP mutual authentication: the device uses its id/secret to
Building Secure IoT Systems 425
identity to the DIdP and the DIdP is authenticated by validating the TLS certificate.
Thereby we have already ticked off confidentiality, integrity, and authentication
from our list! Encoding each device with a unique id and secret is definitely more
expensive than giving each device a standard userid and password, so to minimize
the cost we built an automated process to do this. As part of this process, we create
a QR code that is printed onto the device, but this could also be done by printing
a short URL or embedding an NFC chip — the QR code is simply a cheap and
effective equivalent.
The second phase of the registration process is the user registration. This gives
control of the device to a given user. We built a process where users scan the QR
code. They are then taken to a federated login page. This allows them to use their
existing federated logins (e.g., OAuth2, SAML2, Shibboleth) to identify themselves
to the DIdP. The DIdP then asks for the user’s consent to own this device. If the user
consents, then the device is issued with a security token, which works in conjunction
with the id and secret. This is completely based on the OAuth2 specification. This
token (called the refresh token) now identifies the device as being owned by the user.
Once a user has owned the device, it cannot be re-registered until the user revokes
that ownership (e.g., when selling the device), so further scans of the QR code will
not allow an attacker to take ownership.
The way OAuth2 works is that the device’s id, secret and refresh token are
only ever sent to the DIdP, and always using TLS. When the device boots up or
when a timeout happens, the device uses those three parameters to request a new
ephemeral token from the DIdP. This token — called an OAuth2 bearer token —
can then be shared with other systems. Because it has a short lifetime, even if
the other systems are compromised and the bearer token is stolen, it only allows
limited access to the attackers. At the same time as the refreshed bearer token is
passed back, the DIdP can also pass back other information to the device, enabling
secure updates to happen. We also can pass back secure configuration information,
including server certificate fingerprints and server addresses, allowing the device to
communicate effectively with other servers.
Once the device has its bearer token, it enters stage three of the process:
runtime. In this mode, it now sends data to data-sharing middleware. It also supports
receiving commands from the middleware. All the communication in this version
of OAuthing happens over MQTT, but the same principles could be applied to other
protocols such as CoAP or HTTP (see Chapter 22).
What else could we do to enhance the security of the device? The main
aspect we have not implemented is to support a hardware security manager and
trusted platform. This would allow us to store encryption keys directly in the device
and improve over our use of certificate fingerprints. A second benefit would be
to prevent the device’s secrets being stolen. Currently, a determined hacker with
physical access to the device could steal the identity, secret and refresh token from
the device. However, that would only allow them to emulate that device.
Even more importantly, adding trusted platform support to the device would
allow us to verify at the server side (e.g., in the DIdP) that the device’s code has
not been modified using device attestation. This would prevent hackers from using
those secrets even if they could steal them, because the server would validate that
the devices code is unmodified. However, the extra complexity and expense of this
approach make it prohibitive at the moment. We look forward to when this becomes
affordable and effective.
29.1.2 Device Identity System
We have already described some aspects of the DIdP. There are some other note-
worthy aspects that are worth mentioning. Before we dive into those, let us look
at the overall design of the DIdP. The DIdP is a node.js application that runs in the
Docker infrastructure. This makes it very easy to deploy in a cloud environment. We
did not implement database encryption to protect user and device credentials as yet,
but this would be a logical step for most deployments. However, we do hash all the
secrets using a unique salt per device. This is probably unnecessary, as the secrets
we store are not amenable to dictionary attacks, but the belt and braces approach
does no harm in this case.
A key design decision we took is to ensure that the system never stores
any user passwords. Instead, we implemented a pattern known as the identity
broker. The DIdP is issued identity tokens by third-party identity systems (such
as Facebook, Google, or Twitter). It then issues its own tokens based on those to
the device and to other parties wishing to communicate with the device. To protect
identities, we coded the DIdP to create random pseudonyms for each user. This
means that the user’s identity is never shared with any other system, including the
device itself.
29.1.3 Personal Cloud Middleware
An orthogonal aspect of the OAuthing model is how data is shared. For example,
earlier iterations of the system had a standard MQTT broker with a security plugin
that validated the OAuth2 credentials to handle authentication and authorization.
This model is good, but in OAuthing, we extended this to support each user having
their own broker. In effect, this is the model mentioned above where a shadow
protects the device from direct access. In this case we can classify the user’s broker
as a user shadow, or or a personal cloud middleware (PCM). In effect the device only
ever publishes data to the user’s broker, and that then shares data with any cloud or
Internet service that is authorized to access the data. There are three benefits to this
approach:
• The metadata of the device is protected. For example, a location can be

identified by the IP address of the device, but by intermediating with a PCM,
we can avoid this data leaking out.
• The PCM can implement summarization and filtering. This is very important
in protecting privacy in IoT cases. For example, by only sharing a moving
average of sensor data, we can prevent the fingerprinting mentioned above.
This is not yet implemented in OAuthing but is planned for a future version.
• Trust in the cloud system can be achieved if the user and the third party are
both aware of the code running in the PCM. For example, if the PCM is an
open source codebase, then the user can trust data is shared properly and not
leaked. At the same time, the data recipient can know that the summarization
is accurate and correct. One option to improve this is to use attestation of
the PCM. For example, Intel Software Guard Extensions (SGX) allow cloud
systems to implement remote attestation. This could be used to implement
non-repudiation.
The PCM in OAuthing is implemented using Docker to run the open-source

MQTT broker Really Small Message Broker (RSMB) that is part of the Mosquitto
project of the Eclipse Foundation. A smart gateway known as IGNITE first validates
OAuth2 credentials and then passes on the MQTT packets to the right Docker con-
tainer. If the user has not previously used OAuthing, then IGNITE calls Docker to
instantiate a new container and broker. IGNITE supports both MQTT/TLS (known
as MQTTS) and also MQTT over Secure WebSockets (known as MQTT/WSS),
which allows third parties to effectively connect to the broker as well.
29.1.4 Pseudonymous Data Sharing
We have so far discussed how identities are managed and how a broker is run
for each user. Because the users are protected by pseudonyms, IGNITE does not
know the actual user identity and hence nor does the device nor the third-party
data recipient. In order to allow a third party to receive data (or send commands
to the device), the user simply initiates an OAuth2 consent flow. This is usually
done by visiting a webpage provided by the third-party service. This redirects to
the DIdP, which asks the user’s consent. This then issues a token back to the third-
party service. Then the service can make an MQTTS or MQTT/WSS connection to
IGNITE, which validates the token and then routes the request to the user’s PCM.
The token itself is just a random string that IGNITE can map to a pseudonym. In
this way a user can share data from a device without sharing their identity.
In this section we have described a state-of-the-art federated platform for IoT.
We have seen how we can create secure IoT systems that provide confidentiality,
integrity, secure federated authentication, non-repudiation, protection against fin-
gerprinting, and other key requirements that were identified in previous chapters.
29.2 CONCLUSIONS
In conclusion, across the security chapters we have looked at the core requirements
of security and privacy, and we have identified how these relate to IoT, specifically
to the hardware, network, and cloud systems of IoT networks. We have used this to
identify challenges, threats, and best practices. We then looked at an open-source
system that implements many of these best practices and shows how security and
privacy for IoT can be improved. As we said earlier, it is not effective to bolt on these
approaches. It is far better to implement them from day one as much as possible.
29.3 PRINCIPLES OF IOT SECURITY
We will end this chapter with a set of key principles for IoT security and privacy.
If you have read the rest of this part of the book you should already have a good
concept of these.
• Don’t rely on obscurity.

• Each device needs unique secure credentials.
• Support simple update procedures.
• Only support device update if you can secure it.
• Publish just enough data and no more.
• Ensure that there is consent for all data sharing or command sharing.
• Don’t provide raw data from sensors that can be fingerprinted.

• Protect metadata as much as data.
• Separate security flows from data flows.
• Beware of radio and other scanning attacks.
• Give users a clear view of what data you are publishing and storing on their
behalf.
About the Authors
Boris Adryan is Head of IoT & Data Analytics at Zühlke Engineering in Germany.
Boris has a doctoral degree in developmental genetics, and spent more than 15
years in the fields of bioinformatics and computational biology before rekindling
his love for Internet technology. As a Royal Society University Research Fellow at
the University of Cambridge (UK), Boris led a research group focusing on genomic
data analysis and machine learning. He has co-authored about 50 peer-reviewed
publications in the biomedical sciences and contributed various book chapters.
Following his interests in the Internet of Things, he founded the data analytics
consultancy firm Thingslearn and collaborated with companies in the London IoT
startup scene before returning to Germany. A geek by heart and keen enthusiast with
a soldering iron, Boris put his first “thing” on the Internet while still in secondary
school in 1995: Not surprisingly, a microscope.
Dominik Obermaier is CTO and Co-Founder of dc-square (Germany) that
specializes in IoT and MQTT product development, and provides architecture con-
sulting for ambitious projects with millions of connected IoT devices. Dominik’s
responsibilities include the success of professional service projects for international
customers in various industries such as automotive, military, logistics and telecom-
munications. He leads the product development of the HiveMQ MQTT broker and
designed the product’s fundamental architecture. HiveMQ scales in projects to tens
of millions of concurrent connected IoT devices with millions of delivered MQTT
messages per second. Dominik holds a B.Sc. in computer science from the Uni-
versity of Applied Sciences Landshut, where his Bachelor thesis received an IHK
Award for outstanding theses. He actively participates in the standardization of the
MQTT protocol as member of the OASIS MQTT technical committee. Dominik is
a member of the German Informatics Society and is a frequent conference speaker,
conference expert committee member for various conferences, and author for mul-
tiple German computer science magazines.
Paul Fremantle is a doctoral researcher at the University of Portsmouth, with
a focus on the security and privacy of the Internet of Things. Paul was Co-Founder
431
432
and CTO at WSO2, where was instrumental in the creation of the award-winning
WSO2 Carbon platform. WSO2 technology powers over 2 trillion transactions per
day in production. Paul was named one of the world’s top 25 CTOs by Infoworld in
2008. Before founding WSO2, Paul was a Senior Technical Staff Member in IBM,
where he lead the creation of the WebSphere Web Services Gateway, arguable the
world’s first API gateway. Paul is a visiting lecturer at Oxford University, where
he teaches service oriented architecture and big data. Paul is a Member of the
Apache Software Foundation, where he was VP of the Apache Synapse project.
He jointly chaired the WSRM working group in OASIS, leading to the creation
of an International Standard, and he has also participated in the AMQP and MQTT
working groups. Paul has two patents and he has co-authored three books, alongside
over 20 peer-reviewed publications. Paul has both B.A. and M.Sc. degrees from
Oxford University, where he studied mathematics, philosophy and computing.
ADR, see LoRa
Advanced Message Queuing Protocol, see AMQP
Index advanced planning and scheduling, 109, 242
Advanced Research Projects Agency Network,
88
1-wire, 232, 242 AES, see security
2FA, see security agricultural IoT, 111
3GPP, see cellular data services AM, see modulation
4-20 mA current loop, 238, 242 American National Standards Institute, 170
6LoWPAN, see IPv6 over Low-Power Wire- American Standard Code for Information In-
less Personal Area Networks terchange, 221
802.3at Type 1/2, 162 AMPS, see cellular data services
AMQP, 332–336
AC, see electric current connection, 334
access control, 389 container, 333
actionable insight, 97, 355 flow control, 334, 336
actuator, 129–131, 161, 177, 225, 240, 243, link, 334
256, 269, 275, 297, 328, 345, 348, node, 334
384, 390 session, 334
7-segment display, 178 analog-to-digital conversion, 74, 77, 187, 211,
e-ink paper, 179 298
electrical igniters, 185 analytics, 134, 136, 291, 292, 297, 347, 353–
electromechanical buzzer, 177 355, 371
in-plane switching display, 179 accuracy-versus-sensitivity plot, 369
light emitting diode, 178 action selection, 360
liquid crystal display, 178 anomaly detection, 365
loudspeaker, 177 batch processing, 361
matrix display, 178 Bloom filter, 357
motor, 180 boxplot, 363
piezoelectric buzzer, 177 classifier, 355, 367, 369
relay, 183 clustering, 364, 365
solenoid, 180 data clean-up, 367
sound, 177 data science, 355
thin-film transistor display, 179 deep artificial neural networks, 366
ADC, see analog-to-digital conversion distance measure, 365
Address Resolution Protocol, 218 feature, 367
433
434 Index
heatmap, 364, 365 Yagi, 38

hierarchical clustering, 364, 365 API, see application programming interface
k-means clustering, 364, 365 application programming interface, 94, 134,
Kalman filter, 190, 357–359 150, 320, 346, 348, 372, 373, 415
Kalman gain, 357, 358 APS, see advanced planning and scheduling
machine learning, 360, 364, 371 Arduino, 94, 207, 208, 405
machine learning model, 367 arithmetic unit, 70
naive Bayes, 369 ARP, see Address Resolution Protocol
neural network, 366 ARPANET, see Advanced Research Projects
operations research, 360 Agency Network, 90
outlier detection, 364 ASCII, see American Standard Code for Infor-
performance insight, 355 mation Interchange
post hoc analysis, 357 ASI, see fieldbus
principal component analysis, 365 ASK, see modulation
principal component plot, 365 asset tracking, 101, 110, 111, 123, 262
random forest, 369 assisted living, 104, 118
regression, 367 asynchronous communication, 227, 233
regression analysis, 362 Asynchronous Transfer Mode, 216
regression model, 363 ATM, see Asynchronous Transfer Mode
replicator network, 366 atom, 4, 10
rules engine, 360, 364 Bohr-Rutherford model, 5
scatter plot, 362 neutron, 4
strategic insight, 355 automated irrigation, 111
stream analytics, 355 autonomous driving, 117
streaming analytics, 355, 359
supervised learning, 355, 365, 367 backend, 136, 332, 345–347, 353, 355, 357
support vector machine, 369 BACnet, see fieldbus
unsupervised learning, 364 BAN, see network architecture
ANSI, see American National Standards Insti- bandwidth, 294
tute base64, 323
ANT, 265 battery, 46, 107, 130, 131, 151, 156, 160, 163,
antenna, 27, 31, 32, 38, 131, 146, 176, 237, 188, 207, 212, 269, 276, 282, 298,
262, 267, 275, 278, 286 355
antenna array, 191 accumulator, 168
base geometry, 32 alkaline battery, 165, 167
dipole, 38 alkaline-manganese battery, 171
dipole rod, 32 chemistry, 163, 167, 168
director, 38 coin cell, 171
helical antenna, 37 conditioning phase, 168
inverted F, 37 design, 163, 168, 171
near- and far-field, 37 discharge rate, 167
parabola antenna, 287 electrolyte, 168
reflector, 38 fire safety, 169
sector antenna, 279 form factor, 170
wire trace, 37 lithium polymer, 168
wire whip, 37 memory effect, 168
Index 435
polymer electrolyte, 165 power class, 271

primary cell, 165 protocol stack, 271
rechargeable battery, 167, 168, 176 secure simple pairing, 270
secondary cell, 167 Serial Port Profile, 273
thermal runaway, 169 sniff mode, 273
trickle charging, 169 BNC connector, 250
zinc-carbon battery, 171 Boolean algebra, 63, 360, 361
beacon, 110, 188, 190, 269, 270 Boolean logic, 63
Beaglebone, 208 bootloader, 71, 74, 212, 424
BEEP, see Blocks Extensible Exchange Proto- breadboard, 156, 208
col BSC, see cellular data services
bidirectional communication, 137, 230 BTS, see cellular data services
big data, 354 building automation, 118, 142, 225
binary addition, 66, 221 business logic, 98
binary code, 66 business models, 103
bit, 66, 221 byte, 66, 221
bit banging, 211
BJT, see transistor CA, see security
BLE, see Bluetooth CAN, see network architecture, see fieldbus
Blocks Extensible Exchange Protocol, 342 CAP theorem, 293
Bluetooth, 42, 114, 132, 140, 143, 144, 188, capacitor, 30, 31, 46, 48, 58, 81, 192, 194, 195,
190, 255, 257, 259, 260, 266, 269, 204, 233
274, 275, 384, 411, 414 (metal) film capacitor, 49
application layer, 273 ceramic capacitor, 49
Bluetooth Classic, 270, 271 electrolytic capacitor, 49
Bluetooth Low Energy, 260, 265, 270, 273, super capacitor, 48
410 tantalum capacitor, 49
Bluetooth Mesh, 270, 271 time constant, 48
Bluetooth Smart, 270 Carnot’s theorem, 175
Bluetooth Special Interest Group, 269 carrier sense multiple access/collision avoid-
channels, 270 ance, 239, 245, 258, 268, 274
connection mode, 273 carrier sense multiple access/collision detec-
enhanced data rate, 270 tion, 239, 252
Generic Attribute Profile, 273 CAS, see Chemical Abstract Service
high speed, 270 CE, see Conformité Européenne
hold mode, 273 celluar data services
Human Interface Device, 273 Global System for Mobile Communica-
inquiry mode, 271 tions, 145, 146
master-slave architecture, 271 cellular data, 132, 145
organisation unique identifier, 271 cellular data services, 146, 266, 279
out-of-band identification, 270 1G, 280
paging mode, 273 2G, 280
pairing, 270 3G, 144, 280, 282, 283
park mode, 273 3rd Generation Partnership Project, 281
physical layer, 271 4G, 281, 282, 284
piconet, 271 5G, 281, 284
436 Index
5G frequency, 285 types, 70

Advanced Mobile Phone System, 281 Certificate of Conformity, 158
base station controller, 279, 280 certification, 37, 155, 156
base transceiver station, 279–284 character string, 221
Cat-NB1, 284 charlieplexing, 178
cell, 279, 282 Chemical Abstract Service, 201
circuit switched data, 282 CIDR, see classless inter-domain routing
digital cellular data, 280 CIP, see Common Industrial Protocol
Enhanced Data Rates for GSM Evolution, classful networking, 313
280, 283 classless inter-domain routing, 313
frequency, 281 clock recovery, 224, 267
frequency division duplex, 283 clock signal, 227
General Packet Radio Service, 280, 283 clock stretching, 232
Global System for Mobile Communica- cloud, 129, 134–136, 297, 345, 346, 356, 407,
tions, 280 411, 418, 419, 423, 426, 427
GSM, 144, 281, 282 platform-as-a-service, 345
GSM channel bandwidth, 282 software-as-a-service, 346
GSM frequency band, 281 CMOS, see integrated circuit
GSM module, 285 CoAP, 318, 325–328, 336, 425
High Speed Circuit Switched Data, 283 confirmable message, 326
High Speed Packet Access, 280, 283 Constrained RESTful Environments link
Long Term Evolution, 281 format, 328
LTE, 144, 145, 281, 284 de-duplication, 326
LTE Advanced, 284 group communication, 327
LTE carrier, 284 observable resources, 327
LTE channel bandwidth, 284 resource discovery, 327
LTE frequency band, 282 coil, 30, 50, 161, 176
mobile switching center, 280 collision detection, 245, 261
multimedia messaging service, 280 command query responsibility segregation, 347
Next Generation Mobile Networks Alliance, common clock, 292
284 Common Industrial Protocol, 240, 247
Personal Digital Cellular, 281 computability, 81
short messaging service, 280 computer
software SIM, 285 Colossus, 70
time division duplex, 283 electromechanical computer, 70
UMTS channel bandwidth, 283 Electronic Numerical Integrator and Com-
UMTS frequency band, 281 puter, 70
Universal Mobile Telecommunications Sys- multipurpose computer, 45, 70, 73, 74,
tem, 280 207, 211, 212, 297, 299
Wideband Code Division Multiple Access, von Neumann architecture, 70
280, 283 Z2, 70
WiMAX, 281, 282 conceptual model, 148, 151
WiMAX frequency band, 282 condition-based monitoring, 103, 107, 109
Worldwide Interoperability for Microwave conductivity, 8, 16, 203
Access, 281 conductor, 10, 164
central processing unit, 70, 131 conductors, 9
Index 437
doping, 9 Datagram Transport Layer Security, 327, 404,

semiconductor, 9, 10 410
Conformité Européenne Conformité Européenne, DC, see electric current
158 DCR, see security
connected health, 104 DCS, see distributed control system
connected product, 109, 188 DDoS, see security
consumer IoT, 118, 121, 125, 129, 133, 150, DDS, see Data Distribution Service
156 demand-and-supply management, 101, 110
consumer product, 143, 162, 163, 168, 174, demand-based infrastructure, 113
255, 266, 274, 295 Deutsche Industrienorm, 158
Coulomb, 12, 17 device catalog, 373
Coulomb’s law, 13, 19 device discovery, 374
Coulter effect, 202 device identity, 232, 245, 268, 271, 278, 291,
CPF, see fieldbus 294, 389, 414, 415, 417, 424, 426
CPU, see central processing unit digital image processing, 187
CQRS, see command query responsibility seg- digital logic, 63, 69, 291
regation digital signal processing, 80, 353, 359
CRC, see cyclic redundancy check fast Fourier transformation, 359
crystal, 61, 176, 177, 179, 188, 209 Digital Subscriber Line, 218
CSD, see cellular data services digital-to-analog conversion, 74, 77, 211, 298
CSMA/CA, see carrier sense multiple access/collision DIN, see Deutsche Industrienorm
avoidance diode, 46, 50, 155, 197
CSMA/CD, see carrier sense multiple access/collision breakdown, 51
detection crystal diode, 50
CSS, see LoRa current rating, 51
cyclic redundancy check, 245, 253, 307 peak inverse voltage, 51
photo diode, 53, 59, 174, 196, 202
D-Sub, 228, 244 power rating, 51
DAC, see digital-to-analog conversion Schottky diode, 51
DALI, see fieldbus Zener diode, 51, 170
dashboard, 134 DIP, see electric component
data authenticity, 268 direct memory access, 248
Data Distribution Service, 341 Direct Sequence Spread Spectrum, 268
data frame, 216, 253 directionality, 223
database, 136, 293–295, 347, 349, 355, 372, distributed control system, 93, 241, 242
373 distributed system, 291–294, 299, 347, 416
ACID rules, 350 Byzantine fault, 294
BASE rules, 351 CAP theorem, 351
key : value storage, 351 fallacies, 294
NoSQL, 351 Distribution Line Carrier, 254
relational database management system, DLC, see Distribution Line Carrier
349, 350 domain name, 91
schemaless, 351 doping, 53
SQL database, 350 n-type, 9
time-series database, 134, 352 p-type, 9
datagram, 307 DSL, see Digital Subscriber Line
438 Index
DSP, see digital signal processing brushless motor, 181

DSSS, see Direct Sequence Spread Spectrum commutator, 181
DTLS, see Datagram Transport Layer Security DC motor, 181
duplex, 137 inchworm motor, 180
dynamic range, 77 microstepper motor, 183
servo motor, 181
ECC, see security stepper motor, 181, 183
EDGE, see cellular data services electric socket, 159
edge computing, 354, 356 electric vehicle, 107, 113, 117
edge device, see end device electrical safety, 156
EEP-ROM, see memory electricity, 4, 17, 19, 45, 131, 159
EFF, see Electronic Freedom Frontier bandgap, 9, 197
electric circuit, 16, 45, 46, 55, 59, 63, 80 capacitance, 15–17, 30
1-bit full adder, 66 electric band, 8, 164
amplification circuit, 58 electric band model, 8
design, 156 electric potential, 14
high-pass filter, 49 electric potential difference, 15, 17, 18, 30,
integrated circuit, 45, 46, 54, 183 54, 159, 164, 174
LC circuit, 30, 31, 204 electrochemical potential, 12, 164, 165
low-pass filter, 49 free electron gas, 8
recharging circuit, 170 voltage, 15, 32, 49
rectifier bridge, 51, 161 electroluminescence, 53
electric component, 45, 46, 155, 177 electromagnetic compatibility, 158
active components, 46, 61 electromagnetic interference, 157, 158, 160,
dual inline package, 58 161, 237, 251, 254
passive components, 45 electromagnetic spectrum, 24
small outline integrated circuit, 58 electromagnetic wave, 23–26, 31, 32, 37, 40,
small outline transistor, 58 45, 215, 258, 261, 286
surface-mount, 156, 208 diffraction, 25, 28, 37
electric current, 8, 12, 18, 22, 32, 45, 50, 53, interference, 28
54, 168, 169, 174, 176, 180, 200, multipath propagation, 28, 261
203, 204, 206 reflection, 25, 27, 38
AC ripple, 161 refraction, 28
alternating current, 45, 48, 50, 159, 254 electromagnetism, 4, 19, 45
continuous current, 160 electromechanical device, 61
direct current, 159 electromechanical switch, 92
directed current, 45, 50 electron, 5, 6, 8, 10, 16, 19, 21, 24, 164, 165,
directional current, 60 167, 174, 205
leak current, 49 electric band model, 9
peak current, 48, 107, 160, 178, 233 electron diffusion, 9
electric field, 8, 12, 13, 25, 30, 32 excitation, 24
field constant, 13, 16 free electrons, 8
strength, 14, 15, 19, 38 magnetic quantum number, 8, 19
tribolic effect, 12 orbital angular moment quantum number,
electric motor, 19, 21, 58, 61, 180 6
AC motor, 181 potential energy, 6
Index 439
principal quantum number, 6 density, 167, 168

quantum numbers, 24 electric, 18
shell, 6 emission, 6
spin quantum numbers, 8 energy level, 6
valence electron, 8 ionisation energy, 8
electron configuration, 4, 5 potential electric energy, 14
Electronic Freedom Frontier, 400 production, 23
electrostatic forces, 5 Energy Efficiency Directive, 124
elements, 4, 164 energy harvesting, 61, 160, 174
aluminum, 9 inductive charging, 176
beryllium, 9 RF harvesting, 176
cadmium, 168 Energy Star, 158
chlorine, 8, 164, 167 ENIAC, see computer
copper, 4, 8, 164, 165 EnOcean, 265
gallium, 9, 175 enterprise resource planning, 107, 242
hydrogen, 5 environmental standards, 157
indium, 175 EP-ROM, see memory
iridium, 4 ERP, see enterprise resource planning
iron, 164 error correction, 224
lead, 4, 167, 168 ESR, see resistance
lithium, 4, 9, 164, 167, 168 Ethernet, 132, 139, 141, 160, 162, 218, 237,
manganese, 165 246–252, 254, 303, 307, 310
nickel, 168 100BASE-TX, 251
phosphate, 4, 9 100BASE-TX cable, 250
platinum, 198 10BASE-F cable, 251
potassium, 168 10BASE2 cable, 250
properties, 5 10BASE5 cable, 250
silicon, 4, 10, 61, 174, 197 5-4-3 rule, 250
sodium, 164 Cat cable, 250
strontium, 167 data frame, 248, 252, 253
tantalum, 50 encoding, 251
zinc, 4, 164, 165 EtherCAT, 248, 249
email, 90, 329 Ethernet hub, 253
embedded system, 131, 187, 195, 201, 203, Ethernet/IP, 247, 249
207, 208, 210, 225, 228, 291, 297– EtherType, 253
299, 357, 409 fast Ethernet, 250, 252
EMC, see electromagnetic compatibility industrial Ethernet, 237, 238, 242, 246,
EMF, see electromagnetic interference 248–250
encryption, 131, 133, 145, 266, 270 link layer, 252
end device, 132, 143, 146, 160, 252, 256, 278, Logical Link Control, 252
279, 283, 285, 295, 297, 298, 345, MAC address, 252, 253, 307
347, 348, 353, 354, 371, 373, 405, managed switch, 252
407, 409, 412, 413, 415, 417, 419, optical cable, 251
424, 425 payload, 253
energy, 6 Physical Coding Sublayer, 251
absorption, 6 physical layer, 249
440 Index
repeater, 249, 253 Local Operating Network, 239

RG-58 cable, 250 Media Oriented Systems Transport, 239
RG-8/U coaxial cable, 249 ModBus, 240
router, 249 ProfiBus, 240, 244, 245
switch, 249 ProfiNet, 242, 247, 249
unmanaged switch, 253 firewall, 318, 348, 382
ETSI, see European Telecommunications Stan- firmware, 412
dards Institute FISCO, see fieldbus
European Telecommunications Standards In- fleet management, 116
stitute, 281 flip-flop, 63, 71
EXI, see XMPP clocked set-reset, 66
Extensible Markup Language, 324, 329, 331, JK, 66
335, 339 other types, 66
eXtensible Messaging and Presence Protocol, set-reset, 63
see XMPP, 417 flow control, 228
fluorescence, 24
F-layer, 27 FM, see modulation
fast Fourier transformation, 275 fog computing, 355, 356
FCC, see Federal Communications Commis- FPGA, see field-programmable gate array
sion Fraunhofer zone, 37
FDD, see cellular data services frequency, 24, 25, 32, 38, 40, 159, 258, 261,
Federal Communications Commission, 158 277, 279
FET, see transistor frequency band, 258, 279
FFT, see fast Fourier transformation frequency bands, 40
field level, 241 Fresnel zone, 28, 38
field-programmable gate array, 71 FSK, see modulation
fieldbus, 132, 142, 237–241, 245, 246, 252, full-duplex, 230, 245, 248, 252, 307, 316
255, 261, 299
A/S interface, 243 galvanic cell, 164, 165, 203
BitBus, 240, 243 gateway, 129, 132, 133, 141, 143, 145, 148,
Building Automation and Control Networks, 155, 159, 161, 207, 256, 279, 297,
239 355, 356
CANopen, 240, 245 GCHQ, see Government Communications Head-
Communication Profile Families, 240 quarters
Controller Area Network, 239, 245, 384 general purpose input output, 74, 211, 226,
ControlNet, 240 298, 300
DeviceNet, 240, 245 generator, 46, 61, 62, 107
Digital Addressable Lighting Interface, 239 GLONASS, see localization technology
DyNet, 240 GNSS, see localization technology
EtherCAT, 242 Gopher, 91
Fieldbus Intrinsically Safe Concept, 245 Government Communications Headquarters, 395,
FlexRay, 239 412
Foundation Bus, 240 GPIO, see general purpose input output
Interbus, 240 GPS, see localization technology
KNX, 240 GRPS, see cellular data services
Local Interconnect Network, 239 GSM, see cellular data services
Index 441
half-duplex, 137, 138, 142, 233, 234, 243, 244, ICMP, see Internet Control Message Protocol
249, 252, 307 ICSP, see In-Circuit Serial Programming
Hall effect, 200 IEC, see International Electrotechnical Com-
handshake, 216, 230 mission171
hardware development, 155 62026-2, 243
hardware interface, 225, 238, 243 14443, 264
hash function, 295, 357, 386, 393 18000, 262
HATEOAS, see Hypermedia As The Engine 18092, 264
Of Application State 61158, 244, 247, 248
health care, 273, 383, 385 61784, 244, 247
Hertzian dipole, 31 IEEE, see Institute of Electrical and Electron-
HIP, see security ics Engineers
home automation, 266 802.11, 260, 267, 274
household appliance, 124, 131 802.11a, 274
HSCSD, see cellular data services 802.11ac, 274
HSM, see security 802.11ad, 274
HSPA, see cellular data services 802.11b, 274
HTML, see Hypertext Markup Language 802.11b/g/n, 259
HTTP, see Hypertext Transfer Protocol 802.15, 259, 260
hub, 129, 132, 133, 140, 143, 345 802.15.1, 267, 269
Hund’s rule, 6 802.15.4, 257, 267–269
HVAC, see building automation 802.15.x, 257
Hypermedia As The Engine Of Application 802.16, 282
State, 324, 373 802.3, 248
Hypertext Markup Language, 91, 324 802.x, 257
Hypertext Transfer Protocol, 91, 219, 309, 1118, 243
319–325, 328, 425 IETF, see Internet Engineering Task Force
basic authentication, 323 IIoT, see industrial Internet of Things
cookie, 320 IMP, see Interface Message Processor
digest access authentication, 323 impedance, 32
HTTP headers, 319 IMT, see International Mobile Telecommuni-
HTTP/2.0, 320 cations
HTTPS, 400, 401, 405 IMU, see sensor
long polling, 319 In-Circuit Serial Programming, 209
methods, 320, 322, 324, 325 inductance, 30
response codes, 321 induction, 195, 200
token based authentication, 323 induction loop, 114
inductor, 30, 46, 50
I2 C, 178, 183, 196, 211, 230, 232, 233, 285, industrial control system, 73, 97, 103, 107,
405 123, 244
SCL, 230, 232 industrial Internet of Things, 93, 104, 123, 341
SDA, 230, 232 Industrie 4.0, 104, 341
IAB, see Internet Architecture Board industry automation, 225
IC, see integrated circuit Industry Standard Architecture, 229
ICANN, see Internet Corporation for Assigned information
Names and Numbers encoding, 221, 223
442 Index
quantities, 222 International Organization for Standardization,

unclocked encoding, 223 216, 302, 332
information model, 374, 375 International Standardization Organization, 337
information system, 101, 103, 113 International Telecommunication Union, 40,
information theory, 75, 215, 221, 225 216
entropy, 76 Internet Architecture Board, 91
Nyqvist-Shannon sampling theorem, 77 Internet Control Message Protocol, 218, 307,
sender/receiver model, 75 310
Ingress Protection, 158, 243 Internet Corporation for Assigned Names and
innovation, 123 Numbers, 91
input, processing, output, 129 Internet Engineering Task Force, 141, 302,
Institute of Electrical and Electronics Engi- 325, 329, 410
neers, 141 Internet Protocol, 218, 311–315
integrated circuit, 58, 63, 69, 81, 261, 264 packet, 218
analog IC, 58 Internet Protocol Suite, 302, 309–318
Atmel AT97SC3204 Trusted Platform Mod- Internet service provider, 91, 139
ule, 413 interoperability, 98, 124, 132–134, 136, 150,
Atmel ATSHA204A crypto chip, 405 240, 255, 267, 268, 270, 292, 371,
CC1101 radio transceiver, 267 374
comparator, 60, 77, 197 semantic interoperability, 374
complementary metal-oxide semiconduc- interrogation signal, 261
tor, 58 interrupt, 74
digital IC, 58 ion, 5, 8, 168
GNSS chipset, 190 IoT protocol, 291, 301, 327, 335, 337, 347, 402
GPS receiver, 189 IP, see Ingress Protection
i8044 BitBus communication, 243 IP (protocol), see Internet Protocol
LM317 voltage regulator, 170 IP address, 256, 295, 311, 314
MAX1674 DC-DC converter, 176 IPv6, 315
micro-electro-mechanical system, 192, 193, IPv6 shorthand notation, 315
196, 207 IPS, see actuator
NE555 timer, 58, 60 IPSec, 314
operational amplifier, 58, 63 IPv4, 311, 313
optocoupler, 59 IPv6, 311, 314
quad-constellation chipset, 189 IPv6 over Low-Power Wireless Personal Area
real-time clock, 188 Networks, 257, 269
transistor-transistor logic, 58 IPv6-to-the-edge, 132, 141, 256, 257
UART chip, 228 IRNS, see localization technology
voltage regulator, 60 IRQ, see microprocessor
integrated development environment, 212 IRT, see isochronous real-time
Integrated Services Digital Network, 216 ISA, see Industry Standard Architecture
Inter-Integrated Circuit, see I2 C ISDN, see Integrated Services Digital Network
Interface Message Processor, 88 ISFET, see transistor
International Electrotechnical Commission, 171, ISM, see radio communication
222 ISO, see International Organization for Stan-
International Mobile Telecommunications, 145 dardization
11898-x, 245
Index 443
ISO 7498, 216 Global Positioning System, 189, 354, 357

isochronous real-time, 247, 248 iBeacon, 191, 273
isolator, 8, 10 Indian Regional Navigation Satellite Sys-
isotropic radiator, 260 tem, 189
ISP, see Internet service provider indoor, 188, 190
ISR, see microprocessor indoor positioning systems, 188
ITU, see International Telecommunication Union multilateration, 188, 190
Navigational Satellite Timing and Rang-
Jabber, 329, 330 ing, 189
JavaScript Object Notation, 324, 335, 339 NMEA sentence, 189
JFET, see transistor proximity sensing, 273
Joint Test Action Group, see JTAG time-of-flight difference, 190
JSON, see JavaScript Object Notation WGS84, 190
JTAG, 233, 234 logic function, 63, 69
TCK, 234 AND, 65
TDI, 234 NAND, 65
TDO, 234 NOR, 65
TMS, 234 NOT, 65
OR, 65
knowledge pyramid, 372 logic gate, 63, 66
logical false, 223
LAN, see network architecture logical true, 223
latch, 63 LON, see fieldbus
latency, 146, 294, 299 loose coupling, 347
LCD, see actuator LoRa, 276, 284
LDR, see resistors adaptive data rate, 277
least significant bit first, 222 channel bandwidth, 277
LED, see light emitting diode chirp spread spectrum, 277
Lenz’s law, 26 data rate, 276
light emitting diode, 24, 53, 59, 151, 202 device classes, 278
common anode, 178 gateway, 278
common cathode, 178 link budget, 277
red-green-blue LED, 178 LoRa Alliance, 276
LIN, see fieldbus output power, 277
LLC, see Ethernet payload, 277
load balancing, 347 range, 276
local area network, 130, 313, 328, 345 spread factor, 277
localization technology, 188 spread spectrum modulation, 277
BeiDou Navigation Satellite System, 189 LoRaWAN, 132, 144, 145, 266, 276
Eddystone, 191, 273 Lorentz force, 23
fingerprinting, 190 LSBF, see least significant bit first
Galileo, 189 LTE, see cellular data services
geofence, 355
global, 188 M12, 244
Global Navigation Satellite System, 189 M8, 246, 250
global navigation satellite systems, 188 MAC, see media access control
444 Index
MAC address, see also media access control architecture, 210

machine-to-machine communication, 87, 97, ARM Cortex, 208
129, 146, 327, 328, 342 ATmega 2560, 211
magnetic field, 6, 13, 19, 21–23, 25, 31, 32, 50, ATmega 328P, 208, 209, 211
176, 180, 183, 195, 200 embedded programming, 208
dipole moment, 21 key properties, 207
field constant, 13, 21, 22 PIC18F2520, 210
magnetic field vector, 21 microgrid, 105
magnetic flux line, 20, 22 microprocessor, 54, 58, 73, 198, 210–212, 233,
strength, 21, 38 275, 295, 298
magnetic moment, 8 bus width, 210
magnetic switch, 50 command set, 210
magnetism cryptographic functions, 212, 297
antiferromagnetism, 19 execution speed, 211
ferrimagnetism, 19 input/output capability, 211
ferromagnetism, 19 interrupt, 211, 300
Weiss domain, 19 interrupt request, 300
mains electricity, 130, 163, 169, 174, 176 interrupt service routine, 300
mains power, 159, 162 no operation, 298
MAN, see network architecture reduced instruction set computer, 210
Manchester encoding, 224, 251 speed, 211, 297, 298
manufacturing execution system, 107, 242 microtransaction, 107
material requirements planning, 107 MIFARE, 264, 409
Maxwell’s equations, 13, 19, 23, 25 MIMO, see multiple input/multiple output
media access control, 218, 252, 255, 268, 278, MMS, see cellular data services
307 mobile IP, 314
memory, 58, 63, 69, 70, 73, 81, 131, 207, 208, modulation, 40, 266, 271, 275, 284
210–212, 241, 292, 297, 299, 357, amplitude modulation, 40
405, 410, 416, 424 amplitude-shift keying, 40
erasable programmable read only memory, chirp spread spectrum, 277
73 digital modulation, 40
flash, 73 frequency modulation, 40
MEMS, see integrated circuit frequency-shift keying, 40, 258
MES, see manufacturing execution system on/off keying, 267
message broker, 134, 136, 294, 332, 347, 348, orthogonal frequency-division multiplex-
426, 427 ing, 275
message handling, 238, 241, 267, 292, 347 phase modulation, 40
message routing, 305, 307 phase-shift keying, 40
metal, 8, 50 spread spectrum modulation, 277
transition metals, 8 MOSFET, see transistor
MFA, see security MOST, see fieldbus
microcontroller, 73, 74, 77, 92, 94, 161, 174, MPP, see solar panel
177, 178, 181, 185, 188, 201, 203, MQTT, 309, 325, 331, 333, 336–341, 417, 425
207–212, 215, 222, 225, 228, 229, clean session, 340
231, 237, 242, 267, 285, 291, 297, heartbeats, 341
298, 300, 327, 405 last will and testament, 340
Index 445
message queuing, 341 reader/writer, 265

MQTT over websockets, 341 Tag Types, 264
MQTT-SN, 342 NIST, see National Institute of Standards and
MQTTS, 401, 405 Technology
persistent session, 340 NMEA, see localization technology
publish / subscribe, 337 non-return to zero level, 223, 224
quality of service, 340 non-return to zero level, inverse, 223, 224, 227,
retained messages, 340 228, 236, 251
topic, 337, 338, 340, 341 NRZI, see non-return to zero level, inverse
wildcards, 337, 340 NRZL, see non-return to zero level
MRP, see material requirements planning NSA, see National Security Agency
MSC, see cellular data services
multiband device, 146 OAuth2, 323, 417
multiple input/multiple output, 274, 275, 284 OFDM, see modulation
mutual TLS, 401 OFDMA, see Orthogonal Frequency-Division
Multiple Access
Nabaztag, 94 Ohm’s law, 3, 17, 30, 159, 161, 200
NarrowBand IoT, 284 on-prem, 136, 345
NAT, see network address translation ontology, 374, 376
National Institute of Standards and Technol- ontological reasoning, 376
ogy, 395 OPC UA, see OPC Unified Architecture
National Security Agency, 398, 412 OPC Unified Architecture, 341, 374
NAVSTAR, see localization technology Open Systems Interconnection Reference Model,
NB-IoT, see NarrowBand IoT 216
near-field communication, see NFC operating system, 212, 299
NetBIOS, see Network Basic Input/Output Sys- real-time operating system, 299
tem operational amplifier, 58
network address translation, 313, 315, 318, orbital, 6, 8, 24
328 Orthogonal Frequency-Division Multiple Ac-
network architecture, 129, 133, 137, 141, 219, cess, 284
238, 240, 256, 269, 274, 294, 297, oscillator, 227
305 OSI, see Open Systems Interconnection Refer-
body area network, 140 ence Model
campus area network, 140 OSI model, 240, 302–304, 306
local area network, 139 application layer, 218, 241, 244, 260, 269,
metropolitan area network, 140 303, 309
personal area network, 140 data encapsulation, 304, 305
wide area network, 139, 307 data link layer, 216, 219, 302, 307
Network Basic Input/Output System, 219 link layer, 245, 246, 260, 274
network protocol, 301 network layer, 218, 219, 303, 307
network switch-off, 146 physical layer, 216, 244–246, 257, 260,
newsgroups, 90 274, 302, 306
NFC, 37, 140, 258, 264, 270 presentation layer, 218, 303, 308
card emulation, 265 session layer, 218, 303, 308
encoding, 265 transport layer, 218, 303, 308
peer-to-peer, 265
446 Index
packet switching, 88, 283, 284 regulated, 161

PAN, see network architecture switched-mode power supply, 161
parallel communication, 226, 227 unregulated, 161
parity bit, 227 Power-over-Ethernet, 160, 250
partition tolerance, 293 power-over-the-data-cable, 160, 162
pay-per-use, 103, 109, 143, 146 powerline, 237, 254
PCA, see analytics PoweRline Intelligent Metering Evolution, 254
PCB, see printed circuit board Poynting vector, 27, 260
PCI, see Peripheral Component Interconnect PPP, see Point-to-Point Protocol
PCM, see security Precision Time Protocol, 247
PCS, see Ethernet predictive maintenance, 101, 103, 109, 111,
PDC, see cellular data services 115, 116, 120, 370
peer-to-peer, 238, 241, 249, 274, 333, 413 PRIME, see PoweRline Intelligent Metering
Peltier, 46, 61 Evolution
Peltier element, 62, 204 printed circuit board, 37, 156, 208, 229
perfboard, 156 privacy, see security
periodic table, 4 privacy by design, 382, 420
Peripheral Component Interconnect, 229 Processing, 94, 424
phase shift, 159 product design, 131, 155, 156
photon, 6, 25, 53 programmable logic controller, 92, 241, 243,
photovoltaic effect, 62 299
photovoltaic system, 174 proton, 4, 167, 205
pick-and-place machine, 156 prototyping, 155, 156, 208
piezo, 46, 61 provenance, 124, 125
piezo effect, 61, 193 PSK, see modulation, see also security
piezo element, 61, 176, 180, 188 PTP, see Precision Time Protocol
piezo generator, 176 publish / subscribe pattern, 330, 333, 335, 338,
piezoceramic disc, 176 339, 347
piezoceramic film, 176 pulse width modulation, 74, 80, 161, 181, 211
piezoelectric motor, 180 PWM, see pulse width modulation
piezoelectric readout, 192, 195
PIN, see security QoS, see quality of service
PIV, see diode quality of service, 340
PKC, see security quantum computing, 398
platform, 129, 134, 136, 345, 346
PLC, see programmable logic controller rack server, 135
PMS, see security radio communication, 19, 24, 28, 32, 42, 51,
Point-to-Point Protocol, 216, 310 130, 132, 139, 142, 145, 160, 176,
polarization, 26, 32, 38, 262 215, 225, 255, 258, 285, 384, 410
port, 318 carrier signal, 40
ephemeral ports, 318 channel, 258, 259, 266
multiplexing, 318 channel bandwidth, 258, 261, 266, 268
power, 18, 46, 58, 159, 177, 211, 298 channel hopping, 261
power adapter, 160 concurrency, 261
power over Ethernet, 162 crosstalk, 258
power over USB, 162 data rate, 255, 256, 265, 268, 274
Index 447
duty cycle, 258, 259 representational state transfer, 320, 324, 326,
frequency hopping, 270 328
ISM band, 258, 259, 261, 266, 270, 277 request / response pattern, 319, 324, 325, 328,
link budget, 260 333
maximum power, 258 Request for Comments, 90
output power, 260 resistance, 18, 46, 159
passive backscatter, 262 equivalent series resistance, 49
passive communication, 258, 261, 262 joint resistance, 17
path loss, 260 total resistance, 18
power demand, 255, 256 resistor, 46, 155, 197
radio frequency generators, 50 axial resistor, 48
radio jamming, 414 force-dependent resistor, 48
radio waves, 25, 258 humistor, 48
receiver antenna gain, 260 light-dependent resistors, 46, 48, 53, 196,
signal range, 255, 256, 265 198
SRD band, 258, 259, 261, 266, 277 photoresistor, 196
transmitter antenna gain, 260 power rating, 48
unlicensed band, 258 pull-down resistor, 211
radio frequency identification, see RFID pull-up resistor, 211, 230, 232
radio frequency modules, 266, 267 Steinhart-Hart equation, 198
CC1101, 267 temperature-dependent resistor, 197
current consumption, 267 thermistor, 197
RFM12B, 267 transfer resistor, 53
random numbers, 295, 401 REST, see representational state transfer
Raspberry Pi, 94, 134, 208 Restriction of Hazardous Substances Direc-
RDBMS, see database tive, 158, 196
Real-time Energy Management via Powerlines RFC, see Request for Comments
and Internet, 254 354, 90
real-time energy trading, 105 524, 90
real-time processing, 74, 211, 237, 239, 241, 561, 90
244, 246, 286, 299, 348, 354, 356 791, 90, 313
multitasking, 299 792, 90
received signal strength indication, 190 793, 90, 316
redox reaction, 164, 205 850, 90
standard electrode potential, 165 1036, 90
standard hydrogen electrode, 205 2460, 314
register, 70, 300 6690, 328
relay, 69, 177 7252, 325
normally closed, 183 7390, 327
normally open, 183 7641, 327
solid-state relay, 183 RFID, 31, 93, 107, 111, 140, 262, 263
switch time, 184 active tags, 262
remote access, 101, 103, 120, 150 encoding, 264
REMPLI, see Real-time Energy Management frequency, 263
via Powerlines and Internet passive tags, 262
right-hand rule, 21
448 Index
RISC, see microprocessor authentication, 323, 384, 387–389, 409,

RJ-45, 162, 246, 250 413–416, 421
roaming SIM, 146 authenticity, 387
RoHS, see Restriction of Hazardous Substances availability, 386, 414
Directive backdoor attack, 412
ROM, see memory best practices, 383
router, 163, 188, 305, 308, 312, 382, 384, 415 big data, 383
RS-232, 223, 228, 229, 238, 286 block cipher, 395
RS-485, 228, 243–245 brute force attack, 394
RSA, see security Caesar cipher, 393
RSSI, see received signal strength indication certificate, 387, 400, 405, 406, 409, 425,
RTC, see integrated circuit 426
RTOS, see operating system Certificate Authority, 400, 409
CIA+, 387, 407
SAGE, see semi-automatic ground environ- cipher suite, 402
ment ciphertext, 393, 394
Sagnac effect, 193 confidentiality, 385, 386, 397, 409–412,
SAML, see security 420
sampling rate, 77 confidentiality, integrity, availability, 385
SASL, see Simple Authentication and Security consent-based access, 390
Layer context-based security, 419
satellite, 146, 188, 190, 286 cryptography, 393, 405
geostationary satellite, 188, 285–287 Datagram Transport Layer Security, 404
Low Earth Orbit constellation, 286 deanonymisation, 411, 421
satellite communication, 132, 146, 285, 286 deanonymization, 383
base station, 286 denial of service, 386, 414
frequencies, 287 device identity provider, 424
Globalstar, 286 device shadow, 419
Iridium, 286 dictionary attack, 424
round-trip connection, 286 Diffie-Hellman, 404
very small aperture terminals, 286 Diffie-Hellman key exchange, 395
SCADA, see supervisory control and data ac- distributed denial of service attack, 381,
quisition 386, 414
scalability, 136, 347 Dynamic Client Registration, 415
Schrödinger equation, 6 elliptic curve cryptography, 398, 410
wave function, 6 encryption, 385–387, 393, 397, 405
SCSI, see Small Computer System Interface Enigma cipher, 393, 394
SDH, see Synchronous Digital Hierarchy ephemeral key, 401, 404
Secure Sockets Layer, 219, 309 federated identity, 388, 414, 421
security, 124, 136, 212, 374 fingerprinting, 411
access control, 387, 390, 417, 418, 420 hardware crypto system, 405
Advanced Encryption Standard, 395 Hardware Security Manager, 412, 425
asymmetric cryptography, 395 Host Identity Protocol, 417
attack surface, 382, 384 identity, 387
attestation, 412, 413, 416, 419, 421, 427 identity broker, 426
identity provider, 388, 389
Index 449
integrity, 386, 387, 397, 412, 413, 420 social engineering attack, 383
key principles for IoT security, 428 software update, 384
keyspace, 394 SPINS, 416
man-in-the-middle attack, 323, 398, 404 spoofing, 386, 414
metadata, 412, 419, 420, 427 stalking, 383
microcontroller, 409 stream cipher, 395
Mirai botnet, 381, 382, 387, 415 Sybil attack, 413, 416
Misfortune Cookie, 384 symmetric key encryption, 393, 395, 401,
multi-factor authentication, 387, 388 410
NAND mirroring, 407 threat matrix, 407
non-repudiation, 387, 391, 419, 427 threshold cryptographic system, 409
OAuth, 389 token-based identity, 388
OAuth2, 389, 390, 415, 420, 425, 427 trusted network, 416
OAuthing, 414, 423, 425–427 Trusted Platform Module, 413, 426
OpenID Connect, 389 two-factor authentication, 387
personal cloud middleware, 427 User Managed Access, 419
personal data, 383, 385, 391 XML Access Control Markup Language,
personal identity number, 417 390
PKASSO, 419 security by design, 382
plaintext, 393 Seebeck effect, 199
policy-based access, 390, 418 Semantic Sensor Network Ontology, 374, 378
pre-master key, 404 semi-automatic ground environment, 87
pre-shared key, 402, 410 semiconductor electronics, 10, 46, 53, 54, 57,
prime numbers, 398 174
privacy, 124, 125, 136, 382–385, 391, 411, n-p block, 54
412 n-p junction, 55, 62
privacy by design, 382 n-p-n, 10
private key, 396, 400 p-n block, 10, 54, 174
public key, 396, 400, 409 p-n junction, 10, 51, 53, 55, 57
public key cryptography, 395, 396, 398 p-n-p, 10
public key encryption, 400 sensor, 45, 61, 107, 129–131, 161, 187, 207,
radio transmission, 410 225, 240, 243, 256, 269, 275, 291,
related key attack, 395 297, 328, 348, 354, 355, 359, 384,
remote scanning, 384 390, 411
reputation, 416, 419 accelerometer, 192, 194
Rijndael cipher, 395 acidity, 201, 205
role-based access, 390 altitude, 190, 193
RSA public key encryption, 396, 410 ammeter, 200
Security Assertion Markup Language, 388 barcode, 197
security by design, 382 barometric pressure, 192, 193
Security Triad, 385 bioanalytical instruments, 201, 205
shared key encryption, 393 carbon-monoxide, 202
Shibboleth, 388, 425 chemical sensor, 201
side-channel attack, 409 color, 196
signature, 386, 387, 393, 397, 400 current, 191, 200
single-sign on, 388, 389 dew point, 204
450 Index
flex sensor, 195 standard platinum resistance thermometer,

gyroscope, 192, 194 198
Hall effect sensor, 200 temperature, 191, 197
humidity, 203 tensiometer, 204
hygrometer, 203 thermocouple, 197, 199
inertial measurement unit, 192 tilt switch, 191
kinetic force, 191 translation, 192
lab-on-chip, 201, 205 vibration, 193, 196
laser scanner, 197 serial communication, 226–229, 238, 243, 266,
laser scanning, 196 267, 269, 279, 283
light, 191, 195, 196 Serial Line Internet Protocol, 216, 310
local time, 188 Serial Peripheral Interface, see SPI
localization, 188, 192 SHA, see security
luminosity, 196 shared medium, 249, 283
magnetic field, 200 SHE, see redox reaction
magnetometer, 192, 194 shop floor, 107, 123, 237, 239, 240, 243, 244,
measuring chemical properties, 187 255
measuring physical properties, 187, 191 Short Message Service, 326
microphone, 195 Sigfox, 132, 145, 266, 276
motion, 191 SIM, see subscriber identity module
negative temperature coefficient, 198 Simple Authentication and Security Layer, 330,
nine degrees of freedom, 192 332, 334
optoaccoustic measurement, 196 Simple Mail Transfer Protocol, 309, 329
organic compounds, 205, 206 Simple Object Access Protocol, 324, 373
particle quantitation, 202 sleep mode, 163, 298
pH meter, 205 SLIP, see Serial Line Internet Protocol
pollution, 202 Small Computer System Interface, 229
positive temperature coefficient, 198 smart building, 104, 118
potentiometer, 181 smart city, 104
pressure, 193, 195 smart energy, 105, 107
pyrometer, 199 smart grid, 112
qualitative information, 187 smart home, 118, 383, 415
quantitative information, 187 smart lighting, 113
real-time clock, 188 smart metering, 107
relative time, 188 SMC, see electric component
resolution, 187 SMPS, see power adapter
rotary encoder, 181 SMS, see cellular data services
rotation, 192 SMTP, see Simple Mail Transfer Protocol
sensing of fundamental dimensions, 187 SOAP, see Simple Object Access Protocol
sensitivity, 187 SoC, see system-on-chip
sensor drift, 193 social media websites, 92, 101
sensor fusion, 192 SOIC, see electric component
six degrees of freedom, 191 solar panel, 62, 163, 170, 174
smoke, 202 joint voltage, 175
soil humidity, 204 maximum power point, 174
sound, 191, 195 solar cell, 174
Index 451
Swanson’s law, 175 synchronous communication, 227

solenoid, 183 Synchronous Data Link Control, 243
SONET, see Synchronous Optical Networking Synchronous Digital Hierarchy, 218, 252
SOT, see electric component Synchronous Optical Networking, 218, 252
sound, 196 system-on-chip, 73, 207
spectral analysis, 359
SPI, 178, 183, 196, 211, 229, 230, 285 TCP, 218, 274, 310, 315–317, 326, 330, 334,
clock phase, 230 337, 339, 341
clock polarity, 230 flow control, 308, 317
CS, 230 handshake, 316
DI, 230 retransmission, 308, 316
DO, 230 TCP/IP, 88, 90, 93, 218, 237–239, 246, 305,
MISO, 230 309, 315, 318
MOSI, 230 TCP/IP model, 216, 219
SLCK, 230 application layer, 311
SS, 230 internet layer, 310
split brain, 293 link layer, 310
SPRT, see sensor transport layer, 310
SQL, see database TDD, see cellular data services
SRD, see radio communication TDMA, see time division multiple access
SSID, see WiFi telemetry, 254
SSL, see Secure Sockets Layer, 401 Telnet, 309
SSO, see security terminator, 138, 237
start bit, 227 TFT, see actuator
statistics, 353 thermoelectric effect, 62
average, 361 Thread, 257, 270
correlation coefficient, 363 time division multiple access, 239, 279, 282
descriptive statistics, 361, 362 TLS, see Transport Layer Security, 332, 334,
Gaussian distribution, 363 337
median, 361 see Transport Layer Security, 401
standard deviation, 361 token passing, 244
statistical distribution, 362 top floor, 107, 244
statistical properties, 361 topology, 137, 141, 239, 294
statistical test, 362 bus, 138, 229, 244, 245
statisticals significance, 362 mesh, 138, 140–142, 257, 268
Stokes shift, 24 ring, 138, 230, 238
stop bit, 227 scatternet, 271
stream processing, 348 star, 139, 252, 269, 271, 274, 345
subscriber identity module, 145, 146 TPM, see security
subscription-based service, 117 traffic management, 114, 123
supervisory control and data acquisition, 93, transformer, 159, 160
337 transistor, 10, 46, 53, 63, 155, 170, 197
supply and demand, 120 base, 54
supply chain management, 101, 107, 110, 123 bipolar junction, 10
SVM, see analytics bipolar junction transistor, 53, 69
switch, 163 collector, 54
452 Index
collector-to-base voltage, 57 User Datagram Protocol, see UDP

complementary metal-oxide semiconduc- user experience, 147
tor, 69 user interface, 147, 149, 150
depletion mode, 57 UTF, see Unicode Transformation Format
emitter, 54 UV radiation, 24, 196
emitter-to-base, 57 UX, see user experience
enhancement mode, 57
field effect transistors, 53 VDE, see Verband der Elektrotechnik
forward bias, 54 Verband der Elektrotechnik, 158
gate voltage, 57 Verilog, 71
ion-sensitive field-effect transistor, 205 Very Simple Control Protocol, 342
metal-oxide-semiconductor, 54, 181 virtual power plant, 105
NPN, 54, 55 virtual private network, 146
phototransistor, 57 VNA, see computer
PNP, 54, 55 voice-activated assistant, 150
reverse bias, 54 voltage drop, 18, 51, 53, 77, 238
reverse voltage, 58 voltage regulator, 58
transistor-transistor logic, 69 VPN, see virtual private network
transition zone, 38 VSAT, see satellite communication
Transmission Control Protocol, see TCP VSCP, see Very Simple Control Protocol
Transport Layer Security, 219, 309, 323, 324,
327, 400–404, 410 wall socket, 160
Trojan Room Coffee Pot, 92 WAN, see network architecture
TTL, see integrated circuit waste management, 111
Turing-completeness, 70, 82 water management, 113
wave-particle duality, 25
UART, see Universal Asynchronous Receiver WCDMA, see cellular data services
Transmitter wearable device, 140, 269, 270
UDP, 218, 246, 310, 325, 328, 404 Web 2.0, 92
UI, see user interface web browser, 91, 320, 321, 341
UL, see Underwriters Laboratory web server, 320, 321
UMA, see security Web Services Description Language, 373
UMTS, see cellular data services, 281, 284 websockets, 319, 325, 341
Underwriters Laboratory, 158 Weightless, 266, 276
Unicode Transformation Format, 221 WEP, see WiFi
unidirectional communication, 137, 230 wide area network, 313
Uniform Resource Identifiers, 324 WiFi, 42, 110, 131, 132, 139, 141, 143, 145,
universal asynchronous receiver transmitter, 176, 188, 190, 255, 257, 259, 260,
228 266, 270, 274, 275, 277, 303, 384,
Universal Serial Bus, 160, 233, 234 411
USB adapter/plug system, 234 ad hoc network, 274
USB OTG, 235 ESP32, 405
Unix sockets, 308 ESP8266, 275, 401, 405, 424
unsupervised learning, 367 frame types, 274
URI, see Uniform Resource Identifiers infrastructure mode, 274
USB, see Universal Serial Bus MAC address, 274, 411
Index 453
peak current, 275

range, 275
router, 274, 275
sender power, 274
Service Set Identifier, 274
spectral efficiency, 275
WiFi Protected Access, 274
Wired Equivalent Privacy, 274
WiMAX, see cellular data services
wireless communication, see radio communi-
cation
wireless LAN, see WiFi
wireless sensor network, 325
WLAN, see WiFi
word size, 69
work (physics), 23
World Wide Web, 319
WPA, see WiFi
WSDL, see Web Services Description Lan-
guage
X10, 254
XACML, see security
XEP, see XMPP
XML, see Extensible Markup Language
XMPP, 329–331
Efficient XML Interchange, 331
presence, 331
roster, 329
XML stanza, 329
XMPP Extension Protocols, 330
XMPP Extensions, 329
XMPP Standards Foundation, 329, 330
XSF, see XMPP
Z-Wave, 257, 265

ZigBee, 132, 141, 144, 145, 257, 260, 265–
268, 274, 275
data frame, 268
link budget, 267, 269
payload, 268
peak current, 269
self-healing mesh, 268
Xbee module, 269
ZigBee Alliance, 268
ZigBee IP, 269
Recent Titles in the Artech House
Mobile Communications Series
William Webb, Series Editor
3G CDMA2000 Wireless System Engineering, Samuel C. Yang

3G Multimedia Network Services, Accounting, and User Profiles,
Freddy Ghys, Marcel Mampaey, Michel Smouts, and
Arto Vaaraniemi
5G Spectrum and Standards, Geoff Varrall
802.11 WLANs and IP Networking: Security, QoS, and Mobility,
Anand R. Prasad and Neeli R. Prasad
Achieving Interoperability in Critical IT and Communications
Systems, Robert I. Desourdis, Peter J. Rosamilia,
Christopher P. Jacobson, James E. Sinclair, and
James R. McClure
Advances in 3G Enhanced Technologies for Wireless
Communications, Jiangzhou Wang and Tung-Sang Ng, editors
Advances in Mobile Information Systems, John Walker, editor
Advances in Mobile Radio Access Networks, Y. Jay Guo

Applied Satellite Navigation Using GPS, GALILEO, and
Augmentation Systems, Ramjee Prasad and Marina Ruggieri
Artificial Intelligence in Wireless Communications,
Thomas W. Rondeau and Charles W. Bostian
Broadband Wireless Access and Local Network: Mobile WiMax
and WiFi, Byeong Gi Lee and Sunghyun Choi
CDMA for Wireless Personal Communications, Ramjee Prasad

CDMA Mobile Radio Design, John B. Groe and Lawrence E. Larson
CDMA RF System Engineering, Samuel C. Yang
CDMA Systems Capacity Engineering, Kiseon Kim and Insoo Koo
CDMA Systems Engineering Handbook, Jhong S. Lee and

Leonard E. Miller
Cell Planning for Wireless Communications, Manuel F. Cátedra
and Jesús Pérez-Arriaga
Cellular Communications: Worldwide Market Development,

Garry A. Garrard
Cellular Mobile Systems Engineering, Saleh Faruque
Cognitive Radio Interoperability through Waveform
Reconfiguration, Leszek Lechowicz and Mieczyslaw M. Kokar
Cognitive Radio Techniques: Spectrum Sensing, Interference
Mitigation, and Localization, Kandeepan Sithamparanathan
and Andrea Giorgetti
The Complete Wireless Communications Professional: A Guide for
Engineers and Managers, William Webb
Digital Communication Systems Engineering with
Software-Defined Radio, Di Pu and Alexander M. Wyglinski
EDGE for Mobile Internet, Emmanuel Seurre, Patrick Savelli, and
Pierre-Jean Pietri
Emerging Public Safety Wireless Communication Systems,
Robert I. Desourdis, Jr., et al.
The Future of Wireless Communications, William Webb
Geographic Information Systems Demystified, Stephen R. Galati
Geospatial Computing in Mobile Devices, Ruizhi Chen and
Robert Guinness
GPRS for Mobile Internet, Emmanuel Seurre, Patrick Savelli, and
Pierre-Jean Pietri
GPRS: Gateway to Third Generation Mobile Networks,

Gunnar Heine and Holger Sagkob
GSM and Personal Communications Handbook,
Siegmund M. Redl, Matthias K. Weber, and
Malcolm W. Oliphant
GSM Networks: Protocols, Terminology, and Implementation,
Gunnar Heine
GSM System Engineering, Asha Mehrotra
Handbook of Land-Mobile Radio System Coverage, Garry C. Hess
Handbook of Mobile Radio Networks, Sami Tabbane
High-Speed Wireless ATM and LANs, Benny Bing
Inside Bluetooth Low Energy, Second Edition, Naresh Gupta

Interference Analysis and Reduction for Wireless Systems,
Peter Stavroulakis
Internet Technologies for Fixed and Mobile Networks,
Toni Janevski
Introduction to 3G Mobile Communications, Second Edition,
Juha Korhonen
Introduction to 4G Mobile Communications, Juha Korhonen
Introduction to Communication Systems Simulation, Maurice

Schiff
Introduction to Digital Professional Mobile Radio,
Hans-Peter A. Ketterling
Introduction to GPS: The Global Positioning System,
Ahmed El-Rabbany
An Introduction to GSM, Siegmund M. Redl, Matthias K. Weber,
and Malcolm W. Oliphant
Introduction to Mobile Communications Engineering,
José M. Hernando and F. Pérez-Fontán
Introduction to Radio Propagation for Fixed and Mobile
Communications, John Doble
Introduction to Wireless Local Loop, Broadband and Narrowband,
Systems, Second Edition, William Webb
IS-136 TDMA Technology, Economics, and Services,
Lawrence Harte, Adrian Smith, and Charles A. Jacobs
Location Management and Routing in Mobile Wireless Networks,
Amitava Mukherjee, Somprakash Bandyopadhyay, and
Debashis Saha
LTE Air Interface Protocols, Mohammad T. Kawser
Metro Ethernet Services for LTE Backhaul, Roman Krzanowski

Mobile Data Communications Systems, Peter Wong and
David Britland
Mobile IP Technology for M-Business, Mark Norris
Mobile Satellite Communications, Shingo Ohmori,
Hiromitsu Wakana, and Seiichiro Kawase
Mobile Telecommunications Standards: GSM, UMTS, TETRA, and
ERMES, Rudi Bekkers
Mobile-to-Mobile Wireless Channels, Alenka Zajic′

Mobile Telecommunications: Standards, Regulation, and
Applications, Rudi Bekkers and Jan Smits
Multiantenna Digital Radio Transmission, Massimiliano Martone
Multiantenna Wireless Communications Systems,

Sergio Barbarossa
Multi-Gigabit Microwave and Millimeter-Wave Wireless
Communications, Jonathan Wells
Multipath Phenomena in Cellular Networks, Nathan Blaunstein

and Jørgen Bach Andersen
Multiuser Detection in CDMA Mobile Terminals, Piero Castoldi
OFDMA for Broadband Wireless Access, Slawomir Pietrzyk
Personal Wireless Communication with DECT and PWT,

John Phillips and Gerard MacNamee
Practical Wireless Data Modem Design, Jonathon Y. C. Cheah
Prime Codes with Applications to CDMA Optical and Wireless

Networks, Guu-Chang Yang and Wing C. Kwong
Quantitative Analysis of Cognitive Radio and Network
Performance, Preston Marshall
QoS in Integrated 3G Networks, Robert Lloyd-Evans

Radio Engineering for Wireless Communication and Sensor
Applications, Antti V. Räisänen and Arto Lehto
Radio Propagation in Cellular Networks, Nathan Blaunstein
Radio Resource Management for Wireless Networks, Jens Zander

and Seong-Lyun Kim
Radiowave Propagation and Antennas for Personal
Communications, Third Edition, Kazimierz Siwiak and
Yasaman Bahreini
RDS: The Radio Data System, Dietmar Kopitz and Bev Marks
Resource Allocation in Hierarchical Cellular Systems,
Lauro Ortigoza-Guerrero and A. Hamid Aghvami
RF and Baseband Techniques for Software-Defined Radio,
Peter B. Kenington
RF and Microwave Circuit Design for Wireless Communications,
Lawrence E. Larson, editor
RF Positioning: Fundamentals, Applications, and Tools,
Rafael Saraiva Campos and Lisandro Lovisolo
Sample Rate Conversion in Software Configurable Radios,
Tim Hentschel
Signal Processing Applications in CDMA Communications, Hui Liu
Signal Processing for RF Circuit Impairment Mitigation,
Xinping Huang, Zhiwen Zhu, and Henry Leung
Smart Antenna Engineering, Ahmed El Zooghby
Software Defined Radio for 3G, Paul Burns
Spread Spectrum CDMA Systems for Wireless Communications,
Savo G. Glisic and Branka Vucetic
Technical Foundations of the Internet of Things, Boris Adryan,
Dominik Obermaier, and Paul Fremantle
Technologies and Systems for Access and Transport Networks,
Jan A. Audestad
Third-Generation and Wideband HF Radio Communications,
Eric E. Johnson, Eric Koski, William N. Furman, Mark Jorgenson,
and John Nieto
Third Generation Wireless Systems, Volume 1: Post-Shannon
Signal Architectures, George M. Calhoun
Traffic Analysis and Design of Wireless IP Networks, Toni Janevski
Transmission Systems Design Handbook for Wireless Networks,

Harvey Lehpamer
UMTS and Mobile Computing, Alexander Joseph Huber and
Josef Franz Huber
Understanding Cellular Radio, William Webb

Understanding Digital PCS: The TDMA Standard,
Cameron Kelly Coursey
Understanding GPS: Principles and Applications, Second Edition,
Elliott D. Kaplan and Christopher J. Hegarty, editors
Understanding WAP: Wireless Applications, Devices, and Services,
Marcel van der Heijden and Marcus Taylor, editors
Universal Wireless Personal Communications, Ramjee Prasad

WCDMA: Towards IP Mobility and Mobile Internet, Tero Ojanperä
and Ramjee Prasad, editors
Wireless Communications in Developing Countries: Cellular and
Satellite Systems, Rachael E. Schwartz
Wireless Communications Evolution to 3G and Beyond,
Saad Z. Asif
Wireless Intelligent Networking, Gerry Christensen,
Paul G. Florack, and Robert Duncan
Wireless LAN Standards and Applications, Asunción Santamaría
and Francisco J. López-Hernández, editors
Wireless Sensor and Ad Hoc Networks Under Diversified Network
Scenarios, Subir Kumar Sarkar
Wireless Technician’s Handbook, Second Edition, Andrew Miceli
For further information on these and other Artech House titles,in-

cluding previously considered out-of-print books now available
through our In-Print-Forever® (IPF®) program, contact:
Artech House Artech House

685 Canton Street 16 Sussex Street
Norwood, MA 02062 London SW1V 4RW UK
Phone: 781-769-9750 Phone: +44 (0)20 7596-8750
Fax: 781-769-6334 Fax: +44 (0)20 7630-0166
e-mail: artech@artechhouse.com e-mail: artech-uk@artechhouse.com
Find us on the World Wide Web at: www.artechhouse.com

The Technical Foundation of IoT PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

The Technical Foundation of IoT PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Contents

Foreword by Andy Stanford-Clark xv

Foreword by Alexandra Deschamps-Sonsino xvii

Introduction by Stefan Grasmann xix

Preface by Boris Adryan xxiii

I Physical Principles and Information 1

2.1.2 Active Components . . . . . . . . . . . . . . . . 53

Chapter 3 Information Theory and Computing 75

II Historical Perspective of the Internet of Things 85

III Applications of M2M and IoT 95

Chapter 6 Common Themes Around IoT Ecosystems 101

6.1.4 Agriculture . . . . . . . . . . . . . . . . . . . . . 110

Chapter 7 Drivers and Limitations 123

IV Architectures of M2M and IoT Solutions 127

Chapter 9 Architectural Considerations 137

Chapter 10 Common IoT Architectures 141

Chapter 11 Human Interfaces 147

Chapter 13 Power 159

Chapter 14 Actuators 177

Chapter 15 Sensors 187

15.4.1 Solid Particles . . . . . . . . . . . . . . . . . . . 201

Chapter 16 Embedded Systems 207

VI Device Communication 213

Chapter 18 Information Encoding and Standard Quantities 221

Chapter 19 Industry Standards 225

19.2.3 Powerline . . . . . . . . . . . . . . . . . . . . . 254

VII Software 289

Chapter 21 Embedded Software Development 297

Chapter 22 Network Protocols: Internet and IoT 301

22.4.3 HTTP Authentication . . . . . . . . . . . . . . . 323

Chapter 23 Backend Software 345

Chapter 24 Data Analytics 353

24.2.2 Exemplary Methods for Stream Processing . . . . 359

Chapter 25 Conceptual Interoperability 371

VIII Security 379

Chapter 27 A Beginner’s Guide to Encryption 393

28.4 A2: Hardware Integrity . . . . . . . . . . . . . . . . . . 412

Chapter 29 Building Secure IoT Systems 423

About the Authors 431

Product manufacturers name “customer intimacy” as one of their key motiva-

However, this enormous business potential creates the following challenges

REASONS FOR THIS BOOK

HOW TO NAVIGATE THIS BOOK

In analogy to an undergraduate textbook for biology or chemistry, we will move

TECHNICAL AND HISTORICAL FOUNDATIONS

Atoms, solid-state Electric components, Information, codes,

Figure 1 How to navigate this book.

Spatial dimension: from atoms to devices Historical perspective: from the

1950 1970 1990 2000

The Cold War,

Part III Part IV Part VIII

Industry 4.0, grids, Challenges with

Spatial dimension: from personal area networks and

Part V Part VI Part VII

Physical Principles and

Every new development in information technology is ultimately an exploitation of

In this chapter, we will be looking at electricity and electromagnetism. It can

1.1 MATTER, ELEMENTS AND ATOMS

1.1.1 Electron Conﬁguration and Atomic Orbitals

The diameter of an electron is roughly 1.5x that of a proton ( ). The distribu-

5 ... ... ... ... ... ... ...

ﬁeld. The electrons themselves induce a magnetic moment that is dependent on l ;

1.1.2 Conductors and Semiconductors