Documente Academic
Documente Profesional
Documente Cultură
Safety, Availability
& Security
241
242 Software for Automation
Safety Integrity
Special care must be taken when using software in applications with
safety implications. The complexity of workstation and server oper-
ating systems, such as Windows, makes it impossible to prove them
functionally safe. In most applications the basic automation system
is adequate to handle the safety aspects, but some tasks have a few
critical functions that require safety with even greater integrity.
Safety-related functions may include, for example, machinery that
cuts and stamps, and processing dangerous chemicals.
Safety Related
Many automation applications also have safety-related aspects. This
may include machine safety and process safety in factories, or fire
fighting in buildings. The safety-related aspects are a very small
part of the overall automation systems, since most processes and
buildings are primarily designed to be inherently safe as reasonably
possible, and secondly, because most hazards are prevented by the
controls and alarms in the basic automation system. Remaining
Chapter 10 – Safety, Availability & Security 243
Communication
Do not make OPC part of the shutdown chain. However, OPC can
be used to display SIS status to operators on the operator visuali-
zation software. Consider what action the logic solver needs to
take if operators lose visibility if, for example, the communication
link is severed or software, such as OPC server, or clients stop
functioning. A handshake mechanism between the operator visu-
alization software and the logic solver may be required to detect
such situations. In special cases where writing to parameters in
the SIS is permitted, after enabling writes to the SIS, perform a
read back check to make sure the value and only the intended
variable have changed. When the operator makes a change, it will
typically be read back while the operator waits around to see that
the value was indeed accepted. However, in the case of scripts
and other automated functions, special provisions are required to
read back the value to ascertain that it indeed changed.
Operator Display
There are several ways to improve the safety of the system by
employing system features and designing operator graphics to be
user friendly. Operators must be able to see if a trip in the SIS has
actually occurred, receive alarm notifications, and must also be
able to overview the overrides and health of system components.
Alarms
Operating a fully automated plant is boring, as very little human
intervention is required. This results in lowered alertness and vigi-
lance among operators. Refer to the alarm management discussion
in Chapter 4.
Bypass/Override
Bypass or override refers to the ability to ignore the actual signal
(e.g., from a sensor or the logic solver) and instead use a manually
entered value or state. It should not be possible to enable and
disable override in the SIS through OPC or DDE. Overrides are
only possible from the SIS engineering console. However, operators
at consoles of the basic automation system must be alerted to any
override in the SIS. Overrides shall be indicated on the screens and
operators should also be notified by means of alarms. It is also
important that, during override, it is possible to monitor the actual
value from the field so the operator can know the true state of the
process and machines.
Write
It should not be possible to modify the operation of the SIS from
the basic automation system’s operator console through OPC or
DDE. SIS logic changes should only be possible from the SIS engi-
neering console. Possible exceptions to this rule are certain batch
process applications that must permit certain trip limits to be set
according to product, batch size, and so on. Therefore, writing to a
SIS through OPC is extremely rare. If writes are permitted at all, a
mechanism must be configured to first write a parameter to
unlock the logic solver before the actual parameter of interest is
written, and the logic solver must subsequently be write-locked
again. Refer to the safety manual and user manual of the logic
solver to determine what is permitted and how it can be done. In
Chapter 10 – Safety, Availability & Security 245
Feedback
Operators need feedback to see that a tripped valve has actually
moved. Therefore feedback from valve instrumentation should
show actual valve position to alert operators if the valve does not
fully close. It may be a good idea to use intelligent valves, possibly
networked with a safety fieldbus in conjunction with OPC to
bring feedback information such as actual position, health, mode,
and override status all the way to the operator. Check if the OPC
server for the logic solver supports feedback for such information.
Status
The operator visualization software must show if a SIS subsystem,
component, or network has failed or is degraded. Check if the
OPC server for the logic solver supports feedback for such infor-
mation. It may be a good idea to use field instruments networked
with a safety fieldbus in order to access status of intelligent field
devices such as transmitters and valves. If communication
networks in the SIS are failed or degraded, this must be indicated
by the status in the logic solver OPC server. The OPC client may
use the OPC status to show some invalid symbol in a specific
color, instead of the value, and to issue an alarm.
Access
Access should be limited to authorized persons as a means of protec-
tion. The software must therefore have security built in, preferably
integrated with the operating system. Authentication and authoriza-
tion is explained further in the software security subsection.
Others
There are a few other points of consideration for displaying SIS
data on the consoles of the basic automation system. First, make
sure the update time is fast enough even during emergency condi-
246 Software for Automation
Availability
Open systems based on OPC interfaces and other open technolo-
gies must meet the same availability requirements as proprietary
systems. If automation systems stop functioning, the result could
be production downtime, people stuck in lifts, etc. In other words,
automation system failures mean loss of revenue, reduced produc-
tivity, frustration, and so on. Failure of the automation system may
also result in lost data, such as production records, which may
have regulatory implications. Hot-standby redundancy may there-
fore be implemented for networking, OPC servers, and disk drives.
Other measures include backup power, industrially hardened or
fault tolerant computers, as well as using solid-state hard disks.
Network Redundancy
To provide full benefits, automation software at the automation
system level requires uninterrupted communication with the
underlying automation hardware. Ring topology or redundancy
can be used for control networking. Redundancy at the execution
and business levels is rare, but this may change in the future as
supply chain management, etc. requires a continuous flow of data
through all levels. Using Web services also requires high connec-
tion availability.
Chapter 10 – Safety, Availability & Security 247
Dual Ethernet
Another simple approach to redundancy in control networking is
to use automation hardware such as controllers that have dual
network ports, while computers only use single network ports.
Most automation hardware such as controllers is easily available
with two network ports. By forming two independent networks
using two LAN switches, with individual sets of OPC servers and
248 Software for Automation
Full redundancy means any one component can fail and the
automation system is still able to operate and be supervised. Full
LAN redundancy requires the controller networking to have a
application layer protocol that provides this redundancy scheme.
Not all Ethernet-based industrial networking protocols have this.
An example of an industrial application layer protocol that
supports full LAN redundancy is FOUNDATION™ Fieldbus HSE.
Full LAN redundancy can also be combined with ring topology
forming two rings. For more information on redundancy, refer to
“Fieldbuses for Process Control: Engineering, Operation and
Maintenance”.1
Make sure to use OPC servers that support two Network Interface Cards
(NIC).
RAID Drives
Servers require high availability for the data. High availability for
data storage can be achieved using fault tolerant hard disk archi-
tectures such as RAID (Redundant Array of Independent Disks).
Using RAID controllers for disks on the servers, fault tolerance
can be achieved using regular hard disks. There are many
different RAID architectures available: 0, 1, 2, 3, 4, 5, 6, 10, 50, and
0+1. However, the most common architectures are RAID 1 and
RAID 5. RAID 0 is not fault tolerant. RAID 1 uses disk mirroring,
requiring a one-for-one duplication of disks. RAID 5 uses a parity
scheme requiring one additional disk, but requires a minimum of
three disks. Figure 10-7 illustrates that RAID 1 mirroring needs
eight disks to hold four disks of data while RAID 5 parity only
needs five disks to hold four disks of data.
252 Software for Automation
RAID 1 (Mirroring)
A RAID 1 controller implements mirroring by writing the infor-
mation to two drives in a mirrored pair. If a drive fails, the
mirrored drive still contains all the data. Once the failed drive is
replaced, it is immediately rebuilt using data from the mirror
drive. Most RAID 1 solutions are implemented in hardware,
permitting this to be done online. The mirroring scheme is expen-
sive, as it requires twice the number of disks.
RAID 5 (Parity)
A RAID 5 controller implements parity by computing recovery
data and storing it in parts across other disks. If a drive fails, its
data can be regenerated from the parity data stored on the other
drives. The parity scheme is lower cost because it requires only
one additional disk. RAID 5 writes data slower and is not as fault
tolerant as RAID 1.
Power
Uninterrupted operation of software requires continuous power
supply to the computers. Several measures can be taken to ensure
clean and uninterrupted power.
Chapter 10 – Safety, Availability & Security 253
UPS
An uninterruptible power supply (UPS) sits between the line
power outlet and the computer. It includes a battery and an
inverter that can be dimensioned to power a computer for about
15 minutes if line power is lost. This UPS contains power condi-
tioning that carries the server through short power glitches such
as brownout sags and over-voltage conditions. While the server
operates on power from the UPS, it is possible for data to be
saved or even backed up, then perform a graceful shutdown in an
orderly manner. Sophisticated UPS can be linked to the communi-
cation ports of the server to provide unattended file-saving,
followed by automatic shutdown in the event of a sustained
power outage.
Power Conditioning
Surges and harmonics present on the utility power line caused by
nearby lightning strikes and switching of heavy electrical loads –
both common in industrial environments – may damage
computers and other equipment. Transient voltage surge suppres-
sors can be installed for each computer for line conditioning.
Industrial Computers
Industrial grade computers are hardened to survive a much
harsher environment and may even be used on a factory floor.
Industrial computers are typically installed in a 19-inch rack in a
control panel or cabinet. Carefully designed ventilation creates a
slight excess pressure inside the chassis. This, together with filters,
prevents dust from entering and extends the operating tempera-
ture range. A rugged chassis with holders for plug-in cards makes
industrial computers less sensitive to shocks and vibrations. Solid-
254 Software for Automation
Cyber Security
In the past, most automation systems were isolated islands not
connected to other networks or systems outside the control room,
somewhere connected to the PIMS and LIMS that were also
isolated islands. From the point of security, this is far easier to
manage than a system permanently connected to other networks,
256 Software for Automation
To make the task as difficult for the hacker as possible, limit the prolifera-
tion of system documentation only to those that really need to know.
Network Security
Because security measures such as a firewall between the automa-
tion network and the information network makes integration
more difficult and costly, it is tempting to not use any form of
security. However, security measures are necessary to protect the
automation network from problems permissible in the execution
and business environment, but not in automation. The risk associ-
ated with connecting the automation system to the outside world
includes the risk of a hacker destroying valuable data and falsi-
fying operator commands that could jeopardize assets, possibly
even the environment and human life. The other risk is denial-of-
service attacks that could make system operation unacceptably
slow or even force a stop.
for higher speed, and application proxy firewall for security. The
features that routers, firewalls, and proxies support vary consider-
ably. All of them, however, have the basic functionality of disabling
response to the PING command. Typical functionality is explained
below, although from time-to-time you may find that the function-
ality is available in another device. In this book the following
definitions are used, in order of increasing sophistication:
The primary task of a router is not security, but the router is a first
line of cyber security defense. Even simple routers and some LAN
switches have basic security features built in, such as discarding
PING and performing packet filtering (see Figure 10-12). These
are simple measures that make it much more difficult for the
hacker to locate and access the network and may be sufficient to
make the casual hacker choose another, easier target, instead.
However, it is not sufficient to deter a more determined hacker.
Packet Filtering
Packet filtering is communication screening based on only IP
address and port number. Each port number essentially corre-
sponds to one application layer protocol – for example, HTTP is
port 80. Enabling and disabling ranges of addresses and port
numbers is done using simple fill-in-the-blank fields (see Figure
10-13). Therefore, using packet filtering, it is possible to enable
and disable access from different computers and prevent using
specific protocols, such as FTP and Telnet, etc. Typically, all IP
addresses except those that really need access are disabled. Simi-
larly, all ports are blocked, except for those few protocols neces-
sary. For example, only a few computers on the Intranet in a
264 Software for Automation
Block FTP and TELNET if not needed, if FTP is used, do not permit
anonymous login.
The packet filtering scheme would not work in cases where the
same port or same set of ports is used to perform several diverse
functions – for instance, whenever one protocol is used for many
different things. When a single protocol is used for many func-
tions it becomes impossible to make fine distinctions. It is neces-
sary to permit either all or nothing. A case in point is the powerful
DCOM protocol where Windows has dumped all of its diverse
functionality. OPC uses DCOM, but permitting DCOM would be
a security risk because many other functions also use DCOM and
opening up a range of ports for DCOM may permit hackers to
perform a whole range of tricks. This is why DCOM is not suit-
able for communication between the automation system and the
rest of the enterprise.
Access List
Access list is a filtering mechanism found only in advanced
routers and in firewalls. An access list is configured as a list of
statements used to build filters that permit or deny inbound and
outbound traffic between the internal and external networks. It is
possible to have different access criteria for each Ethernet port on
the router or firewall. Filtering is primarily made based on IP
source addresses but can also include protocol and destination
address. Options include address wildcards, check if the message
is a response to a TCP request, priority and type, discarding PING
requests, and suppress error messages, as well as logging and
remarks. Thus, access list configuration is more programmatic in
nature and therefore more difficult than simple packet filtering.
However, it is also more flexible and powerful.
Application Proxy
A proxy server is intermediary software somewhere between the
clients and the ultimate source servers. A proxy server includes
features such as an application filter firewall for security and
cache providing accelerated Web access. Proxy servers can work
in both directions. By blocking outgoing messages, the proxy can
prevent internal users from accessing certain Web sites or servers
– preventing, for example, Web browsing or file download. By
blocking incoming messages, the proxy servers prevent many
forms of attacks.
Chapter 10 – Safety, Availability & Security 269
DMZ
Connecting the automation system to the Intranet and Internet
requires tight security and it should only be done if really neces-
sary. If connected, consider permitting only read access. If external
270 Software for Automation
The basic rule for establishing a DMZ is that ports opened in the
two firewalls are mutually exclusive so a single protocol cannot
go right through both. For example, if the firewall facing the
external network has port 80 open, the firewall facing the internal
network should not have port 80 open. The Web portal must
access data on the internal private network through another port,
such as 30080, or use HTTPS on port 443.
Make sure to use well proven Web server software, and keep the
software current by applying any security patches that come out
as soon as possible.
Figure 10-16. DMZ between Automation and Execution for Inner MES Portal
272 Software for Automation
Figure 10-17. DMZ between Business LAN and the Internet for Outer ERP
Portal
VPN
Encryption is not required on the automation network, since it is
securely contained within the plant perimeter, factory, or building.
However, data transmitted over a public network such as the
Internet should be encrypted to ensure the data remains confiden-
tial and to prevent any attempt to tamper with the data. Virtual
Private Network (VPN) is a means to provide secure access to a
private network for a limited number of well-known clients or
networks. It permits one private network or client to securely
connect to another private network across a public network such
as the Internet using encrypted communication. VPN also controls
access by requiring authentication such as user name, password,
and an optional additional authentication such as smart card.
Authentication and authorization is explained in the section of
software security.
Chapter 10 – Safety, Availability & Security 273
There are a few different protocols for VPN, but the most common
are Point-to-Point Tunnelling Protocol (PPTP) and IP Security
Protocol (IPSec). Single client computers typically use PPTP when
dialing into a Remote Access Server (RAS) whereas IPSec is used
when connecting LAN to LAN through routers. Although
providing good security, a drawback of VPN is client software
must be configured on the client computer using the PPTP mode.
Connecting LAN to LAN requires network administrator skills.
VPN cannot be used ad-hoc from any computer. This limits the
flexibility for any customer to access when required. It may also
be a hindrance if access is urgently required in a crisis.
An air gap application firewall is split into one external server and
one internal server with a switched buffer in between. The switch
is connected to either server at any one time, never to both at the
same time (Figure 10-18). Technically, there is never any direct
connection from the external through to the internal network. The
firewall connects to the external network through the external
server. External messages are terminated at the external server,
which removes the message header from the message. The switch
transfers only the application data, taking away the ability to
address specific resources on the internal network. Filtered appli-
cation data is sent to a designated Web server on the internal
network by the internal server. The internal server generates a
totally new IP session to pass only the application data to the
internal Web server.
274 Software for Automation
Other Features
Several other features in firewalls and some routers are available
and additional measures can be taken to improve network security.
Alert
Firewalls can alert administrators to a possible attack when DoS
or intrusion is discovered. This can be done through Email or
pager. This permits the administrator to act.
Logging
Firewalls can log activity to create an audit trail. This includes
logging external DoS and intrusion attacks as well as logging
legitimate access made from the internal network, such as sites
visited. The network administrator configures triggers in the fire-
wall that log events as they occur. Triggers are selected and
configured based on specific site needs. The log generates an
audit trail that can alert a network administrator to an ongoing
attack and may later be helpful to analyze a successful or unsuc-
cessful attack.
Anti-Virus Scanning
A careless user can inadvertently transfer virus and other mali-
cious software by Email, Web browsing, or file transfer to the
automation system. Therefore, Email, Web browsing, and file
276 Software for Automation
Physical Access
Routers and firewalls provide the network security, and LAN
switches are the basis for the network infrastructure. If these
devices are switched off or damaged, the entire network is
brought down. Similarly, if routers or firewalls are tampered with
or bypassed, security is compromised. Therefore, it is also neces-
sary to consider the physical security of the network hardware.
Keep firewalls, routers, switches, etc. locked in a closet. Similarly
proxies and other servers should also be locked up.
SNMP Considerations
Managed LAN switches and many Ethernet devices can be moni-
tored and configured using the Simple Network Management
Protocol (SNMP). The first versions of SNMP can pose a security
risk because they do not have encrypted authentication, making it
possible to sniff out the “community string” used as a password.
After gaining access, a hacker can use SNMP to discover network
properties and even reconfigure the network. Some network
devices that use early versions of SNMP, therefore, do not imple-
ment the configuration commands. Version 3 of SNMP supports
authentication and encryption, making it more difficult to break.
Either way, it may be a good idea to disable SNMP in the devices
and through packet filters, or through access lists limit users that
can access using SNMP. Make sure to change the default commu-
nity name to a proper password.
WEP
Wireless networks enable several attractive possibilities using
mobile handheld computers for data entry and interrogation
while moving about the plant, factory, or building. However, in
278 Software for Automation
A serious problem with wireless access is the user does not really
know to which Web server or network the computer has
connected. It is possible that an unsuspecting user will unwit-
tingly enter a secret user name and password on a Web page from
a server right outside the premises set up by a hacker with the
purposes to trick users to reveal secret information. The hacker
then uses the identification to gain access to network resources.
Software Security
Software security primarily includes authentication and authori-
zation. Software security is built into computer operating systems,
Web server software, and other software applications. Software
security is required in addition to network firewall measures.
Password Authentication
Authentication means a user is identified. The simplest form of
authentication requires logon with user name and password as
the credentials. If passwords are not managed properly, they are
surprisingly easy to break. A password policy is an essential part
of the overall security policy. Login and password is also used in
Chapter 10 – Safety, Availability & Security 279
Passwords in Automation
In an office environment, there are lots of computers, people with
different roles and responsibilities, and many outside visitors.
Therefore, computers in an office require strict security. However,
operator workstations in many installations remain permanently
logged on, and all share the same pool of computers. Often indi-
vidual passwords are not used, instead all operators share one
login name and password, and the password may never expire, or
there may not even be a password.
Strong Authentication
Because security built on static reusable passwords has proven
easy for hackers to beat, simple passwords are not sufficient for
remote access to the automation system. Strong authentication
uses two factors of identification: something that you know and
something that you have. One is typically a password, the other
282 Software for Automation
Authorization
Authorization means a logged-in user is granted certain rights
and access. Windows NT security can be based on domain or
workgroup network access model, depending on the system
philosophy. The domain model has a single access control list and
is therefore easy to manage. Single Sign-On (SSO) enables a user
to log in once to gain access to the operating system and many
applications, provided the applications have security integrated
with the operating system.
To enhance security:
• Remove all unused accounts, especially those with adminis-
trator rights
• Disable the guest and anonymous accounts
• Review and disable or remove accounts for unauthorized
users, such as staff that leave
• Minimize the number of accounts with administrator
rights; use non-privileged accounts as much as possible
• Use non-privileged accounts for services and continuously
running applications
• Minimize the number of accounts for services and applica-
tions, not associated with a person
• Remove default accounts and accounts created by system
integrators, etc.
Audit Trail
By enabling the audit trail functionality, both successful and failed
attempts to make changes can be logged, which could be legiti-
mate users as well as a hacker. Logs are therefore useful to see
what damage may have been done and assist in repairing it, as
well as to trace the culprit.
Focus
To prevent operators from launching, switching and shutting
down applications, as well as installing or modifying applications
that could interfere with the automation system, many of the
operating system functions have to be disabled for certain users or
groups of users. In an operating system such as Windows, it is
possible to lock out functionality to prevent a user from doing
things that may disable the computer or system, or distract atten-
tion. Security may be used to ensure operators cannot switch to
other applications having no security, preventing security from
being circumvented.
For example, the Task bar “Start” button may have to be disabled
as may special function keys such as the Windows key and
ALT+TAB, CTRL+ESC, ALT+ESC, and limit CTRL+ALT+DEL
(Figure 10-27). This ability to selectively disable the keys is usually
built into the operator software but generally no disabling is done
by default. Therefore, based on the plant’s security policy, disabling
must be configured. The administrator can define for which users
or groups of users certain key combinations should be disabled.
Chapter 10 – Safety, Availability & Security 289
Others
There are several other measures that can be taken to keep the
system as secure as possible. This includes keeping software and
anti-virus definitions up to date.
Anti-Virus Scanning
Not every automation system vendor would permit users to
install anti-virus software on servers and workstations, since
incompatibilities may occur when the virus definition and the
different components of the anti-virus software are updated.
Moreover, the updating process requires connecting to the
Internet or using removable disks, thus exposing the system
(Figure 10-28). Therefore, if Email, Web browsing, or file transfer
290 Software for Automation
Component Security
DCOM provides security for distributed applications, even
though these applications are not specifically designed to be
secure. The security for any application component, such as the
OPC servers, can be customized. DCOM security configuration is
explained in Chapter 3.
Software Considerations
Many operator visualization software applications have been
designed with 21CFR11 in mind, making it easier for system inte-
grators and users to build a system that meets the requirements.
Chapter 10 – Safety, Availability & Security 293
Audit Trail
Operator actions during production can have a significant impact
on the product. When operators make entries, this must be logged
in a time-stamped audit trail. The application may either log the
currently logged-in user as performing the change or, for critical
points, it may pop up a dialog box requiring the user to authenti-
cate one more time to confirm the change before the action is
completed. Optionally, a second person may be required to elec-
tronically sign (see Figure 10-29).
Figure 10-30. Audit Trail in A&E Log ActiveX Viewer (Screenshot: SMAR
SYSTEM302)
Alarm Log
When an operator acknowledges an alarm it may be sufficient to
merely record that the operator action was an alarm acknowledge-
ment. However, it may be helpful to record some additional
comments regarding the reason and circumstances of the alarm.
Therefore it may be a good idea to use operator visualization that
permits operators to enter such comments explaining the alarm
when acknowledging alarms (see Figure 10-31).
296 Software for Automation
Sequencing Enforcement
Some FDA products require a specific sequence of steps and
phases for manufacturing. Software used in manufacturing, such
as operator visualization software and batch management, must
include mechanisms to enforce the correct execution order during
special circumstances when this is done manually. This ensures the
product is manufactured as per specification and with a minimum
of variance. A simple method is to disable or hide controls associ-
ated with the next step before the previous step in the phase has
been completed. More advanced schemes may include the use of
VBA scripting to assure operator commands are valid.
Closed System
21CFR11 distinguishes between “open system” and “closed
system.” A closed system means system access is controlled by
persons who are responsible for the content of electronic records
on the system. When the automation system administrator
controls access, it is a closed system. This is the most straightfor-
ward approach. A “closed system” in the 21CFR11 sense of the
word does not mean proprietary technologies.
Procedures
Several requirements in 21CFR11 cannot be met using software.
Therefore the user is responsible for making sure these require-
ments are met, such as by putting procedures in place. These addi-
tional requirements include getting applications validated, for
298 Software for Automation
Exercises
1. Is it permitted to set OPC to write to a SIS logic solver?