Documente Academic
Documente Profesional
Documente Cultură
Troubleshooting
The world of software has its own set of problems. A few years
ago, using Windows was a very frustrating experience with lots of
crashes. Today, Windows is very stable and deals with anomalies
much more respectably. However, making a specific combination
of software, networking, and security to work can still be prob-
lematic at times. Modern automation systems have large software
content and are, therefore, subject to some of these IT problems.
Fortunately, some tools are available for diagnostics.
DCOM Troubleshooting
IP (Internet Protocol), combined with DCOM, is the platform soft-
ware communication within which automation system is built
upon. Software tools exist that check computers on the network to
see if DCOM is configured to enable remote OPC to work prop-
erly (Figure 6-1). The same tool contains basic network utilities
such as PING to check that basic communication is established.
DCOM Security
DCOM provides the ability for an application on one computer to
start and stop applications running on another machine. This is
clearly a critical function and, therefore, surrounded by lots of
security. Using DCOM, such as when an OPC server or OLE_DB
database exist on a computer different from the display console,
requires proper configuration of the Windows DCOM security
settings. If the settings are done incorrectly, the connection will
not work. In order not to reveal progress to a potentially mali-
cious intruder, there are no error messages that reveal what the
fault is when a connection cannot be established. This is a security
feature, but it makes troubleshooting for friendly purposes diffi-
cult. It is therefore important to follow the DCOM setup systemat-
ically. The setup procedure is explained in Chapter 3 and usually
in manuals for OPC servers and clients.
No Data Updates
An easy three-step test of DCOM settings is to browse the OPC
server from a remote OPC client. If you can see the list of remote
OPC servers, this means you have access to read the registry.
Further, if you can see the tags, this means you have access to the
application. If you see the server and the tags but get no updates,
this means the OPC server does not have sufficient rights to call
back to the client with the subscribed values.
DCOM Intra-domain
Try as far as possible to have OPC servers and OPC clients in the
same Windows domain. If this is not possible, it will be necessary
to employ some additional tricks: the user account name under
which the OPC server runs should also be created on the client
machine, and the user account under which the client runs should
also be created on the server machine. It is important that identical
user account names and passwords be used on both client and
server machines. The client and server accounts can, and preferably
should, use different passwords. Verify that the accounts have been
set up properly by trying to connect to the server machine from the
client machine’s “Network Neighborhood” and vice-versa. Addi-
tionally, DCOM access, launch permissions, etc., must also be set.
Talk to your network administrator or read further about Domain
Trust Relationships in just about any book on Windows
NT/2000/XP. Another alternative to deal with intra-domain
DCOM is to use Web tunnelling, as explained in Chapter 3.
DCOM Time-out
If the network is slow and unreliable, such as across the public
Internet, you may experience time-out problems. In this case, you
may need to use Web tunnelling techniques explained in Chapter 3.
Be sure to use OPC clients that probe servers for status and report problems.
OPC Troubleshooting
OPC reports several error codes for problems with the OPC server
as a whole, as well as different status for each item (tag). It is also
possible to check the OPC server state. When an OPC client is
unable to display values from an OPC server, a simple test is to
see if the OPC client is able to get data from another OPC server,
and if another OPC client is able to get data from the OPC server.
Chapter 6 – Troubleshooting 187
Figure 6-2. The OPC Client Shows Server Error Messages (Screenshot:
SMAR SYSTEM302)
Figure 6-3. OPC Client Indicates the Quality (Screenshot: SMAR SYSTEM302)
The OPC specification does not make clear if the OPC status for
an OPC item (tag) should reflect the status of the value as it exists
in the underlying hardware, or if it should reflect the status of the
OPC communication itself. In other words, should the status indi-
cate the health of the underlying device and sensor, or the
network communication and the OPC server? Both implementa-
tions may exist in different OPC servers used in the same system.
It is a good idea to find out exactly what the status in the OPC
server indicates. Since it is necessary from a troubleshooting point
of view to know if a fault is due to the communication and OPC
server, or due to the underlying device and sensors, most OPC
servers have been implemented in such a way that the OPC item
status indicates the health of the networking and OPC server. For
those parameters that have an associated status in the device, an
additional OPC item is created for this status.
Figure 6-4. Advanced OPC Clients Distinguish between OPC Status and
Parameter Status (Screenshot: SMAR SYSTEM302)
Bad Quality
The “Bad” quality generally indicates the OPC server is not able
to communicate with underlying hardware. It may be a sign of a
network problem or complete device failure.
Uncertain Quality
The “Uncertain” quality, in most implementations, only indicates
a communications failure.
Good Quality
The “Good” quality indicates that OPC communication is fine.
OPC Compatibility
All OPC specifications define some mandatory and some optional
features, in fact they are individual interfaces. Therefore, not all
OPC servers and clients are the same. From time to time you will
find that an OPC client that could perform a particular function
with one OPC server is unable to do it with another, or that one
client does not have a particular OPC feature supported in
another. Note that an OPC client should never require an optional
192 Software for Automation
The OPC Foundation provides a test kit that OPC server vendors
use to test and certify their OPC products. This tester ensures all
mandatory features are supported and the specification has been
implemented correctly, thus ensuring compatibility between
different OPC products.
It may be a good idea to only use certified OPC products to ensure they
support and work on required features (interfaces).
Figure 6-6. OPC Server State and Other Diagnostics (Screenshot: SMAR
OPC DataSpy)
The server state can reveal what is wrong with the OPC server.
NetDDE
The most common problem when doing remote DDE using the
NetDDE functionality is the required network services have not
been started. Make sure the Windows Network DDE and
Network DDE DSDM services are running. Configure DSDM to
start manually and Net DDE to start automatically. This is done
from the Windows Control Panel/Administrative Tools/Services.
DLL Hell
These days we have learned to live with software as a “living
organism.” Applications require what all too often feels like
endless patches, service packs, and upgrades. When applications
share DLLs a phenomenon known as “DLL Hell” may occur.
When a new application is installed or an existing application or
the operating system is upgraded, one or more shared DLL files
are upgraded with a new version. Because an existing DLL is
replaced, some of the existing applications stop functioning or
Chapter 6 – Troubleshooting 195
Memory Leak
A memory leak is a bug characterized by a continuing loss of
available memory caused when an application does not free up
unused memory. Eventually, all memory is used up, and the
application fails. The memory leak problem is especially severe
for automation applications because the programs often run for
years, and components can be started and stopped thousands of
times without restarting the operating system. This is a hard-to-
detect bug, even with special software tools. Users generally
cannot do anything other than report it to the supplier.
License Restrictions
License restrictions can cause a number of different problems.
License issues include such concerns as number of tags and
number of users. OPC servers may, for example, stop updating
after an evaluation period, such as a few hours to a month, has
expired. Similarly, additional tags or clients may not be updated
when the licensed number has been reached.
Exercises
1. How long does it take for an OPC client to detect that an
OPC server has failed?