Sunteți pe pagina 1din 172

Isilon

OneFS
Version 7.2.0.0 - 7.2.0.4

Release Notes

Copyright 2015 EMC Isilon. All rights reserved. Published in USA.


Published October 1, 2015
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change
without notice.
The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with
respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a
particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.
EMC, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other
countries. All other trademarks used herein are the property of their respective owners.
For the most up-to-date regulatory document for your product line, go to EMC Online Support (https://support.emc.com).
EMC Corporation
Hopkinton, Massachusetts 01748-9103
1-508-435-1000 In North America 1-866-464-7381
www.EMC.com

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CONTENTS

Chapter 1

OneFS Release Notes

OneFS 7.2.0 Release notes..............................................................................8

Chapter 2

Upgrading OneFS

Target Code...................................................................................................10
Supported upgrade paths..............................................................................10

Chapter 3

New features, software support, logging, and controls

13

New and changed in OneFS 7.2.0 - Highlights............................................... 14


Authentication................................................................................. 14
Cluster configuration........................................................................14
File system.......................................................................................15
Hardware......................................................................................... 15
HDFS................................................................................................15
Networking...................................................................................... 15
NFS.................................................................................................. 16
OneFS API........................................................................................ 16
Security............................................................................................16
SMB.................................................................................................16
New and changed in OneFS 7.2.0.4............................................................... 16
Authentication................................................................................. 16
File system.......................................................................................17
Hardware......................................................................................... 17
HDFS ............................................................................................... 18
Networking...................................................................................... 18
Security............................................................................................18
Upgrade and installation..................................................................19
New and changed in OneFS 7.2.0.3 (Target Code)......................................... 20
Cluster configuration........................................................................20
Hardware......................................................................................... 20
HDFS................................................................................................21
Networking...................................................................................... 21
Security............................................................................................21
SMB.................................................................................................22
New and changed in OneFS 7.2.0.2............................................................... 22
Antivirus.......................................................................................... 22
Authentication................................................................................. 22
Cluster configuration........................................................................23
File system.......................................................................................23
HDFS................................................................................................24
Security............................................................................................25
New and changed in OneFS 7.2.0.1............................................................... 25
Authentication................................................................................. 25
Cluster configuration........................................................................26
Diagnostic tools............................................................................... 26
File transfer...................................................................................... 26
HDFS................................................................................................26
Security............................................................................................27
OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CONTENTS

SmartLock........................................................................................27
SmartQuotas....................................................................................27
SMB.................................................................................................27

Chapter 4

New hardware and firmware support

29

New hardware and firmware support in OneFS 7.2.0.4...................................30


New hardware and firmware support in OneFS 7.2.0.3 (Target Code)............. 30
New hardware and firmware support in OneFS 7.2.0.2...................................30
New hardware and firmware support in OneFS 7.2.0.1...................................30
New hardware and firmware support in OneFS 7.2.0.0...................................32

Chapter 5

Resolved issues

33

Resolved in OneFS 7.2.0.4............................................................................ 34


Antivirus.......................................................................................... 34
Authentication................................................................................. 34
Backup, recovery, and snapshots.....................................................34
Cluster configuration........................................................................39
Diagnostic tools............................................................................... 39
Events, alerts, and cluster monitoring.............................................. 39
File system.......................................................................................41
Hardware......................................................................................... 43
HDFS................................................................................................44
Migration......................................................................................... 44
Networking...................................................................................... 45
NFS.................................................................................................. 46
OneFS API........................................................................................ 48
OneFS web administration interface.................................................48
SmarQuotas.....................................................................................49
SMB.................................................................................................49
Resolved in OneFS 7.2.0.3 (Target Code)....................................................... 51
Antivirus.......................................................................................... 51
Authentication................................................................................. 51
Backup, recovery, and snapshots.....................................................53
Cluster configuration........................................................................54
Diagnostic tools............................................................................... 55
Events, alerts, and cluster monitoring.............................................. 56
File system.......................................................................................57
File transfer...................................................................................... 60
Hardware......................................................................................... 60
HDFS................................................................................................62
Job engine........................................................................................64
Migration......................................................................................... 64
Networking...................................................................................... 65
NFS.................................................................................................. 65
SmartLock........................................................................................68
SmartQuotas....................................................................................68
SMB.................................................................................................68
Upgrade and installation..................................................................70
Resolved in OneFS 7.2.0.2............................................................................ 71
Antivirus.......................................................................................... 71
Authentication................................................................................. 72
Backup, recovery, and snapshots.....................................................75
Cluster configuration........................................................................77
Diagnostic tools............................................................................... 78
4

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CONTENTS

Events, alerts, and cluster monitoring.............................................. 79


File system.......................................................................................80
Hardware......................................................................................... 82
Job engine........................................................................................86
Migration......................................................................................... 86
Networking...................................................................................... 87
NFS.................................................................................................. 89
OneFS web administration interface.................................................90
SmartLock........................................................................................91
SMB.................................................................................................91
Upgrade and installation..................................................................93
Virtual plug-ins................................................................................ 94
Resolved in OneFS 7.2.0.1............................................................................ 95
Antivirus.......................................................................................... 95
Authentication................................................................................. 95
Backup, recovery, and snapshots.....................................................96
Cluster configuration........................................................................98
Command-line interface................................................................... 98
Events, alerts, and cluster monitoring.............................................. 98
File system.......................................................................................99
Hardware....................................................................................... 101
HDFS..............................................................................................102
Job engine......................................................................................104
Migration....................................................................................... 104
Networking.................................................................................... 105
NFS................................................................................................ 106
OneFS API...................................................................................... 109
OneFS web administration interface...............................................109
SmartLock......................................................................................110
SmartQuotas..................................................................................110
SMB...............................................................................................110
Virtual plug-ins.............................................................................. 112
Resolved in OneFS 7.2.0.0.......................................................................... 112
Antivirus........................................................................................ 112
Authentication............................................................................... 113
Backup, recovery, and snapshots...................................................114
Cluster configuration......................................................................116
Events, alerts, and cluster monitoring............................................ 116
File system.....................................................................................117
File transfer....................................................................................119
Hardware....................................................................................... 119
HDFS..............................................................................................121
Job engine......................................................................................121
Migration....................................................................................... 122
Networking.................................................................................... 122
NFS................................................................................................ 124
OneFS web administration interface...............................................124
SmartLock......................................................................................125
SmartQuotas..................................................................................125
SMB...............................................................................................125
Upgrade and installation................................................................130
Virtual plug-ins.............................................................................. 130

Chapter 6

Isilon ETAs and ESAs related to this release

131

ETAs related to OneFS 7.2.0........................................................................ 132


OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CONTENTS

ESAs related to OneFS 7.2.0........................................................................ 133

Chapter 7

OneFS patches included in this release

135

Patches included in OneFS 7.2.0.4.............................................................. 136


Patches included in OneFS 7.2.0.3 (Target Code)........................................ 136
Patches included in OneFS 7.2.0.2.............................................................. 137
Patches included in OneFS 7.2.0.1.............................................................. 139

Chapter 8

Known issues

141

Target Code known issues...........................................................................142


Antivirus..................................................................................................... 142
Authentication............................................................................................ 142
Backup, recovery, and snapshots ............................................................... 143
Cluster configuration...................................................................................145
Command-line interface.............................................................................. 146
Diagnostic tools.......................................................................................... 146
Events, alerts, and cluster monitoring......................................................... 146
File system.................................................................................................. 149
File transfer................................................................................................. 151
Hardware.................................................................................................... 151
HDFS........................................................................................................... 153
iSCSI........................................................................................................... 154
Job engine...................................................................................................154
Migration.................................................................................................... 156
Networking..................................................................................................156
NFS............................................................................................................. 157
OneFS API................................................................................................... 159
OneFS web administration interface............................................................ 160
Security.......................................................................................................160
SmartQuotas...............................................................................................161
SMB............................................................................................................ 161
Upgrade and installation............................................................................. 162
Virtual plug-ins............................................................................................163

Chapter 9

OneFS Release Resources

165

OneFS information and documentation....................................................... 166


Functional areas in the OneFS release notes................................................167
Where to go for support...............................................................................171
Provide feedback about this document....................................................... 171

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CHAPTER 1
OneFS Release Notes

The OneFS release notes contain information about new features, changes in
functionality, issues that are resolved, support for new hardware and firmware, and
known issues and limitations in the Isilon OneFS 7.2.0 operating system.
l

OneFS 7.2.0 Release notes......................................................................................8

OneFS Release Notes

OneFS Release Notes

OneFS 7.2.0 Release notes


The OneFS 7.2.0 release notes contain descriptions of all of the enhancements,
functionality changes, new features, support for hardware, support for firmware, and
resolved issues that are included in the release.
l

OneFS 7.2.0.4 released: October 1, 2015 (General Availability)

OneFS 7.2.0.3 released: July 22, 2015 (Target Code)

OneFS 7.2.0.2 released: May 6, 2015

OneFS 7.2.0.1 released: February 18, 2015

OneFS 7.2.0.0 released: November 20, 2014

The new features, functionality changes, resolved issues, and known issues listed in the
release notes are categorized by functional area. For a list of the functional areas used to
categorize the release notes and a brief description of what each functional area typically
contains, see the Functional areas in the OneFS release notes section in the OneFS release
resources section at the end of this document.
For a list of available OneFS releases and information about target code releases and
general availability (GA) releases, see Current Isilon Software Releases on the EMC Online
Support site.

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CHAPTER 2
Upgrading OneFS

OneFS upgrades comprise a full operating system upgrade and require that the Isilon
cluster be rebooted. To help ensure that the version of OneFS to which you upgrade
contains all of the resolved issues included in the version you are upgrading from,
upgrades are supported only from designated previous releases of OneFS.
Before upgrading OneFS, review the Supported upgrade paths section of this document to
verify that the cluster can be upgraded from your current version of OneFS directly to this
release.
See the OneFS Upgrade Planning and Process Guide on the EMC Online Support site for
detailed upgrade instructions and additional upgrade information.
To download the installer for this maintenance release, see the OneFS Downloads page
on the EMC Online Support site.
l
l

Target Code........................................................................................................... 10
Supported upgrade paths......................................................................................10

Upgrading OneFS

Upgrading OneFS

Target Code
OneFS 7.2.0.3 is the current 7.2.0.x target code version. A OneFS release is designated as
Target Code after it satisfies specific criteria, which includes production time in the field,
deployments across all supported node platforms, and additional quality metrics. For
information about upgrading to OneFS Target Code, see Upgrading to OneFS Target Code
on the Isilon EMC Community Network (ECN) pages.

Supported upgrade paths


To ensure that the version of OneFS you are upgrading to contains all of the bug fixes
included in the version of OneFS you are upgrading from, upgrades are only supported
from specified versions of OneFS. If the cluster is not running a supported version of
OneFS, contact EMC Isilon Technical Support before attempting an upgrade.
Upgrade resources
For more information about simultaneous and rolling upgrades and other important
details about the OneFS upgrade process see the OneFS Upgrades - Isilon Info Hub.
Upgrades to OneFS 7.2.0.4 (General Availability)
Simultaneous upgrades to OneFS 7.2.0.4 are supported from the following OneFS
versions:
l

OneFS 7.2.0 through OneFS 7.2.0.3

OneFS 7.1.1 through OneFS 7.1.1.7

OneFS 7.1.0 through OneFS 7.1.0.6

OneFS 7.0.2 through OneFS 7.0.2.13

OneFS 7.0.1 through OneFS 7.0.1.10

OneFS 7.0 (7.0.0.0)

Rolling upgrades to OneFS 7.2.0.4 are supported from the following OneFS versions:
l

OneFS 7.2.0 through OneFS 7.2.0.3

OneFS 7.1.1 through OneFS 7.1.1.7

OneFS 7.1.0 through OneFS 7.1.0.6

Upgrades to OneFS 7.2.0.3 (Target Code)


Simultaneous upgrades to OneFS 7.2.0.3 are supported from the following OneFS
versions:
l

OneFS 7.2.0 through OneFS 7.2.0.2

OneFS 7.1.1 through OneFS 7.1.1.5

OneFS 7.1.0 through OneFS 7.1.0.6

OneFS 7.0.2 through OneFS 7.0.2.13

OneFS 7.0.1 through OneFS 7.0.1.10

OneFS 7.0 (7.0.0.0)

Rolling upgrades to OneFS 7.2.0.3 are supported from the following OneFS versions:

10

OneFS 7.2.0 through OneFS 7.2.0.2

OneFS 7.1.1 through OneFS 7.1.1.5

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Upgrading OneFS

OneFS 7.1.0 through OneFS 7.1.0.6

Supported upgrade paths

11

Upgrading OneFS

12

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CHAPTER 3
New features, software support, logging, and
controls

This section contains descriptions of new features, new software support, new protocol
and protocol version support, additional logging, and new controls such as commandline options and sysctl parameters.
New features enable you to perform tasks or implement configurations that were
previously unavailable.
These new features include:
l

New software and protocol support

Updated software and protocol version support

New logging

New controls such as command options, sysctl parameters, and OneFS web
administration controls

Functionality changes include modifications and enhancements to OneFS that enable you
to perform preexisting tasks in new ways, or that improve underlying OneFS functionality
or performance. These changes also include removing support for deprecated protocols
and software.
The functionality changes documented in the release notes include:
l

Changes to the formatting or syntax of a pre-existing command

Changes to underlying code to improve performance

Updates to integrated OneFS components such as OpenSSL and GNU bash

Changes to enable functionality in the OneFS web administration interface that was
previously available only from the command-line interface

Changes to remove support for old and deprecated protocols

New and changed in OneFS 7.2.0 - Highlights....................................................... 14


New and changed in OneFS 7.2.0.4....................................................................... 16
New and changed in OneFS 7.2.0.3 (Target Code)................................................. 20
New and changed in OneFS 7.2.0.2....................................................................... 22
New and changed in OneFS 7.2.0.1....................................................................... 25

l
l
l
l

New features, software support, logging, and controls

13

New features, software support, logging, and controls

New and changed in OneFS 7.2.0 - Highlights


Authentication
Improved usability for MIT Kerberos
The MIT Kerberos authentication method has been completely revamped to make it
consistent with the other authentication methods. You can now manage Kerberos
authentication through a Kerberos provider, similar to the Active Directory provider.
A Kerberos provider can be included in various access zones similar to the other
providers.
OneFS defaults to LDAP paged search
OneFS now defaults to LDAP paged search if both paged search and Virtual List View
(VLV) are supported. If paged search is not supported and VLV is enabled on the
LDAP server, OneFS will use VLV when returning the results from a search.
Note

In most cases, bind-dn and bind-password must be enabled in order to use VLV.

Cluster configuration
New protection policy
To ensure that node pools made up of new Isilon HD400 nodes can maintain a data
protection level that meets EMC Isilon guidelines for meantime to data loss (MTTDL),
OneFS offers a new requested protection option, +3d:1n1d (3 drives or 1 node and 1
drive). This setting ensures that data remains protected in the event of three
simultaneous drive failures, or the simultaneous failure of one drive and one node.
This protection policy can also be applied to node pools that do not contain HD400
nodes.
Suggested protection
OneFS now includes a function to calculate a recommended protection level based
on cluster configuration. This capability is available only on new clusters. Clusters
upgraded to OneFS 7.2 do not have this capability. Although you can specify a
different requested protection on a node pool, the suggested protection level strikes
the best balance between data protection and storage efficiency. In addition, as you
add nodes to your Isilon cluster, OneFS continually evaluates the protection level
and alerts you if the cluster falls below the suggested protection level.
Node equivalency
OneFS now enables nodes of different generations to be compatible based on
certain criteria and constraints. You can specify compatibilities between Isilon S200
and similarly configured Isilon S210 nodes, and between X400 and similarly
configured X410 nodes. Nodes must have compatible RAM amounts and identical
HDD and SSD configurations. Compatibilities allow newer generation nodes to be
joined to existing node pools made up of older generation nodes. After you add
three or more newer generation nodes, you can delete the compatibility so that
OneFS can autoprovision the new nodes into their own node pools. This enables you
to take advantage of the speed and efficiency characteristics of the newer node
types in their own node pools.

14

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

New features, software support, logging, and controls

Zone-aware ID mapping
OneFS now supports management of ID mapping rules for each access zone. ID
mapping associates Windows identifiers to UNIX identifiers to provide consistent
access control across file sharing protocols within an access zone.

File system
L3 cache stores metadata only on archive platforms
For Isilon NL400 and HD400 nodes that contain SSDs, L3 cache is enabled by default
and cannot be disabled. In addition, L3 cache stores only metadata in SSDs on
archive platforms, which feature mostly data writing events. By storing metadata
only, L3 cache optimizes the performance of write-based operations.

Hardware
Automatic drive firmware updates
OneFS now supports automatic drive firmware updates for new and replacement
drives. This is enabled through drive support packages.
Improved InfiniBand stability
The stability of back-end connections to the cluster has been improved by
addressing a number of issues that were encountered when one or more InfiniBand
switches was rebooted. In some cases, the issues that were addressed occurred if
one or more InfiniBand switches were rebooted manually. In other cases, the one or
more InfiniBand switches unexpectedly rebooted due to an issue such as a memory
leak or a race condition. If any of these issue occurred, the affected nodes typically
lost connectivity to the cluster and, in some cases, had to be manually rebooted in
order to reestablish a connection.

HDFS
Increased Hadoop support
l

OneFS now supports additional Hadoop distributions including Cloudera CDH5,


Hortonworks Data Platform 2.1, and Apache Hadoop 2.4.
WebHDFS now supports Kerberos authentication. Users connecting to the EMC
Isilon cluster through a WebHDFS interface can be authenticated with Kerberos.
HDFS supports secure impersonation through proxy users that impersonate
other users with Kerberos credentials to perform Hadoop jobs on HDFS data.
OneFS now supports an Ambari agent that allows you to monitor the status of
HDFS services in each access zone through an external Ambari interface.

Networking
Source-based routing
OneFS now supports source-based routing, which selects which gateway to direct
outgoing client traffic through based on the source IP address in each packet
header.

File system

15

New features, software support, logging, and controls

NFS
NFS service improvements
OneFS incorporates a number of improvements to the NFS service, including support
of NFS v4 and NFS v3 (NFS v2 is no longer supported). Other improvements include
moving the service from the operating system kernel into userspace for increased
reliability; supporting audit features for NFS events; incorporating access zone
support for NFS clients; autobalancing across all nodes to achieve performance
parity and ensure continuous service; and the ability to create aliases to simplify
client connections to NFS exports.

OneFS API
RESTful interface for object storage
OneFS introduces Isilon Swift, an object storage application for Isilon clusters based
on the object storage API provided by OpenStack Swift. The Swift RESTful API, an
HTTP-based protocol, allows Swift clients to execute Swift API commands directly
with Isilon to execute object storage requests. Accounts, containers, and objects
that form a basis for the object storage can be accessed through the NFS, SMB, FTP,
and RAN protocols in addition to the Swift RESTful API. The following Swift RESTful
API calls are supported: GET, PUT, POST, HEAD, DELETE, and COPY.

Security
Telnet_d support disabled on upgrade
Telnet service, which was removed in OneFS 7.0.0, will stop functioning on upgrade
to 7.2.0. SSH should be used for all shell access.

SMB
Support for SMB2 symbolic links
Beginning in OneFS 7.2.0, OneFS natively supports translation of SMB2 symbolic
links. This change might affect the behavior of SMB2 symbolic links in environments
that rely on them. For more information, see article 193808 on the EMC Online
Support site.

New and changed in OneFS 7.2.0.4


Authentication
New and changed in OneFS 7.2.0.4

ID

A user that attempts to connect to the cluster over SSH, through the OneFS API, or
through a serial cable, can no longer be authenticated on clusters running in
compliance mode if any of the following identifiers are assigned to the user as
either the user's primary ID or as a supplemental ID:
UID: 0

156600

SID: S-1-22-1-0
The message that is logged in the /var/log/lsassd.log file when a trusted
151058
Active Directory domain is offline now includes the name of the domain that cannot

16

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

New features, software support, logging, and controls

New and changed in OneFS 7.2.0.4

ID

be reached. In the example below, <domain_name> is the name of the domain that is
offline:
[lsass] Domain '<domain_name>' is offline

File system
New and changed in OneFS 7.2.0.4

ID

If you run the stat command to view information about a file, the Snapshot ID of
the file is now included in the output. This information appears in the st_snapid
field.

147333

New and changed in OneFS 7.2.0.4

ID

Wear life thresholds were added for the system area on the following Sunset Cove
Plus SSD drive models:

156892

Hardware

HGST HUSMM1620ASS200

HGST HUSMM1640ASS200

HGST HUSMM1680ASS200

HGST HUSMM1680ASS205

HGST HUSMM1616ASS200

The addition of these thresholds enables OneFS to generate alerts and log events if
the wear life of the system area on these SSD drive models reaches 88 percent
(warn), 89 percent (critical), or 90 percent (smartfail).
New control:Options were added to the isi_dsp_install command to enable
you to display the version number of the most recently installed drive support
package (DSP) or to display a list of previously installed DSPs. To display the
version number of the most recently installed DSP, run the following command:

154222

isi_dsp_install --latest

Output similar to the following is displayed:


2015-06-22 15:02:21 || Drive_Support_v1.7.tar

To display a list of previously installed DSPs, run the following command:


isi_dsp_install history

Output similar to the following is displayed:


2015-06-22 15:00:20 || Drive_Support_v1.5.tar
2015-06-22 15:01:36 || Drive_Support_v1.6.tar
2015-06-22 15:02:21 || Drive_Support_v1.7.tar

File system

17

New features, software support, logging, and controls

New and changed in OneFS 7.2.0.4

ID

The error that appears if you run the isi_dmilog command on a platform that
does not support the command was changed from

150724

dmilog functions not supported on this platform

to
dmilog functions not supported on this platform - please consult
'isi_hwmon -h'

For more information about the isi_hwmon command, see article 199270 on the
EMC Online Support site.

HDFS
New and changed in OneFS 7.2.0.4

ID

Support for Ambari 2.0.2 was added.

157860

1.7.0_IBM HDFS was added to the list of supported Ambari servers.

154873

New and changed in OneFS 7.2.0.4

ID

The default network flow control setting for Isilon nodes that contain Intel network
interface cards (ixgbe NICs) was changed. The default flow control setting is now 1.
The ixgbe NIC can receive pause frames but does not send pause frames. This
configuration is consistent with Isilon nodes that contain Chelsio NICs.

151707

Networking

Note

Ethernet flow control in a full-duplex physical link provides a mechanism that will
allow an interface or switch to request a short pause in frame transmission from a
sender by issuing a media access control (MAC) control message and PAUSE
specification as described in the 802.3x full-duplex supplement standard.

Security
New and changed in OneFS 7.2.0.4

ID

On clusters running in compliance mode, you can no longer run the su command to 157417
assume the privileges of a user with root-level (UID=0) access to the cluster. If you
attempt to run the su command to assume the privileges of a user with root-level
privileges, the following message appears on the console:
su: UID 0 denied by compliance mode

18

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

New features, software support, logging, and controls

New and changed in OneFS 7.2.0.4

ID

Note

Due to this change in behavior, beginning in OneFS 7.2.0.4, clusters running in


compliance mode cannot be reimaged by running the sudo isi_reimage
command.
The network time protocol (NTP) service was updated to version 4.2.8P1. For more
information, see ESA-2015-154 on the EMC Online Support site.

154655

The version of OpenSSL that is installed on the cluster was updated to version
0.9.8.zg.

145892

Upgrade and installation


New and changed in OneFS 7.2.0.4

ID

Beginning in OneFS 7.1.0, the file in which cluster configuration information is


stored was changed from a plain text file (gconfig) to a database file
(isi_gconfg.db). In conjunction with this change, the maximum allowed size of
the configuration information for an SMB share was limited to 8192 bytes (8 KB).
Beginning in OneFS 7.2.0.4, the OneFS pre-upgrade check checks the size of the
configuration information for an SMB share prior to upgrading the cluster and the
cluster is prevented from being upgraded if the configuration size is greater than 8
KB.

156585

The pre-upgrade check can be run alone, or as part of the upgrade process. In
either case, if the configuration size of an SMB share exceeds the maximum size
allowed , a message similar to the following appears on the console during the preupgrade check:
Error: The 'share_name' share has too many access permissions and
it cannot be upgraded.
The suggested resolution for this issue is:
1. Remove those users from the share permissions.
2. Add those users to a group.
3. Add that group to the share permissions.
4. Retry the upgrade.

If the pre-upgrade check detects that the configuration size of an SMB share
exceeds the maximum size allowed when it is running as part of the default
upgrade process, the pre-upgrade check portion of the upgrade completes,
however the OneFS upgrade is not started, and a message similar to the following
appears on the console, and in the SMB upgrade log file located in the /
ifs/.ifsvar/tmp directory:
Error: The 'share_name' share has too many access permissions and
it cannot be upgraded.
The suggested resolution for this issue is:
1. Remove those users from the share permissions.
2. Add those users to a group.
3. Add that group to the share permissions.
4. Retry the upgrade.

Under these conditions, the upgrade process cannot be completed until the SMB
share configuration information is reduced in size. In most cases, this can be
accomplished by following the resolution suggested during the pre-upgrade check.

Upgrade and installation

19

New features, software support, logging, and controls

New and changed in OneFS 7.2.0.4

ID

If you encounter this limitation and cannot reduce the size of the SMB configuration
information by following these steps, contact EMC Isilon Technical support for
assistance.
Note

Prior to the addition of this check, if the configuration size of an SMB share on a
cluster that was being upgraded to OneFS 7.1.0 or later exceeded the maximum
size allowed, some of the share information might not have been preserved during
the upgrade process, and an error similar to the following might have appeared in
the /var/log/isi_gconfig_d.log file:
Update error: value for key 'share_name' has size (12324) greater
than max allowed value size (8192)

Although the isi pkg command was not intended to be used to install a drive
support package (DSP), it was possible to install a DSP by running the isi pkg
command. If a DSP was installed using the isi pkg command, the cluster might
have exhibited unexpected behavior until the DSP was removed.
Beginning in OneFS 7.2.0.4, if you attempt to install a DSP using the isi pkg
command, the installation fails and a message similar to the following appears in
the /var/log/isi_pkg log file:

153429

Package <PACKAGE NAME> must be installed with isi_dsp_install.

New and changed in OneFS 7.2.0.3 (Target Code)


Cluster configuration
New and changed in OneFS 7.2.0.3

ID

The output from the sysctl efs.bam.disk_pool_db command now shows


the equivalence_id for pool groups. The ID helps Isilon Technical Support to

150558

identify internal datastructure values when troubleshooting issues related to


storage pools.
More detailed logging was added to help diagnose issues that occur when
149686
SmartPools are upgraded during a OneFS upgrade and to help diagnose issues that
occur after running the smartpools-upgrade command.
Note

This new information appears in the /var/log/messages file.

Hardware

20

New and changed in OneFS 7.2.0.3

ID

A new version of the QLogic BXE driver was incorporated into this release.

152083

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

New features, software support, logging, and controls

New and changed in OneFS 7.2.0.3

ID

Adds a check to the OneFS software event 400120001 to detect boot drives that
are missing mirror components.

145967

Improves the node format command so that the progress of the node format
operation is reported in percentage complete. Prior to this change, dots were
displayed on the console until the operation was complete.

142241

Removed redundant requests for a node's sensor data from the isi_hw_status
command, to improve the response time on A100, S210, X410, and HD400 nodes.

142147

New and changed in OneFS 7.2.0.3

ID

Support for Ambari 2.0.1 and 2.1.0 was added.

153925

Support for the HDFS truncate remote procedure call was added.

143461

Support for Ambari 2.0.0 was added.

140053

New and changed in OneFS 7.2.0.3

ID

Support for PTR record lookup for SmartConnect zone member addresses was
added.

149662

New control:
The following parameters were added to the isi networks command:

145012

HDFS

Networking

--disable-dns-tcp-support

--enable-dns-tcp-support

The first parameter can be used to enable TCP support for SmartConnect; the
second parameter can be used to disable TCP support for SmartConnect. By
default, TCP support is enabled and SmartConnect works as expected. If TCP
support is disabled, SmartConnect doesn't listen for TCP connections on the DNS
port (53), and clients that attempt a DNS query over TCP receive a connection

refused error.

Security
New and changed in OneFS 7.2.0.3

ID

The version of Apache that is installed on the cluster was updated to version
2.2.29. For more information, see ESA-2015-093 on the EMC Online Support site.

136994

HDFS

21

New features, software support, logging, and controls

SMB
New and changed in OneFS 7.2.0.3

ID

Improves logging operations performed by the SRVSVC process, as follows:

149826,
149776

The default logging level for the srvsvc process was changed from WARNING to
INFO.

The user name and domain name for the user performing an action is logged in
the /var/log/srvsvcd.log file, in addition to the SID.

The action of modifying or deleting an SMB share via the Microsoft


Management Console (MMC) snap-in is logged in the /var/log/
srvsvcd.log file, including the user name.

An example of the new logging output appears below, where <USER SID info> is the
name and SID of the user and <SMB_SHARE_NAME> is the name of the share:
Log level changed to INFO
DOMAIN_NAME\USER_NAME <USER SID info> set info on share
SMB_SHARE_NAME
DOMAIN_NAME\USER_NAME <USER SID info> deleted share SMB_SHARE_NAME

Adds support for the SMB2_CREATE_QUERY_ON_DISK_ID (QFid) SMB CREATE


Request value.

149777

Note

Support for the SMB 2 QFid SMB CREATE Request value allows a file opened from
an SMB share to be temporarily cached on an SMB 2 client, reducing some network
traffic associated with opening and closing the file.

New and changed in OneFS 7.2.0.2


Antivirus
New and changed in OneFS 7.2.0.2

ID

The MCP virus_scan parameter was added to the isi_rpc_d configuration file.

142083

New and changed in OneFS 7.2.0.2

ID

The number and type of actions that are logged when a machine password change
triggers a configuration update were increased. Beginning in OneFS 7.2.0.2, if a
machine password is updated, the following activities are logged:

138759

Authentication

22

The time at which an lsass thread starts the machine password update

The result of the attempt to update the password on a domain controller

The result of the LDAP confirmation of the password version

The result of updating the /ifs/.ifsvar/pstore.gc file

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

New features, software support, logging, and controls

New and changed in OneFS 7.2.0.2


l

ID

The success or failure of the password update attempt

Cluster configuration
New and changed in OneFS 7.2.0.2

ID

Logging was added to help identify issues that are caused by applying restrictive
permissions to the /usr/share/zoneinfo directory or its subdirectories.

138729

Note

It is possible to apply permissions to the /usr/share/zoneinfo directory and


its subdirectories that will prevent the isi_papi_d process from reading necessary
files. If the isi_papi_d process cannot access these files, the OneFS web
administration interface cannot start, and lines similar to the following appear in
the /var/log/messages file:
/boot/kernel.amd64/kernel:[kern_sig.c:3349](pid
10953="isi_papi_d")
(tid=100317) Stack trace:
/boot/kernel.amd64/kernel:
Stack:
-------------------------------------------------/boot/kernel.amd64/kernel:
/lib/libc.so.7:strlcpy+0x15
/boot/kernel.amd64/kernel:
/usr/lib/libisi_config.so.1:
arr_dev_type_parse+0xb23
/boot/kernel.amd64/kernel:
/usr/lib/libisi_config.so.1:
_arr_config_load_from_impl+0x174
/boot/kernel.amd64/kernel:
/usr/lib/libisi_platform_api.so.1:
_ZN24cluster_identity_handler8http
_getERK7requestR8response+0x39
/boot/kernel.amd64/kernel:
/usr/lib/libisi_rest_server.so.1:
_ZN11uri_handler19execute
_http_methodERK7requestR8response+0x57d
/boot/kernel.amd64/kernel:
/usr/lib/libisi_rest_server.so.1:
_ZN11uri_manager15execute
_requestER7requestR8response+0x100
/boot/kernel.amd64/kernel:
/usr/lib/libisi_rest_server.so.1:
_ZN14request_thread7processEP12fcgi_request+0xbd
/boot/kernel.amd64/kernel:
/usr/lib/libisi_rest_server.so.1:
_ZN14request_thread6on_runEv+0x1b
/boot/kernel.amd64/kernel:
/lib/libthr.so.3:_pthread_getprio+0x15d
/boot/kernel.amd64/kernel:
-------------------------------------------------/boot/kernel.amd64/kernel: pid 10953 (isi_papi_d), uid 1:
exited on signal 11 (core dumped)

File system
New and changed in OneFS 7.2.0.2

ID

NEw control:

141959
Cluster configuration

23

New features, software support, logging, and controls

New and changed in OneFS 7.2.0.2

ID

The --reserved option was added to the isi get command, and the isi get
command was modified so that it runs only on specific, reserved logical inodes
(LINs) when the command is run with both the --reserved option and the -L
option.
Logging similar to the following was added to the /var/log/messages file if the 139667
NVRAM journal cannot be read:
Bad type: 0

Logging was added to improve diagnosis of issues that can occur if a necessary
OneFS python file fails to load. If this condition is encountered, a message similar
to the following appears in the /var/log/messages file where <python_file> is
the name of the python file that failed to load:

138733

python: Failed to import isi.app.lib.cluster in <python_file>

Note

In addition to the messages described above, if you run the isi stat or if you
run the isi events list -w command, a bad marshal error appears on
the console. If you encounter the issue that this new logging is intended to help
diagnose, contact EMC Isilon Technical Support for assistance. For more
information about this issue, see article 197403 on the EMC Online Support site.

HDFS
New and changed in OneFS 7.2.0.2

ID

Support for Ambari 1.7.1 was added.

145759

Support for the getEZForPath and checkAccess HDFS RPC calls was added.

142558,
140040

Note

In previous versions of OneFS, if an HDFS client sent a request to the HDFS server
that contained one of these RPC calls, the call failed, and messages similar to the
following were returned to the client:
org.apache.hadoop.ipc.RemoteException
(org.apache.hadoop.ipc.RpcNoSuchMethodException):
Unknown rpc: getEZForPath and
org.apache.hadoop.ipc.RemoteException
(org.apache.hadoop.ipc.RpcNoSuchMethodException):
Unknown rpc: checkAccess

Support for Ambari version 1.7.0 was added.

24

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

140051

New features, software support, logging, and controls

Security
New and changed in OneFS 7.2.0.2

ID

The version of GNU bash installed on the cluster was updated to version 4.1.17. For 143337
more information, see ESA-2014-146 on the EMC Online Supprot site.
User input that is passed to a command line is now escaped using quotation
marks. For more information, see ESA-2015-112 on the EMC Online Support site.

140931

An update was applied to address a denial of service vulnerability in Apache HTTP


Server. For more information, see ESA-2015-093 on the EMC Online Support site.

137884

New and changed in OneFS 7.2.0.1


Authentication
New and changed in OneFS 7.2.0.1
Adds the ability to enable Telnet on the cluster.
For more information, see article 198100 on the EMC Online Support site.

137111

Adds a setting to the OneFS registry that enables you to configure the maximum
amount of memory that can be allocated to the lsass process.

134439

Note

Without this setting, the maximum amount of memory that can be allocated to the
lsass process is set to a default of 512 MB. If the system approaches that limit,
LDAP connections are closed, and the following lines appear in the lsassd.log
file:
Error code
Retrying.
Error code
Retrying.
Error code
Retrying.
Error code
Retrying.
Error code
Retrying.

40286 occurred during attempt 0 of a ldap search.


40286 occurred during attempt 1 of a ldap search.
40286 occurred during attempt 2 of a ldap search.
40286 occurred during attempt 1 of a ldap search.
40286 occurred during attempt 1 of a ldap search.

Work with EMC Isilon Support to determine whether you need to configure the
amount of memory allocated to the lsass process. The memory limit must be at
least 512 MB, and no more than 1024 MB. If the memory limit is set outside that
range, the system will restore the default value of 512 MB.
For more information, see article 195564 on the EMC Online Support site.

Security

25

New features, software support, logging, and controls

Cluster configuration
New and changed in OneFS 7.2.0.1
Updates the time zone database that OneFS relies on when you configure the
cluster time zone to Time Zone Data v. 2014h. This database is made available by
the Internet Assigned Numbers Authority (IANA).

135492

Diagnostic tools
New and changed in OneFS 7.2.0.1
New control:
The following options were added to the isi_gather_info command:
l

--dump and --cores to collect the associated files for diagnosis.

--no-cores and --no-dumps if the associated files are not needed.

--clean-all, --clean-cores, and --clean-dumps to delete the


associated files from the /var/crash directory after successful compression
of the package.

135226

Note

dump refers to files that are logged when the node stops responding, and core
refers to files that are logged when the node unexpectedly restarts.

File transfer
New and changed in OneFS 7.2.0.1
The throughput calculation performed by the vsftpd process was improved so that
the total throughput perceived by FTP clients is more precisely controlled by
configuring the local_max_rate option in the /etc/mcp/templates/
vsftpd.conf file.

134432

Note

Prior to implementing this fix, after configuring the local_max_rate option, the
total throughput perceived by FTP clients was lower than expected.

HDFS
New and changed in OneFS 7.2.0.1

26

Support for Cloudera 5.2 was added.

138484

Support for Ambari 1.6.1 was added.

133358

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

New features, software support, logging, and controls

Security
New and changed in OneFS 7.2.0.1

ID

The version of OpenSSL installed on the cluster was updated to 0.9.8zc.

137904

The versions of the Network Time Protocol daemon (NTPD) and Apache, were
updated as follows:

137895

The version of Apache that is installed on the cluster was updated from 2.2.21
to 2.2.25

The version of NTPD that is installed on the cluster was updated from 4.2.4p4
to 4.2.6p5

The version of ConnectEMC installed on the cluster was updated from version
3.2.0.4 to 3.2.0.6. This upgrade changes the behavior of the ConnectEMC
component so that it no longer uses an internal version of OpenSSL and instead
relies on the version of OpenSSL installed on the Isilon cluster. For more
information, see ESA-2015-038 on the EMC Online Support Site.

134760

SmartLock
New and changed in OneFS 7.2.0.1
Adds commands to the sudoers file, which is a file that defines which commands
a user with sudo privileges is permitted to run. These additional commands enable
EMC Isilon Technical Support staff to troubleshoot clusters that are in compliance
mode.

133285

SmartQuotas
New and changed in OneFS 7.2.0.1
New control:
The efs.quota.allow_remote_root sysctl parameter was added to allow a
root user who is connected to the cluster remotely to make changes to files and
directories within a SmartQuota domain, even if those changes would exceed or
further exceed the quota domains hard threshold.

131283

For more information about sysctls, see article 89232 on the EMC Online Support
site.

SMB
New and changed in OneFS 7.2.0.1
New control:

136296

The following option was added to the gconfig file:


registry.Services.lwio.Parameters.Drivers.onefs.FileAttrib
uteEncryptedIgnored

Security

27

New features, software support, logging, and controls

New and changed in OneFS 7.2.0.1


If this option is enabled, Windows offline encrypted files will be synchronized in
unencrypted format when an affected user reconnects to the cluster.
To enable this option, run the following command:
isi_gconfig
registry.Services.lwio.Parameters.Drivers.onefs.FileAttrib
uteEncryptedIgnored=1
New control:
The SMB 1 maximum buffer size can now be configured to meet the requirements
of Kazeon applications.
To configure the SMB 1 maximum buffer size:
1.

Open an SSH connection on any node in the cluster and log on using the root
account.

2.

Run the following command from the command line where <max_buffer> is the
desired maximum buffer size: isi_gconfig
registry.Services.lwio.Parameters.Drivers.srv.MaxBuffer
SizeSMB1=<max_buffer>

Note

For optimal interoperability with Kazeon, the maximum buffer size should be set to
16644.

28

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

134448

CHAPTER 4
New hardware and firmware support

The following sections list new support for hardware and firmware revisions that was
added in the specified OneFS releases.
l
l
l
l
l

New hardware and firmware support in OneFS 7.2.0.4...........................................30


New hardware and firmware support in OneFS 7.2.0.3 (Target Code)..................... 30
New hardware and firmware support in OneFS 7.2.0.2...........................................30
New hardware and firmware support in OneFS 7.2.0.1...........................................30
New hardware and firmware support in OneFS 7.2.0.0...........................................32

New hardware and firmware support

29

New hardware and firmware support

New hardware and firmware support in OneFS 7.2.0.4


Hardware

Model
Number

Drive Type

Hardware: Support for


SG9SLM3B8GB Boot flash
SMART iSATA
M11ISI
SG9SLM3B8GBM11ISI 8GB
boot flash drives was
added. (156892)

Compatible
Nodes

Firmware

IQ108NL, NL400,
S200, X200, X400

Ver7.02k or
Ver7.02w

New hardware and firmware support in OneFS 7.2.0.3 (Target


Code)
Hardware

Model
Number

Drive Type

Compatible
Nodes

Firmware

Adds support for the


Sunset Cove Plus 800 GB
drives with D252 firmware.
(146915)

HGST
HUSMM1680A
SS205

SED SSD

HD400, NL400,
X200, X400, X410

D252

Added support for D254


HGST
firmware, which is installed HUSMM8080A
on HGST Ultrastar 800 GB
SS205
drives. (146915)

SED SSD

S210, X200, X400, D254


X410, NL400

Fixes upgrade path from


MKAOA580 firmware,
which is installed on 3 TB
drives. (146915)

HDD

X200, X400,
NL400, IQ 108NL,
IQ 108000X

HGST
HUA723030AL
A640

MKAOA580

New hardware and firmware support in OneFS 7.2.0.2


Hardware

Model
Number

Drive Type

Compatible
Nodes

Firmware

Adds support for Sunset


Cove 800 GB SED SSDs
with D252 firmware.
(144840)

HGST
HUSMM1680A
SS205

SED SSD

X200, X400,
NL400, X410,
HD400

D252

New hardware and firmware support in OneFS 7.2.0.1

30

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

New hardware and firmware support

Hardware

Model
Number

Drive Type

Compatible
Nodes

Adds support for Sunset


Cove Plus 1.6 TB drives
with A204 firmware.
(134055)

HGST
HUSMM1616A
SS200

SSD

S200, S210, X200, A204


X400, X410

Adds support for A204


firmware for Sunset Cove
Plus 800 GB drives.
(134055)

HGST
HUSMM1680A
SS200

SSD

S200, S210, X200, A204


X400, X410, NL400

Adds support for A204


firmware for Sunset Cove
Plus 400 GB drives.
(134055)

HGST
HUSMM1640A
SS200

SSD

S200, S210, X200, A204


X400, X410

Adds support for A204


firmware for Sunset Cove
Plus 200 GB drives.
(134055)

HGST
HUSMM1620A
SS200

SSD

S200, S210, X200, A204


X400, X410

Adds support for new 32


GB Smart Modular boot
flash drives with an A19
controller. (134055)

SHMST6D032G Boot SSD


HM11EMC
118000100

Adds support for 1EZ


HGST
firmware, which is installed HUS726060AL
on HGST 6 TB drives.
A640
(134055)

HDD

Firmware

X410, S210,
HD400

S8FM08.0

HD 400, NL400

1EZ

Adds support for A006


ST33000652SS HDD
firmware, which is installed
on 3 TB Seagate Mantaray
SEDs. (134055)

X200, X400, NL400 A006

Adds support for firmware HGST


revision MFAOABW0, which HUS724040AL
is installed on 4 TB Hitachi A640
Mars-K Plus SATA drives.
(134055)

SATA

X200, X400,
NL400,IQ72000X,
IQ72NL

MFAOABW0

Adds support for firmware HGST


revision MF8OABW0, which HUS724030AL
is installed on 3 TB Hitachi A640
Mars-K Plus SATA drives.
(134055)

SATA

X200, X400,
NL400, IQ108NL

MFAOABW0

Adds support for firmware HGST


revision MF6OABW0, which HUS724020AL
is installed on 2 TB Hitachi A640
Mars-K Plus SATA drives.
(134055)

SATA

X400, NL400

MFAOABW0

New hardware and firmware support in OneFS 7.2.0.1

31

New hardware and firmware support

New hardware and firmware support in OneFS 7.2.0.0


No new hardware or firmware support was added in this release.

32

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CHAPTER 5
Resolved issues

This section contains the following topics:


l
l
l
l
l

Resolved in OneFS 7.2.0.4.................................................................................... 34


Resolved in OneFS 7.2.0.3 (Target Code)............................................................... 51
Resolved in OneFS 7.2.0.2.................................................................................... 71
Resolved in OneFS 7.2.0.1.................................................................................... 95
Resolved in OneFS 7.2.0.0.................................................................................. 112

Resolved issues

33

Resolved issues

Resolved in OneFS 7.2.0.4


Antivirus
Antivirus issues resolved in OneFS 7.2.0.4

ID

The OneFS web administration interface did not list any files in the Detected
Threats section of the Antivirus > Reports page if any ASCII special characters
for example, an ampersand (&)were in the path name of any infected file.

153117

The OneFS antivirus client could not connect to some ICAP servers if the ICAP URL
that you configured on the cluster was not in the following format:

144726

icap://<hostname>:<port>/avscan

Authentication
Authentication issues resolved in OneFS 7.2.0.4

ID

A local user who did not have root privileges could not change their password by
running the UNIX passwd command. As a result, if an affected users password
expired, they were unable to log on to the cluster until the password was reset
through another method.

155570

If an SMB client sent a request to apply an invalid security identifier (SID) to a file or 154257
directory on the cluster, the cluster returned a STATUS_IO_TIMEOUT response.
Depending on the application that was used to send the request, a message similar
to the following might have appeared on the client:
The specified network name is no longer available

If the cluster was not joined to a Microsoft Active Directory (AD) domain, and you
attempted to change the access control list (ACL) of a file on the cluster from a
Windows client, the operation failed, and a message similar to the following
appeared on the client:

150915

The program cannot open required dialog box because it cannot


determine whether the computer named "10.0.1.1: is joined to a
domain. Close this message, and try again.

Under these conditions, ACLs could only be modified through the OneFS commandline interface.

Backup, recovery, and snapshots

34

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.4

ID

While synchronizing data between source and target clusters in compliance mode,
if the file flags applied to a file on the source cluster differed from the file flags
assigned to the file on the target cluster, SyncIQ attempted to update the file
attributes of WORM committed files on the target cluster even if the retention date

157106

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.4

ID

for those files had not yet passed. As a result, the synchronization failed. If this
issue occurred, lines similar to the following appeared in the /var/log/
messages file:
bam_ads_setmode error: 30Local error : syncattr error for
<path_to_WORM_file>:
chfal

During an initial SyncIQ data replication, Access Control Lists (ACLs) applied to
155965
symbolic links, pipes, block devices, and character devices were not replicated
from the SyncIQ source cluster to the SyncIQ target cluster. As a result, following an
initial synchronization, applications and users were prevented from accessing
these file system objects and were also prevented from accessing files and
directories on the cluster through symbolic links.
When performing an NDMP restore, OneFS verifies the end of the data stream by
155782
detecting two consecutive blocks of zeroes. In rare cases, in OneFS 7.2.0.0 through
7.2.0.3, if the second block of zeroes was stored in a different buffer than the first
block of zeroes, OneFS did not read the second block of zeroes from the other
buffer, and instead read the data that followed the first block of zeroes. If this
occurred, the restore operation was immediately stopped, and data that was in the
process of being restored might have been incompletely restored.
This issue did not occur if the RESTORE_OPTIONS NDMP environment variable was
set to 1, specifying that a single-threaded restore operation be performed.
Note

By default, restore operations are multi-threaded.


If you attempted to run a SyncIQ job from OneFS 5.5 to OneFS 7.x and the job did
not have a valid policy ID, the job stopped without dispatching a failure message,
and an error similar to the following appeared in the /var/log/messages file:

154830

Stack: ------------------------------------------------/lib/libc.so.7:__sys_kill+0xc
/usr/lib/libisi_util.so.1:isi_assert_halt+0xa0
/usr/lib/libisi_migrate_private.so.2:get_lmap_name+0x54
/usr/bin/isi_migr_sworker:work_init_callback+0xacd
/usr/bin/isi_migr_sworker:old_work_init4_callback+0x16f
/usr/lib/libisi_migrate_private.so.2:generic_msg_unpack+0x8bc
/usr/lib/libisi_migrate_private.so.2:migr_process+0x2f1
/usr/bin/isi_migr_sworker:main+0xafa
/usr/bin/isi_migr_sworker:_start+0x8c
-------------------------------------------------/boot/kernel.amd64/kernel: pid 24302 (isi_migr_sworker), uid 0:
exited on signal 6 (core dumped)

Note

Starting in OneFS 7.2.0.4, the following message will appear in the /var/log/
isi_migrate.log file:
Source version unsupported. 'sync_id' must contain a valid policy
id.

If the force_interface option was enabled on a SyncIQ policy, the SyncIQ


scheduler process, isi_migr_sched, leaked memory. If this occurred, scheduled

154326

Backup, recovery, and snapshots

35

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.4

ID

policies stopped running, and the following message appeared in the /var/log/
isi_migrate.log file:
Cannot allocate memory

If you set the BACKUP_OPTIONS NDMP environment variable to a value of 7 to run


incremental, token-based backups, OneFS created entries in the dumpdates file
for all levels of backup, rather than creating dumpdates entries only for level 10,
incremental backups. As a result, NDMP snapshots never expired.

154311

If you used the snapshot-based incremental backup feature during a backup


154269
operation and if multiple snapshots were created between backups, the feature
might have failed to recognize that data had changed during the backup procedure.
As a result, some changed files were not backed up.
For more information, see ETA 203815 on the EMC Online Support site.
If you configured SyncIQ policies to run when source files were modified by setting 154259
the Run Job option to Whenever the source is modified, a memory
leak could have occurred in the SyncIQ scheduler (isi_migr_sched). If this issue
occurred, new SyncIQ jobs were not scheduled, some data was unavailable, and a
message similar to the following appeared in the /var/log/isi_migrate.log
file:
Could not allocate parser read buffer: Cannot allocate memory

If you performed an NDMP direct access recovery (DAR) or selective restore on an


Isilon cluster, OneFS assigned ownership of the restored directories to the root
account. Because clusters in compliance mode do not have a root account, the
restored directories were inaccessible on clusters in compliance mode, unless the
compadmin user was logged on to the compliance cluster.

154250

Although multiple IPv4 and/or IPv6 addresses were defined, NDMP listened to only
one IPv4 and/or one IPv6 IP address. For example:

154248

If a node had multiple IPv4 addresses defined, NDMP listened to only one IPv4
address.

If a node had multiple IPv6 addresses defined, NDMP listened to only one IPv6
address.

If a node had both IPv4 addresses and IPv6 addresses defined, NDMP listened
to only one IPv4 address and only one IPv6 address.

During a snapshot-based incremental backup, a Write Once Read Many (WORM) file 154246
might have been backed up as a regular file. If this occurred, and the files were
restored, the files were restored as regular files, and they could have been modified
after they were restored.

36

If the isi_ndmp_d process was stopped, the NDMP process ID file was still locked
by one or more NDMP child processes. As a result, the mcp process could not
restart the isi_ndmp_d process, and no new NDMP connections could be
established. If this occurred, a Failed to spawn NDMP daemon message
appeared in the /var/log/isi_ndmp_d.log file.

154244

If you queried for the date on which a SyncIQ policy would next be run using the
next_run OneFS API property, the date and time that was returned was incorrect.

154211

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.4

ID

During Data Management Application (DMA) polling, if no tape was loaded in a


153451
backup drive, OneFS set the drive to unbuffered mode. As a result, if a non-Isilon
backup initiator did not set the tape drive to buffered mode before starting a
backup-to-tape, the backup-to-tape performance by non-Isilon initiators might have
been very slow.
Note

This was not an issue if the tape drive was used only by Isilon backup accelerators.
While a SyncIQ policy was running, if a SyncIQ primary worker (pworker) process on 153446
the source cluster sent a list of directories to delete to a secondary worker
(sworker) on the target cluster, and then the pworker process unexpectedly
stopped, the pworker's work range was transferred to another pworker. The other
pworker then sent the list of directories to another sworker. This action resulted in
two sworker processes on the target cluster trying to delete the same directory at
the same time. If this issue occurred, the SyncIQ job stopped, and lines similar to
the following appeared in the /var/log/messages file:
/boot/kernel.amd64/kernel:
[kern_sig.c:3349](pid 70="isi_migr_sworker")(tid=2) Stack
trace:
/boot/kernel.amd64/kernel:
Stack:-------------------------------------------------/boot/kernel.amd64/kernel: /usr/bin/isi_migr_sworker:move_dirents
+0x1b6
/boot/kernel.amd64/kernel: /usr/bin/isi_migr_sworker:delete_lin
+0x279
/boot/kernel.amd64/kernel: /usr/bin/
isi_migr_sworker:delete_lin_callback+0x143
/boot/kernel.amd64/kernel: /usr/lib/libisi_migrate_private.so.
2:generic_msg_unpack+0x8bc
/boot/kernel.amd64/kernel: /usr/lib/libisi_migrate_private.so.
2:migr_process+0x2f1
/boot/kernel.amd64/kernel: /usr/bin/isi_migr_sworker:main+0xa18
/boot/kernel.amd64/kernel: /usr/bin/isi_migr_sworker:_start+0x8c
/boot/kernel.amd64/kernel:
-------------------------------------------------/boot/kernel.amd64/kernel: pid 70 (isi_migr_sworker), uid 0:
exited on signal 10 (core dumped)

Additionally, the following error might have appeared in the /var/log/


isi_migrate.log file:
Error : Unable to open lin 0:Invalid argument: Invalid argument
from remove_entry_from_parent (utils.c:1516)
from remove_single_entry (utils.c:1595)
from remove_all_parent_dirents (utils.c:1680)
from delete_lin (stf_transfer.c:784)

If a SyncIQ policy designated a target directory that was nested within the SyncIQ
target directory of a pre-existing policy, an error occurred during SyncIQ protection
domain creation which caused the SyncIQ policy's protection domain to be
incomplete. If this occurred, the following message appeared in the /var/log/
isi_migrate.log file:

153444

create_domain: failed to ifs_domain_add

In addition, if you ran the isi domain list -lw command, the Type field for
the affected SyncIQ target was marked Incomplete.

Backup, recovery, and snapshots

37

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.4

ID

If you ran a full SyncIQ data replication to a target directory that contained a large
153437
number of files that no longer existed in the source directory, it was possible for the
process that removes extra files from a target directory to conflict with the process
that created the domain for the target directory. If this occurred, the SyncIQ job
failed and had to be restarted.
If the --skip_bb_hash option of an initial SyncIQ policy was set to no (the
153377
default setting), and if a SyncIQ file split work item was split between pworkers, the
pworker that was handling the file split work item might have attempted to transfer
data that had already been transferred to the target cluster. If this occurred, the
isi_migr_pworker process repeatedly restarted and the SyncIQ policy failed. In
addition, the following lines appeared in the /var/log/messages file:
isi_migrate[45328]: isi_migr_pworker: *** FAILED ASSERTION
cur_len != 0 @ /usr/src/isilon/bin/isi_migrate/pworker/
handle_dir.c:463:
/boot/kernel.amd64/kernel: [kern_sig.c:3376](pid
45328="isi_migr_pworker")(tid=100957)
Stack trace:/boot/kernel.amd64/kernel: Stack:
-------------------------------------------------/boot/kernel.amd64/kernel:
/lib/libc.so.7:__sys_kill+0xc
/boot/kernel.amd64/kernel
/usr/lib/libisi_util.so.1:isi_assert_halt+0xa0
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:migr_continue_file+0x1507
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:migr_continue_generic_file+0x9a
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:migr_continue_work+0x70
/boot/kernel.amd64/kernel:
/usr/lib/libisi_migrate_private.so.2:migr_process+0xf
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:main+0x606
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:_start+0x8c
/boot/kernel.amd64/kernel:
-------------------------------------------------/boot/kernel.amd64/kernel: pid 45328 (isi_migr_pworker), uid
0:exited on signal 6 (core dumped)

If a SyncIQ job was interrupted during the change compute deletion phase
(STF_PHASE_CC_DIR_DEL), the Logical Inodes (LINs) could have been incorrectly
removed from the SyncIQ job work list. If this occurred, the SyncIQ job failed, and
messages similar to the following appeared in the /var/log/
isi_migrate.log file:

150613

Unable to update metadata (inode changes) information for Lin


Operation failed while trying to detect all deleted lins in

38

If you viewed the details of a snapshot alias in the OneFS web administration
interface, the Most Recent Snapshot Name was always No value, and the
Most Recent Snapshot ID was always 0.

145938

If you started a restartable backup with a user snapshot, after the backup was
completed and the BRE context was removed, the expiration time of the snapshot
was changed. As a result, the snapshot might have been deleted prematurely.

144427

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Cluster configuration
Cluster configuration issues resolved in OneFS 7.2.0.4

ID

If you ran the isi_ntp_config command to exclude a particular node from


contacting an external Network Time Protocol (NTP) server, subsequent attempts to
exclude another node failed, and, after running the command to exclude another
node, a message similar to the following appeared on the console:

154322

'str' object has no attribute 'gettext'

As a result, only one node could be excluded from contacting an external NTP
server.

Diagnostic tools
Diagnostic tools issues resolved in OneFS 7.2.0.4

ID

Because the following ESRS log files were not listed in the newsyslog.conf file
a configuration file that manages log file rotationover time the files could have
grown in size and could have filled the /var partition:
/var/log/GWExt.log

154107

/var/log/GWExtHTTPS.log
Note

If the /var partition on a node in the cluster is 90% full, OneFS logs an event
warning that a full /var partition can lead to system stability issues. Depending on
how the cluster is configured, an alert might also be issued for this event.
When EMC Secure Remote Services (ESRS) was configured on the cluster, the ESRS
process automatically selected the first available IP address, rather than selecting
an IP address from an IP address pool in the System access zone. Since only the
System zone allows a user SSH access for remote management, if the selected IP
address was not in the System access zone, EMC Isilon Support could not monitor
the cluster remotely.

153455

Events, alerts, and cluster monitoring


Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.4

ID

Because isi_rest_server, a component of the Platform API, did not check for the
156400
correct error codes when interacting with the OneFS auditing system's queue
producer library (QPL), if configuration auditing was enabled and there was an error
in the QPL, the error was not handled correctly. If this issue occurred, it might have
prevented system configuration changes from being audited.
If auditing is enabled, the audit filter waits for a response from the queue producer
library (QPL) before sending audit events to the auditing process (isi_audit_d).
In OneFS 7.2.0.0 through 7.2.0.3, if the QPL became disconnected from the
auditing process, isi_audit_d, while the auditing process was waiting for a
response, the QPL failed to send a response to the auditing process. If this

156398

Cluster configuration

39

Resolved issues

Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.4

ID

occurred, auditing events continued to collect in the auditing process until the
queue became full. If the auditing process queue became full, processes related to
events that were being auditedfor example, processes related to file system
protocols and configuration changesmight have stopped working. Depending on
which related processes were affected, various cluster operations could have been
disrupted by this issuefor example, if configuration auditing was enabled, you
might have been prevented from making configuration changes through the OneFS
web administration interface.
Under some circumstances, multiple isi_papi_d process threads might have called
the same code at the same time. If this occurred, the isi_papi_d process might
have unexpectedly restarted.

154324

If file system auditing was enabled and you configured the system to audit events
in which a user renamed a file, if the user renamed the file from a Mac client
connected to the cluster through a virtual private network (VPN), the complete path
to the file was not always captured in the audit log. If this occurred, applications
that relied on the file paths in the audit logs might have been adversely affected.
Beginning in OneFS 7.2.0.4, if as user attempts to rename a file and the complete
file path to the renamed file is not captured in the audit log, the file is not renamed
and an error appears in the audit log.

153463

Only the root user was permitted to run the isi_audit_viewer command. This
limitation prevented other usersincluding users with sudo privilegesfrom
viewing configuration audit logs and protocol audit logs on the cluster.

153439

If you enabled auditing on the cluster, only nodes that had the primary external
interface (em0) configured could communicate with the Common Event Enabler
(CEE) server, even if a secondary interface, such as em1, was configured and active
on the node. As a result, the audit logs from these nodes were not collected on the
CEE server.

153432

If you configured OneFS to send syslog messages to a remote syslog server, the
HOSTNAME of the cluster was not included in the messages. The absence of the
HOSTNAME entry made it difficult to distinguish messages sent from multiple
clusters to the same syslog server.

153417

Because the OneFS auditing system did not correctly convert a POSIX path with
150920
multiple path separators (/) into a Microsoft UNC path, if NFS protocol auditing was
enabled, incorrect paths could have been recorded in the audit log and
applications that rely on the information in the audit log might have been adversely
affected.
If file system protocol auditing was enabled and a client opened a parent directory
and then opened a subdirectory or file within the parent directory, the auditing
system might have incorrectly appended the subdirectory or file path to the parent
directory path. If this occurred, the incorrect path might have caused an error in the
auditing process and file system protocol events that were in the process of being
logged might not have been captured. If the incorrect path was logged,
applications that relied on file paths in the audit log might have been adversely
affected.

40

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

150918

Resolved issues

File system
File system issues resolved in OneFS 7.2.0.4

ID

If a node ran for more than 497 days without being rebooted, an issue that affected 158417
the OneFS journal buffer sometimes disrupted the drive sync operation. If this issue
occurred, OneFS reported that the journal is full, and as a result, resources that are
waiting for a response from the journal enter a deadlock state. Any cluster that
contains a node that has run for more than 497 consecutive days with no downtime
might unexpectedly reboot as a result of this issue.
For more information, see ETA 202452 on the EMC Online Support site.
If a node ran for eight months or longer without a reboot and the nodes internal
157489
clock rolled over, the universal memory allocator (UMA) processed an invalid value,
which prevented the UMA from reclaiming any of the memory it had allocated. If
this issue occurred, the affected node might have run out of memory, causing the
node to unexpectedly reboot.
On a compliance mode cluster, if either the retention period or the DOS Read
Only flag that was applied to a file on a SyncIQ source cluster was changed after
the initial synchronization, subsequent incremental SyncIQ jobs failed, and
messages similar to the following appeared in the /var/log/messages file,
where <path> was the path to the file on the target cluster:

156270

Local error : syncattr error for <path>: Readonly file system

This issue occurred because, under these conditions, an unnecessary chown


command was also sent to the target cluster.
If you installed the drive support package (DSP) 1.5 firmware update on a cluster
154266
that contained a node with solid-state drives (SSDs) that were configured for use as
L3 cache, the node might have rebooted unexpectedly. If a node rebooted for this
reason, messages similar to the following appeared in the /var/log/messages
file:
Stack: -------------------------------------------------kernel:sched_switch+0x125
kernel:mi_switch+0x12e
kernel:sleepq_wait+0x3a
kernel:_sleep+0x37a
efs.ko:l3_mgmt_drive_state+0x9bd
efs.ko:drv_change_drive_state+0x178
efs.ko:drv_down_drive_prepare+0x1c2
efs.ko:drv_down_drive+0x81
efs.ko:drv_unmount_drive+0x176
efs.ko:drv_modify_drive_state_down+0x1d4
efs.ko:ifs_modify_drive_state+0x35a
efs.ko:_sys_ifs_modify
cpuid = 28
Panic occurred in module efs.ko loaded at 0xffffff87bde5a000:

If OneFS was not mounted on a node and you ran the isi_flush --l3-full
command on that node, the node restarted unexpectedly and messages similar to
the following appeared in the /var/log/messages file:

154264

Stack: -------------------------------------------------kernel:trap_fatal+0x9f
kernel:trap_pfault+0x386
kernel:trap+0x303
efs.ko:mgmt_finish_super+0x4e

File system

41

Resolved issues

File system issues resolved in OneFS 7.2.0.4

ID

efs.ko:l3_mgmt_nuke+0x70
efs.ko:sysctl_l3_nuke+0xcb
kernel:sysctl_root+0x132
kernel:userland_sysctl+0x18f
kernel:__sysctl+0xa9
kernel:isi_syscall+0x39
kernel:syscall+0x28b
--------------------------------------------------

If you attempted to smartfail multiple nodes that were holding user locks, the lock 153436
was held by LK client entries but not present in lock failover (LKF) entries. As a
result of this inconsistency, future lock attempts failed, and a manual release of the
lock was required to grant the desired access.
If you exceeded the number of recommended snapshots on a cluster, nodes in the
cluster might have rebooted unexpectedly. If this issue occurred, lines similar to
the following appeared in the /var/log/messages file:

152660

/boot/kernel.amd64/kernel:
Stack:-------------------------------------------------/boot/kernel.amd64/kernel:kernel:isi_assert_halt+0x42
/boot/kernel.amd64/kernel:efs.ko:pset_resize+0x107
/boot/kernel.amd64/kernel:efs.ko:pset_add+0x50
/boot/kernel.amd64/kernel:efs.ko:bam_data_lock_get_impl+0x1c8
/boot/kernel.amd64/kernel:efs.ko:bam_data_lock_get+0x2b
/boot/kernel.amd64/kernel:
efs.ko:ifm_read_op_init+0xa8
/boot/kernel.amd64/kernel:efs.ko:bam_mark_file_data+0xfd
/boot/kernel.amd64/kernel:efs.ko:ifs_mark_file_data+0x373
/boot/kernel.amd64/kernel:efs.ko:_sys_ifs_mark_file_data+0x166
/boot/kernel.amd64/kernel:kernel:isi_syscall+0x53
/boot/kernel.amd64/kernel:kernel:syscall+0x1db
/boot/kernel.amd64/kernel:-------------------------

If you ran a SmartPools job on a file with an alternate data stream (ADS), the job
sometimes failed, and continued to fail, even if the job was manually started. If the
SmartPools job failed for this reason, the SmartPools process eventually stopped
running scheduled jobs, and this might have caused node pools to become full,
degrading cluster performance. If this occurred, the SmartPools job reported an
error similar to the following in the job history report:

151619

Node 6: pctl2_set_expattr failed: No such file or directory

In some environments, where there was a heavy workload on the cluster, a node
could run out of reserved kernel threads. This condition could have caused the
node to restart unexpectedly. If this iisue occurred, client connectivity to that node
was interrupted, and lines similar to the following appeared in the /var/log/
messages file:
panic @ time 1422835686.820, thread 0xffffff0248243000: ktp: No
reserved threads left
cpuid = 6
Panic occurred in module efs.ko loaded at 0xffffff87b7c84000:
Stack: -------------------------------------------------efs.ko:ktp_assign_reserve+0x29f
efs.ko:dfq_reassign_cb+0x9b
kernel:_sx_xlock_hard+0x276
kernel:_sx_xlock+0x4f
efs.ko:lki_unlock_impl+0x306
efs.ko:lk_unlock+0xbe
efs.ko:bam_put_delete_lock_by_lin+0x36
efs.ko:_bam_free_free_store+0x34

42

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

143399

Resolved issues

File system issues resolved in OneFS 7.2.0.4

ID

efs.ko:dfq_service_thread+0x139
efs.ko:kt_main+0x83
kernel:fork_exit+0x7f

Hardware
Hardware issues resolved in OneFS 7.2.0.4

ID

In rare cases, a failing dual in-line memory module (DIMM) caused a burst of
correctable error correcting code (ECC) errors. If this burst of errors was extreme
for example, if it occurred tens of thousands of times per hourthe performance of
the node and the cluster might have been degraded. If this issue occurred, a
message similar to the following appeared tens of thousands of times per hour in
the /var/log/messages file and on the console:

156345

RDIMM P1-DIMM1A (cpu 0, channel 0, dimm 0) non-fatal


(correctable) ECC error

This issue continued until the DIMM was replaced.


If the hardware abstraction layer (HAL) could not detect the network interface card
155333
(NIC) in an Isilon node, the HAL assigned an empty string to the related nic name
attribute in the lni.xml file, instead of returning an empty list. As a result, when
the flexnet configuration file (flx_config.xml) was updated with this
information, the related <nic-name> element in the flx_config.xml file was also
empty. The empty element was an invalid entry in the file and it rendered the
flx_config.xml file unusable by the node. Because an updated
flx_config.xml file is propagated to all nodes in the cluster, this issue could
have caused all nodes in the cluster to have a flx_config.xml file with invalid
entries. If this occurred, client connections to the cluster might have been
disrupted until the unusable flx_config.xml file was replaced.
If you ran the isi firmware status command on a cluster that contained
S210 nodes with common from factor power supply units (PSUs) that had part
number 071-000-022-00, and if firmware package version 9.3.1 or later was not
installed on the cluster, messages similar to the following appeared on the
console:

154596

CFFPS1_Blastoff CFFPS 09.05 2,7


CFFPS1_Blastoff_DC CFFPS <CFFPS1_Blastoff_DC> 2,7
CFFPS1_Optimus CFFPS <CFFPS1_Optimus_Acbel> 2,7
CFFPS2_Blastoff CFFPS 09.05 2,7
CFFPS2_Blastoff_DC CFFPS <CFFPS2_Blastoff_DC> 2,7
CFFPS2_Optimus CFFPS <CFFPS2_Optimus_Acbel> 2,7

This issue occurred because earlier versions of OneFS, and earlier versions of the
firmware package did not recognize PSU part number 071-000-022-00.
Note

This issue can be resolved in earlier versions of OneFS 7.2.0.x by installing


firmware package version 9.3.1 or later.

Hardware

43

Resolved issues

Hardware issues resolved in OneFS 7.2.0.4

ID

If a node with a LOX NVRAM card was unable to communicate with the NVRAM card
because the NVRAM card controller was unexpectedly reset, the cluster became
unresponsive to all client requests and data on the cluster was unavailable until
the affected node was rebooted.

153693

Note

Beginning in OneFS 7.2.0.4, if this issue is encountered, the affected node will be
rebooted automatically to prevent the cluster from becoming unresponsive.

HDFS
HDFS issues resolved in OneFS 7.2.0.4

ID

Because OneFS treated query strings from WebHDFS clients as case-sensitive,


some valid queries or operations might have failed. For example, OneFS expected
operations such as GETFILESTATUS to be upper case, while Boolean arguments
and strings were expected to be lower case. As a result, queries similar to the
following might have failed because GetFileStatus is entered in mixed case:

156921

http://isilon_ip:8082/webhdfs/v1/?op=GetFileStatus&user.name=root

If multiple threads attempted to simultaneously update the stored list of blocked IP 156306
addresses, the HDFS service restarted and client sessions were disconnected. The
service was automatically restored after a few seconds.
Because the WebHDFS CREATE operation does not explicitly instruct the system to
create parent directories, if OneFS received a WebHDFS request to create a file or
directory within a parent directory that did not yet exist, the request failed.
Beginning in OneFS 7.2.0.4, OneFS will automatically create parent directories if it
receives a WebHDFS create request that requires them.

154404

Migration issues resolved in OneFS 7.2.0.4

ID

If you restarted a full or incremental isi_vol_copy migration three or more


times, and if a specific file was in the process of being copied to the target cluster
each time the isi_vol_copy migration was restarted, the file was not
successfully copied to the target cluster.

154335

Migration

Note

You might still encounter this issue if you restart an isi_vol_copy migration of a
single, large file three or more times.

44

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Networking
Networking issues resolved in OneFS 7.2.0.4

ID

If an X410, S210, or HD400 node was configured to communicate through a 10


GigE network interface card that was using the Broadcom NetXtreme Ethernet (BXE)
driver, the node could have encountered an issue where the output of the
ifconfig command reported no carrier for the link. Toggling the interface up

154455

and down did not resolve the issue and the node had to be rebooted to reestablish
the link.
In some cases, the Mellanox InfiniBand driver waited for a hardware status register
to be cleared, which caused the driver to enter a read and retry loop. If the retry
loop timed out, the driver attempted to print out a significant amount of system
data three times. Since printing the system data output was enabled by default,
and because there was a significant amount of data to be processed, the driver
eventually triggered several Software Watchdog time outs. After five of these time
outs, the software watchdog rebooted the affected node and the following lines
appeared in the /var/log/messages file:

153425

Consecutive swatchdog state warnings: 5


Opt-in swatchdog state warnings: 5
Memory pressure swatchdog warnings: 0
Majority of swatchdog warnings by opt-in threads!
panic @ time 1394782550.534, thread 0xffffff06ebc8a000: Software
watchdog timed out
cpuid = 3
Panic occurred in module kernel loaded at 0xffffffff80200000:
Stack: -------------------------------------------------kernel:isi_swatchdog_panic+0x15
kernel:isi_swatchdog_hardclock+0x1ea
kernel:hardclock_cpu+0xd9
kernel:lapic_handle_timer+0x15c
kernel:spinlock_exit+0x32kernel:putcons+0x3e
kernel:putchar+0x7akernel:kvprintf+0xa3b
kernel:__vprintf+0x5bkernel:printf+0x70
kernel:_fmt_flush+0x3d
kernel:fmt_append+0x47
kernel:fmt_print_num+0x1f7
kernel:fmt_vprint+0x302
kernel:fmt_print+0x5f
mthca.ko:_mthca_mst_dump+0xc7
mthca.ko:mthca_print_mst_dump+0x56
mthca.ko:check_time+0x1c
mthca.ko:mthca_cmd_poll+0x105
mthca.ko:mthca_cmd_box+0x65
mthca.ko:mthca_MAD_IFC+0x1cd
mthca.ko:mthca_query_port+0x107
kernel:port_active_handler+0x31
kernel:sysctl_root+0xd6
kernel:userland_sysctl+0x15c
kernel:__sysctl+0xa9
kernel:isi_syscall+0x53
kernel:syscall+0x1db
--------------------------------------------------

Note

Beginning in OneFS 7.2.0.4, the system data is not printed by default, allowing the
read and retry loop to complete more quickly, and minimizing the chance of a
software watchdog time out events.
If Source Based Routing (SBR) was enabled on the cluster, client connections that
were handled by SBR were disconnected if the MAC address (ARP entry) for the
relevant subnet gateway expired. This issue occurred because nodes in the cluster

150647

Networking

45

Resolved issues

Networking issues resolved in OneFS 7.2.0.4

ID

did not send an ARP request to refresh the MAC address and, as a result, attempted
to send network traffic to an incorrect destination MAC address for the gateway
Note

The default expiration time for an ARP entry is 10 minutes.


In rare cases, a race condition between the networking service and the
148736
SmartConnect service caused the SmartConnect service IP to be assigned to a node
before the network addresses were updated in the IP pool. If this issue occurred,
connection requests to the cluster failed until the dynamic IP addresses in all
network pools were manually rebalanced by running the isi networks
command with the --sc-rebalance-all option.
The error messages that are logged if the flx_config.xml file cannot be read or 141789
loaded were updated to facilitate diagnosis of the issue. Beginning in OneFS
7.2.0.4, if the flx_config.xml file cannot be read or loadedfor example, if the
file cannot be read because a nodes network interface card is not accessible
lines similar to the following might appear in the /var/log/messages file and
the /var/log/isi_flexnet_d.log file:
isi_smartconnect[15482]: Error processing subnet in flexnet
config: 7
isi_smartconnect[15482]: parameter member iface-class of member
nonexistant
isi_smartconnect[15482]: /ifs/.ifsvar/modules/flexnet/
flx_config.xml is corrupt
(configuration errno 7: [/ifs/.ifsvar/ modules/flexnet/
flx_config.xml] parameter 'member iface-class' of 'member'
nonexistant)
isi_smartconnect[15482]: Corrupt config found on /ifs
isi_smartconnect[15482]: Unable to load FlexNet configurations.

NFS
NFS issues resolved in OneFS 7.2.0.4

ID

If an NFS operation failed because the NFSv3 client that attempted to perform the
operation did not have adequate access permissions and then the same NFSv3
client sent a request for file system information, the NFS server unexpectedly
restarted and an error message similar to the following was logged in the in
the /var/log/nfs.log file:

156109

[lwio] ASSERTION FAILED: Expression = (0), Message = 'Got access


denied on stat-only open!'

If all of the following conditions were met, users connected to an NFS export
received Permission denied errors when they attempted to access file system
objects to which they should have had access:

46

The --map-lookup-uid option was enabled (set to yes) for the affected
NFS export.

The group owner of the affected file system object was one of the user's
supplemental groups rather than the user's primary group.

The cluster-side lookup for the user's supplemental groups failed.

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

154927

Resolved issues

NFS issues resolved in OneFS 7.2.0.4

ID

This issue occurred because, when the lookup for the user's UID failed, OneFS did
not correctly apply supplemental group permissions to the user. As a result, the
user was denied access to the file system object.
If an NFSv3 or NFSv4 client attempted to move a subdirectory from one directory to
another within a parent directory to which a directory SmartQuota was applied, the
file could not be moved and messages similar to the following appeared on the
console:

154910

cannot move `directory_name1' to a subdirectory of itself,


`directory_name2

OR
cannot move `directory_name1' to
error

`directory_name2: Input/output

This issue occurred even if the efs.quota.dir_rename_errno sysctl


parameter was set to 18.
Note

For more information about setting the efs.quota.dir_rename_errno sysctl


to a value of 18, see article 90185 on the EMC Online Support site.
For more information about configuring sysctl parameters in OneFS, see article
89232 on the EMC Online Support site.
In environments with NFS exports rules that referenced hundreds of unresolvable
153457
hostnames, the isi nfs exports list --verbose command consumed
too many reserved privileged socket connections when it was interacting with the
isi_netgroup_d process. As a result, commands that used isi_rdo for intra-node
communications (for example, isi_gather_info or isi_for_array) failed to
complete for a few seconds. If this occurred, a message similar to the following
appeared on the console:
isi_rdo: [Errno 13] TCPTransport.bind_to_reserveport: Unable to
bind to privileged port.

If an NFS client attempted to send an NLM asynchronous request to lock a file and
received an error in response to the request, a socket was opened but was not
closed. Over time, it was possible for the maximum number of open sockets to be
reached. If this occurred, processes could not open new sockets on the affected
node. As a result, affected nodes might have been slow to respond to file lock
requests, or lock requests sent to an affected node might have timed out. If lock
requests timed out, NFS clients could have been prevented from accessing files or
applications on the cluster.

153453

If NFSv4 clients mounted NFS exports on the cluster through NFS aliases, it was
152337,
possible to encounter a race condition that caused the NFS service to unexpectedly 151697
restart. This issue was more likely to occur when many NFSv4 clients were
simultaneously mounting exports through NFS aliases. If this race condition was
encountered, the NFS service on the affected node unexpectedly restarted, NFS
clients connected to the node might have been disconnected, some NFS clients

NFS

47

Resolved issues

NFS issues resolved in OneFS 7.2.0.4

ID

might have been prevented from mounting an export, and the following lines
appeared in the /var/log/messages file:
/lib/libc.so.7:thr_kill+0xc
/usr/likewise/lib/lwio-driver/nfs.so:NfsAssertionFailed+0xa4
/usr/likewise/lib/lwio-driver/nfs.so:Nfs4OpenOwnerAddOpen+0x112
/usr/likewise/lib/lwio-driver/nfs.so:NfsProtoNfs4ProcOpen+0x2567
/usr/likewise/lib/lwio-driver/nfs.so:NfsProtoNfs4ProcCompound
+0x5fe
/usr/likewise/lib/lwio-driver/nfs.so:NfsProtoNfs4Dispatch+0x43a
/usr/likewise/lib/lwio-driver/nfs.so:NfsProtoNfs4CallDispatch+0x3e
/usr/likewise/lib/liblwbase.so.0:SparkMain+0xb7

If the Deny permission to modify files with DOS read-only


attribute over Windows File Sharing (SMB) ACL policy option was
enabled, files to which the DOS READ-ONLY flag was applied might have
appeared writeable to NFS clients. As a result, a process on an NFS client might
have attempted to write a change to a read-only file. If this occurred, the write to
the file might have been rejected by the NFS server without sending an error to the
client, or a permissions error might have appeared on the client when the file was
closed or when the system attempted to move the file's data to persistent storage.

150347

Although the correct ACLS were assigned to a filefor example, std_delete or


149743
modifyNFSv3 and NFSv4 clients could not delete, edit, or move the file unless the
delete_child permission was set on the parent directory.
For more information, see ETA 204898 on the EMC Online Support site.

OneFS API
OneFS API issues resolved in OneFS 7.2.0.4

ID

In OneFS, a numeric request ID is included in API client requests that are generated 157487
by a script or application that relies on the isi.rest python module to
communicate with the OneFS API. Because, after generating 1431 request IDs, the
formula that was used to generate the API request ID generated an ID of zero, which
is an invalid value, the next API request failed.
The impact of the failed request depended on how the application or script that
sent the request was designed to handle this type of failure. If the request was
retried, a new request ID was generated and the request succeeded.

OneFS web administration interface


OneFS web administration interface issues resolved in OneFS 7.2.0.4

ID

In the OneFS web administration interface, if the path to the shared directory for an
SMB share was long enough to exceed the width of the SMB shares page, the
shared directory Edit link was sometimes not visible.

144423

Note

The Edit link was accessible if you used the Tab key to move to the link.

48

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

SmarQuotas
SmartQuotas issues resolved in OneFS 7.2.0.4

ID

If you edited the usage limits of an existing directory quota in the OneFS web
administration interface, the Show Available Space as: Size of hard

154331

threshold and Size of cluster options were missing from the Set a hard

limit section. This issue occurred if you chose the Size of cluster option
when you created the directory quota with a hard limit.
If a SmartQuota threshold was exceeded and then files were moved or deleted to
149570
correct the issue, an alert was sometimes sent after the issue was corrected, even
though the threshold was no longer exceeded. If this occurred, a false alert similar
to the following was generated, where /ifs/<path> was the path of the directory
that temporarily exceeded the configured threshold:
Your root quota under /ifs/<path> has been exceeded.
Your quota is 12 TB,and 6.7 TB is in use. You must delete files
to bring usage below 12 TB before you can create or modify files.
Please clean up and free some disk space.

SMB
SMB issues resolved in OneFS 7.2.0.4

ID

If an SMB share on the cluster was configured with the Impersonate Guest
security setting set to Always, and if a large number of SMB sessions to the share
were being opened and closed, an extra cred file was opened for each SMB
session. However, when the SMB session ended, the extra cred file was not
correctly closed and, over time, it was possible for the number of open cred files to
reach the maximum number of open files allowed. If this occurred, new SMB
sessions to the affected node could not be established, and messages similar to
the following appeared in the /var/log/lwiod.log file:

157030

Failed to accept connection due to too many open files

If you used Microsoft Management Console (MMC) to configure an SMB share on


the cluster from a Windows client and the file path to the share was invalid--for
example, if the file path did not exist on the cluster--the share was not created but
no error was returned to the Windows client.
Beginning in OneFS 7.2.0.4, if you attempt to create an SMB share with an invalid
file path through MMC, the following error appears on the client:

155057

The device or directory does not exist.

Due to a race condition that could occur when multiple SMB 1 sessions were being 154962
opened on the same connection, the lwio process sometimes unexpectedly
restarted. If the process restarted, SMB clients connected to the affected node were
disconnected from the cluster.
If SMB auditing was enabled and you set the --max-cached-messages
parameter to 0 (zero) to disable message caching, the SMB client session and
negotiate requests that were waiting to be audited might have prevented new SMB

154271

SmarQuotas

49

Resolved issues

SMB issues resolved in OneFS 7.2.0.4

ID

session and negotiate requests from being processed. If this occurred, SMB clients
might have been prevented from establishing new connections to the cluster until
the backlog of audit messages was processed.
Note

Beginning in OneFS 7.2.0.4, if you set the --max-cached-messages parameter


to 0 to disable message caching, and the Common Event Enabler (CEE) server
becomes unavailable, some audit messages that have not yet been logged might
be discarded. This behavior prevents a backlog of requests from disrupting SMB
client requests and connections.
If a symbolic link was migrated from a Microsoft Windows client to the cluster, if the 153972
tool that was migrating the data attempted to update the attributes of a symbolic
link that had already been migrated, the attributes could not be updated, and the
migration of the symbolic link failed. For example, if you attempted to migrate data
to an Isilon cluster using the EMCopy tool, and if the data contained a symbolic
link, the symbolic link was initially migrated but EMCopy could not apply attributes
to the symbolic link, and an error similar to the following appeared on the EMCopy
client:
ERROR (5) : \path_to_target\symbolic_link -> Unable to set access
time

In addition, if the EMCopy tool attempt to retry the failed operation, the retry failed
and an error similar to the following appeared on the EMCopy client:
ERROR (4392) : \path_to_target\symbolic_link -> Unable to open,
Failed after 1 retries.

If you attempted to migrate a directory symbolic link from a Microsoft Windows


client to an Isilon cluster, OneFS returned a response to the Windows client
indicating that the operation was not supported, and the symbolic link was not
migrated. Depending on the application that was being used to migrate the data,
error messages might have appeared on the client. For example, if you attempted
to migrate data to an Isilon cluster using the EMCopy tool, the symbolic links were
not migrated, and an error similar to the following appeared on the EMCopy client:

153366

ERROR (50) : \path_to_target -> symbolic_link : symlink creation


failure

Under some circumstances, after an SMB2 client attempted to access a file on the
cluster through a symbolic link, OneFS returned an ESYMLINKSMB2 error (an
internal error that is not seen on the client). If this error was returned, the symbolic
link was resolved; however, some kernel memory that was allocated in order to
complete the process of resolving the symbolic link was not deallocated after the
link was resolved. As a result, over time a node's kernel processes might have run
out of memory to allocate. If this occurred, the affected node rebooted
unexpectedly, and messages similar to the following appeared in the /var/log/
messages file on the affected node:
/boot/kernel.amd64/kernel: Pageout daemon can't find enough free
pages.
System running low on memory. Check for memory pigs

50

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

152404

Resolved issues

SMB issues resolved in OneFS 7.2.0.4

ID

If you queried the contents of a directory in an SMB share from a Microsoft


Windows command prompt, and if you included the search string *.* (asterisk
dotasterisk) immediately after other search characters in the queryfor example,
dir do*.*the search results did not include the expected files or directories.
This issue occurred because OneFS treated the dot as a character rather than as a
wildcard.

149841

Note

Searches with only the *.* string listed the entire contents of the directory, as
expected.

Resolved in OneFS 7.2.0.3 (Target Code)


Antivirus
Antivirus issues resolved in OneFS 7.2.0.3

ID

If you configured an antivirus scan of a directory in the OneFS web administration


149763
interface or from the command-line interface, the forward slash (/) at the end of the
designated path was removed from the search string. As a result, the antivirus
scanner might have scanned more directories than expected. For example, if the
file system included both an /ifs/data directory, and an /ifs/data2
directory, and if you configured the antivirus scanner to scan the /ifs/data/
directory, because the forward slash (/) was not included in the path, the antivirus
scanner would have scanned both the /ifs/data directory and the /ifs/
data2 directory.

Authentication
Authentication issues resolved in OneFS 7.2.0.3

ID

Due to a file descriptor (FD) leak that occurred when SMB clients listed files and
directories within an SMB share, it was possible for OneFS to eventually run out of
available file descriptors. If this occurred, an ACCESS_DENIED or
STATUS_TOO_MANY_OPENED_FILES response was sent to SMB clients that
attempted to establish a new connection to the cluster or SMB clients that were
connected to the cluster that attempted to view or open files. As a result, new SMB
connections could not be established, and SMB clients that were connected to the
cluster could not view, list, or open files. If this issue occurred, messages similar to
the following appeared on the Dashboard > Event summary page of the OneFS
web administration interface, and in the command-line interface when you ran the
isi events list -w | grep -i descriptor command:

152809

System is running out of file descriptors

Resolved in OneFS 7.2.0.3 (Target Code)

51

Resolved issues

Authentication issues resolved in OneFS 7.2.0.3

ID

In addition, messages similar to the following appeared in the /var/log/


lwiod.log file:
Could not create socket: Too many open files
Failed to accept connection due to too many open files

In environments that relied on Kerberos authentication, if a machine password was


changed while there were many active SMB connections to the cluster, a race
condition could have taken place. If this occurred, the lwio process restarted
unexpectedly, and lines similar to the following appeared in the /var/log/
messages file:

149810

Stack: -------------------------------------------------/usr/lib/libkrb5.so.3:krb5_copy_principal+0x33
/usr/lib/kt_isi_pstore.so:krb5_pktd_get_next+0xe6
/usr/lib/libkrb5.so.3:krb5_dyn_get_next+0x5e
/usr/lib/libkrb5.so.3:krb5_rd_req_decoded_opt+0x4a4
/usr/lib/libkrb5.so.3:krb5_rd_req_decoded+0x1d
/usr/lib/libkrb5.so.3:krb5_rd_req+0xc1
/usr/lib/libgssapi_krb5.so.2:krb5_gss_accept_sec_context+0x8fd
/usr/lib/libgssapi_krb5.so.2:gss_accept_sec_context+0x22c
/usr/lib/libgssapi_krb5.so.2:spnego_g
/boot/kernel.amd64/kernel: ss_accept_sec_context+0x3d6
/usr/lib/libgssapi_krb5.so.2:gss_accept_sec_context+0x22c
/usr/likewise/lib/lwio-driver/srv.so:SrvGssContinueNegotiate+0x2c5
/usr/likewise/lib/lwio-driver/srv.so:SrvGssNegotiate+0xd3
/usr/likewise/lib/lwio-driver/
srv.so:SrvProcessSessionSetup_SMB_V2+0x6c6
/usr/likewise/lib/lwio-driver/
srv.so:SrvProtocolExecute_SMB_V2+0x1324
/usr/likewise/lib/lwio-driver/srv.so:SrvProtocolExecuteInternal
+0x51b
/usr/likewise/lib/lwio-driver/
srv.so:SrvProtocolExecuteWorkItemCallback+0x28
/usr/likewise/lib/liblwbase.so.0:WorkThread+0x1f7
/lib/libthr.so.3:_pthread_getprio+0x15d
--------------------------------------------------

If an LDAP server was configured to handle Virtual List View (VLV) search instead of 149797
paged search, and if LDAP users were listed, a memory leak occurred when
returning more than one page of information. If users were listed a sufficiently large
number of times, the lsass process could run out of memory and restart
unexpectedly. As a result, SMB users could not be authenticated for the several
seconds it took for the lsass process to restart.
Microsoft Active Directory (AD) users in trusted domains were allowed a higher level 149795
of access to EMC Isilon clusters by default if RFC 2307 was enabled on the cluster,
and if Windows Services for UNIX (SFU) was not configured on the trusted domain.
If the lsassd process was not able to resolve user and group IDs, a message was
logged to the /var/log/messages file. In rare and extreme cases, excessive
logging could decrease the wear life of the boot disks on the affected node. If this
occurred, lines similar to the following appeared in the /var/log/messages
file:

149769

Failed to map token token={UID:10116, GID:100, GROUPS={GID:100,


GID:20042}, zone id=-1 }: Failed
to lookup uid 10116: LW_ERROR_NO_SUCH_USER

If you configured public key SSH authentication on a cluster running OneFS 7.1.1.2
through OneFS 7.1.1.5 or OneFS 7.2.0.1 through OneFS 7.2.0.2, and then you

52

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

138180

Resolved issues

Authentication issues resolved in OneFS 7.2.0.3

ID

upgraded to OneFS 7.2.0.x, the root user could no longer log in to the cluster
through SSH without entering their password.

Backup, recovery, and snapshots


Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.3

ID

A secondary worker process incorrectly attempted to remove extended user


154102
attributes from a WORM-committed file before updating the file retention date. As a
result, incremental SyncIQ jobs failed and error messages similar to the following
appeared in the /var/log/isi_migrate.log file, where <ATTR> was the name
of the specific attribute:
Error : Failed to delete user attribute <ATTR>: Read-only file
system

Reduces lock contention by changing the lock type used by the SyncIQ coordinator
when reading the siq-policies.gc file coordinator from an exclusive lock to a
shared lock.

149818

During a SyncIQ job, if the rm command that was run during the cleanup process of 149771
the temporary working directory on the target cluster exited with an error, the
SyncIQ policy went into an infinite loop, and data could not be synced to the
cluster. If this occurred, a message similar to the following appeared in
the /var/log/isi_migrate.log file:
Unable to cleanup tmp working directory, error is

If you configured or displayed a SyncIQ performance rule in the OneFS web


administration interface, the bandwidth limit was described as kilobytes per
second (KB/sec). This output did not match the kilobits per second (kbps) value
seen in the command-line interface. The web interface and command-line interface
now show the bandwidth limit value measured in kilobits per second.

149668

SyncIQ consumed excessive amounts of CPU during the phase when SyncIQ was
listing the contents of snapshot directories. This caused SyncIQ policies to take
longer to complete.

148431

If the Deny permission to modify files with a DOS read-only


attribute over both UNIX (NFS) and Windows File Sharing
(SMB) ACL policy option was enabled on the cluster, SyncIQ jobs failed when
SyncIQ attempted to synchronize a file or a folder to which the DOS read-only
attribute is applied. If a SyncIQ job failed for this reason, an Operation not

147200

permitted error message appeared in the /var/log/isi_migrate.log


file.
If there was a group change on the source cluster while a SyncIQ job was in the
process of starting, the SyncIQ scheduler might have stopped unexpectedly and
then automatically restarted. If this issue occurred, lines similar to the following
appeared in the /var/log/messages file:

146395

Stack: -------------------------------------------------/lib/libc.so.7:__sys_kill+0xc

Backup, recovery, and snapshots

53

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.3

ID

/usr/lib/libisi_util.so.1:isi_assert_halt+0xa0
/usr/lib/libisi_migrate.so.2:siq_job_summary_save_new+0x200
/usr/bin/isi_migr_sched:sched_main_node_work+0xf3f
/usr/bin/isi_migr_sched:main+0xf13
/usr/bin/isi_migr_sched:_start+0x8c
--------------------------------------------------

When performing a SyncIQ job, in certain cases the target sworker would not
acknowledge completing some tasks. Furthermore, if a SyncIQ job was very large, a
source pworker could have accumulated a large number of un-acknowledged tasks
and then waited for the target worker to acknowledge work that was already
completed. If this occurred, the SyncIQ job would run indefinitely.

142966

If a directory was renamed to a path that had been excluded from a SyncIQ job, the
SyncIQ state information for the directory and its children remained stored.
However, the directory and its children tree were removed from the SyncIQ target.
Any future changes that were made to the directory or its children were treated as
changes to included paths. If this occurred, a SyncIQ target error similar to the
following appeared in the /var/log/isi_migrate.log file:

141584

Error : Unable to open Lin <LIN>: No such file or directory

If all directories that had been excluded from the SyncIQ job were removed in an
incremental SyncIQ job, that incremental SyncIQ job could have failed while trying
to delete an excluded directory. If this occurred, an error similar to the following
appeared in the /var/log/messages or /var/log/isi_migrate.log files:
FAILED ASSERTION found == true

All SyncIQ System B-Trees were protected at 8x mirrored, unnecessarily consuming


disk space.

141176

Note

Beginning in OneFS 7.2.0.3, the protection policy for SyncIQ System B-Trees is set
to the system disk pool default, which enhances SyncIQ performance. If you want
to change the default protection policy for SyncIQ System B-Trees, contact EMC
Isilon Technical Support.
If SyncIQ encountered an issue when processing an alternate data stream for a
directory, an incorrect directory path appeared in the error message that was
logged in the /var/log/isi_migrate.log file.

132233

Cluster configuration issues resolved in OneFS 7.2.0.3

ID

When adding preformatted drives to a node, the drive did not get properly
repurposed for the pool that it was being added to. If this issue occurred, data was
not written to the drive, the drive remained unprovisioned until it was reformatted,

150040

Cluster configuration

54

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Cluster configuration issues resolved in OneFS 7.2.0.3

ID

and messages similar to the following were logged in the /var/log/messages


file:
isi_drive_repurpose_d[6008]: STORAGE drive (devid:x, lnum:y,
bay:z)is not
part of any DiskPool. Skipping this drive.

If the isi_cpool_rd driver was enabled and the FILE_OPEN_REPARSE_POINT flag was 149010
also enabled, then, if an SMB client attempted to open a symbolic link, the
symbolic link was inaccessible, and the following error appeared on the console:
STATUS_STOPPED_ON_SYMLINK

If a file on the cluster was deleted or modified, and the most recent snapshot of
that file was deleted, any changes to SmartPools policies might have silently failed
to propagate to some snapshot files.

147958

Available space remaining on SSDs that are deployed as L3 cache was incorrectly
reported in the OneFS web administration interface.

141931

Diagnostic tools issues resolved in OneFS 7.2.0.3

ID

When you selected Help > Help on This Page or Help > Online Help from the
General Settings page of the web administration interface, a page appeared with
the following message:

146846

Diagnostic tools

Not Found The requested URL /onefs/help/GUID-E395ABA6B63A-4F40-8281-3574CCF6C8B1.html was not found on this server.

Note

This issue did not affect the SNMP Monitoring and SupportIQ general settings
pages.
If you ran the isi_gather_info command with the --ftp-proxy-port and
--save-only options or with the --ftp-proxy and --save-only options,
the specified FTP proxy port or FTP proxy host values were not saved. As a result,
the desired FTP proxy settings had to be specified each time the
isi_gather_info command was run.

142784

If you ran the isi_gather_info command on a node that was encountering


back-end network issues, the operation timed out after 3 minutes, and a message
similar to the following appeared on the console:

75677

isi_rdo: [Errno 60] Operation timed out


isi_gather_info: FAILED to make required directories on 1 nodes.

Diagnostic tools

55

Resolved issues

Events, alerts, and cluster monitoring


Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.3

ID

If you ran the isi statistics client or isi statistics heat


command with the --csv option, the following error appeared instead of the
statistics data:

153565

unsupported operand type(s) for %: 'NoneType' and 'tuple'

The stated storage capacities for /, /var, and /var/crash were reported 8 times 151651
too high in the OneFS statistics system. This sometimes caused incorrect capacity
sizes to appear in the web administration interface, SNMP queries, or in Platform
API-enabled applications.
The following event message did not automatically clear after the boot drive was
replaced:

150730

Drive at Internal <drive_location> wear_life threshold exceeded:


xx (Threshold:
xx). Please schedule drive replacement.

If memory allocated to the clusterwide event log monitoring process


(isi_celog_monitor) became very fragmented, the isi_celog_monitor process
stopped performing any work. As a result, no new events were recorded, alerts
regarding detected events were not sent, and messages similar to the following
were repeatedly logged in the /var/log/isi_celog_monitor.log file:

150625

isi_celog_monitor[5723:MainThread:ceutil:92]ERROR: MemoryError
isi_celog_monitor[5723:MainThread:ceutil:89]ERROR: Exception in
serve_forever()

Note

Allocated memory is considered fragmented when it is not stored in contiguous


blocks. Memory allocated to the CELOG process is more likely to become
fragmented in environments with frequent configuration changes and in which
many CELOG events are being generated.

56

If the CELOG notification master node went down, delivery of event notifications
stopped until the down node returned to service or until the CELOG notification
subsystem (isi_celog_notification) was restarted, at which point the subsystem
would elect a new notification master with the updated group information.

149682

If phase 2 of an FSAnalyze job took longer than 100 minutes to complete, the job
sometimes stopped progressing, might have progressed very slowly, or might have
failed and then resumed. This issue occurred because, during phase 2, the
FSAnalyze job updated an SQLite index, and while the job was updating this index,
it could not handle other job engine requests, which prevented the job from
progressing. In addition, if, while the SQLite index was being created, the number
of requests waiting to be handled grew to more than 100 (the maximum allowed),
the job was terminated and then resumed from a point before the 100 minutes had
elapsed.

147009

The isi_papi_d process did not properly handle CELOG events that referenced a
path name that contained special characters or multibyte characters. If this issue

144742

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.3

ID

occurred, a message similar to the following appeared in the /var/log/


isi_papi_d.log file:
isi_papi_d[37840]: [0x80a403500]: ERROR Event 5.705 specifier
parse error:
"enforcement": "advisory", "domain": "directory /ifs/data/
\xe8\xa9\xa6\xe9\xa8\x93&dios",
"name": "exceeded", "val": 0.0, "devid": 0,"lnn": 0}

If the cluster was being monitored by an InsightIQ server, this issue might also
have resulted in a lost connection between the InsightIQ server and the cluster.
The physIfaces object identifier (OID) was incorrectly named in the ISILON-

144382

TRAP-MIB.txt file, available in the General Settings > SNMP Monitoring tab
of the OneFS web administration interface. As a result, it was not always possible
to monitor the cluster through SNMP.
Protocol event logging in the /var/log/audit_protocol.log file always
showed a value of 0 bytes written for a write event, and close events did not have

138957

bytes written or bytes read fields.


If the snmpd process failed to load the /etc/mcp/sys/lni.xml
138691
file, /etc/ifs/local.xml file, or /etc/ifs/array.xml file, a memory leak
could occur. A memory leak in the snmpd process could have caused SNMP
monitoring to be interrupted until the snmpd process was manually stopped and
then restarted.
Temporary SQLite files were created in the /var/tmp directory more frequently
than was necessary. Because writes to the /var partition can decrease the wear
life of boot disks on an affected node, an index was added to the /
ifs/.ifsvar/db/celog/events.db SQLite database file to reduce the
frequency with which these files are written to the /var/tmp directory.

135108

If you attempted to run Insight IQ 3.1.x to monitor a cluster, disk statistics were not
being collected due to the Platform API disk statistics query returning an error. As a
result, InsightIQ could not be used to collect drive statistics from the cluster.

129187

File system issues resolved in OneFS 7.2.0.3

ID

If SMB2 symbolic link translation was disabled on the cluster by running the
following command:

150833

File system

isi_gconfig
registry.Services.lwio.Parameters.Drivers.onefs.SMB2Symlinks=0

Symbolic links to directories might have failed and an error similar to the following
might have appeared on the client:
The symbolic link cannot be followed because its type is disabled.

If L3 was enabled in a cluster environment using Self-Encrypting Drives (SED) that


149778
previously had it disabled, the SSDs were smartfailed but not re-added as L3
devices. As a result, if you ran the isi_devices command, it was possible to see
File system

57

Resolved issues

File system issues resolved in OneFS 7.2.0.3

ID

that the SSDs never automatically transitioned from the [REPLACE] back to the

[PREPARING] state, and false drive replacement alerts were generated.


If you copy configuration files while the isi_mcp process is running, by design, the
149759
MD5 command will validate the files in question. If two files with the same file
name were copied almost simultaneously, and the second file was started, the
MD5 process on the first file could have been truncated. As a result, an infinite loop
occurred whereby the isi_mcp child process would stop responding. In the below
example, 93.0 was the CPU usage, and the process was running for more than
6400 minutes (106 hours).
isi_for_array -s 'ps auwxxxHl | grep isi_mcp | grep -vi grep'
4284 93.0 0.0 55744 8176 ?? R
2Mar14 6425:30.28 isi_mcp:
child (isi_mcp)

Isilon A100 nodes might have restarted unexpectedly during a group change,
resulting in data unavailability. If this issue occurred, lines similar to the following
appeared in the /var/log/messages file:

149687

Software Watchdog failed on CPU 1 (82353: kt: rtxn_split [-])


Stack: -------------------------------------------------kernel:isi_hash_resize+0x31f
efs.ko:lki_handle_async_reacquire+0x262
efs.ko:lki_group_change_commit+0x727
efs.ko:lk_group_change_commit_initiator+0x32
efs.ko:rtxn_sync_locks_done+0x12e
efs.ko:rtxn_split+0x4e9
efs.ko:rtxn_split_courtship_thread+0x388
efs.ko:kt_main+0x83kernel:fork_exit+0x77
--------------------------------------------------

Due to a race condition that could occur while file metadata was being upgraded
following an upgrade from OneFS 6.5.5.x to OneFS 7.2.0.x, a node might have
unexpectedly restarted. If this issue occurred, the following lines appeared in
the /var/log/messages file on the affected node:
panic @ time 1406566983.500, thread
0xffffff07b80ae560:
Assertion Failure
Stack:
------------------------------------------------kernel:isi_assert_halt+0x42
efs.ko:ifm_di_get_current_protection+0x61
efs.ko:ifm_get_parity_flag+0x33
efs.ko:bam_read_block+0x5f
efs.ko:bam_read_range+0xd8
efs.ko:bam_read+0x613
efs.ko:bam_read_uio+0x36
efs.ko:bam_coal_read_wantlock+0x37a
efs.ko:ifs_vnop_wrapunlocked_read+0x2c6
nfsserver.ko:nfsvno_read+0x58b
nfsserver.ko:nfsrvd_read+0x55c
nfsserver.ko:nfsrvd_dorpc+0x4d3
nfsserver.ko:nfs_proc+0x243
nfsserver.ko:nfssvc_program+0x7b1
krpc.ko:svc_run_internal+0x3c6
krpc.ko:svc_thread_start+0xa
kernel:fork_exit+0x7f
-------------------------------------------------*** FAILED ASSERTION ifm_di_getinodeversion(dip)
== 6 @/build/mnt/src/sys/ifs/ifm/ifm_dinode.c:
397:ifm_di_get_current_protection: wrong
inode

58

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

149669

Resolved issues

File system issues resolved in OneFS 7.2.0.3

ID

It was possible for a race condition between the group change and the deadlock
probea mechanism that attempts to detect and correct deadlock conditionsto
cause a node to restart unexpectedly.

149667

If a cluster had run for more than 248.5 consecutive days, an issue that affected
the OneFS journal buffer could sometimes disrupt the drive sync operation. When
this issue occurred, OneFS reported that the journal was full, and as a result,
resources that were waiting for a response from the journal entered a deadlocked
state. When the journal was in this state, nodes that were affected rebooted to
clear the deadlock. In addition, a message similar to the following appeared in
the /var/log/messages file:

148960

/boot/kernel.amd64/kernel:efs.ko:rbm_buf_timelock_panic_all_cb
+0xd0

Under rare circumstances, the lock subsystem did not drain fast enough, causing
an assertion failure. When this issue occurred, the node restarted, and the
following stack was logged to the /var/log/messages file:

148123

Stack: -------------------------------------------------kernel:isi_assert_halt+0x2ekernel:lki_lazy_drain+0xf76
kernel:_lki_split_drain_locks+0xa8
kernel:kt_main+0x15ekernel:fork_exit+0x75
-------------------------------------------------<3>*** FAILED ASSERTION must_drain ==> !pool->lazy_queue_size || !
li->mounted @ /b/mnt/src/sys/ifs/lock/lk_initiator.c:13270:
lki_lazy_drain_pool on LK_DOMAIN_DATALOCK took 302454934. lazy
queue 1870 -> 11. li->llw_count = 0, iter_count=11087431
chk_space_time = 0, chk_space_iters = 0 llw_time = 880073
llw_iters = 2503 reject_drain_time = 1550050 reject_drain_iters =
1 yield_time = 282713930 yield_iters = 11084926
shrink_lazy_queue_count = 11087431

If an SMB client changed the letter case of the name of a file or directory stored on 147606
the cluster, the file or directory's ctime (change time) value was not updated. As a
result, the affected file or directory was not backed up during incremental backups.
If SmartCache write caching was enabled and if clients were performing
synchronous writes to the cluster, it was possible to encounter a runtime assert
that caused an affected node to unexpectedly restart. If this issue occurred, lines
similar to the following appeared in the /var/log/messages file:

146541

Stack: -------------------------------------------------kernel:cregion_issue_write+0xdcb
kernel:_cregion_write+0x1f5
kernel:cregion_write+0x24
kernel:cregion_flush+0xf6
kernel:coalescer_flush_overlapping+0x219
kernel:coalescer_flush_local_overlap+0x275
kernel:bam_coal_flush_local_overlap+0x2d
--------------------------------------------------

While running an initial SyncIQ job, the target root directory and its contents
remained in a read-write state instead of read-only until the SyncIQ job completed.
As a result, files could be deleted or modified in the target cluster.

145714

File system

59

Resolved issues

File system issues resolved in OneFS 7.2.0.3

ID

SNMP monitoring with Nagios failed when using an Isilon-specific Nagios


configuration file. The following error appeared in Nagios when querying the
cluster:

144278

External command error: Timeout: No Response from <IPaddress>:161

In rare cases, an SMB client released its lease on a file before OneFS received a
request to release the lease. If this occurred, the lwio process restarted
unexpectedly, SMB clients connected to the affected node were disconnected, and
lines similar to the following appeared in the /var/log/messages file:

139833

Stack: -------------------------------------------------lib/libc.so.7:thr_kill+0xc
/usr/likewise/lib/liblwiocommon.so.0:LwIoAssertionFailed+0x9f
/usr/likewise/lib/lwio-driver/
onefs.so:OnefsOplockBreakFillBuffer_inlock+0xbf
/usr/likewise/lib/lwio-driver/onefs.so:OnefsOplockComplete_inlock
+0x7e
/usr/likewise/lib/lwio-driver/onefs.so:OnefsOplockBreakToRH+0x187
/usr/lib/libisi_ecs.so.1:oplocks_event_dispatcher+0xf3
/usr/likewise/lib/lwio-driver/onefs.so:OnefsOplockChannelRead+0x8c
/usr/likewise/lib/liblwbase.so.0:EventThread+0x333
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xec
/lib/libthr.so.3:_pthread_getprio+0x15d
--------------------------------------------------

File transfer
File transfer issues resolved in OneFS 7.2.0.3

ID

If a client was connected to the cluster through vsftpd and ran the ls or dir
commands for directories that contained more than 100,000 files, the vsftpd
process reached its memory limit, and a memory allocation error occurred. As a
result, the files in the affected directories could not be listed.

149665

Hardware issues resolved in OneFS 7.2.0.3

ID

Hardware
The isi firmware status command did not report the firmware version of the 150725
Mellanox IB/NVRAM card. This issue affected the S200, X200, X400, and NL400
series nodes.

60

The LED on the chassis turned solid red for a drive prior to completion of the
smartfail process. As a result, the drive might have been replaced prematurely,
possibly causing data loss.

145348

If you installed a new drive support package (DSP) on a node that already had a
DSP installed and you then attempted to update a drive whose update was
included only in the new DSP, the fwupdate command did not update the drive
unless either the isi_drive_d process or the affected node was restarted. If this
issue occurred, and you ran the isi devices a fwupdate command before

145268

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Hardware issues resolved in OneFS 7.2.0.3

ID

restarting the isi_drive_d process or the node, the following error appeared on the
console:
'fwupdate' action complete, 0 drives updated, 0 updates failed

If you attempted to install a node firmware package that did not have support for
the Chassis Management Controller (CMC) component, on a node that contained a
CMCfor example, an S210, X210, X410, NL410, or HD400 nodethe installation
failed and an unhandled exception error similar to the following appeared on the
console:

144708

FAILED : Unhandled exception in safe.id.cmc ('empty_fw_object'


object has no attribute 'update')

Note

Beginning in OneFS 7.2.0.3, if the preceding conditions exist, the following


message appears on the console, where <partnumber> is the part number of the
CMC:
FW archive does not have support for PN <partnumber>

The isi_sasphymon process could potentially close a valid 0 file descriptor. If this
issue occurred, any drive associated with the file descriptor would no longer be
monitored by the isi_sasphymon process. This issue would also cause excessive
logging in the /var/log/isi_sasphymon.log file similar to the following:

143042

isi_sasphymon[3979]: Can't get SCSI Log Sense page 0x18 from Bay
2 - scan 6
isi_sasphymon[3979]: cam_get_inquiry: error from cam_send_ccb: 9
isi_sasphymon[3979]: scsi_get_info: error from scsi_get_inquiry

If you ran the isi_reformat_node command on a node containing selfencrypting drives (SEDs), sometimes the SEDs could not be released from
ownership, and when the node rebooted, the unreleased SEDs came up in a
SED_ERROR state.

141983

Note

Beginning in OneFS 7.2.0.3, if you run the isi_reformat_node command on a


node containing self-encrypting drives (SEDs) that cannot be released from
ownership, the following messages appear on the console where <affected drives> is
a list of the affected drives:
isi_wipe_disk has failed in isi_reformat_node
Failed to wipe the following drives:
<affected drives>
Opening zsh to allow user to revert these drives using the
'/usr/bin/isi_hwtools/isi_sed revert' command. To continue with
the reformat, enter 'exit' in the shell.

Note

If the reformat process continues without reverting the listed drives, it is likely they
will be in a SED_ERROR state on the next node boot.

Hardware

61

Resolved issues

Hardware issues resolved in OneFS 7.2.0.3

ID

After replacing boot flash drives to a node and running the gmirror status
command, the correct number of active components was displayed but a status of
DEGRADED was incorrectly returned for some components in the output. In the

128304

example below, the keystore and mfg mirrors were affected:


Name
mirror/root0

Status
COMPLETE

mirror/keystore

DEGRADED

mirror/var-crash
mirror/mfg

COMPLETE
DEGRADED

mirror/journal-backup

COMPLETE

mirror/var1

COMPLETE

mirror/var0

COMPLETE

mirror/root1

COMPLETE

Components
ad7p4
ad4p4
ad7p11
ad4p12
ad7p10
ad7p9
ad4p10
ad7p8
ad4p8
ad7p7
ad4p7
ad7p6
ad4p6
ad7p5
ad4p51

Although the operation of the node was unaffected, the incorrect Status sometimes
led to unnecessary service calls for hardware exchanges.

HDFS
HDFS issues resolved in OneFS 7.2.0.3

ID

If the maximum number of HDFS client connections to the cluster was reached, all
worker threads remained busy during processing. As a result, no further cluster
connections could be established, namenode remote procedure calls (RPCs) were
queued for long periods of time, and the HDFS server incorrectly appeared to be
unavailable.

154175

If you tried to change ownership of files or directories through the WebHDFS REST
API by setting only the owning user or the owning group of a file or directory (but
not both), an exception error similar to the following might have appeared in the
command-line interface:

153786

"RemoteException":
{
"exception"
: "SecurityException",
"javaClassName": "java.lang.SecurityException",
"message"
: "Failed to get id rec: 1:"
}
}

Additionally, Ambari 2.1 might have failed to install Hortonworks Data Platform 2.3
through the WebHDFS REST API.

62

The datanode port that HDFS listens on was changed from 1021 to 585 to avoid
conflicts with other processes that might have been listening on the same port.

152933

If the maximum number of HDFS client connections to the cluster was reached, all
worker threads remained busy during processing. As a result, no further cluster
connections could be established, namenode remote procedure calls (RPCs) were
queued for long periods of time, and the HDFS server incorrectly appeared to be
unavailable.

147723

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

HDFS issues resolved in OneFS 7.2.0.3

ID

The isi_hdfs_d proccess no longer unnecessarily logs the following message to


the /var/log/isi_hdfs_d.log file:

146753

RPC getDatanodeReport raised exception: Could not parse


'GetDatanodeReport'

When Kerberos authentication was used with HDFS, the isi_hdfs_d process could
eventually run out of memory and unexpectedly stop. If this issue occurred, an
isi_hdfs_d.core file was created in the /var/log/crash/ directory, and the
following lines appeared in the /var/log/messages file:

146026

isi_hdfs_d: isi_hdfs_d: *** FAILED ASSERTION cv->members @ s11n.c:


137: oom
[kern_sig.c:3376](pid 27685=""isi_hdfs_d"")(tid=102752) Stack
trace:
Stack: -------------------------------------------------/lib/libc.so.7:thr_kill+0xc
/usr/lib/libisi_util.so.1:isi_assert_halt+0xa0
/boot/kernel.amd64/kernel: /usr/bin/
isi_hdfs_d:file_status_array_append+0x9b
/boot/kernel.amd64/kernel: /usr/bin/
isi_hdfs_d:util_make_directory_listing+0x90d
/boot/kernel.amd64/kernel: /usr/bin/
isi_hdfs_d:_rpc2_getListing_ap_2_0_2+0xbf
/boot/kernel.amd64/kernel: /usr/bin/isi_hdfs_d:rpc_ver2_2_execute
+0x21c
/boot/kernel.amd64/kernel: /usr/bin/isi_hdfs_d:_asyncrpctask+0x3a
/boot/kernel.amd64/kernel: /usr/bin/isi_hdfs_d:_workerthr+0x257
/boot/kernel.amd64/kernel: /lib/libthr.so.3:_pthread_getprio+0x15d
/boot/kernel.amd64/kernel:
--------------------------------------------------

Java class names were not included for remote exceptions in WebHDFS. The
exclusion of Java class names might have caused unexpected errors, similar to the
following, when creating and writing a file through WebHDFS:

142056

mkdir: The requested file or directory does not exist in the


filesystem.

If a Hadoop client tried to export data in Hive to a directory that already existed,
and the client did not have permissions on the directory to make the change, the
mkdir command failed. If the mkdir command failed, an error similar to the
following appeared on the client:

142049

FAILED: Execution Error, return code 2 from


org.apache.hadoop.hive.ql.exec.CopyTask

In addition, the following line appeared in the /var/log/isi_hdfs_d.log file


on the node:
pfs_mkdir_p failed in mkdirs with unusual errno: Operation not
permitted

The Ambari server sent a check_host command instead of a host_check


command. If this issue occurred, the following message was logged to
the /var/log/isi_hdfs_d.log file:

139269

Ambari: Tried to access an undefined component name, which is


most likely unsupported: check_host

HDFS

63

Resolved issues

Job engine
Job engine issues resolved in OneFS 7.2.0.3

ID

If you tried to start a PermissionRepair job from the ClusterManagement > Job
Operations > Job Types > Start Job dialog, and you set the Repair Type to
Clone: copy permissions from the chosen path to all files and directories
or Inherit: recursively apply an ACL, the Template File or Directory field did
not appear. As a result, you could not configure a PermissionRepair job to

154094

perform a Clone-type or Inherit-type repair.


If a MediaScan job detected an ECC error in a files data, the job did not properly
restripe the file away from the ECC error. As a result, the file was underprotected,
and was at risk for data loss if further damage occurred to the datafor example, if
a device containing a copy of the data failed. If this issue occurred, a message
similar to the following appeared in the /var/log/isi_job_d.log file:

148016

mark_lin_for_repair:1331: Marking for repair: 1:0001:0003::HEAD

In the web administration interface, the Edit Job Type Details page for jobs that
had a schedule set to Every Sunday at 12:00am displayed Close and Edit Job
Type buttons instead of Cancel and Save Changes buttons.

144692

Migration issues resolved in OneFS 7.2.0.3

ID

During a full or incremental migration, if midfile checkpoints were enabled or if the


WINDOW_MAX_SIZE > 0 environmental variable was set, an error similar to the
following appeared in the /var/log/isi_vol_copy.log file and on the
console, and the migration had to be restarted from the beginning:

149816

Migration

createleaves() - ./file19: not found on tape first = 11988,


curfile.ino = 19619

During an incremental migration through the isi_vol_copy utility, if a socket file


needed to be extracted or migrated, the migration failed and an error similar to the
following appeared on the console:

149815

./f2: cannot create file: Operation not supported

If you renamed or deleted a directory on the source cluster prior to performing an


incremental migration, and if you then created a hard link file with the original
name of the deleted or renamed directory, the incremental migration failed. If this
occurred, errors similar to the following appeared in the /var/log/
isi_vol_copy.log file and also on the console:
[INFO] [isi_vol_copy stdout]: Error:
[INFO] [isi_vol_copy stdout]: Failed to create hardlink ./
HL_PREFIX_DIR4->./DIR4: err:Operation not permitted[1]

[INFO] [isi_vol_copy stdout]: *** FAILED


ASSERTION !"fixupentrytype()" @ /b/mnt/src/isilon/lib/isi_emctar/
updated.c:431:

64

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

149814

Resolved issues

Migration issues resolved in OneFS 7.2.0.3

ID

If the isi_vol_copy_vnx tool was used to migrate data from a VNX array to a OneFS
cluster, and if the data contained any NULL SIDs, the migration process stopped,
and a message similar to the following appeared in the /var/log/messages
file:

149760

/boot/kernel.amd64/kernel:
[bam_acl.c:190](pid 83648="isi_vol_copy_vnx")(tid=101308)
ifs_verify_acl:
Failed verifying
security_ace on lin:1:02df:da06. Ace#3. An ACE cannot have a NULL
identity type.

Networking
Networking issues resolved in OneFS 7.2.0.3

ID

S210 and X410 nodes that were configured to communicate through a 10 GigE
150883,
network interface card that was using the BXE driver, and that were also configured 152083
to use aggregate interfaces with the link aggregation control protocol (LACP),
experienced connectivity issues with those interfaces if the node was rebooted or if
the MTU on those interfaces was reconfigured.
If you performed an extended link flapping test on a node containing a Chelsio
149767
network interface card (NIC), the NIC eventually became unresponsive, and had to
be manually disabled and then re-enabled before it resumed normal operations.
While the NIC was unresponsive, external clients could not communicate with the
node; however, because the nodes back-end communication was unaffected, data
on the node was still available to clients connected to the cluster through other
nodes.
If the cluster contained X410, S210, or HD400 nodes that had BXE 10 GigE NIC
cards and any external network subnets connected to the cluster were set to 9000
MTU, an error similar to the following appeared in the /var/log/messages file,
and the affected nodes rebooted:

148695,
152083

ERROR: mbuf alloc fail for fp[01] rx chain (55)

For more information, see ETA 200096 on the EMC Online Support site.
A memory leak in the networking process, isi_flexnet_d, might have caused the
process to stop running, and could have damaged the /etc/ifs/
flx_config.xml file. If the file was damaged, all clients could have lost their
connections to the cluster.

141822

NFS issues resolved in OneFS 7.2.0.3

ID

NFS
Because OneFS 7.2.0 and later returned 64-bit NFS cookies, some older, 32-bit NFS 153737
clients were unable to correctly handle read directory (readdir) and extended read
directory (readdirplus) responses from OneFS. In some cases, the affected 32-bit
clients became unresponsive, and in other cases, the clients could not view all of

Networking

65

Resolved issues

NFS issues resolved in OneFS 7.2.0.3

ID

the directories in an NFS export. In the latter cases, the client could typically view
the current directory (".") and its parent directory ("..").
For more information, see ETA 205085 on the EMC Online Support site.
Because NFSv3 Kerberos authentication requires all NFS procedure calls to use
RPCSEC_GSS authentication, some older Linux clientsfor example, RHEL 5 clients
that started the FSINFO procedure call with AUTH_NULL authentication before
attempting the FSINFO procedure call with RPCSEC_GSS authentication, were
prevented from mounting an NFS export if the export was configured with the
Kerberos V5 (krb5) security type. Newer clients that started the FSINFO procedure
call with RPCSEC_GSS were not affected.

151582

If the lsass process was not running when NFS configuration information was
refreshed on the cluster, it was possible for empty netgroups to be propagated to
some or all of the cluster nodes. If this issue occurred, NFS clients were unable to
mount NFS exports.

149781

If you created a hard link that contained a colon (:) from an NFSv3 client, the colon
and any characters that followed it were removed from the hard link name. As a
result, the hard link on the cluster did not have the correct name.
If removing the colon and following characters resulted in changing the hard link
name to a file name that was already in use in the destination directory on the
cluster, a file name conflict resulted, and a "File exists error appeared on the NFS
client.

148001

If a client held a read lock on a file and an NFS4 client checked the lock status of
the file, the response from the cluster incorrectly reported that the original client
was holding a write lock on the file.
This issue might have caused the program that the NFS client was using to work
improperly.

147638

If an NFS client attempted to list a file or directory at the root of an NFS export
mount point directory that began with two dotsfor example, /mnt/
nfs_export/..my_folder and the requested file or directory did not exist,
OneFS returned the contents of the NFS export instead of a file not found error
message.

147404

A memory leak in the isi_papi_d process might have caused an out-of-memory error 145209
when running isi nfs exports commands.
Because the nfs and onefs_nfs drivers (and the flt_audit_nfs driver, if you enabled
protocol auditing) share the same process ID, if one of these drivers failed to start,
the MCP process did not always detect the failure and did not always restart the
stopped drivers.

144485

On the NFS Export Details page, if you added a secondary group for either the
Map Root User or the Map Non Root User, the value field did not display until you
refreshed the web administration interface page.

142343

If the NFS server shut down in the middle of a NFS export refresh, it was possible for 142296
an NFS resolver thread to be in use when the NFS server was attempting to shut
down. If this issue occurred, a core file might have been created, and lines similar
to the following appeared in the /var/log/messages file:
Stack: -------------------------------------------------/lib/libthr.so.3:_umtx_op_err+0xa

66

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

NFS issues resolved in OneFS 7.2.0.3

ID

/usr/likewise/lib/liblwbase.so.0:WaiterSleep+0xe0
/usr/likewise/lib/liblwbase.so.0:LwRtlMvarTake+0x69
/usr/likewise/lib/lwio-driver/nfs.so:NfsLockMvar+0x19
/usr/likewise/lib/lwio-driver/
nfs.so:NfsExportManagerResolveCallback+0x5f8
/usr/likewise/lib/liblwbase.so.0:SparkWorkItem+0x56
/usr/likewise/lib/liblwbase.so.0:WorkThread+0x256
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xee
/lib/libthr.so.3:_pthread_getprio+0x15d
-------------------------------------------------

It was possible for two NFS threads to create a race condition when the threads
139673
were inserting NFS export information into the hash table. This race condition could
damage the hash table, causing the NFS process to restart. When this race
condition occurred, lines similar to the following appeared in the /var/log/
messages file:
/boot/kernel.amd64/kernel: [kern_sig.c:3376](pid 7997="nfs")
(tid=100859) Stack trace:
/boot/kernel.amd64/kernel: Stack:
-------------------------------------------------/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase_nothr.so.
0:HashLookup+0x31
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase_nothr.so.
0:LwRtlHashTableInsert+0x5a
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase_nothr.so.
0:LwRtlHashTableResize+0xaf
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase_nothr.so.
0:LwRtlHashTableResizeAndInsert+0x2e
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase_nothr.so.
0:LwRtlHashMapInsert+0x6f
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
nfs.so:NfsExportManagerResolveCallback+0x66
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:SparkWorkItem+0x563
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:WorkThread+0x256
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:LwRtlThreadRoutine+0xee
/boot/kernel.amd64/kernel: /lib/libthr.so.3:_pthread_getprio+0x15d
/boot/kernel.amd64/kernel:
-------------------------------------------------/boot/kernel.amd64/kernel: pid 7997 (nfs), uid 0: exited on
signal 11 (core dumped)

If there was a group change in the cluster, it was possible that the NFS server would 131197
not shut down after a set period of time. After the set period of time elapsed, the
NFS server was forcefully signaled to stop. When the NFS server was forcefully
stopped, a core file was created and lines similar to the following appeared in
the /var/log/messages file:
Stack: -------------------------------------------------/lib/libc.so.7:_kevent+0xc
/usr/likewise/lib/liblwbase.so.0:EventThread+0x964
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xee
/lib/libthr.so.3:_pthread_getprio+0x15d
--------------------------------------------------

NFS

67

Resolved issues

SmartLock
SmartLock issues resolved in OneFS 7.2.0.3

ID

If the compadmin user on a compliance mode cluster ran the sudo


isi_gather_info command, the sudo isi_gather_command successfully
gathered all of the expected files on the local node, but was unable to gather all of
the expected files on remote nodes. This issue occurred because some files on the
cluster can be read only by the root user, and the sudo command did not enable
the compadmin user to run commands as root on remote nodes.

139167

SmartQuotas issues resolved in OneFS 7.2.0.3

ID

If you configured a storage quota on a directory with a pathname that contained a


single, multibyte character, and if a quota notification email was sent for that
directory, the multibyte character in the pathname that appeared in the quota
notification email was replaced with an incorrect character, such as a question
mark.

149758

If you changed a quota's soft or hard limit through the web administration
interface, the Enforced parameter changed from Yes to No, making the quota
accounting-only. Any usage limit that was set was not enforced.

148807

If a quota was created with a hard, soft, or advisory threshold that included a
decimal pointfor example, isi quota quotas create --hardthreshold=4.5Tthe operation failed, and a message similar to the following
appeared on the console:

145943

SmartQuotas

Unknown suffix '.5T'; expected one of ['b', 'K', 'M', 'G', 'T',
'P', 'B', 'KB', 'MB','GB', 'TB', 'PB']

In the web administration interface, after clicking View details for a quota on the
Quotas & Usage page, the %Used value under Usage Limits did not always
correctly match the percentage value displayed under %Used in the top summary
row for the quota.

123355

SMB issues resolved in OneFS 7.2.0.3

ID

If you created an SMB share and then created a single user or group with run-as
root permissions to the share, the user or group could not be deleted, and the user
or groups run-as-root permission could not be modified. If you attempted to delete
the user or group, the command appeared to successfully complete; however, the
user or group was not deleted. If you attempted to modify the user or groups
permissions, the command appeared to successfully complete; however, the
original permissions entry was not removed, and an additional entry, with the
modified permissions, was added to the share. In the example below, the domain

146616

SMB

68

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

SMB issues resolved in OneFS 7.2.0.3

ID

admins group displays the duplicate entries created when the groups permissions
run-as-root was modified:
Account
Account Type Run as Root Permission Type
Permission
-----------------------------------------------------------------------EXAMPLE\domain admins group
True
allow
full
EXAMPLE\domain users group
False
allow
change
EXAMPLE\domain admins group
False
allow
full

SMB clients were unable to display alternate data stream information for files on
the cluster that contained alternate data streams.

153666

During an upgrade to OneFS 7.2.0.x, an upgrade script did not properly interpret an 150658
empty string value for the HostAcl parameter in the /ifs/.ifsvar/
main_config.gc file. This caused SMB shares to be inaccessible after the
upgrade was complete, and as a result, the SMB shares had to be re-created. If this
occurred, output similar to the following appeared after running the isi_gconfig
registry.Services.lwio.Parameters.Drivers.srv.HostAcl
command:
registry.Services.lwio.Parameters.Drivers.srv.HostAcl (char**) =
[ "" ]

If the OneFS file system quota was exceeded, an incorrect


STATUS_QUOTA_EXCEEDED error was returned during SMB1 and SMB2 write
operations instead of STATUS_DISK_FULL. As a result, the client ignored the
error and write requests continued, but were not applied, because they were over
quota. Any binary files, such as PST files would become unusable.

149811

In OneFS 7.2.0.x clusters, the SMB2 connection was sending invalid share flags. As 149796
a result, if the inheritable-path ACL was set while creating a share, access to files
on a cluster using UNC path hyperlinks in Microsoft Outlook emails failed to open.
If you ran the isi statistics client command to view information about
some SMB1 and SMB2 read and write operationsfor example, the
namespace_write operationthe word UNKNOWN appeared in the UserName

149683

column, instead of a valid user name. As a result, if you ran scripts to filter read/
write operations per user, the scripts did not work correctly.
If you attempted to override the default Windows ACL settings that were applied to 149664
an SMB share, by adding custom ACLs to the /ifs/.ifsvar/smb/isi-sharedefault-acl/ template directory, the overrides were not implemented. As a
result, actual access permissions on the SMB share did not match expected
results.
If the FILE_OPEN_REPARSE_POINT flag was enabled, and an SMB client opened 148734
an alternate data stream (ADS) through a symbolic link, the ADS was inaccessible,
and the following error appeared on the console:

STATUS_STOPPED_ON_SYMLINK
If you ran the EMCopy application to migrate data containing symbolic links to the
cluster, the SMB process unexpectedly restarted because of an lwio process
assertion failure. When the SMB process restarted, clients were disconnected from

145612

SMB

69

Resolved issues

SMB issues resolved in OneFS 7.2.0.3

ID

the cluster and the following error message appeared in the /var/log/
lwiod.log file:
ASSERTION FAILED: Expression = (pFcb->bIsDirectory ==
bIsDirectory)

In addition, the following lines appeared in the /var/log/messages file:


/lib/libc.so.7:thr_kill+0xc
/usr/likewise/lib/liblwiocommon.so.0:LwIoAssertionFailed+0xa4
/usr/likewise/lib/lwio-driver/onefs.so:OnefsCreateFCB+0x896
/usr/likewise/lib/lwio-driver/onefs.so:OnefsCreateFileCcb+0x3b0
/usr/likewise/lib/lwio-driver/onefs.so:OnefsCreateInternal+0x90e
/usr/likewise/lib/lwio-driver/onefs.so:OnefsCreate+0x28d
/usr/likewise/lib/lwio-driver/onefs.so:OnefsProcessIrpContext
+0x12b
/usr/likewise/lib/liblwbase.so.0:CompatWorkItem+0x16
/usr/likewise/lib/liblwbase.so.0:WorkThread+0x256
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xee
/lib/libthr.so.3:_pthread_getprio+0x15d

On the client, EMCopy might have displayed the following error message:
ERROR (50) : \\TARGET\symlink ->

folder:symlink creation failure

Upgrade and installation


Upgrade and installation issues resolved in OneFS 7.2.0.3

ID

If the Disable access logging option was set in the OneFS web administration
interface, and then you upgraded your cluster from OneFS 6.5.x to OneFS 7.x, the
apache2 service failed to start, and an error similar to the following appeared
repeatedly in the /var/log isi_mcp file:

149812

FAILED on action list 'start': action 1/1


SERVICE apache2 (pid=3840) returned exit status 1

As a result, client access to HTTP was denied.


If you attempted to upgrade a SmartPools database that was not successfully
upgraded due to empty node pools, an error similar to the following appeared in
the OneFS web administration interface and on the console.

149695

Storage Pool Settings Changes Failed


The edit to the existing storage pool settings did not save due
to the following error:
Changing settings disallowed until SmartPools DB is fully upgraded

As a result, the upgrade did not complete.


Because OneFS 7.2.0.x does not support file pool policy names that begin with a
number, if you upgraded from OneFS 6.5.5.xa version that supported file pool
policies with names that began with a numberand if any of your preexisting file
pool policies began with a number, following the upgrade, SmartPools jobs failed,
and file pool policies could not be created or modified.
Beginning in OneFS 7.2.0.3, a pre-upgrade check will halt an upgrade if the cluster
configuration being upgraded contains file pool policies that begin with a number.

70

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

149684

Resolved issues

Resolved in OneFS 7.2.0.2


Antivirus
Antivirus issues resolved in OneFS 7.2.0.2

ID

If you attempted to scan an infected file from the OneFS web administration
141960
interface, and if the file name or the path name where the file was located
contained the apostrophe (') character, the web interface displayed an HTTP 500

Internal Server Error page, and an error similar to the following


appearred in the /var/log/webware-errors/ file:
File "/usr/local/share/webware/WebKit/HTTPContent.py", line
105,in _respond self.handleAction(action)
File "webui/Is2CorePage.py", line 80, in handleAction
Page.handleAction(self, action)
File "/usr/local/share/webware/WebKit/HTTPContent.py", line 213,
in handleAction getattr(self, action)()
File "webui/AVScanDetectedThreats.py", line 138, in rescan
self.jsonRet['error'] = '%s %s' % (str(e), ACTION_STATE_ERROR)
SystemError: 'finally' pops bad exception

If the job that was running an antivirus scan policy was terminated, either by
another process or due to a software failure, the antivirus scan policy continued to
be listed as running in the OneFS web administration interface, and the job could
not be manually cancelled or cleared from the list of running jobs. The correct
status of the policy was displayed when viewed from the command-line interface.

141954

Because some antivirus scan reporting fields accepted invalid characters from
SQLite queries, running or completed antivirus scan policies were not listed in the
OneFS web administration interface, and messages similar to the following
appeared in the webware_webui.log file where <policy_ID> was the ID of the
affected policy:

138754

OperationalError: unrecognized token: "<policy_ID>"

Under some circumstancesfor example, if Antivirus scan was not correctly


configuredmessages regarding the isi_avscan_d process were repeatedly logged
in the /var/log/isi_avscan_d.log file.

135097

Note

Because repeated logging to the /var partition can adversely affect the wear life of
a node's boot flash drives, to reduce logging under the previously described
circumstances, if a large number of duplicate messages are logged within a short
period of time, some of the messages are suppressed and a message similar to the
following appears in the /var/log/isi_avscan_d.log file:
isi_avscan_d[1764]: Suppressed 152 similar messages!

Resolved in OneFS 7.2.0.2

71

Resolved issues

Authentication
Authentication issues resolved in OneFS 7.2.0.2

ID

If Microsoft Security Bulletin MS15-027 was installed on a Microsoft Active


Directory server that authenticated SMB clients that were accessing an Isilon
cluster, and if the server used the NTLMSSP challenge-response protocol, the SMB
clients could not be authenticated. As a result, SMB clients could not access data
on the cluster.
For more information, see article 199379 on the EMC Online Support site.

147221

If you configured HDFS with Kerberos authentication, WebHDFS requests sent to


access zones other than the System Zone were not correctly authenticated and the
client that sent the request received the following message:

145590

503 Service Temporarily Unavailable

If an LDAP provider returned a UID or a GID that was greater than 4294967295 (the
maximum value that can be assigned to an unsigned 32-bit integer), an incorrect
UID or GID was assigned to the associated user or group. This issue could have
affected a users ability to access data on the cluster.

144002

Note

Beginning in OneFS 7.2.0.2, if an LDAP provider returns a UID or a GID that is


greater than 4294967295, affected users will not be authenticated, and a No

such user error will be returned. Additional logging was also added to
the /var/log/lsassd.log file to help identify these issues.
If the selective authentication setting was enabled for a Windows trusted domain,
and if a user who was a member of the domain was assigned to a group to which
the ISI_PRIV_LOGIN_SSH or ISI_PRIV_LOGIN_PAPI role-based access privilege was
assigned, the user was denied access to the cluster when attempting to log in
through an SSH connection or through the OneFS web administration interface.
This issue occurred because the selective authentication setting prevented OneFS
from resolving the users group membership.

142088

If a DNS server became unavailable while the lsass process was sending RPC
requests to a domain controller, the lsass process might have restarted
unexpectedly. If this issue occurred, authentication services were temporarily
unavailable, and a message a similar to the following appeared in the /var/log/
messages file:

142073

Stack: -------------------------------------------------/usr/likewise/lib/liblsaonefs.stat.so:LsaOnefsGetIpv4Address+0x9
/usr/likewise/lib/liblsaonefs.stat.so+0xee4:0x807315ee4
/usr/likewise/lib/liblsaserverstats.so.0:LsaSrvStatisticsRelease
+0x82
/usr/likewise/lib/lsa-provider/
ad_open.so:AD_NetLookupObjectSidsByNames+0x3bc
/usr/likewise/lib/lsa-provider/
ad_open.so:AD_NetLookupObjectSidByName+0x1b1
/usr/likewise/lib/lsa-provider/ad_open.so:LsaDmConnectDomain+0x205
/usr/likewise/lib/lsa-provider/
ad_open.so:LsaDmWrapNetLookupObjectSidByName+0x76
/usr/likewise/lib/lsa-provider/
ad_open.so:LsaDmEngineGetDomainNameWithDiscovery+0x6a5
/usr/likewise/lib/lsa-provider/
ad_open.so:AD_ServicesDomainWithDiscovery+0x79

72

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Authentication issues resolved in OneFS 7.2.0.2

ID

/usr/likewise/lib/lsa-provider/ad_open.so:AD_AuthenticateUserEx
+0x418
/usr/likewise/lib/liblsaserverapi.so.
0:LsaSrvAuthenticateUserExInternal+0x436
/usr/likewise/lib/liblsaserverapi.so.0:LsaSrvAuthenticateUserEx
+0x4be
/usr/likewise/lib/libntlmserver.so.0:NtlmValidateResponse+0xeb1
/usr/likewise/lib/libntlmserver.so.
0:NtlmServerAcceptSecurityContext+0x10a
/usr/likewise/lib/libntlmserver.so.
0:NtlmSrvIpcAcceptSecurityContext+0x325
/usr/likewise/lib/liblwmsg.so.0:lwmsg_peer_assoc_call_worker+0x20
/usr/likewise/lib/liblwbase.so.0:CompatWorkItem+0x16
/usr/likewise/lib/liblwbase.so.0:WorkThread+0x256
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xec
/lib/libthr.so.3:_pthread_getprio+0x15d
--------------------------------------------------

If an LDAP or NIS provider attempted to authenticate a user with a user ID (UID) of


4294967295, the isi_papi_d process unexpectedly restarted, and lines similar to
the following appeared in the /var/log/messages file:

141947

/usr/lib/libisi_persona.so.1:persona_get_type+0x1
/usr/lib/libisi_auth_cpp.so.
1:_ZN4auth15json_to_personaERKN4Json5ValueERKNS_14lsa_connectionER
KSs+0xc08
/usr/lib/libisi_auth_cpp.so.
1:_ZN4auth15persona_to_jsonERKNS_7personaERKNS_14lsa_connectionEb
+0x62
/usr/lib/libisi_platform_api.so.
1:_ZN4auth15sec_obj_to_jsonERKNS_7sec_objERKNS_14lsa_connectionEbb
+0x178
/usr/lib/libisi_platform_api.so.
1:_ZN18auth_users_handler8http_getERK7requestR8response+0x4c4
/usr/lib/libisi_rest_server.so.
1:_ZN11uri_handler19execute_http_methodERK7requestR8response+0x56e
/usr/lib/libisi_rest_server.so.
1:_ZN11uri_manager15execute_requestER7requestR8response+0x100
/usr/lib/libisi_rest_server.so.
1:_ZN14request_thread7processEP12fcgi_request+0x112
/usr/lib/libisi_rest_server.so.1:_ZN14request_thread6on_runEv+0x1b
/lib/libthr.so.3:_pthread_getprio+0x15d

If a machine password was changed by a node while the lwreg process on another
node was refreshing that node's lsass configuration, the lsass process on the
second node could have cached both the old and new machine passwords. If this
occurred, the lsass process unexpectedly restarted, and clients connected to the
affected node could not be authenticated. In addition, lines similar to the following
appeared in the /var/log/messages file:

141940

/lib/libc.so.7:thr_kill+0xc
/usr/likewise/lib/lsa-provider/
ad_open.so:LsaPcachepEnsurePasswordInfoAndLock+0x9b6
/usr/likewise/lib/lsa-provider/
ad_open.so:LsaPcacheGetMachineAccountInfoA+0x28
/usr/likewise/lib/lsa-provider/
ad_open.so:AD_MachineCredentialsCacheInitialize+0x38
/usr/likewise/lib/lsa-provider/ad_open.so:AD_Activate+0x9d5
/usr/likewise/lib/lsa-provider/ad_open.so:LsaAdProviderStateCreate
+0xb22
/usr/likewise/lib/lsa-provider/
ad_open.so:AD_RefreshConfigurationCallback+0x792
/usr/likewise/lib/liblsaserverapi.so.0:LsaSrvRefreshConfiguration
+0x432
/usr/likewise/lib/lw-svcm/lsass.so:LsaSvcmRefresh+0x209
/usr/likewise/lib/liblwbase.so.0:RefreshWorkItem+0x24
/usr/likewise/lib/liblwbase.so.0:WorkThread+0x256

Authentication

73

Resolved issues

Authentication issues resolved in OneFS 7.2.0.2

ID

/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xec
/lib/libthr.so.3:_pthread_getprio+0x15d

If a cluster that was joined to a Microsoft Active Directory (AD) domain was also
140851
configured with an IPv6 subnet, and if the AD domain controller was configured to
use an IPv6 address, the netlogon process on the cluster repeatedly restarted and
members of the Windows AD domain could not be authenticated to the cluster. If
the netllogon process restarted as a result of this issue, Windows clients might
have received an Access Denied error when attempting to access SMB shares
on the cluster, or they might have received a Logon failure: unknown

user name or bad password message when attempting to log on to the


cluster. In addition, the following lines appeared in the /var/log/messages
file:
Stack: -------------------------------------------------/lib/libc.so.7:thr_kill+0xc
/lib/libc.so.7:__assert+0x35
/usr/likewise/lib/libnetlogon_isidcchooser.so:IsiDCChooseDc
+0xbb3
/usr/likewise/lib/lw-svcm/netlogon.so:LWNetChooseDc+0x27
/usr/likewise/lib/lw-svcm/netlogon.so:LWNetSrvPingCLdapArray
+0x1187
/usr/likewise/lib/lw-svcm/
netlogon.so:LWNetSrvGetDCNameDiscoverInternal+0x72a
/usr/likewise/lib/lw-svcm/netlogon.so:LWNetSrvGetDCNameDiscover
+0x111
/usr/likewise/lib/lw-svcm/netlogon.so:LWNetSrvGetDCName+0xb20
/usr/likewise/lib/lw-svcm/netlogon.so:LWNetSrvIpcGetDCName+0x4f
/usr/likewise/lib/liblwmsg.so.0:lwmsg_peer_assoc_call_worker
+0x20
/usr/likewise/lib/liblwbase.so.0:CompatWorkItem+0x16
/usr/likewise/lib/liblwbase.so.0:WorkThread+0x256
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xec
/lib/libthr.so.3:_pthread_getprio+0x15d
--------------------------------------------------

An issue sometimes occurred that prevented OneFS from retrieving Service


Principal Name (SPN) keys from the cluster's machine password configuration file,
pstore.gc. If this issue occurred, authentication requests failed with an
Access Denied error, and continued to fail until the lwio process restarted.

139654

If the isi_vol_copy_vnx utility, the PermissionsRepair job, SyncIQ, or the


isi_restill utility attempted to replicate an access control entry (ACE) that
contained a Security identifier (SID) with a subauthority of 4294967295, the utility
or job failed. If this occurred, lines similar to the following appeared in
the /var/log/messages file:

138738

Stack:
-------------------------------------------------/boot/kernel.amd64/kernel: /usr/lib/libisi_persona.so.
1:persona_len+0x1
/boot/kernel.amd64/kernel: /usr/lib/libisi_acl.so.1:cleanup_sd
+0x506
/boot/kernel.amd64/kernel: /usr/lib/libisi_acl.so.1:sd_from_text
+0x1f1

Although an LDAP or NIS file provider was configured with a list of unfindable users
through the --unfindable-users option of the isi auth create or isi
auth modify command, a user's groups were still queried for through the LDAP
or NIS provider.
74

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

137897

Resolved issues

Authentication issues resolved in OneFS 7.2.0.2

ID

If an update to Microsoft Active Directory (AD) succeeded, but the subsequent LDAP 137743
query for the new password failed, OneFS did not update the cluster's machine
password configuration file, pstore.gc. As a result, there was a mismatch
between the machine password registered with Active Directory and the machine
password being used by the cluster, and clients attempting to connect to the
cluster could not be authenticated.

Backup, recovery, and snapshots


Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.2

ID

During a parallel restore operation, if only a portion of the restore operation's file
data write was written to disk, the remaining file data from that write could have
been discarded. Because a restore operation writes a maximum of 1 MB of data at
a time, it was extremely unlikely that only a portion of the data would be written to
disk.

142339

Under some circumstances, the NDMP process might have failed to correctly
142075
account for the number of isi_ndmp_d instances running on a node, and the
number of running instances might have exceeded the maximum number allowed.
In some cases, the running instances might have consumed all available resources,
causing a node to unexpectedly reboot, and the running NDMP job to fail. If this
issue occurred, clients connected to the node were disconnected, and lines similar
to the following appeared in the /var/log/messages file:
/boot/kernel.amd64/kernel: pid 56071(isi_ndmp_d), uid 0 inumber
2111 on /tmp/ufp: out ofinodes
isi_ndmp_d[56071]: ufp copy error: failed to open destination
for /tmp/ufp/isi_ndmp_d/4675/gc ==>/tmp/ufp/isi_ndmp_d/.56071.tmp/
gc: No space left on device
isi_ndmp_d[56071]: ufp error: Failed to initialise failpoints for
isi_ndmp_d/56071

If a snapshots expiration time was extended or changed to zero (indicating that the 142072
snapshot never expires) while the snapshot was being deleted, the isi_snapshot_d
process could have missed the expiration change, and, as a result, the snapshot
might have been deleted.
If the --skip_bb_hash option of a SyncIQ policy was set to no (the default
setting) and if a SyncIQ file split work item was split between pworkers, it was
possible for the pworker that was handling the file split work item to attempt to
transfer data that had already been transferred to the target cluster. If this
occurred, the isi_migr_pworker process repeatedly restarted and the SyncIQ policy
failed. In addition, the following lines appeared in the /var/log/messages file:

142058

isi_migrate[45328]: isi_migr_pworker: *** FAILED ASSERTION


cur_len != 0 @ /usr/src/isilon/bin/isi_migrate/pworker/
handle_dir.c:463:
/boot/kernel.amd64/kernel: [kern_sig.c:3376](pid
45328="isi_migr_pworker")(tid=100957) Stack trace:
/boot/kernel.amd64/kernel: Stack:
-------------------------------------------------/boot/kernel.amd64/kernel:
/lib/libc.so.7:__sys_kill+0xc
/boot/kernel.amd64/kernel:
/usr/lib/libisi_util.so.1:isi_assert_halt+0xa0
/boot/kernel.amd64/kernel:

Backup, recovery, and snapshots

75

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.2

ID

/usr/bin/isi_migr_pworker:migr_continue_file+0x1507
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:migr_continue_generic_file+0x9a
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:migr_continue_work+0x70
/boot/kernel.amd64/kernel:
/usr/lib/libisi_migrate_private.so.2:migr_process+0xf1
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:main+0x606
/boot/kernel.amd64/kernel:
/usr/bin/isi_migr_pworker:_start+0x8c
/boot/kernel.amd64/kernel:
-------------------------------------------------/boot/kernel.amd64/kernel: pid 45328 (isi_migr_pworker), uid 0:
exited on signal 6 (core dumped)

If a Collect job had not been run for a long time, snapshots were not processed,
and, over time, they accumulated. As a result, it took longer than expected to
delete files associated with a large number of accumulated snapshots.

141968

It was possible for a successful DomainMark job to leave a SyncIQ domain or a


Snaprevert domain incomplete. If this occurred, the SnapRevert jobwhich might
run during the SyncIQ Prepare Resync job phasefailed, and the following status
message appeared in the SyncIQ job report:

141935

Snapshot restore domain is not ready (unrunnable)

In the OneFS web administration interface, the View Details hyperlink on the Data 141933
Protection > SnapshotIQ > Snapshot Schedules page displayed only one line of
the snapshot schedule settings. As a result, the full details of the schedule were
not available unless the user's mouse hovered outside of the browser window.
Although configuring an NDMP backup job with both the BACKUP_FILE_LIST
environment variable and the BACKUP_MODE=SNAPSHOT environmental variable
negated the effect of setting the BACKUP_MODE=SNAPSHOT environment variable
(faster incremental backups), it was possible to configure a job with both
environment variables. Beginning in OneFS 7.2.0.1, if you configure both
environmental variables, the job does not run, and the following message appears
on the Data Management Application (DMA), on the console, and in
the /var/log/ndmp_debug.log file:

141928

File list and backup_mode(snapshot)is not supported

Under normal circumstances, the retention period applied to WORM-committed


files might differ between SyncIQ source and target clusters. However, if the
retention period applied to a file on a SyncIQ source cluster ended on an earlier
date than the retention period applied to the related file on the target cluster,
incremental SyncIQ jobs failed, and messages similar to the following were logged
in the /var/log/messages file, where <path> is the path to the file on the target
cluster:
Local error : syncattr error for <path>: Read-only file system

This issue occurred because the SyncIQ process attempted to decrease the
retention period of a WORM-committed file, which is not permitted.

76

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

138935

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.2

ID

Beginning in OneFS 7.2.0.2, if the retention date applied to a file on the source
cluster predates the retention date on the target cluster, no attempt is made to
update the retention date on the target cluster during synchronization.
If a SnapRevert job was run on a directory to which both a SyncIQ domain and a
SnapRevert domain were applied, and if the SyncIQ domain was set to read/write
mode, the SnapRevert job failed, and lines similar to the following appeared in
the /var/log/messages file and in the /var/log/isi_migrate.log file :

138780

isi_job_d[20805]: Man
Working(manager_from_worker_stopped_handler, 2012):
Error from worker 2:14-12-03 12:16:50
SnapRevert[409] Node 1 (1) task 2-1:
Snaprevert job finished with status failed: Unable to create and
getfile descriptor for tmp working directory:
Read-only file system(unrunnable)
from snap_revert_item_process(/usr/src/isilon/bin/isi_job_d/
snap_revert_job.c:730)
from worker_process_task_item(/usr/src/isilon/bin/isi_job_d/
worker.c:940)
isi_job_d[20805]:snap_revert_item_process:743: Snap revert job
finished with status failed: Unable to create and
get file descriptor for tmp working directory: Read-only file
system (unrunnable)
isi_job_d[1910]: SnapRevert[409]Fail

Due to a memory leak in the isi_webui_d process, while viewing SyncIQ reports
through the OneFS web administration interface, the isi_webui_d process
unexpectedly restarted. As a result, the OneFS web administration interface
stopped responding, and users who were logged into the OneFS web
administration interface were disconnected and returned to the log-in screen. In
addition, messages similar to the following appeared in the /var/log/
webware-errors file:

138731

isi_webui_d: siq_gc_conf_load: Failed to gci_ctx_new: Could not


allocate parser read buffer: Cannot allocate memory

Cluster configuration
ID

Cluster configuration issues resolved in OneFS 7.2.0.2

If you attempted to reconfigure an existing file pool policy from the OneFS web
143453
administration interface without selecting the disk or node pool in the Storage
Settings section again, an error similar to the following appeared, and the file pool
policy change was not saved:
File Pool Policy Edit Failed The edit to the file pool policy did
not save due to the following error: Invalid storage pool
'<storage-pool-name> (node pool)'

After a cluster that was configured with manual node pools was upgraded, it was
possible for the drive purpose database file (drive_purposing.db) to contain
incorrect node equivalence information for the nodes in the manual node pools.
Because OneFS relies on the information in the drive_purposing.db file when
provisioning nodes, if this issue was encountered, it might have prevented new
nodes from being provisioned.

142026

Cluster configuration

77

Resolved issues

Diagnostic tools
Diagnostic tools issues resolved in OneFS 7.2.0.2

ID

If you ran the isi_gather_info command with the --ftp-port <alt-port> -141922
save-only options, where <alt-port> was the name of the alternate FTP port to set
as the new default, the isi_gather_info command ignored the request, and
used the default FTP port (port 21) instead. As a result, the alternate FTP port
number had to be specified each time the isi_gather_info command was run.
Because the following isi_gather_info command options were processed
immediately before all other command options, the options that followed these
options were sometimes ignored:
l

--verify-upload

--save

--save-only

--re-upload

135541

As a result, the .tar file that is created when the isi_gather_info command
is run might not have been uploaded to Isilon Technical Support, and running the
command sometimes had unexpected results. For example, if you ran the following
command, the --ftp-proxy-host option was ignored:
isi_gather_info --verify-upload --ftp-proxy-host=x

If you ran the isi_gather_info command with the -f optionan option that
enables you to designate a specific directory to gatherand if you specified that
the /ifs/data/Isilon_Support directory should be gathered, the .tar file
that was created by the command could have been extremely large. This issue
occurred because /ifs/data/Isilon_Support is the default temporary
directory that is used to store the .tar files that are created when the
isi_gather_info command is run, and, as such, this directory might contain
previous .tar files that are large in size. In addition, the isi_gather_info -f
command gathers the contents of the /ifs/data/Isilon_Support directory
from each node in the cluster, multiplying the size of the resulting .tar file <x>
times, where <x> is the number of nodes in the cluster.
Note

Beginning in OneFS 7.2.0.1, if you run the isi_gather_info command with the
-f option, and if you specify that the /ifs/data/Isilon_Support directory
should be gathered, the following message appears on the console and the
command does not run:
WARNING: ignored path /ifs/data/Isilon_Support

78

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

135540

Resolved issues

Events, alerts, and cluster monitoring


Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.2

ID

In some cases, a race condition between the I/O request packet (IRP) cancellation
callback function and the IRP dispatch function caused the lwio process to restart.
If the process restarted as a result of this issue, client connections to the cluster
were disrupted, and the following lines appeared in the /var/log/messages
file:

147471

/boot/kernel.amd64/kernel: /lib/libc.so.7:thr_kill+0xc
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwiocommon.so.
0:LwIoAssertionFailed+0xa3
/boot/kernel.amd64/kernel: /usr/likewise/lib/libiomgr.so.
0:IopFltContextReleaseAux+0x79/boot/kernel.amd64/kernel:
/usr/likewise/lib/libiomgr.so.0:IoFltReleaseContext+0x2f
/boot/kernel.amd64/kernel: /usr/lib/libisi_flt_audit.so.1:_init
+0x3b37
/boot/kernel.amd64/kernel: /usr/likewise/lib/libiomgr.so.
0:IopFmIrpCancelCallback_inlock+0x2af

In 7.2.0.1, if a file whose name contained multibyte characters was audited, the
146609
isi_audit_cee process did not decode the file name correctly when it forwarded
audit events to the EMC Common Event Enabler (CEE). As a result, the name of a file
that contained multibyte characters was incorrect within the auditing software.
Some information regarding NFS clients that were being audited, such as the
userID, was omitted from the audit stream. As a result, NFS clients could not be
correctly audited.

138945

Note

Although all of the necessary information regarding NFS clients is now included in
the audit stream, NFS clients might not be correctly audited by some auditing
software.
If memory allocated to the CELOG monitoring process (isi_celog_monitor) became
very fragmented, the isi_celog_monitor process stopped performing any work. As a
result, no new events were recorded, alerts regarding detected events were not
sent, and messages similar to the following were repeatedly logged in
the /var/log/isi_celog_monitor.log file:

138874

isi_celog_monitor[5723:MainThread:ceutil:92]ERROR: MemoryError
isi_celog_monitor[5723:MainThread:ceutil:89]ERROR: Exception in
serve_forever()

Note

Allocated memory is considered fragmented when it is not stored in contiguous


blocks. Memory allocated to the CELOG process is more likely to become
fragmented in environments with frequent configuration changes and in which
many CELOG events are being generated.
On the Cluster Status tab under Monitoring, the Cluster size pie chart did not
display Virtual Hot Spare (VHS) reserved space. VHS reserved space could be
viewed by running the isi status command from the command-line interface.

138737

Due to an error in the newsyslog.conf.1000MB and the newsyslog.conf.


500MB files, the /var/log/nfs_convert.log file was not rotated.

138675

Events, alerts, and cluster monitoring

79

Resolved issues

Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.2

ID

Note

Log files that are not correctly rotated can grow in size, and might eventually fill
the /var partition, which can affect cluster performance.
Because commas were not correctly escaped in the output of the isi
statistics--csv command, if the data returned from the command contained
commas, the commas were treated as separators, and the data could not be
accurately interpreted by third-party monitoring tools.

138613

If users attempted to access a file under an audited SMB share and the attempt
failed, the failed access attempts were not recorded in the audit log. As a result,
these events could not be tracked.

138068

File system issues resolved in OneFS 7.2.0.2

ID

File system
If L3 cache was enabled on a cluster running OneFS 7.2.0.1, it was possible for
147475
OneFS to erroneously report that the journal on one or more nodes was invalid. This
issue was more likely to affect S210 and X410 nodes.
Note

Although OneFS reported that a nodes journal was invalid, the journal was actually
intact. This issue occurred because a OneFS script erroneously detected that the
journal was invalid.
If this issue occurred, the affected node or nodes could not boot, and the following
message appeared on the console:
Checking Isilon Journal integrity...
Attempting to save journal to default location
Warning: /etc/ifs/journal_bad exists. Saving bad journal.
OneFS is unmounted
A valid backup journal already exists. Not saving.
NVRAM autorestore status: Not performed...
Attempting to restore journal from disk backup...
Restore from disk failed
Attempting to save and restore journal to clear any ECC errors in
unused DRAM blocks...
Restore failed
Could not recover journal. Contact Isilon Customer Support
immediately.

On clusters with L3 cache enabled, if you updated SSD firmware by using an Isilon
Drive Support Package (DSP), it was possible to encounter an issue that could
cause data loss. If this issue occurred, data integrity (IDI) issues were reported as
an IDI event, and a critical event notification similar to the following was sent:

146182

Detected IDI failure on LIN 1:001c: 4758::HEAD, lbn 1005511 (fec)


2,12,2760679424:8192 (type user data)

For more information, see article 200097 on the EMC Online Support site.
When a node joins an Isilon cluster, the file system acquires a merge lock in order
to postpone joining the node until running file system operations are complete. In
rare cases, if an AutoBalance, FlexProtect, or MediaScan job was running while a
80

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

144214

Resolved issues

File system issues resolved in OneFS 7.2.0.2

ID

node was joining the cluster, the merge lock was not released in a timely manner,
and the merge lock timed out. If this occurred, the file system could not be
accessed until the issue was resolved. In addition, messages similar to the
following appeared in the /var/log/messages log file, where <time> was the
number of milliseconds that the merge lock was held before timing out:
error

85 from rtxn_exclusive_merge_lock_get after < time> ms

If the lwio-device-srv symbolic link located in the /var/lib/likewise


directory became damaged, the srv service could not start on any nodes in the
cluster. If this occurred, SMB services were unavailable and SMB clients were
unable to connect to the cluster.

142835

Note

When a node is rebooted, srvan lwio drivercreates a symbolic link named


lwio-device-srv in the /var/lib/likewise directory. Beginning in
7.2.0.2, if this symbolic link is damaged, the damaged symbolic link is overwritten
with a functioning copy.
Although the UseDNS parameter was set to no in the /etc/ssh/sshd_config 142087
file, if you connected to a node through SSH, establishing a connection to the node
took longer than expected, approximately 15 seconds. This issue occurred because
the UseDNS no parameter was not enforced.
Note

By default, the UseDNS parameter is set to yes. Setting the parameter to no


specifies that reverse DNS lookups should not be performed. It is typically used to
decrease the length of time it takes to establish an SSH connection to the cluster.
In rare cases, while installing a drive firmware update on a node that contained
SSD drives that were configured to be used for L3 cache, data was sometimes
moved from the SSD drives too slowly, a condition that caused the node to reboot
unexpectedly. If this occurred, the following lines appeared in the /var/log/
messages file:

140906

panic @ time 1418749697.371, thread


0xffffff013ba275b0: l3 slow drain
cpuid = 0
Panic occurred in module kernel loaded at
0xffffffff80200000:
Stack:
-------------------------------------------------kernel:drive_drain_timeout_cb+0x1ca
kernel:softclock+0x2ee
kernel:ithread_loop+0x208
kernel:fork_exit+0x75
--------------------------------------------------

If either L1 or L2 prefetch was disabled for a 4TB file, nodes that handled the file
unexpectedly rebooted while reading the last block of the file. If this issue
occurred, the following FAILED ASSERTION message appeared

140639

in /var/log/messages file:
*** FAILED ASSERTION end_l1 <= max_lbn @
/build/mnt/src/sys/ifs/bam/bam_file.c:1128

File system

81

Resolved issues

File system issues resolved in OneFS 7.2.0.2

ID

Note

L1 and L2 prefetch is disabled on a file by default if the file is managed by a file


pool policy configured with the Optimize for random access data access pattern. L1
and L2 prefetch are also disabled by default if you run the isi set -a command
to configure the random or disabled file access patterns. L1 and L2 prefetch can
also be configured manually through the use of specific sysctl commands. For more
information about configuring file access patterns, see the OneFS 7.1.1 Web
Administration Guide and the OneFS CLI Administration Guide.
While a node was disconnecting from the clusterfor example, while a node was
138487
rebootingit was possible for the node to encounter a race condition that caused a
deadlock between a transaction that was performing a batch operation and the
disconnect operation. If this issue occurred, the node unexpectedly rebooted.
If you deleted a snapshot, the subsequent SnapshotDelete job might not have
137911
deleted all the files. This issue was more likely to occur if the snapshot contained a
very large number of filesfor example, 200,000 or more. If this issue occurred and
if you ran the isi job reports view command to view a job report for an
affected SnapshotDelete job, the output showed that only a portion of the logical
inodes (LINs) were deleted during phase two of the SnapshotDelete job. In
addition, subsequent SnapshotDelete jobs might have taken longer to complete
than expected until all of the residual files were eventually deleted.
Because the Simple Network Management Protocol (SNMP) process is single141927
threaded, and because the default behavior of the snmpget function used by SNMP
was to time out after one second and retry up to five times over six seconds, it was
possible for the snmp process to appear to stop responding to requests from
applications such as Nagios and Cacti.
Note

Beginning in OneFS 7.2.0.1, the snmpget function will time out after ten seconds
and will retry the affected request once.

Hardware
Hardware issues resolved in OneFS 7.2.0.2

ID

If you ran the isi_inventory_tool command with the --startUp option on


148129
an S210 node or an X410 node that contained components with EMC part
numbers, a CTO exception similar to the following appeared on the console and the
node might not have booted correctly:
WARNING: softAVL is missing Isilon P/N(s) for vendor
ID="303-409-000A-00", firmware ID="rp180b04",hwType="nvram",
hwName="LOx NVRAM"
CTO exception --P/N="105-575-001-01" is not valid

This issue occurred because the isi_inventory_tool --startUp command


was configured to handle part numbers that followed an NNN-NNNN-NN formatfor
example, 123-4567-89and EMC part numbers follow a different format.

82

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Hardware issues resolved in OneFS 7.2.0.2

ID

If an InfiniBand host channel adapter (HCA) was unresponsive, data transmission


145424
between nodes in the cluster might have slowed. This condition could also have
caused OneFS to split one or more nodes from the cluster, adversely affecting client
connections to the cluster.
If a drive in an HD400 node was replaced while the drive was in the process of
being smartfailed, and if the node that contained the replaced drive was rebooted
before the smartfail process was complete, the affected node failed to mount
the /ifs partition. If this occurred, a message similar to the following appeared in
the /var/log/messages file:

142946

ifsd[2159]: ifs_work_request: IFS is umounted; exiting


isi_group_change_d: A mounted /ifs is required.
mount_efs: Reporting missing logical drive 55 with guid
54de6e460001d4de 578b6ef62a39f3e1 as DOWN
mount_efs: driveConfIdentifySSD: Error mapping lnum 55 to bay
/boot/kernel.amd64/kernel: [bam_vfsops.c:260](pid
1742="mount_efs")(tid=100145) too many drives: 61
root[2200]: IFS failed to mount. Aborting boot.

This issue occurred because the value assigned to the maximum number of logical
drives allowed was not updated to fully accommodate HD400 nodes. For more
information, see article 198924 on the EMC Online Support site.
On older nodes running OneFS 7.2.0.0 through 7.2.0.1, if the
getNumBatteries() function was called to count the number of NVRAM
batteries in the node, the function did not return the correct number. As a result,
processes that relied on this information might not have performed correctly. For
example, battery tests might not have been correctly configured.

142159

If you ran the isi firmware update command to update node firmware on an
X210 or an X410 node, the update failed and the following error appeared on the
console, where <X> was the number of the node on which the update failed:

142141

ERROR: Node <X>: failed to cold reset car and unable to get
completion code and bit flag

Note

This issue occurred only on X210 and X410 nodes with CMC firmware version 00.0f
or earlier. You can confirm your version of the CMC firmware by logging on to any
node in the cluster and running the following command:
isi firmware status

Under some circumstances, if a node containing self-encrypting drives (SEDs) that


could not be released from ownership was reimaged by using a USB flash drive,
after the USB flash drive was removed, the node failed to boot.

141986

Hardware

83

Resolved issues

Hardware issues resolved in OneFS 7.2.0.2

ID

Note

Beginning in OneFS 7.2.0.2, if a node containing SEDs that cannot be released from
ownership is reimaged by using a USB flash drive, before the node shuts down, the
following messages appear:
Failed to release SEDs, one or more drive(s) will be in SED_ERROR
state after reimage is complete and will require a PSID revert.
This may result in /ifs being unable to mount.
Press Enter to continue

If you press ENTER, the reimage process completes, and the node shuts down.
When the node is subsequently booted, you might be required to manually revert
the affected SEDs to restore the node to normal operation.
If the /var/db/hwmon/isi_hwmon.p file was damaged and you attempted to
start the isi_hwmon service, the service failed to start. In addition, lines similar to
the following appeared in the /var/log/isi_mcp file, confirming repeated
attempts to restart the isi_hwmon service:

141929

isi_mcp[1894]: FAILED on action list'start': action 1/1 SERVICE


isi_hwmon (pid=3722) returned exit status 1
isi_mcp[1894]: Action list 'start' has completed. Releasing
shared lock 0x80393e690 (pid=3722)
isi_mcp[1894]: Executing 'start' actions of service 'isi_hwmon'.
isi_mcp[3937]: Executing '/usr/bin/isi_hwmon -m' command of
actionlist 'start'.
isi_mcp[1894]: FAILED on action list 'start': action 1/1 SERVICE
isi_hwmon (pid=3937) returned exit status 1
isi_mcp[1894]: Action list 'start' has completed. Releasing
shared lock 0x80393e690 (pid=3937)

Note

Beginning in OneFS 7.2.0.2, if a damaged /var/db/hwmon/isi_hwmon.p file is


encountered, the file is recreated and the following message is logged in
the /var/log/isi_hwmon.log file:
Exception while opening /var/db/hwmon/isi_hwmon.p. Reinitializing
file.

Under rare circumstances, a failing drive caused a node to restart unexpectedly. If


this occurred, the following lines appeared in the /var/log/messages file:
Stack: -------------------------------------------------kernel:trap_fatal+0x9f
kernel:trap_pfault+0x293
kernel:trap+0x323
kernel:devstat_end_transaction+0x3c
kernel:g_disk_ioctl+0x20e
kernel:g_part_ioctl+0x94
efs.ko:drv_sync_drive_cache+0xbd
efs.ko:jdr_do_sync_one+0x101efs.ko:kt_main+0x80
kernel:fork_exit+0x7f
--------------------------------------------------

84

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

139718

Resolved issues

Hardware issues resolved in OneFS 7.2.0.2

ID

If the isi_bootdisk_read_test was run on a node, messages related to the test,


including extraneous messages similar to the following, appeared in
the /var/log/messages file on all nodes in the cluster:

139697

isi_bootdisk_read_test: Running bootdisk read test on ad2.


isi_bootdisk_read_test: Running bootdisk read test on ad2.
isi_bootdisk_read_test: Running bootdisk read test on ad2

Beginning in OneFS 7.2.0.1, the preceding messages are no longer logged, and the
relevant messages related to this test appear only on the node on which the
isi_bootdisk_read_test is run.
Even though a drive was smartfailed, physically removed, and replaced, the old
drive appeared in the output of the isi devices list command in a
suspended, smartfailed, or erased state. For example:
Unavailable drives:
Lnum 40
[SUSPENDED]
Unavailable drives:
Lnum 40
[SMARTFAIL]
Unavailable drives:
Lnum 40
[ERASE]

138207

Last Known Bay N/A


Last Known Bay N/A
Last Known Bay N/A

If you then ran the isi devices -a smartfail -d <device> command to


smartfail the drive in question, where <device> is the drive to be smartfailed, an
error similar to the following appeared on the console:
isi: error: Unknown drive: '<device>' does not map to a valid bay
or lnum

After you installed an Isilon Drive Support Package (DSP) on a cluster, the year and 137271
month of the date recorded in the /var/log/isi_dsp_tool.log file was
overwritten. Because the day of the month was not also overwritten, it was possible
for the resulting date to be invalid. For example, the date could have been changed
to February 31st. If this occurred, an error similar to the following appeared on the
console during the post-install verification phase of the installation:
ValueError: day is out of range for month

Although an error appeared, the DSP was successfully installed. For more
information, see article 194343 on the EMC Online Support site.
If the LCD server process was unable to communicate with the LCD on the front
panel of a node, extraneous messages were repeatedly logged in the /var/log/
messages file.

136603

Note

Beginning in OneFS 7.2.0.2, if this condition is encountered, informative messages


will continue to be logged; however, the following message will no longer be
logged:
ERROR lcd.daemon: server Traceback (most recent call last):
File "/usr/local/lib/python2.6/site-packages/isi/ui/lcd/
daemon.py", line 209, in serve_forever
File "/usr/local/lib/
python2.6/site-packages/isi/ui/lcd/display.py", line 849, in
open
File "/usr/local/lib/python2.6/site-packages/isi/ui/lcd/
noritake.py", line 247, in open

Hardware

85

Resolved issues

Hardware issues resolved in OneFS 7.2.0.2

ID

File "/usr/local/lib/python2.6/site-packages/isi/ui/lcd/
noritake.py", line 357, in verifyModel
File "/usr/local/lib/python2.6/site-packages/isi/ui/lcd/
noritake.py", line 350, in waitForResponse
LCDError: LCD did not respond

If you ran the isi firmware status command, OneFS might have
encountered an error while attempting to log a value that was too large. If this
occurred, the following error appeared on the console after running the isi
firmware status command:

135083

File "/usr/local/lib/python2.6/logging/handlers.py",line 804, in


emiterror:
[Errno 40] Message too long

Job engine
Job engine issues resolved in OneFS 7.2.0.2
When certain jobs were run, the isi_job_d process created temporary files in
the /var/tmp directory. Files written to this directory are stored on the clusters
boot flash drives. In rare cases, writing to the boot flash drives could cause
excessive wear and premature boot flash drive failure.
Beginning in OneFS 7.2.0.2, the temporary files are created in the /
ifs/.ifsvar/tmp/jobengine directory.

141951

If a snapshot, or the first of a set of snapshots, was empty when the snapshot
delete job ran, the isi_job_d process failed, and lines similar to the following
appeared in the /var/log/messages log file:

140865

Note

The job recovered after 30 to 60 minutes.


Stack: -------------------------------------------------/lib/libc.so.7:thr_kill+0xc
/usr/lib/libisi_util.so.1:isi_assert_halt+0xa0
/usr/bin/isi_job_d:sdl_lin_range_job_finalize+0x4a7
/usr/bin/isi_job_d:job_virt_finalize+0x51
/usr/bin/isi_job_d:job_phase_done+0xb68
/usr/bin/isi_job_d:coord_task_checkpoint+0x31d
/usr/bin/isi_job_d:coord_from_director_task_done_handler+0x292
/usr/bin/isi_job_d:coord_from_director_handle_task_done+0x55
/usr/bin/isi_job_d:handle_msg+0x15f5
/usr/bin/isi_job_d:coord_main+0xad6
/usr/bin/isi_job_d:main+0xcbf
/usr/bin/isi_job_d:_start+0x8c

Migration
Migration issues resolved in OneFS 7.2.0.2
After an initial VNX data migration, if a source file was replaced by a file that was a 147197
Block Device file or a Character Device file with the same name, the new file was
not copied to the target during the next or subsequent incremental data migrations.
86

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Migration issues resolved in OneFS 7.2.0.2


As a result, the Block Device file or Character Device file was not backed up and the
target cluster contained some files that no longer existed on the source cluster.
The isi_vol_copy_vnx utility did not copy alternate data streams to the target 147004
cluster. As a result, operations that relied on the alternate data streams of migrated
files failed on the target cluster.
After an initial VNX migration, if a file on the source array was replaced by a
symbolic link with the same name, during the next incremental migration, the
symbolic link was not migrated to the target. As a result, the data on the target
cluster did not precisely match the data on the source array.

146151

Networking
Networking issues resolved in OneFS 7.2.0.2
On the Cluster Management > Network Configuration page in the OneFS web
administration interface, if you enabled the int-b interface and the InfiniBand (IB)
internal failover network and specified a valid subnet mask, and then assigned the
same IP address range or overlapping IP address ranges to the int-b network and
the IB failover network, a Subnet overlaps error appeared and you could not

142889

edit the configuration.


Although it is a valid configuration, if the same static route was assigned to
different SmartConnect node pools, messages similar to the following were
repeatedly logged in the isi_flexnet_d.log file:

142068

isi_flexnet_d[1399]: Adding static route <IP address> on


interface: lagg1 via <IP address>

If you configured the auto-unsuspend-delay parameter to prevent


automatically unsuspended nodes from serving requests to a designated IP pool
for a specified period of time, and if a node that was serving requests to that IP
pool was rebooted, the affected node might have remained suspended for a period
of time that was longer than the time period specified by the auto-unsuspenddelay parameter. As a result, DNS replies did not provide the IP address of the
affected node for a longer period of time than was expected.

142065

Note

This issue did not affect nodes that were rebooted following an upgrade.
A race condition sometimes occurred when the isi_flexnet_d and isi_dnsiq_d
141924
processes were both configuring IP addresses. If this condition occurred, the nodes
restarted unexpectedly, and lines similar to the following appeared in
the /var/log/messages file:
Stack: -------------------------------------------------kernel:trap_fatal+0x9f
kernel:trap_pfault+0x287
kernel:trap+0x313
kernel:sysctl_iflist+0x1e7
kernel:sysctl_rtsock+0x200
kernel:sysctl_root+0x121
kernel:userland_sysctl+0x18f

Networking

87

Resolved issues

Networking issues resolved in OneFS 7.2.0.2


kernel:__sysctl+0xa9
kernel:isi_syscall+0x64
kernel:syscall+0x26e
--------------------------------------------------

If the isi networks --dns-servers and the isi networks dnscache


disable commands were run to update the DNS configuration, the updates were
written to the /etc/nsswitch.conf.tmp temporary file before being moved to
the /etc/nsswitch.conf file. Because an error in isi_dns_update prevented
the temporary file from closing, the updated information was not moved to
the /etc/nsswitch.conf file. As a result, messages similar to the following
were repeatedly written to the /var/log/isi_flexnet_d.log file:

141920

isi_flexnet_d: /usr/bin/isi_dns_update caught '<type


'exceptions.AttributeError'>'; traceback:
File "/usr/bin/isi_dns_update", line 240, in main
setDnsInfo(domains, servers, options)
File "/usr/bin/isi_dns_update", line 195, in setDnsInfo nssDirty
= processNsswitchConf(dnsON)
File "/usr/bin/isi_dns_update", line 177, in processNsswitchConf
nnsf.close()
isi_flexnet_d[933]: DNS update script did not exitcleanly (0x4600)

Although the auto-unsuspend-delay timeout parameter was enabled, if the


141917
cluster was configured with dynamic IP address allocation and IP failover, it was
possible for SmartConnect to rebalance IP addresses to a node before the specified
auto-unsuspend-delay timeout period had elapsed. If this occurred, it was
possible for the IP address that clients were using to connect to the cluster to be
moved to the node before all of that node's services were available. Affected
clients might have been disconnected from the cluster or temporarily prevented
from performing tasks related to those services.
If you ran the isi network dnscache statistics command to view the
DNS cache statistics, the DNS cache statistics were not displayed, and an error
message similar to the following appeared on the console:

141587

show statistics
^
error: expecting {cache,cluster,debug,dns,parameters,server}

On the Cluster Management > Network Configuration page of the OneFS web
administration interface, it was possible to configure multiple subnets with the
same gateway priority value, even though gateway priority values must be unique.
If multiple subnet gateways were configured with the same priority value, users
were unable to access the cluster from a client in one subnet, but could
successfully connect to the same cluster from client in a different subnet.

140368

Note

It is not possible to configure multiple subnet gateways with the same priority value
from the command-line interface.
For more information, see article 88862 on the EMC Online Support site.
If a client used statically assigned cluster IP addresses to mount the cluster, and if
that client was connected to the cluster through SMB 2, the client could be

88

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

139170

Resolved issues

Networking issues resolved in OneFS 7.2.0.2


disconnected if the node was rebooted or shut down, for any reason. If this issue
occurred, the client was unable to reconnect to the cluster for 45 to 90 seconds.
Although you could configure three DNS servers through the OneFS web
administration interface, information about the third server was not added to the
local host entry of the /etc/resolv.conf file. As a result, only two of the
configured DNS servers were available, and queries failed if both of those DNS
servers were unavailable.

139044

If a node was configured so that both of its interfaces responded to traffic on a


VLAN and then one interface was later removed from all pools associated with that
VLAN, the interface was not always immediately removed from the VLAN
configuration, and IP addresses were not always immediately disassociated from
removed interface. As a result, clients could temporarily continue to connect to the
affected node through IP addresses assigned to the removed interface.

138727

If you removed a gateway from a subnet, either through the OneFS web
administration interface or the command-line interface, the IP address for the
gateway remained in the routing table. As a result, if you ran the netstat
command to view information about the network configuration, the IP address that
was removed continued to appear in the output.

133973

If source-based routing (SBR) was enabled and static routes were also configured,
it was possible for SBR to override the static routes.

123581

Note

Beginning in OneFS 7.2.0.2, if SBR is enablied and static routes are also
configured, SBR excludes the static routes from SBR management.

NFS
NFS issues resolved in OneFS 7.2.0.2
When an NFSv4 client initiated a request to mount the pseudo file system, the
information that OneFS returned about the file system indicated that the maximum
file size allowed within the system was zero. As a result, some NFSv4 clientsfor
example, AIX 6.1 clientsdid not attempt to mount the file system.

143912

While OneFS was closing an idle client connection to an NFS export, it was possible 142269
to encounter a race condition. If this race condition was encountered, the NFS
server unexpectedly restarted and NFSv4 clients were disconnected from the
cluster. In addition, the following lines appeared in the /var/log/messages
file:
/usr/likewise/lib/lwio-driver/nfs.so:__svc_zc_clean_idle+0x1f7
/usr/likewise/lib/lwio-driver/nfs.so:rendezvous_request+0x7f6
/usr/likewise/lib/lwio-driver/nfs.so:svc_getreq_xprt+0x120
/usr/likewise/lib/lwio-driver/nfs.so:NfsListenerProcessTask+0x3b
0x800f15e5c (lookup_symbol: error copying in Ehdr:14)
0x800f1da9e (lookup_symbol: error copying in Ehdr:14)
0x8014f56bd (lookup_symbol: error copying in Ehdr:14)

If an NFS client that had placed an advisory lock on a system resource


unexpectedly shut down, the lock might not have been released when the client

142074

NFS

89

Resolved issues

NFS issues resolved in OneFS 7.2.0.2


rebooted and reconnected to the cluster. As a result, the locked resources might
have been inaccessible until the lock was manually released.
If you ran a command from an NFSv3 or NFSv4 client to query for files or directories 141533
in an empty folder, and if you included the asterisk (*) or question mark (?)
characters in the command, the query failed and an error message appeared on the
console. For example, if you ran the ls * command, the command failed and the
following error appeared on the console:
ls: cannot access *: Too many levels of symbolic links

If an NFSv4 client sent a request to the cluster while the file system was
unavailablefor example, while nodes were rebootingOneFS returned the wrong
response and did not correctly disconnect the client. If this occurred, lines similar
to the following appeared in the /var/log/messages file:

140511

nfs[8962]: [nfs] SERVERFAULT on v4 operation 9, ntStatus


0xefff0066 (UNKNOWN)

Note

Beginning in OneFS 7.2.0.2, if an NFSv4 client sends a request to the cluster while
the file system is unavailable, the client is disconnected from the cluster and an
informative message is logged inthe /var/log/messages file.
Under some circumstances, although an NFS export was configured to return 32-bit 140372
file IDs for files created within the export, 64-bit file IDs were instead sent to the
client. As a result, the client could not access files on the cluster.
In environments where many NFSv4 clients were reading from and writing to the
139910
cluster, it was possible to encounter a condition that enabled a memory resource to
be over-allocated. If this issue occurred, the following lines appeared in
the /var/log/messages file:
/lib/libc.so.7:thr_kill+0xc
/lib/libc.so.7:__assert+0x35
/usr/likewise/lib/lw-svcm/nfs.so:xdr_iovec_allocate+0x191
/usr/likewise/lib/lw-svcm/nfs.so:svc_zc_getrec+0x1db
/usr/likewise/lib/lw-svcm/nfs.so:svc_zc_recv+0xa1
/usr/likewise/lib/lw-svcm/nfs.so:svc_getreq_xprt+0x11e
/usr/likewise/lib/lw-svcm/nfs.so:NfsSocketProcessTask+0x415
/usr/likewise/lib/liblwbase.so.0:EventThread+0x6b0
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0x100
/lib/libthr.so.3:_pthread_getprio+0x15d

The isi_cbind command did not parse numbers correctly. As a result, the
command could not be used to change settings that required a numeric value.

139008

OneFS web administration interface


OneFS web administration interface issues resolved in OneFS 7.2.0.2
If the name of your cluster started with a capital letter or a lowercase letter a or
letter b, and you clicked Start Capture on the Cluster Management >
Diagnostics > Packet Capture page of the OneFS web administration interface,

90

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

141970

Resolved issues

OneFS web administration interface issues resolved in OneFS 7.2.0.2


the resulting .tar file did not contain the expected network packet capture
(pcap) file, and the .tar file also contained some incorrect content.

SmartLock
SmartLock issues resolved in OneFS 7.2.0.2
On clusters running in compliance mode, the compadmin user did not have
permission to run the newsyslog command. As a result, the compadmin could
not manually rotate OneFS log files.

141953

SMB
SMB issues resolved in OneFS 7.2.0.2
In some cases, while the lwio process was shutting down on a node (because it
147473
was manually or automatically restarted), the lwio SRV component waited
indefinitely for a file object to be freed and did not shut down. If this occurred, after
5 minutes, the SRV service was stopped by the lwsm process and then
automatically restarted. SMB clients were unable to connect to the affected node
until the SRV service restarted.
Distributed Computing Environment (DCE) Remote Procedure Calls (RPCs) that were 147470
sent to the cluster in big-endian byte order were not correctly handled. As a result,
clients with CPUs designed to format RPCs in big-endian byte orderincluding
PowerPC-based clientswere unable to communicate with the cluster. For
example, PowerPC-based clients running Mac OS 10.5 and earlier were unable to
connect to SMB shares. If a packet capture was gathered to diagnose this issue, an
nca_invalid_pres_context_id RPC reject status code appeared in the
packet capture.
Although path names that are up to 1024 bytes in length are supported in OneFS
144100
7.2.0.x, if a user who was connected to the cluster from an SMB client attempted to
rename a file on the cluster in Windows Explorer, and if the full path to the renamed
file was greater than 255 bytes in length, the file was not renamed and the
following error appeared:
The file name(s) would be too long for the destination folder.
You can shorten the file name and try again, or try a location
that has a shorter path.

If you ran the isi smb settings shares modify command with the -revert-impersonate-user option to restore the --impersonate-user
option applied to a share to the default value, the command did not take effect
until the lwio process was restarted.

142066

After upgrading a cluster to OneFS 7.2.0.0 through OneFS 7.2.0.1, Linux and Mac
clients connecting to the cluster through SMB 1 were unable to view or list SMB
shares. If an affected Linux client attempted to list shares the following error
appeared:

142060

NT_STATUS_INVALID_NETWORK_RESPONSE

SmartLock

91

Resolved issues

SMB issues resolved in OneFS 7.2.0.2


If an affected Mac client attempted to view shares in the Finder, an error similar to
the following appeared:
There was a problem connecting to the server.

As a result, SMB shares were not accessible to those Linux and Mac clients.
If an SMB2 client sent a compound request to the cluster, OneFS did not send the
correct response. As a result, the client was disconnected from the cluster.

141961

In rare instances, if an SMB1 echo request was received on an SMB2 connection,


the lwio process restarted unexpectedly. If the lwio process restarted, SMB clients
connected to the cluster were disconnected, and messages similar to the following
appeared in the /var/log/messages file:

141943

/boot/kernel.amd64/kernel: [kern_sig.c:3376](pid 30325="lwio")


(tid=100436) Stack trace:
/boot/kernel.amd64/kernel:
Stack: -------------------------------------------------/boot/kernel.amd64/kernel: /lib/libc.so.7:thr_kill+0xc /boot/
kernel.amd64/kernel: /usr/likewise/lib/liblwbase_nothr.so.
0:__LwRtlAssertFailed+0x13c
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvProtocolExecute2+0x115f
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvProtocolTransport2DriverDispatchPacket+0x2f2
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvProtocolTransportDriverNegotiateData+0xe4a
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvSocketProcessTaskReadBuffer+0x485
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvSocketProcessTaskRead+0x36
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvSocketProcessTask+0x53f
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:RunTask+0x8d
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:ProcessRunnable+0x95
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:EventLoop+0xeb
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:EventThread+0x3f
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:LwRtlThreadRoutine+0x8e
/boot/kernel.amd64/kernel: /lib/libthr.so.3:_pthread_getprio+0x15d

If the SMB2Symlinks option was disabled on the cluster and a Windows client
141323
navigated to a symbolic link that pointed to a directory, under some circumstances,
the system returned incorrect information about the symbolic link. If this occurred,
the symbolic link appeared to be a file, and the referenced directory could not be
opened.
In addition, because OneFS 7.2.0.1 did not consistently check the OneFS registry to
verify whether the SMB2Symlinks option was disabled, in some cases, although
the SMB2Symlinks option was disabled, the lwio process attempted to handle
symbolic links when it should have allowed them to be processed by the OneFS file
system. If this occurred, the following error appeared on the client:
The symbolic link cannot be followed because its type is disabled.

138763
If both the antivirus Scan files when they are opened option and the
SMBPerformance Settings Oplocks option were enabled, and a file was opened,
modified, and closed multiple times through an application such as Microsoft

92

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

SMB issues resolved in OneFS 7.2.0.2


Excel, it could take 30 seconds longer than expected for the system to save
changes to the file.
If you attempted to create an SMB share of the /ifs/.snapshot directory or one 138594
of its subdirectories through the OneFS web administration interface or the
command-line interface, an error similar to the following appeared:
'/ifs/.snapshot' is under '/ifs/.snapshot': Invalid argument

If an SMB client attempted to access an application through a symbolic link that


contained Unicode characters, a backslash (\) followed by a zero (0) was
sometimes appended to the symbolic link. As a result, the symbolic link did not
lead to its intended target, and the application could not start.

137822

In Microsoft Windows, if you ran the mklink command to create a symbolic link to
a file or directory in an SMB share on the cluster, the command failed and the lwio
process sometimes unexpectedly restarted, if the name of the symbolic link began
with a colon (:). In addition, the following error appeared on the console:

137820

The specified network name is no longer available

An issue sometimes occurred that prevented access to absolute paths to files


through symbolic links. If this issue occurred, the link failed to return the file, and
the requested file could not be opened.

137772

Because OneFS did not respond correctly to a specific Local Security Authority
(LSA) request made by Mac OS 10 clients running Mac OS 10.6 through 10.10, the
ACLs and POSIX owner applied to an affected share could not be viewed from Mac
OS 10 clients running those versions.

135560

Upgrade and installation


Upgrade and installation issues resolved in OneFS 7.2.0.2
During a OneFS upgrade, there was a window of opportunity during which the
array.xml file on some nodes in the cluster could have contained out-of-date
version information. If a node whose array.xml file was out-of-date sent
messages to a node whose array.xml file was current, the affected node
exhibited unexpected behavior, such as random group changes.

146937

Note

Although the array.xml file on the affected node contained out-of-date


information about the version of OneFS installed on the node, the node was
successfully upgraded. The unexpected node behavior was resolved when the
array.xml file was eventually updated.
If this issue occurred, messages similar to the following appeared on the console:
/boot/kernel.amd64/kernel: [gmp_rtxn.c:2636](pid 5052="kt: gmpconfig")(tid=100178) gmp config took 0s
/boot/kernel.amd64/kernel: [gmp_info.c:1734](pid 5052="kt: gmpconfig")(t/boot/kernel.amd64/kernel: id=100178) group change:
<8,1787394> [up: 6 nodes, down: 123 nodes, shutdown_read_only: 3
nodes] (no change)

Upgrade and installation

93

Resolved issues

Upgrade and installation issues resolved in OneFS 7.2.0.2


/boot/kernel.amd64/kernel: [gmp_info.c:1735](pid 5052="kt: gmpconfig")(tid=100178) new group: <8,1787394>: { 8:0-11,13-22,
11:0-23, 46,55,70,93:0-11, down: 2, 4-7, 9-10,12-45, 47-54,
56-69, 71-92, 94-131, shutdown_read_only: 84, 91, 126, diskless:
100-108, 119-120, 123 }
/boot/kernel.amd64/kernel: [gmp_rtxn.c:2636](pid 5052="kt: gmpconfig")(tid=100178) gmp config took 0s
/boot/kernel.amd64/kernel: [gmp_info.c:1734](pid 5052="kt: gmpconfig")(tid=100178) group change: <8,1787395> [up: 6 nodes,
down: 123 nodes, shutdown_read_only: 3 nodes] (no change)

If a OneFS upgrade was performed while nodes were down, the SmartPools portion
of the upgrade failed without presenting an error or logging a CELOG event. If this
issue occurred, new nodes could not be added to the cluster and nodes that were
removedfor example, nodes that were smartfailedcould not be re-added to the
cluster.
If you encountered this issue, and you ran the following command, the disk pool
version listed was not correct for the version of OneFS to which the cluster was
upgraded:

139285

isi_for_array -s 'sysctl efs.bam.disk_pool_db | grep version'

Note

The correct disk pool version for clusters running OneFS 7.2.0.x is version 8.
If a USB flash drive with a bootable image of OneFS was attached to a node while
the node was being smartfailed, the partition table on the flash drive became
damaged. As a result, the node could not boot from the flash drive after it was
smartfailed, and the image on the flash drive was unusable.

110337

Virtual plug-ins
Function
al area

Virtual plug-ins issues resolved in OneFS 7.2.0.2

ID

Virtual
plug-ins

Attempts to register a OneFS 7.2.0 cluster as a VASA provider failed if


the cluster had no iSCSI LUNs configured, and, following the failed
registration, portions of the OneFS web administration interface became
inaccessible. In addition, the httpd process unexpectedly restarted and
the following lines appeared in the /var/crash/httpd.log file:

138741

prodisi1-6(id6) /boot/kernel.amd64/kernel: [kern_sig.c:


3349](pid 52204="httpd")(tid=100097)
Stack trace:prodisi1-6(id6)/boot/kernel.amd64/kernel:
Stack:
-------------------------------------------------prodisi1-6(id6) /boot/kernel.amd64/kernel:
/usr/lib/
libisi_vasa_service.so:_ZNK15vasa_db_manager34get_associa
ted_ports_for_processorERKSt6vectorISsSaISsEERKSsPP9isi_e
rror+0xcaprodisi1-6(id6) /boot/kernel.amd64/kernel:
/usr/lib/
libisi_vasa_service.so:_ZNK16vasa_server_impl32query
AssociatedPortsForProcessorEPP9isi_errorP38_ns4__query
AssociatedPortsForProcessorP46_ns4__queryAss
+0x504prodisi1-6(id6)
/boot/kernel.amd64/kernel:

94

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Function
al area

Virtual plug-ins issues resolved in OneFS 7.2.0.2

ID

/usr/lib/libisi_vasa_service.so:_Z39__ns5__query
AssociatedPortsForProcessorP4soapP38_ns4__query
AssociatedPortsForProcessorP46_ns4__query
AssociatedPortsForProce+0x102prodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/lib/
libisi_vasa_service.so:_Z50soap_serve___ns5__query
AssociatedPortsForProcessorP4soap+0xf7prodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/lib/libisi_vasa_service.so:_Z10soap_serveP4soap
+0x58prodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/local/apache2/modules/libmod_gsoap.so:_init
+0x1b66prodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/local/apache2/bin/httpd:ap_run_handler
+0x72prodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/local/apache2/bin/httpd:ap_invoke_handler
+0x7eprodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/local/apache2/bin/httpd:ap_process_request
+0x18eprodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/local/apache2/bin/httpd:ap_process_http_connection
+0x13dprodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/local/apache2/bin/httpd:ap_run_process_connection
+0x70prodisi1-6(id6)
/boot/kernel.amd64/kernel:
/usr/local/apache2/bin/httpd:worker_thread
+0x24bprodisi1-6(id6)
/boot/kernel.amd64/kernel:
/lib/libthr.so.3:_pthread_getprio+0x15d

Resolved in OneFS 7.2.0.1


Antivirus
Antivirus issues resolved in OneFS 7.2.0.1

ID

AVScan reports were deleted from the OneFS system 24 hours after the job
successfully completed because the end date for the reports was incorrectly set to
1970-01-01.

113563

Note

Detected threats could still be viewed through the AVScan database.

Authentication
Authentication issues resolved in OneFS 7.2.0.1

ID

If different nodes in a cluster were connected to different network subnets and if


those subnets were assigned to different Active Directory sites, the site
configuration information on the cluster was repeatedly updated. Because updates

138750

Resolved in OneFS 7.2.0.1

95

Resolved issues

Authentication issues resolved in OneFS 7.2.0.1

ID

to the site configuration information require a refresh of the lsass service, this
behavior caused authentication services to become slow or unresponsive.
On a cluster with multiple access zones configured that was upgraded from OneFS
7.0.x or earlier to OneFS 7.2.0.0, if you attempted to create a local user from the
command line interface or through the OneFS web administration interface in an
access zone other than the System access zone, an error similar to the following
appeared, and the user could not be added to the access zone:

135537

Failed to add user <username>: SAM database error

Intermittently, incoming SMB sessions were successfully authenticated and


received the correct username, but were mapped to the wrong SID. As a result,
audit logs associated the incorrect SID with the affected user and the affected user
was denied access to their files. To resolve the problem, the lsass process had to
be restarted on all nodes in the cluster.

135182

If you ran a recursive chmod command to add, remove, or modify an access control 134860
entry (ACE) to a directory that contained files that were quarantined by an antivirus
scan, the command stopped running when it encountered a quarantined file. As a
result, ACEs were only modified on the files and directories that were processed
before the command stopped running.
In the OneFS web administration interface, if you created a user mapping rule that
134825
contained incorrect syntax related to the use of quotation marks, the following error
appeared when you attempted to save the updated Access Zone Details:
Your access zone edit was not saved
Error #1: Rules parsing failed at ' ': syntax error, unexpected
QUOTED, expecting BINARY_OP or UNARY_OP

In addition, future attempts to create mapping rules sometimes failed.

Backup, recovery, and snapshots


Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.1

ID

A SyncIQ job configured with the --disable_stf option set to true sometimes
failed when an sworkera process responsible for transferring data during
replicationdetected differences between files on the source and target clusters
and then attempted to access and update the linmap database. If a SyncIQ job
failed as a result of this issue, the following error appeared in the
isi_migrate.log file:

132579

A work item has been restarted too many


times. This is usually caused by a network failure or a persistent
worker crash.

If a Multiscan or Collect job was running, it was possible for the job to attempt to
update the snapshot tracking file (STF) for a snapshot at the same time that a write
was made to a file under that snapshot. If this occurred, and if the STF file
contained a large number of files (in the millions), it was possible for the Multiscan
or Collect job to fail to account for some blocks of data in the STF file, or to account

96

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

138403

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.1

ID

for some blocks of data more than once. If this issue occurred, errors similar to the
following appeared in the /var/log/idi.log file:
Malformed block history: marking free block

or
Malformed block history: freeing free block

Note

In addition to the errors that were logged, a coalesced event appeared in the list of
new events on the Dashboard > Events > Summary page in the OneFS web
administration interface. The event ID, which can be found by clicking View
details in the Actions column, was 899990001, and the message was as follows:
File system problems detected

The NDMP process ignored the protocol version setting in the config.xml file. As 135187
a result, only NDMP version 4 messages were accepted and sent.
In environments with a large number of configured SyncIQ policies, the
isi_classic sync job report and isi_classic sync list
commands sometimes took several minutes to return a list of SyncIQ reports.

135183

The NDMP process unexpectedly restarted after attempting to back up a symbolic


link that referenced a file whose name contained EUC-JP encoded characters. If the
NDMP process restarted as a result of this issue, the in-progress backup job failed.

134846

If the paths added to the NDMP EXCLUDE or FILES environment variables exceeded 134845
the maximum length allowed1024 charactersthe affected backup job would fail
and an error similar to the following appeared in the ndmp_debug.log file:
ERRO:NDMP fnmmatching.c:413:isi_fnm_is_valid_pattern
Exclude pattern longer than 1024 limit

Note

The maximum length allowed is now handled by the Data Management Application
(DMA).
In rare circumstances, the isi_snapshot_d process failed due to an internal error
but the process would not exit. As a result, it was not possible to create new
scheduled snapshots or to recover previous versions of snapshot files created by
the scheduling system, and the following error message appeared in
the /var/log/isi_snapshot_d.log file, where [####] is the PID for the
isi_snapshot_d service:

134808

isi_snapshot_d[####]: Unable to manage orphaned snapshots:


Socket is not connected

In environments with a large number of configured SyncIQ policies, the isi sync
job report and isi sync list commands sometimes took several minutes
to return a list of SyncIQ reports.

134429

Backup, recovery, and snapshots

97

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.1

ID

SmartLock compliant files and directories that were backed up through an NDMP
file list back up could not be restored to a SmartLock domain. This issue occurred
because the selected files were not backed up in SmartLock compliance mode. If
this issue occurred, lines similar to the following appeared in the
ndmp_debug.log file:

134227

Restoring NDMP files from <source_smartlock_domain> to [See line


below]
Restoring NDMP files from [See line above] to
<target_smartlock_domain>
DAR disabled - continuing restore without DAR
Attempting normal restore.
Cannot extract non-Compliant archive entry to a SmartLock
Compliance directory.

Cluster configuration
Cluster configuration issues resolved in OneFS 7.2.0.1

ID

Lwio subscriptions held by the isi_gconfig_d process were not always released in a
timely manner. As a result, the subscriptions sometimes accumulated. If a large
number of subscriptions accumulated, it sometimes took a long time to release
these resources back to the system and it was possible for the isi_gconfig_d
process to become unresponsive until the operation was complete. Because the
isi_gconfig_d process is responsible for maintaining SMB share configuration
information, if this issue occurred, SMB clients were prevented from viewing or
creating shares, and messages similar to the following appeared in
the /var/log/lwiod.log file:

139741

lwio[4814]: StoreChangesWatcherThreadRoutine store error


subscription request failed: could not update local database:
cluster database(revision 0) older than local database (revision
3)lwio[83454]: StoreChangesWatcherThreadRoutine store error did
not
get response from server

Command-line interface
Command-line interface issues resolved in OneFS 7.2.0.1

ID

If you ran the isi status -d -w command in an environment with long pool
names, the pool names broke into multiple lines in the outputas many as were
needed to fit into the table. Because the table was not widened to accommodate
the pool name, this caused issues with scripts that parse the output in the table.

134717

Events, alerts, and cluster monitoring

98

Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.1

ID

The safe.id.nvram onsite verification test (OVT) did not include support for the
version 2.1 MLC NVRAM card model. As a result, the safe.id.nvram test failed and

139905

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.1

ID

errors similar to the following appeared on the console and in the /


ifs/.ifsvar/ovt log files:
[safe.id.nvram]
: NVRAM card detected: /dev/mnv0
: NVRAM battery voltages okay
FAILED : NVRAM Rev: 5 (should be 3)

If you edited or added a notification rule, the first six configurable events listed on
the Edit Notification Rule and Add Notification Rule pages were related to
CloudPools, a feature that was not available on the cluster.

136709

If an Simple Network Management Protocol (SNMP) request was sent to a node to


which multiple IP addresses were assigned, the reply to that request could have
been returned from an IP address that differed from the address to which the
request was sent.

135006

Note

In some environments, such as those configured with a firewall, replies received


from an address other than the address to which a request is sent are unrecognized
and rejected. If the reply to an SNMP request is rejected because the IP address
isn't recognized, the SNMP request fails.
On clusters where a large number of events were regularly logged, events were
sometimes logged faster than the EMC CEE Event Forwarder (isi_audit_cee)
was able to forward them. If this occurred, a backlog of events waiting to be
forwarded could have developed and might have continued to grow.

134420

File system issues resolved in OneFS 7.2.0.1

ID

Under rare circumstances, the FlexProtect and FlexProtectLin jobs left pointers to
blocks on a node or a drive that was no longer in the cluster. If a file was partially
truncated during a repair job (the job that is responsible for removing nodes or
drives), there was a narrow window where, if a further unlikely circumstance
occurred (such as a node reboot or a temporary network issue that affected backend network connections between nodes), then some snapshot data might have
been left under-protected. A subsequent mark job (such as MultiScan or
IntegrityScan) would then log attempts to mark blocks owned by a snapshot of the
truncated file on the node or drive that was no longer on the cluster. As a result,
messages similar to the following appeared in the /var/log/idi.log
and /var/log/messages files, where <Node>,<Drive> identified the device that
was no longer in the cluster:

139723

File system

Marking a block on gone node or drive: Marking block


<Node>,<Drive>,98820513792:8192 on a gone drive.

In addition, running the isi events list command displayed messages


similar to the following, where <instanceID> is the instance ID value:
<instanceID> 01/30 16:25 -detected

Filesystem problems

File system

99

Resolved issues

File system issues resolved in OneFS 7.2.0.1

ID

And running the isi events show <instanceID> -w command displayed


coalesced events similar to the following:
ID: <instanceID>
Coalesced events:
(l 1::HEAD b 2,0,311296:8192, Marking a block on gone node or
drive)
(l 1::HEAD b 2,0,311296:8192, Accessing a gone drive on mark)

Note

This information is also available on the Dashboard > Events > Cluster Events
Summary page in the OneFS web administration interface. Contact EMC Isilon
Technical Support immediately if you see these messages on the console or in the
web administration interface.
If protocol auditing was enabled and the NFS auditing service was running, the NFS 136061
service failed to start. As a result, data access through NFS was limited. In addition,
the following NFS statuses appeared in the output after running the lwsm list |
grep nfs command:
flt_audit_nfs
nfs
onefs_nfs

[driver]
[driver]
[driver]

running
stopped
stopped

After adding a node to a large cluster that had L3 Cache enabled, some nodes in
the cluster might have unexpectedly rebooted.

136031

If there were millions of back end batch messages in a single batch initiator on a
node, the counter in the batch data structure sometimes reached the maximum
allowed value. If this occurred, the affected node could have rebooted
unexpectedly, causing clients connected to the node to be disconnected, and a
message similar to the following appeared in the var/log/messages

135828

log file:NULL msg context for rbid

In the OneFS web administration interface, if you increased the size of an existing
iSCSI LUN, OneFS did not include the space already used by the LUN when
calculating how much space the LUN would occupy after the LUN was resized. As a
result, the web administration interface would display a Size exceeds

134851

available space on cluster error even if there was sufficient space to


accommodate the larger LUN. For example, on a 10 GB cluster configured with a 5
GB LUN and 5 GB of available space, if you attempted to increase the size of the 5
GB LUN to 6 GB, OneFS would calculate the amount of space needed for the 6 GB
LUN based on the 5 GB of available space, and would return the error.
If an Integrity Scan was run on a damaged, mirrored file, the node checking the file
unexpectedly rebooted, and lines similar to the following appeared in
the /var/log/messages file and on the console:
Stack:
-------------------------------------------------kernel:isi_assert_halt+0x42
efs.ko:bam_verify_file_data_mirrors+0xdd5
efs.ko:bam_verify_file_data+0x611
efs.ko:bam_mark_file_data+0x6a8
efs.ko:ifs_mark_file_data+0x373

100

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

134725

Resolved issues

File system issues resolved in OneFS 7.2.0.1

ID

efs.ko:_sys_ifs_mark_file_data+0x14c
kernel:isi_syscall+0x53
kernel:syscall+0x1db
--------------------------------------------------

In addition, a FAILED ASSERTION message similar to the following appeared


in the /var/log/messages file and clients connected to the affected node were
disconnected when the node rebooted:
*** FAILED ASSERTION error != 0 @ /build/mnt/src/sys/ifs/bam/
bam_verify.c:1144:

In the OneFS command-line interface, the descriptions of some sysctl options


referred to incorrect time units. For example, the description of the
efs.bam.av.scanner_wait_time sysctl option indicated that the

134217

assigned value represented the number of milliseconds that the scanner thread
would sleep, when the value actually represented the number of operating system
ticks that the thread would sleep. The descriptions of the following sysctl options
have been updated to reflect the correct information:
l

efs.bam.av.scan_on_open_timeout

efs.bam.av.scan_on_close_timeout

efs.bam.av.batch_scan_timeout

efs.bam.av.nfs_request_expiration

efs.bam.av.scanner_wait_time

efs.bam.av.nfs_worker_wait_time

efs.bam.av.av_opd_restart_sleep

Note

To view the description of a sysctl option, run the following command where
<option> is the option whose description you want to view:
sysctl d <option>

Hardware
ID

Hardware issues resolved in OneFS 7.2.0.1

X210 and X410 nodes that were configured to communicate through a 10 GigE
138521
network interface card that was using the Broadcom NetXtreme Ethernet (BXE)
driver that was introduced in OneFS 7.2.0 might have restarted unexpectedly. If this
occurred, a message similar to the following appeared in the var/log/
messages file:
Node panicked with Panic Msg: sleeping thread 0xffffff04692a0000
owns a nonsleepable lock

Because the isi_inventory_tool command could not handle part numbers


with more than 11 digits, if you ran the isi_inventory_tool --

137173

Hardware

101

Resolved issues

Hardware issues resolved in OneFS 7.2.0.1

ID

configCheck command on an HD400 node (a node that uses a new part number
format with more than 11 digits), the part number could not be processed, and
errors similar to the following appeared on the console:
Unexpected exception:

<type 'exceptions.TypeError'>

If you attempted to install a drive support package (DSP) while the /ifs partition
was not mounted, the following lines appeared on the console:

136710

File "/usr/bin/isi_dsp_install", line 730, in <module> rc = main()


File "/usr/bin/isi_dsp_install", line 701, in main installed =
dsp_installed()
File "/usr/bin/isi_dsp_install", line 593, in dsp_installed info
= isi_pkg_info()
File "/usr/bin/isi_dsp_install", line 188, in isi_pkg_info
error("%s: rc=%d%s" % (estring, rc, rc and ':' or ''))
NameError: global name 'estring' is not defined

Note

Beginning in OneFS 7.2.0.1, if you attempt to install a DSP when the /ifs partition
is not mounted, the following error appears:
ERROR: Cannot check if DSP is installed. Please ensure /ifs is
mounted.

If you ran the isi firmware update command on an HD 400 node and it
included updating the Chassis Management Controller (CMC) device firmware
along with other devices, the firmware update process might have failed. If the
process failed, errors similar to the following appeared on the console:

136039

Error uploading firmware block, compcode = d5|


Error in Upload FIRMWARE command [rc=-1]
TotalSent:0x10
Firmware upgrade procedure failed

HDFS
HDFS issues resolved in OneFS 7.2.0.1

ID

If Kerberos is enabled, a Cloudera 5.2 client cannot connect to datanodes that do


138484
not have Simple Authentication and Security Layer (SASL) security enabled, unless
the datanode service is running on a port lower than port 1024. Because OneFS did
not support SASL security for datanodes and because OneFS ran the datanode
service on port 8021, Cloudera 5.2 clients could not connect to the cluster. If a
Cloudera 5.2 client was unable to connect for this reason, errors similar to the
following might have appeared in log files on the client:
java.io.IOException: Cannot create a secured connection if
DataNode listens on unprivileged port (8021) and no protection
is defined in configuration property dfs.data.transfer.protection.

102

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

HDFS issues resolved in OneFS 7.2.0.1

ID

If a HAWQ client attempted to connect to HDFS over Kerberos, the connection and
authentication process failed and an error similar to the following was logged in
the /var/log/isi_hdfs_d.log file:

137967

Requested identity not authenticated identity.

If an application, such as Cloudera Impala, queried OneFS for information about


137303
support for HDFS ACLs, OneFS did not respond correctly. As a result, the application
that sent the query unexpectedly stopped running and the following message
appeared in the /var/log/messages file:
isi_hdfs_d: Deserialize failed: Unknown rpc: getAclStatus

During read operations, an HDFS client sometimes closed its connection to the
135859
server before reading the entire message received from the server. Although closing
connections in this manner did not cause any issues on the cluster, if this occurred,
the following message appeared multiple times in the isi_hdfs_d.log file:
Received bad DN READ ACK status: -1

If a user ran the hdfs dfs -ls command to view the contents of a directory on
the cluster, files to which the user did not have read access did not appear in the
output of the command.

135858

In a Kerberos environment, applications (including Hive, Pig, and Mahout) that


made multiple and simultaneous HDFS connections through the same user
sometimes encountered authentication errors similar to the following:

135644

Job Submission failed with exception


'org.apache.hadoop.ipc.RemoteException
(Delegation Token can be issued only with kerberos or web
authentication)'

Because OneFS did not properly handle requests from HDFS clients if the requests
contained fields that the OneFS implementation of HDFS did not support, affected
clients were unable to write data to the cluster. If this issue occurred, a
java.io.EOFException error similar to the following appeared on the client:

135568

[user@hadoop-client]$ hdfs dfs -put file.txt


/14/12/19 12:53:36 WARN hdfs.DFSClient: DFSOutputStream
ResponseProcessor exception for block isi_hdfs_pool:
blk_4297916419_1000
java.io.EOFException: Premature EOF: no length prefix
available
at
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.j
ava:2203)
at
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readField
s(PipelineAck.java:176)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer
$ResponseProcessor.run
(DFSOutputStream.java:867)
put: All datanodes 10.7.135.55:8021 are bad.
Aborting...

HDFS

103

Resolved issues

HDFS issues resolved in OneFS 7.2.0.1

ID

In addition, lines similar to the following appeared in the /var/log/messages


file:
2014-12-19T12:53:34-08:00 <1.3> cluster-1(id1) isi_hdfs_d:
Malformed packet, dropping. DN ver=28, packet seqno=0, payload
len: 1476487168, crc len = 0 data len: 0)
2014-12-19T12:53:34-08:00 <1.3> cluster-1(id1) isi_hdfs_d: Error
while receiving packet #0

Under some circumstances, the isi_hdfs_d process handled the return value of a
135185
system call incorrectly, causing the HDFS process to restart. If this occurred, HDFS
clients were disconnected from the affected node, and the following error appeared
in the isi_hdfs_d.log file:
FAILED ASSERTION pr >= 0

During read operations, an HDFS client sometimes closed its connection to the
135184
server before reading the entire message received from the server. Although closing
connections in this manner did not cause any issues on the cluster, if this occurred,
the following message appeared multiple times in the isi_hdfs_d.log file:
Received bad DN READ ACK status: -1

If a Hadoop Distributed File System (HDFS) client attempted to perform a recursive 134863
operation on a directory tree, a race condition sometimes occurred in the
isi_hdfs_d process which caused the process to restart unexpectedly. This race
condition was most frequently encountered while an HDFS client was recursively
deleting directories. If the isi_hdfs_d process unexpectedly restarted as a result of
this condition, HDFS clients connected to the affected node were disconnected and
messages similar to the following might have appeared in the /var/log/
isi_hdfs_d.log file:
isi_hdfs_d: RPC delete raised exception:
Permission denied from rpc_impl_delete
(/usr/src/isilon/bin/isi_hdfs_d/rpc_impl.c:484)
from _rpc2_delete_ap_2_0_2 (/usr/src/isilon/bin/isi_hdfs_d/
rpc_v2.c:811)

Job engine
Job engine issues resolved in OneFS 7.2.0.1

ID

If a cluster was experiencing heavy client traffic, OneFS might have significantly
limited the amount of cluster resources that job engine jobs were allowed to
consume, causing jobs to run very slowly.

136193

Migration issues resolved in OneFS 7.2.0.1

ID

After performing an initial full migration from a VNX array to an Isilon cluster
through isi_vol_copy_vnx, if a hard link was deleted from the source VNX

135028

Migration

104

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Migration issues resolved in OneFS 7.2.0.1

ID

array and a new file with the same name was then created on the source array, it
was possible for the data from the new file to be improperly copied to the hard link
on the target cluster. This issue occurred because the isi_vol_copy_vnx utility
copied data from the new file into the pre-existing hard link when it should have
deleted the hard link from the target cluster, and then created the new file on the
target cluster. If this occurred, the new file was not accessible on the target cluster.
If the isi_vol_copy utility was unable to resolve on-disk identities associated
with data being migrated to a OneFS cluster, the operation timed out. If the
operation timed out, the correct user and group information might not have been
applied to the migrated data, and valid users and groups might not have had
access to the data following the migration. In addition, messages similar to the
following appeared on the console and in the /var/log/messages file:

134715

Warning: Unable to convert security descriptor blob, bytes:328


err:60[Operation timed out] Error after looking up ACL: no sd
aclino 56974197 for ./bde_1.22.0/snapshot/groups/bas/group/
bas.cap, inode 32017462, err:Operation timed out

If you ran the isi_vol_copy utility to migrate files from a NetApp filer to an Isilon 134434
cluster, and the ACL setting Deny permission to modify files with
DOS read-only attribute over both UNIX (NFS) and Windows
File Sharing (SMB) was enabled, incremental migrations might have failed to
transfer some files to which the DOS read-only attribute was applied. If this
occurred, errors similar to the following appeared in the isi_vol_copy.log file:
./dirX/fileY.txt: cannot create file: Operation not permitted

Networking
Networking issues resolved in OneFS 7.2.0.1

ID

The OneFS web administration interface allowed the same IP address range or
overlapping IP address ranges to be assigned to the int-a and int-b interfaces and
the InfiniBand internal failover network. If a cluster was configured with the same
or overlapping IP address ranges, nodes sometimes displayed unexpected
behavior or unexpectedly rebooted.

136888

Note

Beginning in 7.2.0.1, the IP ranges for the int-b interface and the InfiniBand
internal failover network cannot be configured until a valid Netmask has been
specified.
The rate of data transfer to and from nodes that were configured with link
aggregation on their 10GbE network interfaces in combination with a maximum
transfer unit (MTU) of 1500 was sometimes slower than the rate of data transfer to
and from nodes that were not configured in this way.

136887

If SmartConnect zone aliases were configured on a Flexnet pool, a memory leak that 136704
could affect several processes related to the SyncIQ scheduler was sometimes
encountered. If this memory leak occurred, scheduled SyncIQ jobs did not move to

Networking

105

Resolved issues

Networking issues resolved in OneFS 7.2.0.1

ID

the running state, and lines similar to the following appeared in the
isi_migrate.log file:
isi_migrate[6923]: sched: siq_gc_conf_load: Failed to
gci_ctx_new: Could not allocate parser read buffer: Cannot
allocate memory

As a result, SyncIQ jobs in a scheduled state never moved to the running state.
If a new node was added to a cluster that was configured for dynamic IP allocation,
SmartConnect did not detect the configuration change and did not assign the new
node an IP address. As a result, clients could not connect to the affected node. If a
group change occurred after the new node was added, or if IP addresses were
manually rebalanced by running the isi networks --sc-rebalance-all
command, SmartConnect then detected the configuration change and assigned an
IP address to the new node.

136295

Because the driver for the 10 GbE interfaces on the A100 Accelerator nodes was
out-of-date, the interfaces sometimes unexpectedly stopped transferring data. If
you ran the ifconfig command to confirm the status of an affected interface, a
no carrier message appeared, even if a cable in good working order was

136293

connected to the interface. To restore functionality, the affected node had to be


rebooted.
By default, OneFS assigned an IPv6 IP address to the loopback interface, down
interfaces, and ifdisabled interfaces. As a result, AAAA (IPv6) requests were sent to
DNS servers. If AAAA requests were sent to a DNS server that was not configured to
respond to them, the following error was returned:

135193

Server Failure

This affected the performance of applications running on the cluster that performed
large numbers of DNS lookups, such as mountd.
If an IPv4 SmartConnect zone was a subdomain of another SmartConnect zone (for
example, name.com and west.name.com), clients that sent a type AAAA (IPv6) DNS
request for the subdomain zone received an NXDOMAIN (nonexistent domain)
response from the server. This response could have been cached for both type A
(IPv4) and type AAAA requests. If this occurred, future DNS requests for the
subdomain zone (in this example, west.name.com) could also receive an
NXDOMAIN response, preventing access to that SmartConnect zone.

135173

If a network interface that had IP addresses assigned to it by the Flexnet process


failed, the IP addresses were not failed over to another node or interface. As a
result, a Failed to open BPF message would appear in the var/log/

134723

messages file, and the interfaces had to be manually removed from the pool.

NFS
NFS issues resolved in OneFS 7.2.0.1

ID

If all of the following factors were true, a user with appropriate POSIX permissions
was denied access to modify a file:

141210

106

The user was connected to the cluster through NFSv3.

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

NFS issues resolved in OneFS 7.2.0.1


l

The user was a member of a group that was granted read-write access to the
file through POSIX mode bit permissions, for example, -rwxrwxr-x (775).

The user was not the owner of the file.

ID

Depending on how the file was accessed, errors similar to the following might have
appeared on the console:
Permission denied

or
Operation not permitted

For more information, see article 197292 on the EMC Online Support site.
If users were being authenticated through a Kerberos authentication mechanism,
NFS export mapping rules such as map-root and map-user were not being enforced
for those users. As a result, the file permissions check was not correct, and users
might have had incorrect allow or deny file access permissions.

139001

If the NFS server was unable to look up a user through the expected providerfor
example, if the LDAP provider was not accessiblethe NFS server did not attempt
to look up the user in the local database, but instead mapped the user to the
nobody (anonymous) user account. As a result, some users were denied access to
resources that they should have had access to.

138784

Due to a memory leak, each time an NFS client registered or unregistered through
Network Lock Manager (NLM), some memory was allocated but never returned to
the system. Over time, this behavior could have caused a node to run out of
available memory, which would have caused the affected node to unexpectedly
reboot. If a node unexpectedly rebooted, clients connected to that node were
disconnected.

137261

If an NFS export that was hosting a virtual machine's (VM) file system over NFSv3
became unresponsive, the VM's file system became read-only.

136637

If the OneFS NFS server was restarted, it assigned client IDs to NFS clients
beginning with client ID 1. As a result, in environments with very few NFS clients, it
was possible for a client to be assigned the same client ID before and after the NFS
server was restarted. If this occurred, the NFS client did not begin the necessary
process to recover from the loss of connection to the NFS server, and the NFS client
became unresponsive.

136365

If a network or network provider became unavailable, the LDAP provider might have 135780
evaluated some error conditions incorrectly, causing inaccurate or empty netgroup
information to be cached and distributed to nodes in the cluster. If incorrect or
empty netgroup information was distributed, LDAP users could not be
authenticated and could not access the cluster.
If the isi_nfs4mgmt tool was called to manage clients on a node that had
thousands of NFSv4 clients connected, the NFS service unexpectedly restarted,
causing a brief interruption in service, and lines similar to the following appeared
in the /var/log/messages file:

135690

[ Several possibly unrelated calls ]


/usr/likewise/lib/lwio-driver/nfs.so:xdr_nfs4client+0x45
/usr/likewise/lib/lwio-driver/nfs.so:xdr_reference+0x42

NFS

107

Resolved issues

NFS issues resolved in OneFS 7.2.0.1

ID

/usr/likewise/lib/lwio-driver/nfs.so:xdr_pointer+0x74
/usr/likewise/lib/lwio-driver/nfs.so:xdr_nfs4client+0x114
/usr/likewise/lib/lwio-driver/nfs.so:xdr_reference+0x42
/usr/likewise/lib/lwio-driver/nfs.so:xdr_pointer+0x74
/usr/likewise/lib/lwio-driver/nfs.so:xdr_nfs4client+0x114
/usr/likewise/lib/lwio-driver/nfs.so:xdr_reference+0x42
/usr/likewise/lib/lwio-driver/nfs.so:xdr_pointer+0x74
[ repeats many times ]

While the NFS service was being shut down, it could have attempted to use memory 135528
that was already freed. If this occurred, the NFS service restarted. Because the
service was being shut down, there was no impact to client services.
In environments with NFSv4 connections, the 30-second lease time setting for the
vfs.nfsrv.nfsv4.lockowner_nolock_expiry sysctl was not properly
applied by the OneFS NFS server if locks were held for a very brief duration. As a
result, the server prematurely timed out lock owners, causing the server to send an
NFS4ERR_BAD_STATEID error to the client. In some cases, affected NFS

135467

clients were temporarily prevented from accessing one or more files on the cluster.
Because the NFS refresh time was in the range of 10 minutes per 1000 NFS exports, 135222
if you had thousands of exports, there was a significant delay before changes and
additions became effective. This delay might have adversely affected NFS
workflows.
If you ran the isi nfs exports create command with the --force option
to force the command to ignore bad hostname errors, the command also ignored
export rule conflicts. As a result, it was possible to create two exports on the same
path with different rules. For example, you could create two exports of the /ifs/
data directory where export 1 was set to read-write permissions and export 2 was
set to read-only permissions. If an NFS client connected to the /ifs/data export,
either rule could have been applied, resulting in an inconsistent experience for the
client.

135217

During the NFS export host check, although an IPv6 address (AAAA) was not
135192
configured on the node, AAAA addresses were searched. As a result, during startup,
mountd would be very slow to load export configurations that referred to many
client hosts.
On systems with thousands of NFS exports, it might have taken several minutes to
list the exports with the isi nfs export list command.

135111

If you attempted to modify thousands of exports using the isi nfs export
modify command, the following error appeared:

135107

RuntimeError: Incomplete response from server.

In addition, the export might or might not have been modified.


Note

Increasing the --timeout value did not resolve this issue.


Due to a race condition between an NLM unlock message and a lock completion
134452
callback message, it was possible for the primary delegate to unregister and
destroy LKF client entries that the backup delegates retained, causing the lock data
for the affected client to become inconsistent. If this occurred, lock requests from

108

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

NFS issues resolved in OneFS 7.2.0.1

ID

NFS clients to the affected node sometimes timed out and messages similar to the
following appeared in the /var/log/messages file:
lkfd_simple_waiter_backup_resp_cb: Unregister for client: 0x<lkfclient-id> failed with error: 16

If the cluster was handling many client requests from clients connected through
different protocols (for example, both SMB and NFS clients), contention for filesystem resources sometimes caused delays in client request processing. If the
processing of client requests was delayed, kernel resources might have been
reserved more quickly than they were released until all resources were eventually
consumed, and then the node restarted unexpectedly.

133963

OneFS API issues resolved in OneFS 7.2.0.1

ID

OneFS API
Because the RESTful Access to the Namespace (RAN) API process was not case136526
sensitive, if you queried for a directory or file name through the RAN API, it was
possible for the query to return the wrong file. For example, if the file system
contained a file named AbC.txt and a file named abc.txt, a query for AbC.txt
might have returned abc.txt instead.
If a user with an RBAC role was deleted from Active Directory and then the role that
the user belonged to was modified, an erroneous entry was added to the sudoers
file. As a result, if a user ran the sudo command, a syntax error similar to the
following appeared:
sudo:
sudo:
sudo:
sudo:

135186

>>> /usr/local/etc/sudoers: syntax error near line 86 <<<


parse error in /usr/local/etc/sudoers near line 86
no valid sudoers sources found, quitting
unable to initialize policy plugin

Consequently, the sudo command could not be used.


Resolves an issue where a Role-based Access Control (RBAC) privilege was
incorrectly applied.

134445

If a namespace API query used the max-depth query parameter to discover the
number of files and subdirectories in the /ifs/home directory, the query
sometimes returned only a portion of the contents of the directory. In other cases,
the query returned the entire contents of the directory. If either result was returned,
the object_d job unexpectedly restarted.

134416

OneFS web administration interface


OneFS web administration interface issues resolved in OneFS 7.2.0.1

ID

In the OneFS web administration interface, on the Cluster Diagnostics > Gather
Info page, if you clicked the Start Gather button to collect and send log files to
EMC Isilon Technical Support and the file upload failed, the Gather Status bar
indicated that the gather succeeded. However, no .tgz file was created and new
gathers could not be started.

134854

OneFS API

109

Resolved issues

OneFS web administration interface issues resolved in OneFS 7.2.0.1

ID

In the OneFS web administration interface, if the cluster time zone was changed,
the new date and time set on the cluster was sometimes incorrect. If the new date
and time set on the cluster was significantly different than the correct date and
time in the selected time zone, the difference could prevent the cluster from
properly communicating or synchronizing with external systems, such as Active
Directory domain controllers.

134426

SmartLock issues resolved in OneFS 7.2.0.1

ID

SmartLock
In compliance mode, the compadmin role did not have read permissions for several 134422
log files, including the isi_papi_d and isi_papi_d_audit log files. As a
result, the log files were not collected during the isi_gather_info process.

SmartQuotas
SmartQuotas issues resolved in OneFS 7.2.0.1

ID

If a default-user quota existed on a directory where the user did not have a linked
quota, and you modified the default-user quota to clear a threshold and then again
to set a threshold, the user quota domain was not created, and the following
message appeared if the isi quota quotas create command was run,
where <username> was the name of the specific user:

135225

Creating:
user:<username>@snaps=no@/ifs/data/ec_workareas FAIL
!! Failed to create domain 'user:<username>@snaps=no@/ifs/data/
ec_workareas': Failed to save
!! domain: Invalid argument

As a result, cluster space could not be allocated to specific user data.


If a quota was created with a hard threshold, and the hard threshold was cleared
during a quota modify operation, the --enforced option remained enabled. As a
result, it was also possible to enable the --container option, although the -container option applies only to hard thresholds.

134213

If you applied a default-user quota to a directory, and then attempted to create a


user quota on the same directory by using the isi quota quotas create
command, the operation failed and the following message appeared:

133641

Failed to save domain: File exists

SMB

110

SMB issues resolved in OneFS 7.2.0.1

ID

If a Windows client that was connected to the cluster through SMB copied a file
from the cluster, the timestamp metadata applied to the file might have become
invalid. This issue occurred because OneFS did not properly interpret the value

142313

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

SMB issues resolved in OneFS 7.2.0.1

ID

assigned to a file's timestamp metadata if the value was set to -1, which is a valid
value. Workflows that rely on timestamp metadata might have been negatively
affected by this issue.
Note

The SMB protocol specifies that, when file attributes are set, a value of -1 indicates
that the attribute in the corresponding field must not be changed.
For more information, see ETA 198187 on the EMC Online Support site.
On a Microsoft Windows client, if you attempted to delete a file from an SMB share 139852
and the letter case of the file path that you wanted to delete did not exactly match
the actual letter case of the share path, the file was not deleted, and, if lwio logging
was increased to the DEBUG level, the following messages appeared in
the /var/log/lwiod file:
Status: STATUS_OBJECT_NAME_NOT_FOUND

Note

Other file operations (Read, Write, and Rename) work as expected.


After the lwio process unexpectedly restarted, the process could no longer
138694
communicate with the srvsvc service. As a result, SMB clients that were connected
to the cluster could not view or list shares, and SMB shares on the cluster could not
be managed from a Windows client.
After the NetBIOS Name Service (NBNS) was enabled, the service failed to listen on
port 139. As a result, clients that relied on NBNS could not establish a connection
to the cluster.

136889

NetBIOS requests sent over SMB 2 were not properly handled. As a result, the lwio
process unexpectedly restarted and lines similar to the following appeared in
the /var/log/messages file:

135468

/lib/libc.so.7:thr_kill+0xc
/usr/likewise/lib/liblwbase_nothr.so.0:__LwRtlAssertFailed+0x5a
/usr/lib/libisi_ntoken.so.1+0x23d673:0x808490673
/usr/lib/libisi_ntoken.so.1+0x243b4e:0x808 496b4e
/usr/lib/libisi_ntoken.so.1+0x2453c5:0x8084983c5
/usr/lib/libisi_ntoken.so.1+0x2e450f:0x80853750f
/usr/likewise/lib/liblwbase.so.0:EventThread+0x333
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xec
/lib/libthr.so.3:_pthread_getprio+0x15d

If SRV logging was enabled in the OneFS registry, incoming SMB 1 requests caused
the lwio process to unexpectedly restart. If the lwio process restarted, SMB clients
connected to the cluster were disconnected.

134224

OneFS permitted the use of forward slashes in path names, and, in OneFS, forward
slashes within an SMB request were converted to backslashes. This behavior did
not comply with the SMB protocol, which specifies that such a request should fail
and return the following error:

132952

OBJECT_NAME_INVALID

SMB

111

Resolved issues

SMB issues resolved in OneFS 7.2.0.1

ID

Note

OneFS 7.2.0.1 and later versions of OneFS comply with the SMB protocol. If an SMB
request that contains a forward slash is received, an OBJECT_NAME_INVALID error
is returned.
Because OneFS sent an incorrect response to a NetBIOS session request, the
132574
request to connect was closed and the NetBIOS client could not connect to the
cluster. If the session request was closed, lines similar to the following appeared in
the packet capture:
10.0.0.35 10.0.0.100
10.0.0.100 10.0.0.35
10.0.0.35 10.0.0.100
10.0.0.100 10.0.0.35

NBSS 138 Session request


TCP 66 netbios-ssn > 51660 [FIN, ACK]
TCP 66 51660 > netbios-ssn [FIN, ACK]
TCP 66 netbios-ssn > 51660 [ACK]

Virtual plug-ins
Virtual plug-ins issues resolved in OneFS 7.2.0.1

ID

Due to an error that occurred when drive capacity was checked during the creation 133546
of a new OneFS 7.2.0.0 cluster through the OneFS 7.2.0.0 simulator, after creating
a cluster on a system running Microsoft Windows or on a Microsoft Windows virtual
machine, the new cluster did not boot up, and the following messages appeared on
the console:
mount_efs:
mount_efs:
mount_efs:
IFS failed

Reading GUID from da2s1e: No such file or directory


update_ifs_drives: No drive available to mount.
OneFS: Operation not supported by device
to mount. Aborting boot.

Resolved in OneFS 7.2.0.0


Antivirus
Antivirus issues resolved in OneFS 7.2.0.0

ID

If the ID of an antivirus scan report was more than 15 characters long, the OneFS
web administration interface and command-line interface would report the job as
running forever. Any threats detected by the scan would not be associated with the
correct policy.

125535

If you ran the isi avscan report purge command while an antivirus scan
was running, OneFS would sometimes delete the report of the antivirus scan that
was currently in progress.

125534

A syntax error in the .xml file from which AVScan reports are generated caused

125526

reports accessed from the Data Protection > Antivirus > Reports page in the
OneFS web administration interface to not include threats that appeared on the
Data Protection > Antivirus > Detected Threats page.

112

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Authentication
Authentication issues resolved in OneFS 7.2.0.0

ID

Users assigned to the admin group were able to reuse a previously used password
immediately even if the Password History Length option was configured to
prevent the reuse of a specified number of previously used passwords.

130656

Users with assigned roles could not access the Cluster Management >
Diagnostics page because permission to access the Diagnostics page was
assigned only to the ISI_PRIV_SYS_SUPPORT privilege.

130342

OneFS now defaults to LDAP paged search if both paged search and Virtual List
View (VLV) are supported. If paged search is not supported and VLV is enabled on
the LDAP server, OneFS will use VLV when returning the results from a search.

130171

Note

In most cases, bind-dn and bind-password must be enabled in order to use VLV.
If a mapping rule contained a username with a space, mapping tokens would fail,
which prevented users from joining.

130024

Because the lsass process could not distinguish between different trust domains
130003
that shared the same NetBIOS name, role-based authentication would fail when
clients that were connected to the cluster through SSH, CIFS, or the web
administration interface tried to access the identically-named domains. As a result,
the identically-named domains were inaccessible.
If the dup() function (a function that duplicates a file descriptor) failed, no error was 128435
returned to the lsass process. As a result, the lsass process attempted to pass a
nonexistent file descriptor to the lwio process. If this condition was encountered,
there was a potential for SMB clients to be temporarily prevented from
authentication on the cluster.
If you changed the machine name for the local provider (system zone) to include
periods or commas, errors similar to the following were logged in the /var/log/
messages file when an administrator attempted to create new users from the
command line:

123878

Failed to add user <usename>: Invalid Ldap distinguished name


(DN)

While the lwio process was in the process of shutting down, it sometimes
referenced a data structure that no longer existed. If this occurred, the following
lines were logged in the /var/log/messages file:

123397

Stack: -------------------------------------------------/lib/libthr.so.3:_pthread_mutex_lock+0x1d
/usr/likewise/lib/lwio-driver/onefs.so:OnefsAsyncTableGet+0x1f
/usr/likewise/lib/lwio-driver/onefs.so:OnefsAsyncUpcallCallback
+0x58
/usr/lib/libisi_ecs.so.1:oplocks_event_dispatcher+0xb9
/usr/likewise/lib/lwio-driver/onefs.so:OnefsOplockChannelRead+0x8c
/usr/likewise/lib/liblwbase.so.0:EventThread+0x333
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xec
/lib/libthr.so.3:_pthread_getprio+0x15d
--------------------------------------------------

Authentication

113

Resolved issues

Authentication issues resolved in OneFS 7.2.0.0

ID

When the RequireSecureConnection over LDAP setting was enabled, connection to 114935
the LDAP server failed because the StartTLS command was not sent from the cluster
to the LDAP server.

Backup, recovery, and snapshots


Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.0

ID

If one or more objects, such as a file or directory, were moved out of the scope of a
SyncIQ policys root path between two sequential snapshots, subsequent
ChangelistCreate jobs for those two snapshots failed and errors similar to the
following appeared in the isi_job_d log file:

133809

Error 70 finding path of 4297707296


Error in task 1-1: Stale NFS file handle
Worker 8 Busy: (worker_process_task, 1092) ChangelistCreate[296].
0: Error in task 1-1: 14-09-05 12:30:17 ChangelistCreate[296]
Node 1 (1) task 1-1: Stale NFS file handle
from
dir_get_dirent_path (/build/mnt/src/isilon/lib/isi_migrate/migr/
utils.c:1891)
from dir_get_utf8_str_path (/build/mnt/src/
isilon/lib/isi_migrate/migr/utils.c:1996)
from
changelist_add_change_entry (/build/mnt/src/isilon/bin/isi_job_d/
changelist_job.c:1369)
from changelist_item_process (/
build/mnt/src/isilon/bin/isi_job_d/changelist_job.c:1142)

Additionally, ChangelistCreate jobs could fail in a similar manner if the SyncIQ


policy root path was set to /ifs.
If you ran a ChangelistCreate job, multiple nodes sometimes unexpectedly
rebooted, and lines similar to either of the following sets appeared in
the /var/log/messages file:

133504

Stack: -------------------------------------------------kernel:isi_assert_halt+0x2e
kernel:btree_leaf_get_key_at_or_before+...
kernel:sbt_txn_get_entry_at+...
kernel:_sys_ifs_sbt_get_entry_at+0x2a7
kernel:isi_syscall+0x7fkernel:syscall+0x325
-------------------------------------------------*** FAILED ASSERTION pct.num < pct.den @ /build/mnt/src/sys/ifs/
btree/btree_leaf.c:2418
Stack: -------------------------------------------------kernel:isi_assert_halt+0x2e
kernel:btree_leaf_get_key_at+0x15c
kernel:sbt_txn_get_entry_at+0x287
kernel:_sys_ifs_sbt_get_entry_at+0x2a7
kernel:isi_syscall+0x7fkernel:syscall+0x325
-------------------------------------------------*** FAILED ASSERTION pct.num < pct.den @ /build/mnt/src/sys/ifs/
btree/btree_inner.c:2532

114

A SyncIQ job sometimes failed while handling a file with a hard link if the hard link
referred to a file that no longer existed.

131302

If you created a Smartlock directory, set the retention date to "forever," and then
attempted to restore the directory through NDMP, the NDMP job failed and the

131138

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.0

ID

following assertion failure message appeared in the/var/log/messages and


the /var/log/isi_ndmp_d files:
Assertion failed: (date >= 0), function pax_attribute, file
archive_read_support_format_tar.c, line 1644.

A race condition caused parallel restores to fail.

131001

In environments where a large number of SyncIQ policies were configured (several


hundred), SyncIQ policies were not listed in the OneFS web administration
interface and isi sync commands that list policies sometimes failed with the
following error:

130756

CLI timeout exceeded while waiting for the server to respond; the
request still may have completed.

A SyncIQ job configured with the --disable_stf option set to true sometimes
failed when an sworker (a process responsible for transferring data during
replication) detected differences between files on the source and target clusters
and then attempted to access and update the linmap database.
If a SyncIQ job failed as a result of this issue, the following error appeared in the
isi_migrate.log file:

130340

A work item has been restarted too many times. This is usually
caused by a network failure or a persistent worker crash.

If a SyncIQ policy designated a target directory that was nested within the SyncIQ
target directory of a preexisting policy, an error occurred during SyncIQ protection
domain creation which caused the SyncIQ policy's protection domain to be
incomplete. If this occurred, the following message appeared in the /var/log/
isi_migrate.log file:

130337

create_domain: failed to ifs_domain_add

In addition, if you ran the isi domain list-lw command, the Type field for the
affected SyncIQ target was marked Incomplete.
SyncIQ requests from the OneFS command-line interface and web administration
130000
interface repeatedly opened and closed the reports.db SQlite database. As a result,
changes made in the web administration interface would not take effect and
commands run from the command-line interface might not return results and
eventually failed.
If a large number of replication policies existed on the cluster, the isi sync
policies list command might timeout before the command completed.

129999

The pthread-cancel process would sometimes fail without releasing the resources it 129997
contained. As a result, other processes stopped indefinitely.
If the number of file system event snapshots exceeds the amount of space
allocated by the OnefsEnumerateSnapshots buffer, the lwio process will restart on
various nodes, causing clients to be disconnected and then reconnected to the
cluster.

125571

Backup, recovery, and snapshots

115

Resolved issues

Backup, recovery, and snapshots issues resolved in OneFS 7.2.0.0

ID

If your environment included more than 16 interfaces, NDMP backups would


sometimes fail with the following error message:

125536

ERRO:NDMP custom.c:129:createIPList MAX_INTERFACES exceeded.

Replication reports created for the first run of a replication policy sometimes
contained inaccurate values. All other replication reports were accurate.

122906

Due to a memory leak in the isi_papi_d process, the process would sometimes stop 120509
responding. As a result, SyncIQ policies would not be listed in the OneFS web
administration interface or after running the isi sync policy list command
from the command-line interface.

Cluster configuration
Cluster configuration issues resolved in OneFS 7.2.0.0

ID

If an unprovisioned drive was physically removed from a node without first being
smartfailed and the isi_drive_d process was subsequently restarted (either
manually or automatically), OneFS attempted to reprovision the removed drive,
preventing new drives and nodes from being provisioned. As a result, new drives
and nodes could not be added to the cluster.

132913

Default SmartPools jobs incorrectly scanned configuration information for all files
on a cluster. As a result, SmartPools jobs progressed for days, but did not
complete.

132309

Events, alerts, and cluster monitoring


Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.0

ID

Duplicate object identifiers (OIDs)in the ISILON-TRAP-MIB.txt file prevented


the use of certain SNMP monitoring tools.

131035

Some events were configured with invalid variable bindings (the association
between a variable name and its value). As a result, SNMP alerts were not sent for
these events.

130621

If a Windows 8.1 client or a Windows server 2012R2 SMB2 client requested file
system volume and attribute information from the cluster and the maximum
response length requested by the client was too small to hold the entire response,
the affected node would return a STATUS_BUFFER_TOO_SMALL response

130589

instead of the expected STATUS_BUFFER_OVERFLOW response. The client


was unable to handle this response, and, as a result, the request failed. This issue
was typically encountered while attempting to open a file with Notepad.
On nodes that support the new non-disruptive drive firmware updates (NDFU)
feature, if the CELOG process checked the state of a drive while a drive firmware
update was in progress, erroneous drive is ready to be replaced
alerts were sometimes issued. NDFU is supported on the following nodes: S200,
S210, X200, X400, X410, 108NL, and NL400.

116

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

130155

Resolved issues

Events, alerts, and cluster monitoring issues resolved in OneFS 7.2.0.0

ID

While upgrading or reimaging a node, the


isi_first_post_ifs_merged_configs file was not generated during
system start due to a condition requiring the existence of the
drive_gconfig.gc file. As a result, Drive Support Package (DSP) firmware
CELOG alerts were not generated.

130128

The flt_audit lwio filter driver would fail to audit SMB traffic on files with non-ASCII
characters in their names. As a result, these files were not audited, and Failed

130011

to allocate memory for a path component errors were displayed


in the /var/crash/lwiod.log file, even when memory was available.
Scripts that called isi commands would sometimes cause a large number of the
following error message to appear in the /var/log/messages file:

129143

expander_inquiry: attempt xx got no ses0 inquiry


data

Access to files with non-ASCII characters in their names was not audited. A client
could access and modify such a file without problem, but the action would cause
an error in the audit filter driver and the following message would display in the
lwiod.log file:

127508

ERROR:flt_audit:0x805c02560:SyncGetFileName():audit_info_util.cpp:
563: Failed to allocate memory for a path: UNKNOWN'

If you ran the isi statistics heat command with the --events option, the 125549
output was not filtered correctly.
Incorrect SNMP traps were sent for some alerts. The alerts were sent with the
correct alert level but indicated that the wrong threshold had been exceeded. For
example, when a high temperature threshold was exceeded, a critical SNMP trap
was sent, however the trap stated that the low temperature threshold was
exceeded.

125541

SNMP traps were not sent if two or more SNMP recipients were defined in an event
notification rule.

125537

File system issues resolved in OneFS 7.2.0.0

ID

File system
Whenever the asynchronous delete operation (an operation which deletes files in
132921
the background while the user can run other OneFS operations) finished before all
the data was deleted, the synchronous delete path reverted a file back to
asynchronous delete. As a result, the asynchronous delete operation became stuck
in an endless loop, and multiple nodes attempted to delete the file at the same
time. This resulted in performance issues for the user.
A race condition could be encountered if Network Lock Manager (NLM) received a
would_block lock request from an NFS client just before a group change began. If
the race condition was encountered, a node could have been prevented from
leaving the group and the node that prevented the group change could become
unavailable. If a node became unavailable, client connections to the affected node
timed out or were unresponsive.

131724

File system

117

Resolved issues

File system issues resolved in OneFS 7.2.0.0

ID

If a node was leaving the cluster at the same time that the node received a lock
request from an NFS client, a lock failover (LKF) waiter might be created. If this
occurred, the affected node was prevented from leaving the cluster and would
unexpectedly restart.

131198

It was possible to configure the overcommit limit below the low and high
overcommit thresholds (which is an invalid configuration). If the overcommit
thresholds were configured incorrectly, nodes sometimes ran out of memory and
unexpectedly rebooted.

130363

Note

The overcommit thresholds are set through the following sysctl settings:
l

vfs.nfsrv.rpc.request_space_overcommit

vfs.nfsrv.rpc.request_space_high

vfs.nfsrv.rpc.request_space_low

When reads were attempted at 4 MB or larger that cross a 16 GB boundary, the


following assertion failure appeared in the /build/mnt/src/sys/ifs/ifm/
ifm_pg_cache.c:915 file for certain read size/offset combinations. As a result,
the node could restart unexpectedly.

129562

FAILED ASSERTION lk_range_is_contained(&pg_range, lock_range)

If you were running a data-width-changing restripe on a 4 TB file, a node might


unexpectedly reboot.

128750

If the filename related to a change notify event was not a valid UTF-8 string, an
assertion error would sometimes occur, resulting in the lwio process restarting.

128077

If the event count for change events exceeded the 32-bit counter limit, multiple
124720
nodes might reboot unexpectedly and lines similar to the following would appear in
the kernel stack trace:
kernel:isi_assert_halt+0x42
efs.ko:bam_event_synchronize+0x7f5
efs.ko:ifs_vnop_wrapunlocked_write_mbuf+0x612
kernel:recvfile+0x6e7
kernel:isi_syscall+0x64
kernel:syscall+0x26e

If a customer-issued API call was used to look up the lock status of any given file in 122759
the cluster at the exact same time that a node was being taken off the cluster, and
that node was the one tasked with the API verification, the entire cluster might have
become unable to serve any NFS or SMB connections for about five minutes.

118

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

File transfer
File transfer issues resolved in OneFS 7.2.0.0

ID

If an HTTP client sent a request to the cluster through the Apache WebUI service,
the following message appeared repeatedly in the /var/log/apache2/
webui_httpd_error.log file:

130648

Requested service
"WK" doesn't match authenticated session services.

If the httpd process was handling a large number of client connections, the process 127190
sometimes unexpectedly restarted while accepting a connection from an http
client. If the process restarted, http clients connected to the affected node were
disconnected from the cluster.

Hardware
Hardware issues resolved in OneFS 7.2.0.0

ID

In some cases, when an X410 or S210 node was configured for the first time,
during the initial boot-up process the node did not boot completely, and the
following error messages appeared on the console:

131674

Bad Magic in superblock: 0


Failed to read journal during scan: No such file or directory
Test journal exited with an error.
Aborting boot.

For more information about this issue, see KB 190590 on EMC Online Support.
If a battery that supplies power to the mt25208 NVRAM failed, the LED on that
battery remained green instead of turning red, even though the CELOG alert
correctly indicated the batterys failure. This issue affected the following node
types: S200, X200, X400, or NL400 nodes.

130683

If the PCIe connection between the motherboard and the NVRAM/IB card was
129914
disrupted, the affected node stopped responding. If the unresponsive node was
subsequently powered down, the NVRAM/IB card failed to set Fast Self Refresh
(FSR). If FSR was not set when the node was powered down, the NVRAM journal was
not preserved and the following message appeared on reboot:
Could not recover journal. Contact Isilon Customer Support
Immediately.

In addition, the following entry appeared in the /var/log/messages file:


fsr=0 pwr=0

Installing a power supply unit (PSU) firmware update on a common form factor (CFF) 129810
PSU in a node with only one working PSU caused the node to shut down. This
occurred because the working PSU was rebooted as part of the PSU firmware
update process.
Attempting to SmartFail a SED would sometimes fail, even after the drive had been
manually removed and successfully replaced.

129326

File transfer

119

Resolved issues

Hardware issues resolved in OneFS 7.2.0.0

ID

If the QLogic 10 Gig Ethernet card experienced a timeout from a request initiated by 129252
the Direct Memory Access Engine (DMAE), lines similar to the following appeared in
the /var/log/messages file.
Stack: -------------------------------------------------if_bxe.ko:bxe_write_dmae+0xd0
if_bxe.ko:bxe_write_dmae_phys_len+0x78
if_bxe.ko:ecore_init_block+0x122
if_bxe.ko:bxe_init_hw_common+0x7d5
if_bxe.ko:bxe_init_hw_common_chip+0x18
if_bxe.ko:ecore_func_hw_init+0xd7
if_bxe.ko:ecore_func_state_change+0x10c
if_bxe.ko:bxe_init_hw+0x41
if_bxe.ko:bxe_nic_load+0x726
if_bxe.ko:bxe_init_locked+0x18c
if_bxe.ko:bxe_handle_chip_tq+0x86
kernel:taskqueue_run_locked+0x9a
kernel:taskqueue_thread_loop+0x48
kernel:fork_exit+0x7f
--------------------------------------------------

As a result, the affected node had to be rebooted to restore node functionality.


Nodes that contained bootflash drives would sometimes reboot unexpectedly
during a disk firmware update if there was not enough memory available to meet
the requirements of the disk firmware update process.

129012

If you added an unsupported boot drive to a node, a CELOG alert was properly
generated, but a traceback blocked the addition of the entry to the baseboard
management controller's (BMC) system events log. As a result, there was no report
of the unsupported drive in the events log.

128984

When installing the drive support package (DSP) that contained firmware for
127958
upcoming hardware models, the installer reported errors similar to the following. In
addition, although installation was successful, an error message indicating that the
installation had failed appeared on the console and in the isi_dsptool.log
file:
- ERROR: Found 2 error messages in isi_dsptool logfile
- ERROR Gconfig parse warnings for file /dsp_staging/config/
models/HGST_HUS726060ALA640.gc; dropping unrecognized entries
- ERROR Gconfig parse warnings for file /ifs/.ifsvar/modules/
hardware/drives/config/models/HGST_HUS726060ALA640.gc.2; dropping
unrecognized entries
DSP Install Failed

Note

These messages are now logged in the isi_dsptool.log file as warnings.

120

The CTO upgrade process did not complete on clusters in compliance mode.

126895

If you added a node to a cluster that was configured to synchronize its time with an
external NTP server, the cluster would sometimes synchronize its time with the
node that was added. As a result, the cluster time might have been so different
from the time on the NTP server that the cluster would not automatically correct
itself.

126652

Unprovisioned nodes could not be added to a manual node pool. If you added
nodes in your cluster to one or more manual node pools, and then attempted to
add one or two nodes to the cluster, OneFS would not be able to add those nodes

126363

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

Hardware issues resolved in OneFS 7.2.0.0

ID

to a node pool, and so those nodes would be unprovisioned, and you would be
unable to add those unprovisioned nodes to the manual node pools.
A100 nodes reported warning-level sensor errors when a power supply was
removed or failed, rather than reporting a critical-level redundant power supply
failure.

126321

If you remove a power cable from the Power Supply Unit (PSU) on an A100 node,
the isi_hw_status command incorrectly displayed the following output:

126240

Power Supplies OK

After a drive-down operation completed, nodes would sometimes panic with the
following error message:

126239

Fatal trap 12: page fault while in kernel mode

If you shut down a node with a failed boot drive, the node would sometimes stop
responding during the shut down process because the journal for the node could
not be saved to the failed boot drive.

126219

If the InfiniBand card in a node ran out of memory, the affected node might have
been disconnected from the cluster.

124325

HDFS issues resolved in OneFS 7.2.0.0

ID

HDFS
Due to an issue in the Hadoop Distributed File System (HDFS) code, HDFS-1497, the 127983
sequence numbers assigned to HDFS data packets were not always consecutive. If
OneFS received an HDFS data packet with a sequence number that was not
consecutive, the affected HDFS client connection was closed.

Job engine
Job engine issues resolved in OneFS 7.2.0.0

ID

The isi_job_d process might fail, causing jobs to briefly pause and then resume.

136028

Data reliability issues could occur after the job engine ran Collect or MultiScan
jobs.

132695,
132696,
132697,
132698

Job engine logging always ran at trace level, a level used to gather detailed
information about job engine processes. As a result, job engine performance was
adversely affected and the job engine log file, isi_job_d.log, was
unnecessarily flooded with messages.

132895

If the cluster was being monitored by InsightIQ, the FSA job could fail with the
following error message:

130999

[TRACE_SQL_ERRS]: database is locked

HDFS

121

Resolved issues

Job engine issues resolved in OneFS 7.2.0.0

ID

If you ran the FlexProtect job while the impact of the job was set to high, nodes that 129445
contained SSD drives would sometimes panic.
When smartfailing a drive with very little data on it, a FlexProtect or FlexProtectLin
job could pause in phase 2 for as long as two hours, causing the job to be
cancelled by the system. Until a FlexProtect or FlexProtectLin job successfully
completed, no other jobs could run. In addition, the cluster could fall below the
configured protection level.

129349

In rare cases, restriping a 4 TB file could cause a node to panic.

126675

After creating a custom job policy with isi job policy create, the values for 125544
the job impact policies were incorrectly set and jobs could not be run. Errors similar
to the following appeared:
Parse warnings from defaults:
Multiple errors:
Repeated
disk record:
old={ivar:impact.policies {token:0, version:
1, flags:---I---} = (read: write:)}
new={ivar:impact.policies {token:0, version:1, flags:---I---} =
(read: write:)}
Repeated disk record:
old={ivar:impact.policies {token:0, version:1, flags:---I---} =
(read: write:)}
new={ivar:impact.policies {token:0,
version:1, flags:---I---} = (read:HIGH write:HIGH)}

If you upgraded your cluster from OneFS 6.5.5, you could not modify job impact
policies.

125543

Migration issues resolved in OneFS 7.2.0.0

ID

The isi_vol_copy_vnx utility did not properly handle new files that were added to
new directories between incremental copies. As a result, incremental copies failed.

131728

Networking issues resolved in OneFS 7.2.0.0

ID

Some changes to VLAN tagging pools, such as adding a Network Interface Card
(NIC) or rebalancing dynamic IPs, caused the Smartconnect process to stop
responding to DNS queries until the cluster was rebooted or until the isi_dnsiq_d
service was restarted.

132022

Migration

Networking

Due to issues in the failover code path in the Sockets Direct Protocol (SDP), failover 131544
to the backup InfiniBand (IB) fabric could fail. If failover was unsuccessful, the
Isilon cluster was unavailable until the IB switches were rebooted. The potential for
encountering these issues was limited, but the potential increased in proportion to
the number of nodes in the cluster.
Under some conditions, the flx_conf.xml file could not be accessed
immediately after a group change occurred on a cluster. If this issue was
encountered, the SmartConnect process, isi_dnsiq_d, unexpectedly restarted on

122

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

130702

Resolved issues

Networking issues resolved in OneFS 7.2.0.0

ID

one or more nodes and the following lines appeared in the /var/log/messages
file of the affected nodes:
/usr/lib/libisi_flexnet.so.1:flx_config_get_kevent+0x13
/usr/sbin/isi_dnsiq_d:update_flx_config_kevent+0x25
/usr/sbin/isi_dnsiq_d:realloc_ips+0xf9
/usr/sbin/isi_dnsiq_d:main+0xbf0
/usr/sbin/isi_dnsiq_d:_start+0x8c

If you ran the isi networks support sc_put command to manually assign
a dynamic IP address to an interface on a specific node, the command failed and a
FAILED ASSERTION message similar to the following appeared:

130652

FAILED ASSERTION config->_lock_state == O_EXLOCK @ /build/mnt/src/


isilon/lib/isi_flexnet/flexnet.c:2309

If the SmartConnect service IP address was the only IP address assigned to the
external network interface of a node, Flexnet would not populate the subnet
gateway. As a result, the affected node did not respond to DNS queries.

130642

If a static route was assigned to a storage pool (by running the isi networks
modify pool command with the --add-static-routes option), Flexnet
checked each node in the pool for UP interfaces to which the route could be
assigned. If Flexnet did not detect any UP interfaces, the following informational
message was sometimes repeatedly logged in the isi_flexnet_d.log file:

130343

isi_flexnet_d[8079]: No available UP interfaces for route:

Note

This message will now be logged only if the Flexnet logging level is set to debug or
higher.
SmartConnect did not pause for 10 seconds between rebalance operations and
thus rebalanced IP addresses more frequently than necessary.

130318

If you added a static route and incorrectly set the gateway, the affected node
sometimes became unresponsive and the OneFS software watchdog rebooted the
node.

130263

Note

The following IP addresses should not be assigned to the external gateway on a


node:
l

An IP address that falls within the cluster's internal network IP range

An IP address that is assigned to a node

An unreachable IP address

The broadcast IP address

After an IB switch was rebooted, the FlexNet process running on each node
125193
updated the flx_config.xml file, causing the SmartConnect process to lock the
file. As a result, SmartConnect would fail to respond to new DNS requests for up to
two minutes on large clusters.

Networking

123

Resolved issues

Networking issues resolved in OneFS 7.2.0.0

ID

If an administrator added a static route that would send traffic across different
88072
interface types using the IP address of a node as the route destination, the affected
node rebooted unexpectedly.

NFS
NFS issues resolved in OneFS 7.2.0.0

ID

The isi nfs exports list command sometimes timed out, preventing users
from viewing or configuring NFS exports from the command-line Interface or the
OneFS web administration interface.

130270

If an inherit_only Access Control Entry (ACE) was applied to the owner of a file, and 130253
the Access Control List (ACL) was modified, the inherit_only ACE was mapped to the
NFSv4entry OWNER@. If the OWNER@ entry was subsequently remapped, the entry
was re-mapped to creator_owner rather than the original owner of the file, which
could prevent the original owner from accessing the file.
If either of the following conditions existed, the lockd process would stop
responding, preventing some NFS clients from accessing files on the cluster
because they could not be granted file locks:
l

A lock was granted to an NFS client while the client was unregistering from the
LKF system.

The isi_classic nfs client rm command was run while there were
several lock waiters on the NFS client.

If a cluster received NLM requests that included the AUTH_NONE credential, OneFS
would return a locking error instead of the correct error message.

129900

125482

During the OneFS boot process, a race condition prevented some sysctl parameters 125479
that were required for NFS Kerberos authentication from being read. This issue
caused Kerberos authentication to be unavailable to NFS clients; if this issue
occurred, messages similar to the following were logged in the nfs.log file:
Kerberos not available: gss_acquire_cred: Key table entry not
found;

If an NFS client sent lock requests with security type AUTH_NONE, the client
received an incorrect error message that did not indicate the reason for failure.

123567

If the cluster was configured with the overcommit limit below the low and high
settings (which is an invalid configuration), nodes could run out of memory and
unexpectedly reboot.

116133

OneFS web administration interface

124

OneFS web administration interface issues resolved in OneFS 7.2.0.0

ID

In the OneFS web administration interface, on the Snapshot Schedules page,


after clicking View details to view detailed information about a snapshot
schedule, some of the details were not displayed or the details were displayed
after a noticeable delay.

130720

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

OneFS web administration interface issues resolved in OneFS 7.2.0.0

ID

On the Cluster Management > Access Management > LDAP page in the OneFS
web administration interface, if the length of the Bind to value exceeded the width
of the page , the corresponding edit link was not available.

130336

The OneFS web administration interface was not accessible to clients using
Microsoft Internet Explorer 8, 9 , or 10 in compatibility view. In addition, if a client
attempted to access the web administration interface using Internet Explorer in
compatibility view, the IE console displayed the following error: :

119315

SCRIPT1028: Expected identifier, string or number all-classes.js?


70100b00000002a, line 13717 character 5

You could not set a netmask of 0.0.0.0 through the OneFS web administration
Interface.

96604

SmartLock issues resolved in OneFS 7.2.0.0

ID

If a file was committed to a WORM directory through the RESTful Namespace API,
the file permissions were altered and, as a result, the file was accessible to
everyone.

130319

On clusters running in compliance mode, the compadmin user did not have access
to core files that were created when system processes stopped running. This
prevented the compadmin user from analyzing the cause of a failure if a system
process unexpectedly restarted. This also prevented the compadmin user from
deleting the files.

130284

If a cluster was running in SmartLock compliance mode, you could not renew the
SSL certificate of the Isilon web administration interface.

128443

The CTO upgrade process did not complete on clusters in compliance mode.

118428

SmartQuotas issues resolved in OneFS 7.2.0.0

ID

When a soft quota was modified, if the --soft-grace option was modified but
the --soft-threshold option was not modified, the command-line interface
ignored the configuration change.

130640

SMB issues resolved in OneFS 7.2.0.0

ID

Because OneFS relied on a function that could handle only file descriptors with a
maximum value of 1024, the lsass process unexpectedly restarted when it
attempted to process file descriptors assigned a value higher than 1024. As a
result, SMB users could not be authenticated for the few seconds it took for the
process to restart.

132043

SmartLock

SmartQuotas

SMB

SmartLock

125

Resolved issues

SMB issues resolved in OneFS 7.2.0.0

ID

While the lwio process was handling a symbolic link (a file that acts as a reference
to another file or directory) a memory allocation issue could occur in the lwio
process. If this issue was encountered, the lwio process unexpectedly restarted
and SMB clients that were connected to the affected node were disconnected.

131751

While executing a zero-copy system call, the lwio process could attempt to access
memory that was previously released to the system (also known as freed memory).
If the lwio process attempted to access freed memory, the lwio process
unexpectedly restarted and SMB clients that were connected to the affected node
were disconnected.

131748

The lwio process sometimes attempted to read data from a socket connection that
was not ready to be read from. If this occurred, the lwio process unexpectedly
restarted and the following ASSERTION FAILED message appeared in the

131745

lwiod.log file:
[lwio] ASSERTION FAILED: Expression = (pConnection>readerState.pRequestPacket->bufferUsed <= (maxHeader
+sizeof(NETBIOS_HEADER)))

Under some circumstances, the lwio process reported the length of a file name in
131711
bytes when a different value type was expected. As a result, the lwio process
attempted to access memory that wasn't allocated to it, causing the lwio process to
crash. If the lwio process crashed, SMB clients that were connected to the affected
node were disconnected.
If an SMB2 client experienced connection issues at the same time that it attempted 131681
to place a lease on a file, a race condition could occur that resulted in the client
being disconnected from the cluster.
Under rare circumstances, if a subprocess of the lwio process opened a new file
131586
handle on an existing lease at the same time that another subprocess was breaking
the lease, the lwio process unexpectedly restarted. If the lwio process restarted,
SMB clients that were connected to the affected node were disconnected.
When upgrading to OneFS 7.1.1.0, if any share names contained an invalid
character (for example, a bracket, colon, asterisk, or question mark), or if a share
path did not start with /ifs, the SMB configuration could not be upgraded. In
addition, no SMB shares would be visible after the cluster was upgraded and SMB
clients could not connect to the cluster until the invalid shares were removed and
the SMB configuration was successfully upgraded.

131364

Note

In OneFS 6.0 and earlier, an SMB share name could contain invalid characters and
shares could be created outside of the /ifs directory (an invalid share
configuration). On an upgrade to OneFS 6.5 through 7.0, an SMB share
configuration that contained shares with an invalid character or share paths that
did not start with /ifs could be successfully upgraded; however, the invalid
shares were inaccessible. Although the shares were inaccessible in OneFS 6.5 and
later, the existence of these shares could adversely affect upgrades to OneFS
7.1.1.0.
Under some circumstances, the lwio process reported the length of a file name in
130641
bytes when a different value type was expected. As a result, the lwio process
attempted to access memory that wasn't allocated to it, causing the lwio process to

126

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

SMB issues resolved in OneFS 7.2.0.0

ID

crash. If the lwio process crashed, SMB clients that were connected to the affected
node were disconnected.
While a network socket was being closed, contention between process threads
could cause data structures referencing the socket to be prematurely freed. If the
freed structures were then accessed by another thread, the lwio process
unexpectedly restarted and SMB clients connected to the affected node were
disconnected.

130353

In environments with more than 12,000 SMB shares, the isi_webui_d process
sometimes ran out of memory and stopped running . If the isi_webui_d process
stopped running, the OneFS web administration interface was unavailable until the
process restarted and existing connections to the web administration interface
were unresponsive or were disconnected.

130267

Note

12,000 SMB shares exceeds the maximum number of shares supported by OneFS.
A work item could be scheduled in SRV and then freed before it could run. As a
result, crashes could occur.

130132

Because some SMB2 functions used an incorrect value type to manage SMB
message sequence numbers, SMB sometimes incorrectly returned a
STATUS_INVALID_PARAMETER error in response to SMB2 client requests. If

130130

the STATUS_INVALID_PARAMETER error was returned, the affected SMB


client was disconnected from the cluster.
Note

Sequence numbers associate requests with responses and determine what


requests are allowed for processing.
If an SMB1 session setup request and the Tree Connect process simultaneously
attempted to access the security context in-memory object, a race condition would
occur that stopped the lwio process and closed connections to the node.

130032

If a user requested access to a file to which they had write access and the file was
located in a share to which they had Read-Only access, the user might be
incorrectly denied access to the file if the create disposition of the request was
FILE_OPEN_IF.

130030

Note

The create disposition of the file specifies the action the system will take in
response to a request to access a file, based on whether the requested file does or
does not already exist.
SMB2 message IDs larger than 64kb in size were incorrectly displayed as zero,
which caused Active Directory domain controller connections to be reset.

130021

When snapshot data was queried with the SMB2_FIND_ID_BOTH_DIRECTORY_INFO


command after being deleted, the system incorrectly reported the file as found and
"." as the filename.

130010

SMB

127

Resolved issues

SMB issues resolved in OneFS 7.2.0.0

ID

Clients connected to the cluster through SMB were disconnected if the lwiod
process crashed. When the process crashed, the following lines were logged in the
stack trace:

130001

/lib/libthr.so.3:pthread_rwlock_init+0x117 /usr/likewise/lib/lwiodriver/srv.so: SrvConnection2SetInvalidEx +0x22/boot/kernel.amd64/


kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvProtocolTransport1DriverSendDone +0x6e /usr/
likewise/lib/lwio-driver/srv.so:SrvSocketProcessTaskWrite
+0x2dc /usr/likewise/lib/lwio-driver/srv.so: SrvSocketProcessTask
+0x3d0 /usr/likewise/lib/liblwbase.so.0:EventThread+0x333 /usr/
likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xec /lib/
libthr.so.3:_pthread_getprio+0x15d

128

If auditing was enabled and a directory was accessed, the isDirectory flag was
sometimes incorrectly set to false. As a result, the audit log incorrectly indicated
that the item accessed was a file rather than a directory.

129455

A race condition would occur wherein an SMB1 session setup request and the Tree
Connect process simultaneously tried to access the security context in-memory
object. As a result, the lwio process would stop, and existing connections to the
node would close.

128076

The DC connection would reset when the MessageID wrapped to zero at 64 KB,
although it should have continued incrementing up to 0xFFFFFFFF (64 bits).

127778

If you created a snapshot of a directory through the OneFS web administration


interface, deleted the files, and then attempted to restore those files and folders in
Windows through the Restore Previous Versions option by right clicking the
directory, the deleted files were never restored to that directory.

127010

If there is was a mismatch between share names stored in memory and share
names stored in the registry, an assert would sometimes occur and lwio might
restart unexpectedly with a signal 6 error.

127005

Applications that required a volume label could not establish a connection to an


SMB share on the cluster.

126496

You could not create a share for a path that didn't exist through MMC. If you did
this, you could view the share through the OneFS web administration interface.
However, you could not access the share, because the path did not exist.

125888

When the number of file system event snapshots exceeded the amount of space
allocated by the OnefsEnumerateSnapshots buffer, the lwio process restarted on
various nodes, causing clients to be disconnected and then reconnected to the
cluster.

125570

If OneFS received an SMB request that contained a filepath, OneFS would convert
any forward slashes (/) to backslashes (\) before processing the request. This was
contrary to SMB standards, which specify that requests containing file paths that
include forward slashes return a STATUS_OBJECT_NAME_INVALID error.

125566

If a user had permission to access a shared directory, but the user was not granted
access to the parent directory that contained the shared directory, the user could
not rename files or folders contained in the shared directory.

125036

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Resolved issues

SMB issues resolved in OneFS 7.2.0.0

ID

Clients connected to the cluster over SMB were disconnected when the lwio
process crashed. When the process crashed, the following lines were logged in
the /var/log/messages file:

124981

/lib/libthr.so.3:pthread_rwlock_init+0x117
/usr/likewise/lib/lwio-driver/srv.so: SrvConnection2SetInvalidEx
+0x22
/boot/kernel.amd64/kernel:
/usr/likewise/lib/lwio-driver/
srv.so:SrvProtocolTransport1DriverSendDone +0x6e
/usr/likewise/lib/lwio-driver/srv.so:SrvSocketProcessTaskWrite
+0x2dc
/usr/likewise/lib/lwio-driver/srv.so: SrvSocketProcessTask
+0x3d0
/usr/likewise/lib/liblwbase.so.0:EventThread+0x333
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xec
/lib/libthr.so.3:_pthread_getprio+0x15d

If a client sent an oplock or lease break acknowledgment for an oplock or lease that 123747
was never requested, a crash would occur with the following stack trace:
/boot/kernel.amd64/kernel: /lib/libc.so.7:thr_kill+0xc
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase_nothr.so.
0:__LwRtlAssertFailed+0x5a
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvPrepareOplockStateAsync_SMB_V2+0x57
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvOplockBeginPolling+0x36
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvExecContextContinue2+0x1c7
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvProtocolExecute2+0xdf
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
srv.so:SrvExecuteCreateAsyncCB_SMB_V2+0x6a
/boot/kernel.amd64/kernel: /usr/likewise/lib/libiomgr.so.
0:IopIrpCompleteInternal+0x324
/boot/kernel.amd64/kernel: /usr/likewise/lib/libiomgr.so.
0:IoFmIrpDispatchContinue+0x8c4
/boot/kernel.amd64/kernel: /usr/likewise/lib/libiomgr.so.
0:IoIrpComplete+0x33</msgblock>
/boot/kernel.amd64/kernel: /usr/likewise/lib/lwio-driver/
onefs.so:OnefsCompleteIrpContext+0xa9
/boot/kernel.amd64/kernel: /usr/likewise/lib
/boot/kernel.amd64/kernel: /lwio-driver/
onefs.so:OnefsProcessIrpContext+0x18b
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:CompatWorkItem+0x16
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:WorkThread+0x256
/boot/kernel.amd64/kernel: /usr/likewise/lib/liblwbase.so.
0:LwRtlThreadRoutine+0xec
/boot/kernel.amd64/kernel: /lib/libthr.so.3:_pthread_getprio+0x15d

If OneFS received a request from an SMB client whose Kerberos service ticket could 114524
not be decrypted, OneFS returned a STATUS_LOGON_FAILURE response to
the SMB client that sent the request. If this response was sent, the affected SMB
client might have experienced issues accessing files or applications that were
stored on the cluster.

SMB

129

Resolved issues

SMB issues resolved in OneFS 7.2.0.0

ID

Note

In OneFS 7.2.0.0 and later, if OneFS receives a request from an SMB client whose
Kerberos service ticket cannot be decrypted, a
STATUS_MORE_PROCESSING_REQUIRED response is returned. This
response prompts the affected SMB client to search for a secondary cluster. If the
search for a secondary cluster fails, the affected SMB client could still experience
issues accessing files or applications on the cluster.

Upgrade and installation


Upgrade and installation issues resolved in OneFS 7.2.0.0

ID

Ff all of the following conditions were met and you if upgraded to OneFS 7.0 or
later, the SMB configuration was not successfully upgraded and one or more
services were sometimes disrupted following the upgrade:

130266

The cluster was being upgraded from OneFS 6.5.0 or earlier.

The file path to one or more SMB shares on the cluster contained a multibyte
character.

The upgrade process might detect events that do not appear in the OneFS web
administration interface or the output of the isi events list command.
Because these events are older than 30 days, they are not displayed by default.

125551

During a OneFS upgrade, the crontab file was not updated with data from the
125550
crontab.smbtime file. As a result, crontab overrides that were configured before
the upgrade were not applied after the cluster was upgraded.

Virtual plug-ins
Virtual plug-ins ssues resolved in OneFS 7.2.0.0

ID

Due to a capacity checking error, if you created a new cluster through the OneFS
Simulator on Windows, Windows VMs, or VMWare Fusion workstations, the cluster
failed to mount /ifs and the following error message appeared:

133546

IFS failed to mount. Aborting boot. Please contact Isilon


Customer Support.

130

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CHAPTER 6
Isilon ETAs and ESAs related to this release

The following section provides a list of EMC Technical Advisories (ETAs) and EMC Security
Advisories that describe issues that affect the latest 7.2.0 release or previous 7.2.0
releases.
For the most up-to-date list of Isilon ETAs and ESAs, see the Notifications section of the
Isilon Uptime Info Hub on the EMC Isilon Community Network site. You can also subscribe
to receive ETAs and ESAs related to OneFS via email by visiting the EMC Isilon OneFS
product page on the EMC Isilon Support site and clicking the Manage Advisory
Subscriptions link under Advisories.
l
l

ETAs related to OneFS 7.2.0................................................................................ 132


ESAs related to OneFS 7.2.0................................................................................ 133

Isilon ETAs and ESAs related to this release

131

Isilon ETAs and ESAs related to this release

ETAs related to OneFS 7.2.0

132

Functional
area

ETA

Description

Status

ID

Authentication

199379

If Microsoft Security Bulletin MS15-027


was installed on a Microsoft Active
Directory server that authenticated SMB
clients that were accessing an Isilon
cluster, and if the server used the
NTLMSSP challenge-response protocol,
the SMB clients could not be
authenticated.

Resolved
in OneFS
7.2.0.2

147221

Backup,
recovery, and
snapshots

203815

If you used the snapshot-based


Resolved
incremental backup feature during a
in OneFS
backup operation and multiple
7.2.0.4
snapshots were created between
backups, the feature might have failed to
recognize that data had changed during
the backup procedure. As a result, some
changed files were not backed up.

154269

File system

202452

If a node ran for more than 497


Resolved
consecutive days without being
in OneFS
rebooted, an issue that affected the
7.2.0.4
OneFS journal buffer sometimes
disrupted the drive sync operation. If this
issue occurred, OneFS reported that the
journal was full, and, as a result,
resources that were waiting for a
response from the journal entered a
deadlock condition. In addition, clusters
that contained a node that ran for more
than 497 consecutive days with no
downtime could have unexpectedly
rebooted as a result of this issue.

158417

Hardware

198924

If a drive in an HD400 node was replaced Resolved


while the drive was in the process of
in OneFS
being smartfailed, and if the node that
7.2.0.2
contained the replaced drive was
rebooted before the smartfail process
was complete, the affected node failed to
mount the /ifs partition.

142946

NFS

197460

Under specific conditions, a user with


appropriate POSIX permissions was
denied access to modify a file.

Resolved
in OneFS
7.2.0.1

141210

NFS

204898

Although the correct ACLS were assigned


to a filefor example, std_delete or
modify NFSv3 and NFSv4 clients could
not delete, edit, or move the file unless
the delete_child permission was set on
the parent directory.

Resolved
in OneFS
7.2.0.4

149743

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Isilon ETAs and ESAs related to this release

Functional
area

ETA

Description

Status

ID

NFS

205085

Because OneFS 7.2.0 and later returned


64-bit NFS cookies, some older, 32-bit
NFS clients were unable to correctly
handle read directory (readdir) and
extended read directory (readdirplus)
responses from OneFS. In some cases,
the affected 32-bit clients became
unresponsive, and in other cases, the
clients could not view all of the
directories in an NFS export. In the latter
cases, the client could typically view the
current directory (".") and its parent
directory ("..").

Resolved
in OneFS
7.2.0.3

153737

Networking

200096

If the cluster contained X410, S210, or


Resolved
HD400 nodes that had BXE 10 GigE NIC
in OneFS
cards and any external network subnets 7.2.0.3
connected to the cluster were set to 9000
MTU, the affected nodes rebooted, and
an error similar to the following appeared
in the /var/log/messages file :

152083,
148695

ERROR: mbuf alloc fail for


fp[01] rx chain (55)

SMB

198187

If a Windows client that was connected to Resolved


the cluster through SMB copied a file
in OneFS
from the cluster, the timestamp
7.2.0.1
metadata applied to the file might have
become invalid.

142313

ESAs related to OneFS 7.2.0


ESA

Description

Status

ID

ESA-2015-154

The network time protocol (NTP)


service was updated to version
4.2.8P1.

Resolved in OneFS
7.2.0.4

154655

ESA-2015-114

The version of Apache that is


Resolved in OneFS
installed on the cluster was updated 7.2.0.3
to version 2.2.29.

136994

ESA-2015-112

User input that is passed to a


command line is now escaped using
quotation marks.

Resolved in OneFS
7.2.0.2

140931

ESA-2015-093

An update was applied to address a


denial of service vulnerability in
Apache HTTP Server.

Resolved in OneFS
7.2.0.2

137884

ESAs related to OneFS 7.2.0

133

Isilon ETAs and ESAs related to this release

134

ESA

Description

ESA-2014-146

The version of GNU bash installed on Resolved in OneFS


the cluster was updated to version
7.2.0.2
4.1.17.

143337

ESA-2015-015

Because SSL v3 was vulnerable to


some man-in-the-middle attacks.
Support for SSL v3 for HTTPS
connections to the cluster was
removed beginning in OneFS
7.2.0.1.

Resolved in OneFS
7.2.0.1

137904

ESA-2015-038

The version of ConnectEMC installed Resolved in OneFS


on the cluster was updated from
7.2.0.1
version 3.2.0.4 to 3.2.0.6. This
upgrade changes the behavior of the
ConnectEMC component so that it
no longer uses an internal version of
OpenSSL and instead relies on the
version of OpenSSL installed on the
Isilon cluster.

134760

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Status

ID

CHAPTER 7
OneFS patches included in this release

The following section provides a list of patches that address issues that are now fixed in
OneFS. If you previously installed one or more of the listed patches, and you upgrade to a
release that includes the fix for the issue the patch addressed, you do not need to
reinstall those patches after you upgrade.
After upgrading, see Current Isilon OneFS Patches on the EMC Online Support site to find
out if any new patches were released that might apply to the version of OneFS you
upgraded to.
l
l
l
l

Patches included in OneFS 7.2.0.4...................................................................... 136


Patches included in OneFS 7.2.0.3 (Target Code)................................................ 136
Patches included in OneFS 7.2.0.2...................................................................... 137
Patches included in OneFS 7.2.0.1...................................................................... 139

OneFS patches included in this release

135

OneFS patches included in this release

Patches included in OneFS 7.2.0.4


Functional
area

Description

Patch ID

Authentication

Functionality change:
Users that attempt to connect to the cluster over SSH,
through the OneFS API, or through a serial cable, can no
longer be authenticated on clusters running in compliance
mode if any of the following identifiers are assigned to the
user as either the user's primary ID or as a supplemental ID:

patch-156748

UID: 0
SID: S-1-22-1-0
HDFS

This patch addresses multiple issues that affect the HDFS


protocol. For more information about the issues addressed
by this patch, review the patch README.

patch-159065

HDFS

Adds 1.7.0_IBM HDFS to the list of supported Ambari servers. patch-157202

NFS

This patch addresses multiple issues that affect the NFS


protocol. For more information about the issues addressed
by this patch, review the patch README.

patch-158509

NFS

This patch addresses multiple issues that affect the NFS


protocol. For more information about the issues addressed
by this patch, review the patch README.

patch-156230

SMB

This patch addresses multiple issues that affect SMB2


symbolic links. For more information about the issues
addressed by this patch, review the patch README.

patch-154603

Patches included in OneFS 7.2.0.3 (Target Code)


Functional
area

Description

Patch ID

Events, alerts,
and cluster
monitoring

If you ran the isi statistics client or isi


statistics heat command with the --csv option, the
following error appeared instead of the statistics data:

patch-153659

unsupported operand type(s) for %: 'NoneType'


and 'tuple'

Job engine

136

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

If a MediaScan job detected an ECC error in a file's data, the


job did not properly restripe the file away from the ECC error.
As a result, the file was underprotected, and was at risk for
data loss if further damage occurred to the datafor
example, if a device containing a copy of the data failed. If

patch-156835

OneFS patches included in this release

Functional
area

Description

Patch ID

this issue occurred, a message similar to the following


appeared in the /var/log/isi_job_d.log file:
mark_lin_for_repair:1331: Marking forrepair:
1:0001:0003::HEAD

NFS

Because NFSv3 Kerberos authentication requires all NFS


procedure calls to use RPCSEC_GSS authentication, some
older Linux clientsfor example, RHEL 5 clientsthat started
the FSINFO procedure call with AUTH_NULL authentication
before attempting the FSINFO procedure call with
RPCSEC_GSS authentication, were prevented from mounting
an NFS export if the export was configured with the Kerberos
V5 (krb5) security type. Newer clients that started the FSINFO
procedure call with RPCSEC_GSS were not affected.

patch-151610

SMB

Due to a file descriptor (FD) leak that occurred when SMB


patch-154168
clients listed files and directories within an SMB share, it was
possible for OneFS to eventually run out of available file
descriptors. If this occurred, an ACCESS_DENIED or
STATUS_TOO_MANY_OPENED_FILES response was sent to
SMB clients that attempted to establish a new connection to
the cluster or SMB clients that were connected to the cluster
that attempted to view or open files. As a result, new SMB
connections could not be established and SMB clients that
were connected to the cluster could not view, list, or open
files. If this issue occurred, messages similar to the following
appeared on the Dashboard > Event summary page of the
OneFS web administration interface, and in the commandline interface when you ran the isi events list -w |
grep -i descriptor command:
System is running out of file descriptors

In addition, messages similar to the following appeared in


the /var/log/ lwiod.log file:
Could not create socket: Too many open files
Failed to accept connection due to too many
open files

Patches included in OneFS 7.2.0.2


Functional area

Description

Authentication

If a cluster that was joined to an Active Directory (AD) patch-143372


domain was also configured with an IPv6 subnet, and
if the AD domain controller was configured to use an
IPv6 address, the netlogon process on the cluster
repeatedly restarted and members of the Windows
AD domain could not be authenticated to the cluster.

Patch ID

Patches included in OneFS 7.2.0.2

137

OneFS patches included in this release

Functional area

Description

Patch ID

If the netllogon process restarted as a result of this


issue, Windows clients might have received an
Access Denied error when attempting to access
SMB shares on the cluster, or they might have
received a Logon failure: unknown user

name or bad password message when


attempting to log on to the cluster. In addition, the
following lines appeared in the /var/log/
messages file:
Stack:
------------------------------------------------/lib/libc.so.7:thr_kill+0xc
/lib/libc.so.7:__assert+0x35
/usr/likewise/lib/
libnetlogon_isidcchooser.so:IsiDCChooseDc
+0xbb3
/usr/likewise/lib/lw-svcm/
netlogon.so:LWNetChooseDc+0x27
/usr/likewise/lib/lw-svcm/
netlogon.so:LWNetSrvPingCLdapArray
+0x1187
/usr/likewise/lib/lw-svcm/
netlogon.so:LWNetSrvGetDCNameDiscoverInter
nal+0x72a
/usr/likewise/lib/lw-svcm/
netlogon.so:LWNetSrvGetDCNameDiscover
+0x111
/usr/likewise/lib/lw-svcm/
netlogon.so:LWNetSrvGetDCName+0xb20
/usr/likewise/lib/lw-svcm/
netlogon.so:LWNetSrvIpcGetDCName+0x4f
/usr/likewise/lib/liblwmsg.so.
0:lwmsg_peer_assoc_call_worker+0x20
/usr/likewise/lib/liblwbase.so.
0:CompatWorkItem+0x16
/usr/likewise/lib/liblwbase.so.
0:WorkThread+0x256
/usr/likewise/lib/liblwbase.so.
0:LwRtlThreadRoutine+0xec
/lib/libthr.so.3:_pthread_getprio+0x15d
-------------------------------------------------

NFS

If you ran a command from an NFSv3 or NFSv4 client


to query for files or directories in an empty folder,
and if you included the asterisk (*) or question mark
(?) characters in the command, the query failed and
an error message appeared on the console. For
example, if you ran the ls * command, the
command failed and the following error appeared on
the console:

patch-142630

ls: cannot access *: Too many levels of


symbolic links

SMB

138

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

If Microsoft Security Bulletin MS15-027 was installed patch-147684


on a Microsoft Active Directory server that
authenticated SMB clients that were accessing an
Isilon cluster, and if the server used the NTLMSSP
challenge-response protocol, the SMB clients could

OneFS patches included in this release

Functional area

Description

Patch ID

not be authenticated. As a result, SMB clients could


not access data on the cluster.
For more information, see article 199379 on the EMC
Online Support site.
SMB

If Microsoft Security Bulletin MS15-027 was installed patch-145051


on a Microsoft Active Directory server that
authenticated SMB clients that were accessing an
Isilon cluster, and if the server used the NTLMSSP
challenge-response protocol, the SMB clients could
not be authenticated. As a result, SMB clients could
not access data on the cluster.
This patches is deprecated by patch-147684.

SMB

If the SMB2Symlinks option was disabled on the


patch-143767
cluster and a Windows client navigated to a symbolic
link that pointed to a directory, under some
circumstances, the system returned incorrect
information about the symbolic link. If this occurred,
the symbolic link appeared to be a file, and the
referenced directory could not be opened.
In addition, because OneFS 7.2.0.1 did not
consistently check the OneFS registry to verify
whether the SMB2Symlinks option was disabled, in
some cases, although the SMB2Symlinks option
was disabled, the lwio process attempted to handle
symbolic links when it should have allowed them to
be processed by the OneFS file system. If this
occurred, the following error appeared on the client:
The symbolic link cannot be followed
because its type is disabled.

Security

Updates the version of GNU bash installed on the


cluster to version 4.1.17.
For more information, see ESA-2014-146 on the EMC
Online Support site.

patch-139164

Patches included in OneFS 7.2.0.1


Functional area

Description

Patch ID

Networking

If different nodes in a cluster were connected to


different network subnets, and if those subnets were
assigned to different Active Directory sites, the site
configuration information on the cluster was
repeatedly updated. Because updates to the site
configuration information require a refresh of the
lsass service, this behavior caused authentication
services to become slow or unresponsive.

Patch-138767

Patches included in OneFS 7.2.0.1

139

OneFS patches included in this release

Functional area

Description

NFS

If all of the following factors were true, a user with


Patch-141322
appropriate POSIX permissions was denied access to
modify a file:
l

The user was connected to the cluster through


NFSv3.

The user was a member of a group that was


granted read-write access to the file through
POSIX mode bit permissions. For example, rwxrwxr-x (775).

The user was not the owner of the file.

Patch ID

Depending on how the file was accessed, errors


similar to the following might have appeared on the
console:
Operation not permitted.

or
Permission denied.

SMB

If a Windows client that was connected to the cluster


through SMB copied a file from the cluster, the
timestamp metadata applied to the file might have
become invalid. This issue occurred because OneFS
did not properly interpret the value assigned to a
file's timestamp metadata if the value was set to -1,
which is a valid value. Work flows that rely on
timestamp metadata might have been negatively
affected by this issue.
Note

The SMB protocol specifies that, when file attributes


are set, a value of -1 indicates that the attribute in
the corresponding field must not be changed.

140

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Patch-142418

CHAPTER 8
Known issues

Unless otherwise noted, the following issues are known to affect OneFS 7.2.0.0 through
OneFS 7.2.0.4.
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l

Target Code known issues................................................................................... 142


Antivirus............................................................................................................. 142
Authentication.................................................................................................... 142
Backup, recovery, and snapshots ....................................................................... 143
Cluster configuration...........................................................................................145
Command-line interface...................................................................................... 146
Diagnostic tools.................................................................................................. 146
Events, alerts, and cluster monitoring................................................................. 146
File system.......................................................................................................... 149
File transfer......................................................................................................... 151
Hardware............................................................................................................ 151
HDFS................................................................................................................... 153
iSCSI................................................................................................................... 154
Job engine...........................................................................................................154
Migration............................................................................................................ 156
Networking..........................................................................................................156
NFS..................................................................................................................... 157
OneFS API........................................................................................................... 159
OneFS web administration interface.................................................................... 160
Security...............................................................................................................160
SmartQuotas.......................................................................................................161
SMB.................................................................................................................... 161
Upgrade and installation..................................................................................... 162
Virtual plug-ins....................................................................................................163

Known issues

141

Known issues

Target Code known issues


The issues in the following table are known to affect OneFS 7.2.0.3 (Target Code). In
addition, unless otherwise noted, issues that are resolved in OneFS 7.2.0.4 and later are
known to affect OneFS 7.2.0.3. For a complete list of issues that are known to affect
OneFS 7.2.0.3, you should also review the OneFS 7.2.0.4 and later resolved issues
topics.

Antivirus
Antivirus known issues

ID

In the OneFS web administration interface, on the Antivirus Policies page, if you
double-click the Start link for a policy, multiple instances of the AVScan job start.

54477

Authentication known issues

ID

If there are files or directories on the cluster with ACLs that include SIDs for which
no corresponding UID is foundfor example, ACLs that include SIDs for users that
were deleted from an Active Directory (orphaned SIDs)OneFS queries external
authentication providers in an attempt to map the SID to an authoritative UID.
If the attempt to map an orphaned SID to an authoritative UID fails, OneFS
continues to query external authentication providers for the missing UID, and, in
environments where a large number of orphaned SIDs exist, the volume of queries
sent to the providers might adversely affect the performance of the external
authentication provider. If this occurs, users might be prevented from being
authenticated to the cluster.

158867

If the alternate security identities attribute is enabled for an LDAP provider on a


cluster running OneFS 7.2.0.0 through 7.2.0.4, the lsass process fails to look up
alternate security identities, and, as a result, the affected users cannot be
authenticated to the cluster.

158243

Authentication

Note

Enabling the alternate security Identity setting is not a typical configuration.

142

In some cases, if the security mode of the SMB file sharing service is unchanged
from the default configuration in OneFS 6.5.5.x, and if another SMB share setting
for example, Change Notifyis also changed from the default setting in OneFS
6.5.5, then, during an upgrade to OneFS 7.1.1.x, the Impersonate guest security
parameter is changed from Always to Never. If this issue occurs, following the
upgrade, SMB clients might not be able to access shares on the cluster until the
Impersonate Guest value for the share is manually set to Always.

154826

The OneFS SMB server might fail to respond to NetrWkstaGetInfo remote procedure
calls at info level 102 that a clienttypically, embedded systems in a printer or
scannermight make when establishing a connection to the cluster. This could
cause the client to fail to establish a connection to the cluster.

134324

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

Authentication known issues

ID

The lwio process cannot rename files to a name longer than 255 bytes.

134304

Incorrect sequence numbers during SMB2 traffic could cause the lwsmd process to
fail, resulting in a temporary loss of SMB service.

134247

The lsass process might fail while running the NtlmGetDomainNameFromResponse 134239
function due to an incorrectly formed request during NTLM authentication, resulting
in a temporary loss of authentication service.
The lsass process might fail while running the NtlmValidateResponse function due
to an incorrectly formed request during NTLM authentication, resulting in a
temporary loss of authentication service.

134238

The lsass process might fail while running the AuthenticateNTLMv2 function due to
an incorrectly formed request during NTLM authentication, resulting in a temporary
loss of authentication service.

134237

The lsass process might fail while running the NtlmGetCStringFromUnicodeBuffer


134236
function due to an incorrectly formed request during NTLM authentication, resulting
in a temporary loss of authentication service.
When lsass is sequentially restarted across nodes in the cluster, the lwio process
might restart unexpectedly on a node, causing all SMB and NFS clients on that
node to be disconnected. If this occurs, the following lines appear in the stack
trace:
/lib/libc.so.7:recvfrom+0xc
0x807f0f2d5 (lookup_symbol: symtab/strtab
0x807f0f417 (lookup_symbol: symtab/strtab
0x807f0a819 (lookup_symbol: symtab/strtab
0x807f0ab44 (lookup_symbol: symtab/strtab
0x80bb2e474 (lookup_symbol: symtab/strtab
/lib/libthr.so.3:_pthread_getprio+0x15d

not
not
not
not
not

131835

found:2)
found:2)
found:2)
found:2)
found:2)

When you run the isi auth local user modify command with the -password-never-expires option on one of the default services accounts, you
receive an Invalid Parameter error. For example, running the following

83444

command attempts to set the password to never expire for the insightiq user
account:
isi auth local users modify --name=insightiq
--password-never-expires

For more information, see article 89092 on the EMC Online Support site.

Backup, recovery, and snapshots


Backup, recovery, and snapshots known issues

ID

If you configure a snapshot schedule expiration time by running the isi


snapshot schedules command or through the OneFS web administration
interface, the expiration time is not always configured correctly and might not
display correctly in either interface. For example, if you configure a snapshot to
expire in 88 weeks, the web interface might display a snapshot expiration time of
14953 Hours. In addition, the snapshot schedule expiration time that is displayed
in the web interface or at the command-line interface might not accurately reflect

139186

Backup, recovery, and snapshots

143

Known issues

Backup, recovery, and snapshots known issues

ID

the actual configured expiration time. This issue occurs because the expiration
time is interpreted differently by the web interface, the command-line interface,
and the isi_papi_d process.
Workaround: Use the following commands to configure snapshot schedule
expiration times from the command-line interface. Values set by running these
commands are not interpreted by the isi_papi_d process and are, therefore, not
affected by this issue:
isi_classic snapshot schedule create
isi_classic snapshot schedule modify
For more information about the preceding commands, run the following
commands:
isi_classic snapshot schedule create h
isi_classic snapshot schedule modify h
The isi_migr_sched process might fail while it is not possible to run replication
jobs, such as while a node is shutting down.

135744

Restoring an especially large amount of data (more than 50 TB), might fail due to a
memory allocation error with the following error message:

133591

<3.3> scdepot-4(id4) isi_ndmp_d[94807]:


isi_ndmp_d: *** FAILED ASSERTION rb->datap @
/build/mnt/src/isilon/lib/isi_ndmpbrm2/fast_restore.c:1088:
Failed to allocate memory - Cannot allocate memory

Parallel NDMP restores might fail while the cluster is under heavy load.

130693

If you set the read-only DOS attribute to deny modification of files over both UNIX
(NFS) and Windows File Sharing (SMB) on a target directory of a replication policy,
the associated replication jobs will fail.

127652

While a Backup Accelerator is running multiple NDMP sessions, memory


exhaustion or a crash might occur, and false sessions might appear on the NDMP
session list.
Workaround: Open an SSH connection on any node in the cluster, log in using the
root account, and run the following command:

125897

rm -rf /ifs/.ifsvar/modules/ndmp/sessions/*

This will remove all stale files while retaining current sessions.
The following message might appear in the /var/log/messages file:

124767

isi_migrate[98488]: coord[cert2-long123-d0b]:
Problem reading from socket of (null):
Connection reset by peer

Workaround: Ignore the error message. This is a transient error that OneFS will
recover from automatically.

144

Backing up large sparse files takes a very long time because OneFS must build
sparse maps for the files, and OneFS cannot back up data while building a map.
OneFS might run out of memory while backing up a sparse file with a large number
of sparse regions.

124216

File list backups are not supported with dir/node file history format.

113999

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

Backup, recovery, and snapshots known issues

ID

The SyncIQ scheduler service applies UTF-8 encoding even if the cluster is set with
a different encoding. As a result, DomainMark and SnapRevert jobs, which apply
cluster encoding, might fail to run.

99383

If you revert a snapshot that contains a SmartLock directory, the operation might
fail and leave the directory partially reverted.

99211

When SyncIQ and SmartQuotas domains overlap , a SyncIQ job might fail with one
of the following errors:

97492

Operation not permitted

unable to delete

failed to move

unable to rename

For more information, see article 88602 on the EMC Online Support site.
If you are using the Comvault Simpana data management application (DMA), you
cannot browse the backup if the data set has file names with non-ASCII characters.
As a result, you cannot select single files to restore. Full restoration of the dataset
is unaffected.
For more information, see article 88714 on the EMC Online Support site.

96545

If you use SyncIQ to synchronize data and some data is freed on the source cluster 94614
because a file on the source decreased in size, the data is not freed on the target
cluster when the file is synchronized. As a result, the space consumed on the target
cluster might be greater than the space consumed on the source.
SyncIQ allows a maximum of five jobs to run at a time. If a SnapRevert job starts
while five SyncIQ jobs are running, the SnapRevert job might appear to stop
responding instead of pausing until the SyncIQ job queue can accept the new job.

93061

After performing a successful NDMP backup that contains a large number of files
(in the tens of millions), when you restore that backup using Symantec NetBackup,
the operation fails and you receive the following error message:

87092

error db_FLISTreceive: database system error 220

For more information, see article 88740 on the EMC Online Support site.

Cluster configuration
Cluster configuration known issues

ID

If a user is assigned only the ISI_PRIV_AUDIT privilege, the user can view the
controls to delete file pool policies on the File System > Storage Pools > File
Pool Policies page.

134378

Cluster configuration

145

Known issues

Cluster configuration known issues

ID

Note

Although the ISI_PRIV_AUDIT privilege does not allow a user to delete file pool
policies, a user who is assigned the ISI_PRIV_AUDIT privilege can view the controls
to delete file pool policies on the File System > Storage Pools > File Pool
Policies page.
The isi_cpool_io_d process might fail while attempting to close a file, generating
"bad file descriptor" errors in the log. This is due to leaving a stale descriptor for
the cache header.

132397

The command-line wizard requires a default gateway to set up a cluster. You may
not have a default gateway if your network uses a local DNS server.
Workaround: Enter 0.0.0.0 for your default gateway.

24621

Command-line interface
Command-line interface known issues

ID

If you run an isi command with the --help option to get more information about
the command, the text that is displayed might provide information about the
related isi_classic command instead of providing information about the
command that you typed. For example, if you run the isi storagepools
command with the --help option, the following information appears:

129637

'isi_classic smartpools health' options are:


--verbose, -v Print settings to be applied.
--help, -h Print usage help and exit

The isi version osreldate command returns a random number rather than
the expected OneFS release date.

98452

Diagnostic tools known issues

ID

On the Gather Info page In the OneFS web administration interface, the Gather
Status progress bar indicates that the Gather Info process is complete while the
process is still running.

103906

Diagnostic tools

Events, alerts, and cluster monitoring

146

Events, alerts, and cluster monitoring known issues

ID

If an NFS request specifies an inode rather than a file name, and more than one
hard link to the specified inode exists, OneFS auditing will be unable to determine

136038

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

Events, alerts, and cluster monitoring known issues

ID

which hard link was intended by the NFS client. If this happens, OneFS auditing
might select the incorrect hard link, which can cause client permissions to be
misrepresented in audit logs.
The isi_papi_d process might fail while InsightIQ begins monitoring a cluster that
contains 80 or more nodes.

135767

The isi_stats_hist_d process might fail when the cluster is under heavy load, with
the following lines in the stack trace:

135641

/lib/libc.so.7:thr_kill+0xc
/lib/libc.so.7:__assert+0x35
/usr/sbin/isi_stats_hist_d:_ZN15stats_hist_ring4initEitb+0x506
/usr/sbin/isi_stats_hist_d:_ZN10ring_cache3getEiiiiii+0x228
/usr/sbin/
isi_stats_hist_d:_ZN11db_mgr_impl5queryER20stats_timeseries_setP10
stats_impltRK11query_timesRK14stats_hist_pol+0x33d
/usr/sbin/
isi_stats_hist_d:_ZN16database_manager5queryER20stats_timeseries_s
etP10stats_impltRK11query_timesRK14stats_hist_pol+0x28
/usr/sbin/
isi_stats_hist_d:_ZN20ecd_query_timeseries8query_dbEP10stats_implt
RK11query_timesRK14stats_hist_pol+0x3d
/usr/sbin/
isi_stats_hist_d:_ZN20ecd_query_timeseries12proc_commandEl+0x56c
/usr/sbin/isi_stats_hist_d:main+0xbcd
/usr/sbin/isi_stats_hist_d:_start+0x8c

The isi_celog_coalescer process fails when the garbage collector reaches across
multiple threads/connections and attempts to clear out what it deems as
unreferenced.

132398

The SNMP daemon might restart after a drive is smartfailed and then replaced.

129711

If you have auditing with NFS enabled on your cluster, the NFS service might restart
unexpectedly. If this occurs, lines similar to the following appear in
the /var/log/messages file:

129098

Stack: -------------------------------------------------/usr/lib/libstdc++.so.6:_ZNSs6assignERKSs+0x1e
/usr/lib/libisi_flt_audit.so.1:_init+0x3b60
/usr/lib/libisi_flt_audit.so.1:_init+0x4092
/usr/likewise/lib/libiomgr.so.0:IopFmIrpStateDispatchPostopExec
+0x16a
/usr/likewise/lib/libiomgr.so.0:IoFmIrpDispatchContinue+0x74d
/usr/likewise/lib/libiomgr.so.0:IopIrpDispatch+0x317
/usr/likewise/lib/libiomgr.so.0:IopRenameFile+0x117
/usr/likewise/lib/libiomgr.so.0:IoRenameFile+0x22
/usr/lib/libisi_uktr.so.1+0x167873:0x8082f2873
/usr/lib/libisi_uktr.so.1+0x194a17:0x80831fa17
/usr/lib/libisi_uktr.so.1+0x18fc90:0x80831ac90
/usr/lib/libisi_uktr.so.1+0x169b2c:0x8082f4b2c
/usr/likewise/lib/liblwbase.so.0:SparkMain+0xb7
--------------------------------------------------

Workaround: Disable auditing with NFS.


When alert traffic is high, running the isi events quiet all command might
time out. As a result, some events might not be quieted and the following error
might be displayed:

113689,
112774

Error marking events: Error while getting


response from isi_celog_coalescer (timed out)

Workaround: Run the isi events quiet all command on the master node.

Events, alerts, and cluster monitoring

147

Known issues

Events, alerts, and cluster monitoring known issues

ID

If the email address list for an event notification rule is modified from the command 89086
line, the existing list of email addresses is overwritten by the new email addresses.
For more information, see article 88736 on the EMC Online Support site.
Although SNMP requests can reference multiple object IDs, the OneFS subtree
responds only to the first object ID.

81183

If you have a large number of LUNs active, the event processor might issue a
warning about open file descriptors held by the iSCSI daemon.
You can safely ignore this warning.

79341

On the Cluster Overview page of the OneFS web administration interface, clicking 77470
the ID of a node that requires attention, as indicated by a yellow Status icon, does
not provide details about the status.
Workaround: In the list of events, sort the nodes by the Scope column or by the
Severity column, and then click View details.
Alternatively, run the isi events list --nodes <id> command to view the
events.
For more information, see article 16497 on the EMC Online Support site.
If you run the isi status command, the value displayed for the sum of per-node 73554
throughput might differ from the value displayed for the sum of cluster throughput.
This occurs because some data is briefly cached. The issue is temporary.
Workaround: Re-run the isi status command.
For more information, see article 88690 on the EMC Online Support site.
Reconfiguring aggregate interfaces can leave active events for inactive interfaces.
Workaround: Cancel the events manually.

72200

Event system databases that store historical events might fail to upgrade correctly.
If the databases fail to upgrade, they are replaced by an empty database with a
new format and historical events are lost.

71840

A network interface that is configured as a standby without an IP address triggers


an interface down event.
Workaround: Quiet the event manually.

71399

Monitoring with SNMP, InsightIQ, or the isi statistics command can fail
when a cluster is heavily loaded.

68559

While a cluster processes a heavy I/O load, graphs in the OneFS web
administration interface might display the following message:

62736

Warning: Unreliable Data

Workaround: Run the isi statistics command.

148

When using Simple Network Management Protocol (SNMP) to report on aggregated


interfaces, for example, LACP, LAGG, and fec, the interface speed is displayed as
100 MB instead of 2 GB.
For more information, see article 89363 on the EMC Online Support site.

55247

You might receive an alert that a temporary license is expired even though a
permanent license is installed.

24504

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

Events, alerts, and cluster monitoring known issues

ID

Workaround: Use the command-line interface or the web administration interface


to quiet the alert

File system
File system known issues

ID

If you create or open Alternate Data Stream (ADS) with the Permission to
Delete option enabled at open time, a memory resource leak on the virtual file
system can result. This might degrade overall cluster performance.

153312

If a dedupe job is running on a file that is also in the process of being deleted, the
workers for the job can be delayed long enough to generate a hangdump file. The
dedupe job will continue afterwards. If this issue is encountered, messages similar
to the following appear in the /var/log/messages file:

141028

isi_hangdump: Lock timeout: 720.008538 from


efs.lin.lock.initiator.oldest_waiter
isi_hangdump: LOCK TIMEOUT AT 1421800091
UTC
isi_hangdump: Hangdump timeout after 0 seconds:
Received HUP
isi_hangdump: Lock timeout: 725.018597 from
efs.lin.lock.initiator.oldest_waiter
isi_hangdump: Lock timeout: 730.028656 from
efs.lin.lock.initiator.oldest_waiter
isi_hangdump: Lock timeout: 735.038715 from
efs.lin.lock.initiator.oldest_waiter
isi_hangdump: Lock timeout: 740.048774 from
efs.lin.lock.initiator.oldest_waiter
isi_hangdump: END OF DUMP AT 1421800091
UTC

A node might fail to shut down or reboot if the shutdown process is unable to stop
the lwsm process in less than 2 minutes. If this issue occurs the following error
appears in the /var/log/messages file:

140822

rc.shutdown: 120 second watchdog timeout expired. Shutdown


terminated.

If you encounter this issue, wait 5 minutes and then try to reboot the node by
running the reboot command. If the node fails to reboot, contact EMC Isilon
Technical Support for assistance.
The lwio process might fail while a node is being shut down.

135869

The lwio process might fail while the cluster is under heavy load, causing clients to
become disconnected. If this occurs, the following lines appear in the logs:

134343

/lib/libc.so.7:thr_kill+0xc
/usr/likewise/lib/liblwiocommon.so.0:LwIoAssertionFailed+0xb6
/usr/likewise/lib/libiomgr.so.
0:IopFmIrpStateDispatchFsdCleanupDone+0x26
/usr/likewise/lib/libiomgr.so.0:IoFmIrpDispatchContinue+0x36c
/usr/lib/
libisi_cpool_rdr.so:_Z16cprdr_pre_createP21_IO_FLT_CALLBACK_DATAP2
3_IO_FLT_RELATED_OBJECTSPPvPPFvS0_S3_ES4_+0x646
/usr/lib/
libisi_cpool_rdr.so:_Z19process_pre_op_itemP13_LW_WORK_ITEMPv+0x54

File system

149

Known issues

File system known issues

ID

/usr/likewise/lib/liblwbase.so.0:WorkThread+0x256
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xee
/lib/libthr.so.3:_pthread_getprio+0x15d

During the upgrade process, an MCP error might prevent the last node on a cluster
from upgrading and corrupt the /etc/mcp/mlist.xml file.
Workaround: Delete the /etc/mcp/mlist.xml file and restart MCP. MCP will
autogenerate a new mlist.xml.

133115

When processing a restart request, MCP service configuration scripts that call isi 131924
services might result in a recursive service stop request, and this second
request might cause the MCP to simultaneously stop a service while starting
another that depends upon it. This will result in unnecessary service restarts.
Workaround: Manually stop the processes in the reverse order of their dependency.
If a node crashes on a three-node cluster and it is not re-added to the cluster, and
then you add a node, one of the remaining nodes might unexpectedly reboot. You
might need to wait for a significant amount of time before you can add the node to
the cluster successfully.
Workaround: Add the node to the cluster while no writes are being made to the
cluster. This will prevent the issue from occurring.

124603

LDAP user and group ownership cannot be configured in the OneFS web
administration interface.
Workaround: Use the command-line interface to configure LDAP user and group
ownership.

103983

An Alternate Data Stream (ADS) block-accounting error might cause the Inode
Format Manager (IFM) module to fail, causing the following message to be logged
to the stack trace:

100118

kernel:isi_assert+0xde
kernel:isi_assert_mayhalt+0x70
efs.ko:ifm_compute_new_ads_summary+0x9a
efs.ko:ifm_update_ads_summary+0x15b
efs.ko:ifm_end_operation+0x11ad
efs.ko:txn_i_end_all_inode_ops+0x11d
efs.ko:txn_i_end_operations+0x5e
efs.ko:txn_i_end+0x3d
efs.ko:bam_remove+0x198
efs.ko:ifs_vnop_wrapremove+0x1bf
kernel:VOP_REMOVE_APV+0x33
kernel:kern_unlinkat+0x2a6
kernel:isi_syscall+0x49
kernel:syscall+0x26e

Workaround: Ignore this error message. This is a transient error that OneFS will
recover from automatically.

150

Nodes without subpools appear in the per-node storage statistics, but are not in
the cluster totals because you cannot write data to unprovisioned nodes.

86328

The OneFS web administration interface does not prevent multiple rolling upgrades
from being started simultaneously. If multiple rolling upgrades are running
simultaneously, the upgrades fail.

84376

Some configuration changes cannot be made if the cluster is 99 percent full. As a


result, the cluster might stop responding until sufficient free space is made

74272

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

File system known issues

ID

available. See Best Practices Guide for Maintaining Enough Free Space on Isilon
Clusters and Pools on the EMC Online Support site.
When you attempt to create a hard link to a file in a WORM (Write Once Read Many)
directory, the following incorrect error message displays:

73790

Numerical argument out of domain

When FlexProtect is run with verify upgrade check enabled and one or more drives
are down, OneFS occasionally reports false data corruption. If this issue occurs,
contact EMC Isilon Technical Support.

73276

If you run an incorrectly formatted shutdown command, a node might be placed


54120
into read-only mode and could fail to shutdown. In some cases the node is
inaccessible through the network but is still accessible through a serial connection.
For more information, see article 89544 on the EMC Online Support site.
In the OneFS web administration interface, file names that contain characters that
are not supported by the character encoding that is applied to the cluster do not
display correctly when viewed through File System Explorer.
Workaround: Rename the files using characters supported by the character
encoding that is applied to the cluster.

18901

File transfer known issues

ID

The FTP output of the isi statistics command might be inaccurate.

129599

By default, the Very Secure FTP Daemon (vsftpd) service supports clear-text
authentication, which is a possible security risk.

127738

File transfer

Note

For more information about this issue, see the Protocols section of the OneFS 7.2
Security Configuration Guide.
In the OneFS web administration interface, on the Diagnostics > Settings page, if
you enter an invalid address in the HTTP host or FTP host field, Connection

70448

Verification Succeeded is displayed when you click Verify.

Hardware
Hardware known issues

ID

If you run the isi devices -a smartfail -d <device> command to


159412
smartfail a drive that failed before it was purposed by OneFS, an error similar to the
following appears on the console:
!! Error: the smartfail action is invalid for a missing drive.

File transfer

151

Known issues

Hardware known issues

ID

Note

In the command example above, <device> is the drive to be smartfailed.


If you reboot or shut down a node with a Broadcom 10 GbE network interface card
that is configured for legacy fec aggregation, the node might stop responding until
it is manually powered off.

136915

If the power supply fan in an HD400 node fails, the power supply indicator light
turns yellow, but no alert is sent. If this condition is not addressed, the power
supply will eventually fail and an alert will be sent for the power supply failure.
Contact EMC Isilon Technical Support if you encounter this issue.

135814

If a node encounters a journal error during an initial boot, OneFS allows the user to
continue booting the node through the following text:

135354

Test Journal exited with error - Checking Isilon Journal


integrity...
NVRAM autorestore status: not performed...
Could not restore journal. Contact Isilon Customer Support
Immediately.
Please contact Isilon Customer Support at 1-877-ISILON.
Command Options:
1) Enter recovery shell
2) Continue booting
3) Reboot

If the node is booted in this state, and then joined to a cluster, it will remain in a
down state and might affect cluster quorum.
Workaround: Do not continue booting the node. Contact Isilon Technical Support.
If an SED SSD drive is set to SED_ERROR, and the drive is formatted while L3

133696

cache is enabled on the cluster, the drive will be formatted for storage and will
report a status of HEALTHY.
Workaround: SmartFail the SED SSD that has been formatted for storage and then
format the drive again.
The isi firmware update command might incorrectly report that a firmware
update has failed because OneFS requires nodes to be rebooted after a firmware
update, but the command performs a shutdown -p command instead.

133606

The isi firmware update command might incorrectly report that a firmware
update has failed on a remote node.

133317

Node firmware updates will fail if HPM downloads return error code D5 during the
upgrade process.
Workaround: Retry updating the node firmware. If this issue persists, contact EMC
Isilon Technical Support.

132523

Chassis Management Controller (CMC) firmware update procedures might fail.


Workaround: Run the following command and then retry the update.

123303

/usr/bin/isi_ipmicmc -c -a cmc

An internal sensor that monitors components might not correctly detect the source 73050
of a hardware component failure, such as the I2C bus. If this occurs, the wrong alert
or no alert might be generated.

152

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

Hardware known issues

ID

Nodes with invalid system configuration numbers are split from the cluster after
joining.
Workaround: Use smartfail to remove the node from the cluster. Contact Isilon
Technical Support to apply a valid system configuration number to the node and
then add the node to the cluster again.

71354

A newly created cluster might not be visible to unconfigured nodes for up to three
minutes. As a result, nodes will fail to join the cluster during that time period.

69503

If the /etc/isilon_system_config file or any /etc VPD file is blank, an


isi_dongle_sync -p operation will not update the VPD EEPROM data.

67932

There are multiple issues with shutting down a node incorrectly that can potentially 35144
lead to data loss.
Workaround: Follow instructions about shutting down nodes exactly.
For more information, see article 16529 on the EMC Online Support site.

HDFS
HDFS known issues

ID

DataNode connections can potentially experience a memory leak in the data path.
Over time, this can result in an unexpected restart of the HDFS server. As a result,
clients connected to that node are disconnected.
Workaround: The HDFS server will automatically be operational again within a few
seconds and no further action is necessary.

158083

If the Hadoop datanode services are left running on Hadoop clients that are
connected to a cluster, the isi_hdfs_d process will continuously log the following
message to /var/log/messages and /var/log/isi_hdfs_d.log as it
receives the requests:

135993

org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol
from verify_ipc_protocol
(/build/mnt/src/isilon/bin/isi_hdfs_d/protoutil.c:18)
from parse_connection_context
(/build/mnt/src/isilon/bin/isi_hdfs_d/protoutil.c:100)
from ver2_2_parse_connection_context
(/build/mnt/src/isilon/bin/isi_hdfs_d/protocol_v2_2.c:388)
from process_out_of_band_rpc
(/build/mnt/src/isilon/bin/isi_hdfs_d/protocol_v2_2.c:1000)

If the cluster is under heavy HDFS load, it might cause the isi_hdfs_d process to
restart. If this occurs, the following lines appear in the stack trace:

123802

/lib/libc.so.7:__sys_kill+0xc
/usr/lib/libisi_util.so.1:isi_assert_halt+0xa0
/usr/lib/libisi_hdfs.so.1:hdfs_enc_mkdirat_p+0x2b1
/usr/lib/libisi_hdfs.so.1:hdfs_mkdir_p+0x41
/usr/bin/isi_hdfs_d:config_init_directory+0x13

HDFS

153

Known issues

iSCSI
iSCSI known issues

ID

The iSCSI protocol can log a data digest error message in the iSCSI log.

83537

No workaround is required; the protocol will recover and reconnect.


VSS 32-bit installation succeeds on a Windows initiator, but the provider does not
appear in the list of installed providers. This issue affects Windows Server 2003
only.
Workaround: Install the Microsoft iSCSI Software Initiator.

74303

For more information, see article 88763 on the EMC Online Support site.
In the OneFS web administration interface, the iSCSI Summary page sometimes
loads slowly. When this occurs, the page might time out and the isi_webui_d
process might be consuming a high percentage of CPU resources on one or more
nodes.

73038

If you create a new target after you move iSCSI shadow clone LUNs, the OneFS web
administration interface might become unresponsive.

71919

Job engine known issues

ID

Job engine
In rare instances, if a drive fails while IntegrityScan is running, the IntegrityScan job 139708
can fail. In addition, if you run the isi job events list --job-type
integrityscan command, a message similar to the following appears on the
console, where <x> is the job ID:
2015-02-12T15:35:31 <x> IntegrityScan 1
State change Failed

The job should automatically restart and then run to completion.


In rare instances, if a drive fails while MediaScan is running, the MediaScan job can 139704
fail. In addition, if you run the isi job events list --job-type
mediascan command, a message similar to the following appears on the console,
where <x> is the job ID:
2015-02-12T15:35:31 <x> MediaScan 1 State
change Failed

The job should automatically restart and then run to completion.


The isi_job_d process might fail while a QuotaScan job is running. If this happens,
the QuotaScan job will continually pause and resume, and the following lines will
appear in the stack trace:
/lib/libc.so.7:thr_kill+0xc
/usr/lib/libisi_util.so.1:isi_assert_halt+0xa0
/usr/lib/libisi_job.so.1:tw_opendir+0x207
/usr/lib/libisi_job.so.1:tw_tree_init+0x327
/usr/bin/isi_job_d:treewalk_task_next_item+0x150

154

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

134301

Known issues

Job engine known issues

ID

/usr/bin/isi_job_d:quotascan_task_next_item+0x4c
/usr/bin/isi_job_d:worker_process_task+0x307
/usr/bin/isi_job_d:worker_main+0x11cd :
/lib/libthr.so.3:_pthread_getprio+0x15d

Workaround: Cancel the QuotaScan job.


If you queue multiple jobs while smartfailing drives, AutoBalance jobs might fail.

133771

The MediaScan job reports errors for drives that have been removed from the
cluster.
Workaround: Don't fail a drive after a MediaScan job has started, or cancel the job.

132083

If the MultiScan, Collect, or Autobalance jobs are disabled before a rolling upgrade, 124744
the jobs will automatically become enabled after the rolling upgrade completes.
Workaround: If MultiScan, Collect, or Autobalance jobs are disabled before a rolling
upgrade, and you want those jobs to be disabled after the rolling upgrade
completes, manually disable those jobs after the rolling upgrade completes.
If a FlexProtect or FlexProtectLin job is started during a rolling upgrade, OneFS
123167
might cancel the job. The job might not complete until after the rolling upgrade is
complete.
Workaround: If OneFS creates a FlexProtect job because a device failed during a
rolling upgrade, pause the upgrade until the job completes. It is recommended that
you pause the rolling upgrade and do not pause the FlexProtect job.
The isi job status command displays jobs in numerical order by running ID
instead of displaying active jobs before inactive jobs.

114802,
114583

The isi job reports view job command sometimes returns reports twice.

112265

The Dedupe and DedupeAssess jobs can only run with a job-impact level of low.

110129

When you run a DomainMark job after taking a snapshot, and then run a
SnapRevert job with a job impact policy set higher than low, the impact policy has
no effect.
For more information, see article 88597 on the EMC Online Support site.

93603

Job engine operations occasionally fail on heavily loaded or busy clusters. When
the command fails, a message similar to the following is displayed:

72109

Unable to pause integrity scan: pause command failed: Resource


temporarily unavailable.

Workaround: If an operation fails, wait a moment and then retry the operation.
The final phase of the FSAnalyze job runs on one node and can consume excessive
resources on that node.

64854

Job engine

155

Known issues

Migration
Migration known issues

ID

If you migrate ACLs to the cluster through the isi_vol_copy_vnx command and 131299
then attempt to read those ACLs over NFSv4, the read will fail with the following
error message:
An NFS server error occurred

If you migrate FIFO files using the isi_vol_copy utility, the following message
displays:

100366

Save checkpoint error:


Could not match file history.

If the isi_vol_copy command is run twice, with different source paths but the
same target path, the second run fails without migrating any files.

100365

Networking known issues

ID

Networking
If a network socket is already closed when sbflush_internal is called, the affected
150739
node might unexpectedly reboot. If a node reboots as a result of this issue, an error
similar to the following appears in the /var/log/messages file:
Software Watchdog failed (userspace is starved!)

In clusters with a large number of nodes, after an InfiniBand switch is rebooted, the 134665
cluster might experience a high level of group change activity for approximately two
hours. Because, by default, a single Device Work Thread (DWT) is handling all node
transitions to the new InfiniBand connections, some requests are not handled in a
timely manner. As a result, nodes might not successfully failover to the new
InfiniBand connection, and, in some cases, might fail to rejoin the cluster.
Workaround: To increase the number of DWT threads handling requests to failover
to a new InfiniBand connection, set the following sysctl value:
sysctl efs.rbm.dwt_threads=4

For more information about viewing and setting sysctl options, see article 89232 on
the EMC Online Support site.
Note

Increasing the number of DWT threads might affect CPU performance, depending
on the number of processors in the node.

156

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

Networking known issues

ID

The OpenSM process might fail, causing cluster-wide actions to slow for a short
period of time. If this occurs, the following lines appear in the stack trace:

132546

/lib/libc.so.7:thr_kill+0xc
/lib/libc.so.7:__assert+0x35
/usr/lib/libcomplib.so.1:cl_spinlock_acquire+0x53
/usr/libexec/opensm:osm_log+0xef
/usr/libexec/opensm:umad_receiver+0x55b
/usr/lib/libcomplib.so.1:__cl_thread_wrapper+0x18
/lib/libthr.so.3:_pthread_getprio+0x15d

Ixgbe interfaces might report a status of inactive, even if the cable and the port that 127706
the cable is plugged into is functioning correctly.
If a port on an A100 node has IP addresses assigned to it, the port will reinitialize
when the node is booted up.

126464

After a group change, the dnsiq_d process might fail. After this, the following
message is logged to the stack trace:

78588

/usr/sbin/isi_dnsiq_d:vip_configured+0x54
/usr/sbin/isi_dnsiq_d:vip_ifconfig_down+0x18
/usr/sbin/isi_dnsiq_d:apply_flx_subnet+0x7c
/usr/sbin/isi_dnsiq_d:gmp_group_changed+0x122
/usr/sbin/isi_dnsiq_d:main+0x660
/usr/sbin/isi_dnsiq_d:_start1+0x80
/usr/sbin/isi_dnsiq_d:_start+0x15

When a node with a static IP address is smartfailed, the IP address is assigned to


another node. In some cases, the IP address that is moved might be moved to a
node that already has an IP address assigned to it, replacing the IP address on that
node.

71687

If an IPv6 subnet includes two or more NICs, one NIC might become unresponsive
over IPv6.

57880

NFS known issues

ID

If you run the rmdir command to remove a directory from an NFS export that is
configured with character encoding other than the default encodingfor example,
CP932 or ISO-8859-1 encodingand if the name of the directory you want to
remove contains a special character, the directory is not removed and a message
similar to the following appears on the console:

159373

NFS

failed to remove `\directory_path': Invalid argument

On occasion, when OneFS is shutting down the NFS server, a system call made by
the server does not return a response within the allowed 15-minute grace period.
As a result, the NFS server is forcibly shut down and lines similar to the following
appear in the /var/log/messages file:

136358

/lib/libc.so.7:syscall+0xc
/usr/likewise/lib/lw-svcm/onefs.so:OnefsQuerySetInformationFile
+0xa7

NFS

157

Known issues

NFS known issues

ID

/usr/likewise/lib/lw-svcm/onefs.so:OnefsSetInformationFile+0x3b
/usr/likewise/lib/lw-svcm/onefs.so:OnefsIrpSpark+0x109
/usr/likewise/lib/lw-svcm/onefs.so:OnefsIrpWork+0xfa
/usr/likewise/lib/lw-svcm/onefs.so:OnefsAsyncStart+0x55
/usr/likewise/lib/lw-svcm/onefs.so:OnefsDriverDispatch+0x6f
/usr/likewise/lib/libiomgr.so.0:IopFmIrpStateDispatchFsdExec+0x9d
/usr/likewise/lib/libiomgr.so.0:IoFmIrpDispatchContinue+0x56c
/usr/likewise/lib/libiomgr.so.0:IopIrpDispatch+0x1d0
/usr/likewise/lib/libiomgr.so.0:IopQuerySetInformationFile+0x1fc
/usr/likewise/lib/libiomgr.so.0:IoSetInformationFile+0x44
/usr/likewise/lib/lw-svcm/nfs.so:Nfs4SetattrSetInfoFile+0x5a2
/usr/likewise/lib/lw-svcm/nfs.so:Nfs4Setattr+0x3bd
/usr/likewise/lib/lw-svcm/nfs.so:NfsProtoNfs4ProcSetAttr+0x178
/usr/likewise/lib/lw-svcm/nfs.so:NfsProtoNfs4ProcCompound+0x87e
/usr/likewise/lib/lw-svcm/nfs.so:NfsProtoNfs4Dispatch+0x486
/usr/likewise/lib/lw-svcm/nfs.so:NfsProtoNfs4CallDispatch+0x3e
/usr/likewise/lib/liblwbase.so.0:SparkMain+0xb7

The NFS process might fail if you attempt to shut down the NFS process while the
cluster is under heavy NFS load.

135529

OneFS might report that NFS clients are still connected to the cluster after the
clients have disconnected.

135376

The NFS process might core, causing all NFS clients to be disconnected. If this
occurs, the following lines appear in the stack trace:

129684

/lib/libc.so.7:thr_kill+0xc
/lib/libc.so.7:__assert+0x35
/usr/likewise/lib/libiomgr.so.0:IoFileSetContext+0x32
/usr/likewise/lib/lwio-driver/onefs.so:OnefsStoreCCB+0x20
/usr/likewise/lib/lwio-driver/onefs.so:OnefsNfsCreateFile+0xf4b
/usr/likewise/lib/lwio-driver/onefs.so:OnefsCreateInternal+0x1209
/usr/likewise/lib/lwio-driver/onefs.so:OnefsSemlockAvailableWorker
+0x92
/usr/likewise/lib/lwio-driver/
onefs.so:OnefsAsyncUpcallCallbackWorker+0x1dd
/usr/likewise/lib/lwio-driver/onefs.so:OnefsAsyncUpcallCallback
+0xe8
/usr/lib/libisi_ecs.so.1:oplocks_event_dispatcher+0xb9
/usr/likewise/lib/lwio-driver/onefs.so:OnefsOplockChannelRead+0x56
/usr/likewise/lib/liblwbase.so.0:EventThread+0x6dc
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xee
/lib/libthr.so.3:_pthread_getprio+0x15d

If an SMB client has an opportunistic lock (oplock) on a file and the file is renamed 94168
or deleted by an NFS client, the SMB client does not relinquish its oplock, and the
file data on the SMB client is not updated. This issue is caused by an extremely rare
race condition that might occur in OneFS 6.0 or later.
For more information, see article 88591 on the EMC Online Support site.

158

After a node restarts, the mountd process starts before authentication. As a result,
immediately after the node restarts, NFS clients might experience permission
problems or receive the wrong credentials when they mount a directory over NFS.
Workaround: On the NFS client, unmount and remount the directory.

73090

Moving files between exports in an NFSv4 overriding-exports scenario may cause


unforeseen consequences.
Workaround: Configure exports so that they are not exporting similar paths or
mapping to two different credentials.

70616

When you add a node to the cluster, the master control program (MCP) loads the
sysctl.conf file after the external interfaces have IP addresses. As a result, NFS

70413

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

NFS known issues

ID

clients that require 32-bit file handles might encounter issues connecting to newly
added nodes.
Workaround: On NFS clients that encounter this issue, unmount and then remount
the directory.
The default number of NFS server threads was changed to address a potential issue 69917
in which the NFS server monopolizes node resources. NFS performance might be
lower than expected.
Workaround: Adjust the number of nfsd threads by running the following
commands. Modify the minimum number of threads by running the following
command, where <x> is an integer:
isi_sysctl_cluster vfs.nfsrv.rpc.threads_min=<x>
Modify the maximum number of threads by running the following command, where
<x> is an integer:
isi_sysctl_cluster vfs.nfsrv.rpc.threads_max=<x>
We recommend that you set threads_min and threads_max to the same value.
Increasing the number of threads can improve performance, but can also cause
node stability issues.

OneFS API
OneFS API known issues

ID

The lwswift process might fail if a large number of clients retrieve large files that
have not been previously accessed by Swift. If this occurs, the following lines
appear in the stack trace:

135252

/lib/libc.so.7:thr_kill+0xc
/usr/likewise/lib/liblwbase_nothr.so.0:LwRtlMemoryAllocate+0x9e
/usr/likewise/lib/liblwbase.so.0:LwIovecCreateMemoryEntry+0x22
/usr/likewise/lib/liblwbase.so.0:LwIovecPullupCapacity+0x1ae
/usr/likewise/lib/lwio-driver/
lwswift.so:_Z12HttpProtocolPN5swift10_LW_SOCKETEP9_LW_IOVECiPvPj
+0x165
/usr/likewise/lib/liblwswift_utils.so.
0:_ZN5swift12LwSocketTaskEP8_LW_TASKPv19_LW_TASK_EVENT_MASKPS3_Pl
+0x634
/usr/likewise/lib/liblwbase.so.0:EventThread+0x6dc
/usr/likewise/lib/liblwbase.so.0:LwRtlThreadRoutine+0xee

If you attempt to write to a read-only file, OneFS does not log an error message to
the /var/log/lwswift.log file.

134770

In the RESTful Access to the Namespace (RAN) API, when a file is created through
the PUT operation, a temporary file of the same name with a randomly generated
suffix is placed in the target directory. Under normal circumstances, the temporary
file is removed after the operation succeeds or fails. However, the temporary file
may remain in the target directory if the server crashes or is restarted during the
PUT operation.

104388

OneFS API

159

Known issues

OneFS web administration interface


OneFS web administration interface known issues

ID

If you run the isi devices fwupdate command on a node that contains SSDs
configured for use as L3 cache, and that node is in read-only mode, the node might
restart unexpectedly and an error similar to the following appears in
the /var/log/messages file:

155489

login: panic @ time 1436325569.184, thread 0xffffff01a8c175b0:


Assertion Failure
cpuid = 2
Panic occurred in module kernel loaded at 0xffffffff80200000:
Stack: -------------------------------------------------kernel:isi_assert_halt+0x2e
kernel:jio_journal_write_sync+0x60
kernel:j_write_l3_super+0x104
kernel:mgmt_finish_super+0x4b
kernel:mgmt_remove_from_sb+0x18b
kernel:l3_mgmt_drive_state+0x7ec
kernel:drv_change_drive_state+0x111
kernel:ifs_modify_drive_state+0x16cb
kernel:_sys_ifs_modify_drive_state+0x83
kernel:isi_syscall+0xaf
kernel:syscall+0x325
-------------------------------------------------*** FAILED ASSERTION j_can_continue_write(j) @ /b/mnt/src/sys/ifs/
journal/journal_io.c:186: jio_journal_write_sync: attempt to
write to
read-only journal

If you attempt to upload cluster information through the OneFS web administration
interface and the upload fails, the web interface for uploading information ceases
to function. If you attempt to upload information again, OneFS will display
Gather Succeeded. However, no cluster information will be uploaded.

133974

If you have not uploaded cluster information to Isilon Technical Support yet, on the
Cluster Management > Diagnostics Info page, the Gather Status bar appears
gray or black.

133972

The default SSL port (8080) for the web administration interface cannot be
modified.
For more information, see article 88725 on the EMC Online Support site.

94026

If you use the SmartConnect service IP or hostname to log in to the OneFS web
administration interface, the session fails or returns you to the login page.
Workaround: Connect to the cluster with a static IP address instead of a hostname.

75292

Security known issues

ID

Beginning in OneFS 7.2.0.1, SSL v3 is no longer supported for HTTPS connections


to the cluster. As a result, HTTP clients cannot connect to the OneFS web
administration interface through a connection that relies on SSL v3.
Workaround: Enable TLS 1.x for HTTP connections to the web administration
interface.

137904

Security

160

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Known issues

Security known issues

ID

For more information, see ESA-2015-015 on the EMC Online Support site.

SmartQuotas
SmartQuotas known issues

ID

If you configured a storage quota on a directory with a pathname that contained a


single, multibyte character, and if a quota notification email was sent for that
directory, the multibyte character in the pathname that appeared in the quota
notification email was replaced with an incorrect character, such as a question
mark.

138115

Quota configuration import and export functionality is missing from the isi
quotas command.
Workaround: To export or import quota configuration files, run the isi_classic
quota list --export or the isi_classic quota --import --fromfile <filename> command from the command-line interface, where <filename>
is the name of the file to be imported.

94797

To export a file from the OneFS web administration interface, click SmartQuotas >

Generated Reports Archive > Generate a quota report.


Writing files past the quota limit over NFSv4 generates an I/O error.

69816

SMB known issues

ID

SMB
139712
On the Protocols > Windows Sharing > SMB Shares tab in the OneFS web
administration interface, if you click Reset or Cancel in the Add a User or Group
dialog box while adding or viewing an SMB share, the Add a User or Group dialog
becomes inoperable.
Workaround: Refresh the OneFS web administration web page.
If you shut down a node while a cluster is under heavy load, the following lines
might appear in the stack trace:

134661

/lib/libc.so.7:recvfrom+0xc
/usr/lib/libisi_gconfig_c_client.so.1:gconfig_connection_flush
+0x375
/usr/lib/libisi_gconfig_c_client.so.
1:gconfig_connection_read_message+0x47
/usr/lib/libisi_gconfig_c_client.so.
1:gconfig_client_update_entries_count+0x799
/usr/lib/libisi_gconfig_c_client.so.
1:gconfig_client_wait_for_config_change+0x274
/usr/likewise/lib/lwio-driver/
srv.so:StoreChangesWatcherThreadRoutine+0xf3
/lib/libthr.so.3:_pthread_getprio+0x15d

If an application sends OneFS a request for alternate data streams, but specifies a
buffer size that is too small to receive all of the alternate data streams, OneFS will

134299

SmartQuotas

161

Known issues

SMB known issues

ID

report that the streams do not exist, instead of reporting that the buffer size was
too small.
Alternate data streams might be inaccessible through Windows PowerShell.

134250

The isi_papi_d process might fail while there is a large amount of SMB traffic and
multiple threads call the same code at the same time. However, in rare cases, the
port can suddenly become inactive.
Workaround: If a port becomes inactive, you must reboot the node to resolve this
issue.

130692

Some SMB 1 clients send a Tree Connect AndX request using ASCII to specify a
path. The cluster rejects the connection with STATUS_DATA_ERROR.

84457

When you add a new Access Control Entry (ACE) that grants run-as-root permissions 72337
to an Access Control List (ACL) on an SMB share, OneFS adds a duplicate ACE if
there is already an entry granting full control to the identity. The extra ACE grants no
extra permissions.
Workaround: Remove the extra ACE by running the isi smb permissions
command.

Upgrade and installation


Upgrade and installation known issues

ID

Beginning in OneFS 7.2.0.1, the network port range used for back-end
communications was changed. As a result, in rare cases, if you perform a rolling
upgrade from a supported version of OneFS to OneFS 7.2.0.1 or later, and if the
upgrade process fails or is paused before all of the nodes in the cluster have been
upgraded, commands sent from nodes that have not yet been upgraded might be
sent to an upgraded node through an unsupported port.
If this issue occurs, affected nodes are not upgraded, the command that was sent
fails, and messages similar to the following might appear on the console:

143408

ERROR Client connected from an unprivileged port number 50230.


Refusing the connection
[Errno 54] RPC session disconnected

Note

You can avoid this issue by performing a simultaneous upgrade. If you encounter
this issue, see article 198906 on the EMC Online Support Site.
If you initiate a simultaneous upgrade through the OneFS web administration
interface, OneFS incorrectly reports that a rolling upgrade is occurring through the
following message:

133409

A rolling upgrade is currently in progress. Some changes to


configuration may be disallowed.

When running the sudo isi update command, you might encounter warnings
that the cluster contains unresolved critical events, that certain drives are ready to
be replaced, or that devices in the carrier boards are not supported boot disks. You
can disregard these messages because they have no adverse affects.
162

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

131929

Known issues

Upgrade and installation known issues

ID

After a rolling upgrade is complete, the OneFS web administration interface might
report that a rolling upgrade is still in progress.
Workaround: Restart the rolling upgrade.

126799

For more information, see article 186845on the EMC Online Support site.
If a node is rebooted during a rolling upgrade, and the node fails, the upgrade
process might continue to run indefinitely, even after all other nodes have been
upgraded.

125320

If Collect or MultiScan jobs are in progress when either a rolling upgrade or cluster
reboot is initiated, the job will fail instead of being cancelled.

123903

Note

If the Collect or MultiScan jobs continue to fail after the rolling upgrade is
complete, it is unlikely that the failure was caused by this issue.
During a rolling upgrade, if you are logged in to a node that has not been upgraded 123842
yet, and you view job types, the system displays several disabled job types with IDs
of AVScan.
These job types are new to OneFS 7.1.1 and have been mislabeled during the
rolling upgrade process. The IDs of the job types will resolve to the correct IDs after
the rolling upgrade is complete.
Jobs that are running when a OneFS upgrade is started might not continue running
after the upgrade completes.
Workaround: Cancel all running jobs before upgrading or manually restart jobs that
did not restart automatically following the upgrade.

98341

If an upgrade job is started on a cluster containing a node with a degraded boot


drive, the upgrade engine crashes on initialization, preventing the upgrade from
proceeding.
For more information, see article 88746 on the EMC Online Support site.

98072

Virtual plug-ins known issues

ID

Adding an Isilon vendor provider might fail when you enable VASA support.
Additionally, the VASA information that appears in vCenter might be incorrect.
These issues can occur if you create a data store or virtual machine through the
VMware vSphere PowerCLI.
Workaround: You can resolve this issue by creating data stores through either the
VMware vCenter graphical user interface or the VMware ESXi command-line
interface.

97735

Virtual plug-ins

Virtual plug-ins

163

Known issues

164

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

CHAPTER 9
OneFS Release Resources

Sources for information about and help with the OneFS operating system.
l
l
l
l

OneFS information and documentation............................................................... 166


Functional areas in the OneFS release notes........................................................167
Where to go for support.......................................................................................171
Provide feedback about this document............................................................... 171

OneFS Release Resources

165

OneFS Release Resources

OneFS information and documentation


EMC Isilon channels
You can access OneFS information through the following channels.
Channel

Description

EMC Isilon OneFS


Product Page

Visit the EMC Isilon OneFS product page on the EMC Online Support site to
download Isilon product documentation and current software releases.

Help on This
Page

Select Help on this Page from the Help menu in the OneFS web
administration interface to see information from the OneFS Web
Administration Guide and the OneFS Event Reference. The Help on This Page
option does not require internet connectivity.

Online Help

Select Online Help from the Help menu in the OneFS web administration
interface to see information from the OneFS Web Administration Guide and the
OneFS Event Reference. The Online Help contains the latest available versions
for these guides. The Online Help option requires internet connectivity.

ISI Knowledge

You can visit the ISI Knowledge blog weekly for highlights and links to Isilon
support content we have to offer. Announcements of availability of content,
product tips, and information about new ID.TV videos.

EMC Isilon
YouTube playlist

You can visit the EMC Isilon YouTube playlist on the EMC Corporate YouTube
channel for Isilon how-to videos, information about new features,
information about Isilon hardware, and technical overviews.

Available documentation
OneFS documentation is available across the following channels.

166

Document

Channel

OneFS 7.2.0 Release Notes


Information about new features, operational changes, enhancements, and
known issues for OneFS 7.2.0.

EMC Online Support

OneFS 7.2 Web Administration Guide


Information about the OneFS web administration interface, which enables
you to manage an Isilon cluster outside of the command line interface or
LCD panel.

EMC Online Support


Help on this page

OneFS 7.2 CLI Administration Guide


Information about the OneFS command-line interface (CLI), which includes
commands that enable you to manage an Isilon cluster outside of the web
administration interface or LCD panel.

EMC Online Support

OneFS 7.2 Event Reference


Information about how to monitor the health and performance of your EMC
Isilon cluster through OneFS event notifications.

EMC Online Support


Help on this page

OneFS 7.2 Backup and Recovery Guide


Information about backup and recovery procedures with NDMP and SyncIQ.

EMC Online Support

OneFS 7.2 API Reference

EMC Online Support

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

Online Help

Online Help

OneFS Release Resources

Document

Channel

Information about how to access cluster configuration, management, and


monitoring functionality, and also how to access directories and files on the
file system through an HTTP-based interface.
OneFS 7.2 Security Configuration Guide
Information about the security features in OneFS.

EMC Online Support

OneFS Site Preparation and Planning Guide


Information for system administrators and facility managers about how to
plan and implement an Isilon cluster in an optimal data center
environment.

EMC Online Support

OneFS Upgrade Planning and Process Guide


Information that users should take into account when deciding to upgrade
the OneFS operating system and information about tasks that users should
perform to prepare the cluster for the upgrade.

EMC Online Support

OneFS CLI Mappings


EMC Online Support
Command syntax changes that were implemented between OneFS versions.
OneFS 7.2 Upgrade Readiness Checklist
A checklist to help users ensure that their cluster is ready to upgrade to
OneFS 7.2.

EMC Online Support

OneFS 7.2 Migration Tools Guide


Information about how to migrate data to an Isilon cluster through OneFS
migration tools.

EMC Online Support

OneFS 7.2 iSCSI Administration Guide


Information about how to manage block storage on an Isilon cluster
through the OneFS iSCSI software module.

EMC Online Support

OneFS 7.2 Swift Technote


Information about how to store content and metadata as objects on a
cluster through RESTful APIs.

EMC Online Support

Functional areas in the OneFS release notes


This section contains a list of the functional areas that are used to categorize content in
the OneFS release notes and descriptions of the types of content that each category
contains.
Antivirus
This functional area is used to categorize new features, changes, and issues that
affect the way OneFS interacts with antivirus software.

Functional areas in the OneFS release notes

167

OneFS Release Resources

Authentication
This functional area is used to categorize new features, changes, and issues that
affect authentication on the cluster. This includes, but is not limited to:
l

Access control lists (ACLs)

LDAP

NIS

Role-based access control (RBAC)

Backup, recovery, and snapshots


This functional area is used to categorize new features, changes, and issues that
affect backup, recovery, and snapshots. This includes, but is not limited to:
l

NDMP

Snapshots

SyncIQ

Symantec NetBackup

Cluster configuration
This functional area is used to categorize new features, changes, and issues that
affect cluster configuration. This includes, but is not limited to:
l

CloudPools

Licensing

NTP

OneFS registry (gconfig)

SmartPools

Command-line interface
This functional area is used to categorize new features, changes, and issues that
affect the OneFS command-line interface.
Diagnostic tools
This functional area is used to categorize new features, changes, and issues that
affect tools that are used to research and diagnose cluster issues. This includes, but
is not limited to:

168

EMC Secure Remote Services (ESRS)

Gather info (isi_gather_info)

Help in the OneFS web administration interface

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

OneFS Release Resources

Events, alerts, and cluster monitoring


This functional area is used to categorize new features, changes, and issues that
affect utilities that are used to detect and record system events and utilities that are
used to monitor cluster health and general statistics. This includes, but is not limited
to:
l

Alerts

Protocol auditing

Cluster event log (CELOG)

File system analytics (FSA)

Onsite Verification Test (OVT)

Simple Network Management Protocol (SNMP)

Statistics

Status

File system
This functional area is used to categorize new features, changes, and issues that
affect the OneFS file system. This includes, but is not limited to:
l

Cluster group management

File system coalescer

File system events (not CELOG)

FreeBSD

L3 cache

MCP

Network Lock Manager (NLM)

OneFS Kernel

File transfer
This functional area is used to categorize new features, changes, and issues that
affect FTP and HTTP connections to the cluster.
Hardware
This functional area is used to categorize new features, changes, and issues that
affect Isilon hardware in a OneFS cluster .
HDFS
This functional area is used to categorize new features, changes, and issues that
affect the HDFS protocol.
iSCSI
This functional area is used to categorize new features, changes, and issues that
affect the iSCSI protocol and iSCSI devices connected to a OneFS cluster.
Note

Support for the iSCSI protocol is deprecated in this version of OneFS.

Functional areas in the OneFS release notes

169

OneFS Release Resources

Job engine
This functional area is used to categorize new features, changes, and issues that
affect the OneFS job engine and deduplication in OneFS.
Migration
This functional area is used to categorize new features, changes, and issues that
affect migration of data from a NAS array or a OneFS cluster to a OneFS cluster
through the isi_vol_copy utility or the isi_vol_copy_vnx utility.
Networking
This functional area is used to categorize new features, changes, and issues that
affect the OneFS external network and the OneFS back-end network. This includes,
but is not limited to:
l

Fibre Channel

Flexnet

InfiniBand

SmartConnect

TCP/IP

NFS
This functional area is used to categorize new features, changes, and issues that
affect NFS connections to the cluster.
OneFS API
This functional area is used to categorize new features, changes, and issues that
affect the OneFS Platform API and SWIFT.
OneFS web administration interface
This functional area is used to categorize new features, changes, and issues that
affect the web administration interface.
Performance
This functional area is used to categorize new features, changes, and issues that
affect cluster performance.
Security
This functional area is used to categorize new features, changes, and issues that are
related to security fixes and vulnerabilities.
Security Profiles
This functional area is used to categorize new features, changes, and issues that
affect hardened profiles such as the security technical information guides (STIG).
SmartQuotas
This functional area is used to categorize new features, changes, and issues that
affect SmartQuotas.
SMB
This functional area is used to categorize new features, changes, and issues that
affect SMB connections to the cluster.
Upgrade and installation
This functional area is used to categorize new features, changes, and issues that
affect OneFS upgrades, installation of OneFS patches, and the reformatting and
reimaging of Isilon nodes by using a USB flash drive.
170

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

OneFS Release Resources

Virtual plug-ins
This functional area is used to categorize new features, changes, and issues that
affect virtual plug-ins. This includes, but is not limited to:
l

Isilon for vCenter

OneFS Simulator

Storage Replication Adapter (SRA)

vStorage APIs for Array Integration (VAAI)

VMware vSphere API for Storage Awareness (VASA)

vOneFS
This functional area is used to categorize new features, changes, and issues that
affect vOneFS.

Where to go for support


You can contact EMC Isilon Technical Support for any questions about EMC Isilon
products.
Online Support

Live Chat
Create a Service Request

Telephone Support

United States: 1-800-SVC-4EMC (800-782-4362)


Canada: 800-543-4782
Worldwide: +1-508-497-7901
For local phone numbers in your country, see EMC Customer
Support Centers.

Help with online


support

For questions specific to EMC Online Support registration or


access, email support@emc.com.

Provide feedback about this document


We value your feedback. Please let us know how we can improve this document.
l

Take the survey at http://bit.ly/isi-docfeedback.

Send your comments or suggestions to docfeedback@isilon.com.

Where to go for support

171

OneFS Release Resources

172

OneFS 7.2.0.0 - 7.2.0.4 Release Notes

S-ar putea să vă placă și