Sunteți pe pagina 1din 52

IBM Spectrum Scale

Protocols & Multisite

Ash Mate
WW Senior Solutions Architect
mate@us.ibm.com

Spectrum Scale / ESS Solution & Architecture

© 2016 IBM Corporation


Agenda

• What is IBM Spectrum Scale?


• Spectrum Scale Deployment Models
• Spectrum Scale Protocols
• Spectrum Scale Protocols Authentication
• Spectrum Scale Use Cases
• Spectrum Scale Multi-Site
• Spectrum Scale AFM
• Case Studies

© 2016 IBM Corporation


Terminology
• GPFS: General Parallel File System
• CES: Cluster Export Service (a.k.a. Protocol Node)
• AFM: Active File Manager (a.k.a. Active Cloud Engine - ACE)
• NFS: Network File System Protocol (Unix)
• SMB: Server Message Block Protocol (Windows) a.k.a. CIFS Common Internet File System Protocol
• Openstack Swift: Open Source Cloud Computing Platform Object Store & Access Protocol
• S3: Amazon Simple Storage Service - Object Store & Protocol (Openstack has an emulation Swift3)
• REST: REpresentational State Transfer API
• AD: Active Directory
• LDAP: Lightweight Directory Access Protocol
• POSIX: Portable Operating System Interface for UniX
• HDFS: Hadoop Distributed File System
• ESS: Elastic Storage Server
• SONAS: Scale Out Network Attached Storage
3 © 2016 IBM Corporation
What is IBM Spectrum Scale?
What it is Why it matters
• High performance, parallel file system • Parallelism removes data bottlenecks
• POSIX compliant • Wide range of applications supported
• Single global namespace for all data • Easy data management at scale
• Data placement and migration policies • Automated, transparent data tiering
• Shared disk access on commodity HW • Low cost
• Multi-site support • Global collaboration
• Multi-protocol support NAS, Object, • Unified scale-out data ‘lake’
GPFS
• Data availability, integrity and security
• End-to-end checksum, encryption
• Fast rebuild and increased fault
• Spectrum Scale native RAID tolerance
© 2016 IBM Corporation
Spectrum Scale Deployment Models

Note: SMB, NFS, Object Store Interfaces supported through Cluster Export Services
© 2016 IBM Corporation
Simple Cluster Model Overview
Single Cluster/Single Filesystem*

 All Network Shared Disk (NSD)


Client does real-time parallel I/O to all the NSD
servers export NSDs to all the servers and storage volumes/NSDs
clients in active-active mode
NFS, CIFS, S3
 GPFS stripes files across NSD exports CES
servers and NSDs in units of Application
file-system block-size CES
Nodes
 NSD client communicates with All clients can access
all the servers all data in parallel Scalable, capacity &
GPFS Protocol performance
 File-system load spread evenly
across all the servers and TCP/IP or Infinband Network
NSDs. No Hot Spots
 No single-server bottleneck Other optional
NSD
NSD
 Can share access to data with Servers
NFS, SMB and Swift S3 Servers
Inside ESS
 Access via File & Object ESS Storage Storage
protocols for clients without
GPFS client Storage Heterogeneous Optimize
 Easy to scale while keeping the Storage heterogeneous
resources
architecture balanced

*or can be multiple filesystems if desired © 2016 IBM Corporation


Spectrum Scale Protocols

• Protocol Support added in v4.1.1


– In Standard & Advanced Editions, via Additional license
– File Access via NFS Ganesha & SMB (CIFS) SAMBA
– Object Access via Openstack Swift & S3
• Benefit: Ability to create NFS exports, SMB shares, and OpenStack Swift
containers that have data in GPFS file systems for access by client systems that do
not run GPFS. Create a SONAS replacement.
• Unified File & Object Support added in v4.2
– Provides in-place access to data in specific filesets from both File & Object protocols
– File created from POSIX / NFS / SMB protocols can be accessed as Object using Swift / S3
– Object created from Swift / S3 protocols can be accessed as File using POSIX/NFS/SMB
– Support for analytics of Objects via Hadoop connector without data copy

© 2016 IBM Corporation


Spectrum Scale Use Cases

General Purpose
Big Data & Analytics Cloud
Computing

File
NFS SMB
Hadoop Cinder Swift
GPFS Block Object
Connector
POSIX

Single Name Space


Enterprise storage on Data Lake Linear capacity &
industry standard performance scale out
hardware
Spectrum Scale Server

UFO Support - Unified File (POSIX, NFS, CIFS) and Object (Swift S3) with Integrated Analytics (HDFS)
© 2016 IBM Corporation
Spectrum Scale Protocol Nodes

• Protocol (CES) Nodes


– Provide a secure way for users from different clients to get File & Object protocol access to
data stored on a Spectrum Scale cluster or ESS.
– Two Protocol (CES) Nodes recommended for HA.
– Need to have GPFS server license designations (v4.1.1 or later).
– Nodes in a cluster can be designated as Protocol Nodes using
• mmchnode –ces-enable
– Should be configured with "external" network addresses that will be used to access the
protocol artifacts (shares/exports/containers) from clients.
– All the protocol nodes have to be running the Red Hat Enterprise Linux operating system v7 or
later, and the protocol nodes must be all Power® (in big endian mode) or all Intel (although the
other nodes in the GPFS cluster could be on other platforms and operating systems).
– Protocol Nodes or Cluster Export Services (CES) infrastructure support the integration of the
NFS, SMB and object servers.
© 2016 IBM Corporation
Spectrum Scale Protocols Versions

• NFS Server supports NFS v3 and the mandatory features in NFS v4.0.
• SMB server support SMB 2, SMB 2.1, and the mandatory features of SMB 3.0.
• Object server supports the Kilo release of Openstack Swift along with Keystone v3.
• The CES infrastructure is responsible for
– managing the setup for high-availability clustering used by the protocols
– monitoring the health of these protocols on the protocol nodes and raising events/alerts in the
event of failures
– managing the addresses used for accessing these protocols including failover and failback of
these addresses because of protocol node failures
• For more details see the "Implementing Cluster Export Services" chapter of the IBM
Spectrum Scale: Advanced Administration Guide.

© 2016 IBM Corporation


Spectrum Scale Protocols Details
• Considering Protocols
– SMB:
• Limit of 20K total connections
• GPFS/Spectrum Scale supports SMB2 and SMB3.
• RHEL6 only supports SMB1, not SMB2 or SMB3
• RHEL7 has code for SMB2 support, but it is not officially supported by RedHat
• Limitation of ACL management with LDAP authentication
• Open Directory is not currently tested
– NFS:
• No limit of total connections
• Not all clients support NFS- for example Win7 and Win8
– Object:
• Openstack Kilo Release Swift & S3
• Unified File and Object Access

11 © 2016 IBM Corporation


Unified File & Object Access
• Accessing object using file interfaces (SMB/NFS/POSIX) and accessing file using object
interfaces (REST) helps legacy applications designed for file to seamlessly start integrating into the
object world.
• Cloud data which is in form of objects to be accessed using files using application designed to
process files.
• Multi protocol access for file and object in the same namespace allows supporting and hosting data
oceans of different types with multiple access options.
• A rich set of placement policies for files (using mmapplypolicy) available with IBM Spectrum
Scale™. With unified file and object access, those placement policies can be leveraged for
object data.
• To analyze large amounts of data, advanced analytics systems are used. However, porting and
copying the data from an object store to a distributed file system that the analytics system
requires is complex and time intensive. For these scenarios, there is a need to access the object
data using file interface so that analytics systems can use it. Unified File and Object Access value
adds in this scenario.

12 © 2016 IBM Corporation


IBM Spectrum Scale Protocol Node - Planning
• Download the Advanced Protocols code image for install
• A GPFS cluster and file system are required
• CCR cluster is enabled (check via mmlscluster command)
• Requires RH 7.0 or higher
• Ssh promptless access of all nodes within the cluster
• All nodes must be pingable via IP, hostname.
• Reverse DNS lookup is in place
• Protocol nodes cannot be used to serve remote mounted file systems.

13

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


IBM Spectrum Scale Protocol Node - Sizing
• Minimum Hardware Configuration : 1 CPU socket Intel / Power, 64GB Memory,
10Gb Ethernet
• NFS / Object only : Minimum Configuration will suffice
• SMB or Mixed Protocols: add 1 CPU socket and at least 64GB more memory
• Network ports to meet your availability and throughput needs
• At least 1 Protocol Node but 2 recommended for HA
• Protocol Node Sizing Tool
– Download from IBM DeveloperWorks: Protocol Node Sizing Tool
– An interactive Excel spreadsheet that takes workload input and produces a sizing

14

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


IBM Spectrum Scale Protocol Node - Number
• By number of connections
– A maximum of 3,000 SMB connections per protocol node with a maximum of 20,000 SMB connections per cluster.
– A maximum of 4,000 NFS connections per protocol node is recommended.
– A maximum 875 concurrent Swift Object user requests per protocol node is recommended. See Object Sizing Details.
• By throughput
– To base the quantity of protocol nodes on available network bandwidth. Take the total throughput required, divided by how
many Ethernet ports are available in each server. (As a guideline you can go with 800MiB/sec sequential throughput per
10Gbit port)
– Since object protocol is a stateless protocol that does not have persistent connections, consider the bandwidth or I/O
capabilities of the server and network
• Review and ensure you have not exceeded any maximums
– If you are using SMB in any combination of other protocols you can configure up to 16 protocol nodes (Hard limit)
– The recommended limit for number of protocol nodes is 16 if you enable Object.
– If only NFS is enabled you can have 32 protocol nodes.
– Once you have determined the number of protocol nodes you need, consider adding additional nodes to maintain full
performance even when a node is down

15 © 2016 IBM Corporation


Spectrum Scale Protocols Authentication
• Authentication and ID mapping
• You can configure NFS & SMB services to authenticate against the most popular
authentication services such as Microsoft Active Directory (AD) and LDAP.
• Keystone V3 internal & external (AD/LDAP) is supported for Object Access
• Mapping Microsoft security identifiers (SIDs) to the POSIX user and group IDs on the
file server can either be done automatically or by using the external ID mapping
service like RFC 2307.
• If none of the offered authentication and mapping schemes match the environmental
requirements, the option to establish a user-defined configuration is available. The
mmuserauth service create command can be used to set up all authentication-
related settings.

© 2016 IBM Corporation


File access Authentication
• The following steps are involved in the user authentication for file access: User tries to connect to the
IBM Spectrum Scale system by using their credentials.
– The IBM Spectrum Scale system contacts the authentication server to validate the user.
– The IBM Spectrum Scale system contacts the ID map server that provides UIDs and GIDs of
the user and user group to verify the identity of the user.
– If the user credentials are valid, the user gains access to the system.
• ID Mapping
authentication of the user or groups of users associated with the identification of their unique identifiers
– External ID mapping methods
• RFC2307 when AD-based authentication is used
• LDAP when LDAP-based authentication is used
– Internal ID mapping method
• Automatic ID mapping when AD-based authentication is used

17 © 2016 IBM Corporation


File access Authentication (cont.)
• Other supported authentication
– Netgroups: Spectrum Scale system supports only the netgroups that are stored in NIS and in
Lightweight Directory Access Protocol (LDAP).
– Kerberos: Kerberos is a network authentication protocol that provides secured communication
by ensuring passwords are not sent over the network to the system. The system supports
Kerberos with both AD and LDAP-based authentication.
– Transport Level Security (TLS): The TLS protocol is primarily used to increase the security and
integrity of data that is sent over the network.

18 © 2016 IBM Corporation


Spectrum Scale Protocols For The Cloud
• Flexible Shared Block, File and Object Storage –
– Spectrum Scale offers block storage for your Openstack Cinder volume, supports File Access
protocols like POSIX, SMB, NFS and Object Access Protocols Openstack Swift3 (S3 API
Emulation) and Swift APIs, as well as HDFS for analytics.
– Multi-protocol support allows applications to use the right protocol for the job. VMs can access
their Cinder volumes via NFS or POSIX from the hypervisor and then use one or more of the
protocols to access data from inside the VM itself.
– In addition, Manila can use NFS for multi-tenant access to file data. But most importantly,
multi-protocol access allows Spectrum Scale to be used for much more than just providing
storage for VMs. An entire organization can use Spectrum Scale for most if not all of its
storage needs, no matter what they may be.

© 2016 IBM Corporation


Use Case: Spectrum Scale as the Enterprise Storage Layer for OpenStack

Nova / Glance (Compute) Cinder (Volumes) Swift (Objects)

Spectrum Scale Placement Spectrum Scale Volume Spectrum Scale Object Driver
Driver Driver

Spectrum Scale: A Reliable, Scalable, POSIX-Compliant Enterprise File System that stores
Compute images, Volumes and Objects

 Leverages Spectrum Scale as a common storage layer for images, volumes and objects
– Avoids data copy
– Local access of data
 Adds enterprise storage management features to OpenStack
– Rapid volume and virtual machine provisioning using file system level copy-on-write
function
– Scale out IO performance through support for shared nothing clusters
– Resilient volumes through transparent pipeline replication

© 2016 IBM Corporation


Use case – Enabling “In-Place” analytics for Object data repository
Analytics on Spectrum Scale Object Store With Unified
Analytics on Traditional Object Store File and Object Access

In-Place Analytics
Object
(http)
Results returned
in place
Spectrum Scale
Explicit Data movement Hadoop Connectors

Data ingested
as Objects
Results Published
as Objects with
<SOF_Fileset>/<Device> no data movement

Spectrum Scale

Traditional object store – Data to be copied from


Spectrum Scale object store with Unified File and Object Access –
object store to dedicated cluster , do the analysis
Object Data available as File on the same fileset . Spectrum Scale Hadoop
and copy the result back to object store for
connectors allow the data to be directly leveraged for analytics.
publishing 21
Source:https://aws.amazon.com/elasticmapreduce/
No data movement / In-Place immediate data analytics.
© Copyright IBM Corporation 2016 © 2016 IBM Corporation
Spectrum Scale in the Cloud: Deploy as part of complete hybrid or public cloud
implementation

Client and ISV Applications

Platform
Platform LSF Platform
Symphony Platform LSF
(SaaS) Symphony
(SaaS)
Workload
& Data Spectrum Scale, based on
Spectrum Scale on Cloud
Spectrum Scale

SoftLayer, an IBM Company On-premise


bare metal infrastructure infrastructure

24X7 Cloud Ops Support

Platform Computing Cloud Service Local Infrastructure


Ready to use, high performance cluster in the Platform LSF or Platform Symphony cluster or
cloud grid

Spectrum Scale servers and storage are isolated inside each organization's
private VLAN  no sharing for maximum security
© 2016 IBM Corporation
Additional Industry Use Cases
• Enterprise File Archive Solution: Data from various sources (NAS, DBs, wikis, etc.)
classified by StoredIQ ingested into Spectrum Scale via File Interface can be tiered
to Cleversafe on-prem active archive via Transparent Cloud Tiering
• Enterprise Content Managenment Soltuion: Data ingested into Scale via FileNet
for tiering to Tape / Cloud.
• Medical Images Archive Solution: MRI, X-Ray, CT Scans ingested via Merge
PACS as files during active use can be later stored as immutable objects with
searchable meta-data.
• Broadcast Production and Archive Solution: Video data ingested as files during
active use can be later stored as tagged objects.
• CCTV Cam / Dash Cam / Body Cam Solution: Images & video data captured,
tagged and stored as evidence for long term retention & e-discovery.
• Genomic Research Solution: Genomic Sequencer data stored for Analysis & later
23
retained for Analytics & Research. © 2016 IBM Corporation
Spectrum Scale Multi-Site
• GPFS™ allows users shared access to files in either the cluster where the file
system was created, or other GPFS clusters.
• Ability to access and mount GPFS file systems owned by other clusters in a network
of sufficient bandwidth
• Accomplished using the mmauth, mmremotecluster and mmremotefs commands.
• Each site in the network is managed as a separate cluster, while allowing shared file
system access.
• The cluster owning the file system is responsible for administering the file system and
granting access to other clusters on a per cluster basis.
• After access to a particular file system has been granted to nodes in another GPFS
cluster, the nodes can mount the file system and perform data operations as if the file
system were locally owned.

24 © 2016 IBM Corporation


Global Sharing with Spectrum Scale
Remote Mount
 Allow multiple application groups to
share/collaborate portions or all data
 Single copy of data shared across
multiple sites
 I/O Performance during remote file-
system access is limited by the inter-
cluster latency/bandwidth

Synchronous Replication Failure Domain #1 Failure Domain #2

 File system metadata/data is always in


sync across multiple failure domains
 No interruption to client I/O activity
TCP/IP or Infinband Network
during a storage domain failure.
File system-level metadata and
 Automatic failover and seamless file- data replication
system recovery
Storage Storage Storage Storage
 Write I/O performance limited by inter-
Single file system synchronized across two locations
cluster latency/bandwidth © 2016 IBM Corporation
Security considerations of multiple clusters
• Remote shell with ssh
– Commands to communicate across nodes in the cluster
– Specified using mmcrcluster or mmchcluster
– Default is rsh
• Remote cluster: subnet and firewall rules
– Connect multiple clusters within the same data center or across WAN
– Firewall setting:
• Inbound/outbound: GPFS/1191 TCP (default port and can be changed via
mmchcconfig tscTcpPort)
• Inbound/outbound: SSH/22 TCP
• SELinux configuration with GPFS
– Need to disable automatic startup for setting the correct security
context
– Change of inode labels may not be visible on other nodes
– once a file is enabled for SElinux and xattrs are added, TSM
incremental backup will pick this file for backup, data and metadata will
be backed up again regardless of any other changes to the file

26 © 2016 IBM Corporation


User access to a file system owned by another cluster
• In a cluster environment that has a single user identity namespace, all nodes have
user accounts set up in a uniform manner
• In multiple cluster environment, the uniform user account infrastructure might no
longer be valid
– IBM Spectrum Scale provides an interface that integrates into an existing user registry
infrastructure, such as the Globus Security Infrastructure (GSI) used by TeraGrid. The interface
is based on a set of user-supplied ID remapping helper functions (IRHF) for performing dynamic
UID/GID remapping.

27 © 2016 IBM Corporation


Accessing file systems from another cluster
Users share access to files in the cluster where the file system is created or from other
clusters
• Each node in the cluster needing access to another cluster files system requires to
open a TCP/IP connection to each other
• Nodes in two separate remote clusters mounting the same file system are not
required to be able to open a TCP/IP connection to each other.
– A mounts file system from B and C desires to mount the same file system, A and C do not have
to communicate to each other
• In order to take advantage of the fast networks and to use the nodes in Cluster 1 as
NSD servers for Cluster 2 and Cluster 3, configure a subnet for each of the
supported clusters

28 © 2016 IBM Corporation


Mounting a file system owned and served by another cluster
• Example summarizes the commands that the administrators of the two clusters need
to issue so that the nodes in cluster2 can mount the remote file system fs1, owned by
cluster1, assigning rfs1 as the local name with a mount point of /rfs1

29 © 2016 IBM Corporation


AFM Overview
• Active file management (AFM) uses a home-and-cache model in which a single
home provides the primary storage of data, and exported data is cached in a local
GPFS™ file system
• Users access files from the cache system
– For read requests, when the file is not yet cached, AFM retrieves the file from the home site
– For write requests, writes are allowed on the cache system and can be pushed back to the home system, depending
on the cache types
• AFM is primarily suited for remote caching, it is also used to migrate data:
– Configure the cache to retain a copy of every file in the home
– After the copy is completed, break the AFM home/cache relationship
– Transform the cache filesets into the normal data filesets for user access

30 © 2016 IBM Corporation


Global Sharing with Spectrum Scale AFM
• Expands the GPFS global namespace across
geographical distances
• Caches local ‘copies’ of data distributed to one or more GPFS clusters
• Low latency ‘local’ read and write performance
• Automated namespace management
• As data is written or modified at one location, all other locations see that
same data

• Efficient data transfers over Wide Area Network (WAN)


• Works with unreliable, high latency connections GPFS

GPFS
• Speeds data access to collaborators and resources
around the world
GPFS

© 2016 IBM Corporation


AFM Caching Basics
• Sites – two sides for a cache relationship
– A single home cluster
• Presents a fileset that can be cached (export with NFS)
• Can be non-GPFS cluster/nodes
– One or more cache clusters
• Associates a local fileset with the home export
• AFM Fileset
– Independent fileset with per-inode in xattrs
– Data is fetched into the fileset on access (or prefetched on command)
– Data written to the fileset is copied back to home
• Gateway Node (designation)
– Maintains an in-memory queue of pending operations
– Moves data between the cache and home clusters
– Monitors connectivity to home, switches to disconnected mode on outage, triggers recovery on failure

32 © 2016 IBM Corporation


Cache Types and Migration
• 4 types of Caches:
– Read Only (RO)- non updateable – only reflects home
– Local Update (LU) – Updates are not pushed back to home
– Single Writer (SW)-only one cache may be allowed to write and updates are pushed back to home. Other
caches to the home may be in RO mode only
– Independent Writer (IW): Multiple caches may independently write to the home export. Updates from one
cache are pushed to the home export and other IW caches of this export see these updates
• For migration purpose, configure the cache to retain every file at the home site.
– When data transfer is complete, AFM relationship of home/cache is removed and the cache filesets are
transformed to normal data fileset for user access

33 © 2016 IBM Corporation


Yahoo in Japan
Supporting Global Application with Spectrum Scale –
Cache in Japan. Home in USA.
Internet Company in Japan has multiple sites with around 11 TB data being generated per
day per site by the end application, but these sites have limited disk capacity. They want a
Japan to US archive with high data transfer rate (10 TB/hr) & petabyte class archive system
(initial 15 PB + 5 PB/yr). They also want fail-over & fail-back between the Japan & US sites
for business continuity.

Choice: Spectrum Scale offered the only object-storage solution with ability to use Active File
management (AFM) for object movement between Japan & US sites. Lab Services
collaborated with Development & Research Lab to design a feasible solution and performed
Proof of Concept (PoC) prototyping and bench marking with IBM Spectrum Scale 4.2, AFM
Caching/Parallel IO, and Protocol Server (Object). IBM met client requirements by combining
multiple IBM Spectrum Scale functions.

Solution:
Phase 1 is implementing Spectrum Scale Object with AFM for 2 cache sites in Japan with 1
home site in US. In phase 2 multiple cache sites will be supported with 1 home and in phase
3 multiple cache sites with multiple home sites will be supported.

Benefits: Spectrum Scale with AFM Parallel IO solution – provides the required high data
transfer rates and the ability to expand capacity up to tens of petabyte of objects with single
endpoint.

© 2016 IBM Corporation


A Smarter Storage Approach
The IBM Integrated Storage Portfolio

Thank you!
For more information:
Website: http://www-03.ibm.com/systems/storage/spectrum/index.html

© 2016 IBM Corporation


Backup

Spectrum Scale / ESS Solution & Architecture

© 2016 IBM Corporation


DATA LIFECYCLE MGMT

Policy-based File Management examples


– Placement policies, evaluated at file creation, example
• rule 'home' set pool 'gold' for fileset 'home_bosslady‘

– Migration policies, evaluated periodically


• rule 'hsm' migrate from pool 'sata' threshold(90,85) weight(current_timestamp – access_time) to
pool 'hsm'
where file_size > 1024kb
• rule 'cleansilver' when day_of_week()=Monday migrate from pool 'silver' to pool 'bronze' where
access_age > 30 days

– Deletion policies, evaluated periodically


• rule 'purgebronze' when day_of_month()=1 delete from pool 'bronze' where access_age>365 days

– List Rule generate reports on file system contents


• rule 'listall' list 'all-files' SHOW(varchar(file_size) ||' ‘|| varchar(user_id)||' ‘||
fileset_name )

© 2016 IBM Corporation


Spectrum Scale – The Complete Data Management Solution for Enterprise
environments
Single Worldwide Name Space

cache site FPO


POSIX NFS CIFS
SMB CIFS
OpenStack

cache site AFM Spectrum Scale


ESS
cache site fast slow
Flash TSM
disk disk
LTFS
HPSS

Single file-system technology to serve home, high-speed


scratch, and analytics workloads as well as be globally available
to worldwide user community

© 2016 IBM Corporation


Filesystem Layout (Traditional Vs Unified File and Object Access)
• One of the key advantages of unified file and object access is the placement and naming of objects
when stored on the file system. In unified file and object access stores objects following the same path
hierarchy as the object's URL.
• In contrast, the default object implementation stores the object following the mapping given by the ring, and
its final file path cannot be determined by the user easily.
Unified File and Object Access
Ingest object URL: https://swift.example.com/v1/acct/cont/a.jpg a.jpg
a.jpg

Object ingest Object ingest

Traditional SWIFT
ibm/gpfs0/
ibm/gpfs0/
<Sof_policy_fileset>/<device>/
object_fileset/
AUTH_acctID/cont/
o/z1device108/objects/7551/125

39

75fc66179f12dc513580a239e92c3125 a.jpg

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


Easy Access Of Objects as Files via supported File Interfaces
(NFS/SMB/POSIX)
• Objects ingested are available immediately for File access via the 3 supported file protocols.
• ID management modes (explained later) gives flexibility of assigning/retaining of owners, generally required by file
protocols.
• Object authorization semantics are used during object access and file authorization semantics are used during object
access of the same data – thus ensuring compatibility of object and file applications

Object NFS/SMB/POSIX
(http)
2
1
Objects accessed as Files
Data ingested as Objects

File Exports created on container level


<Container> OR
POSIX access from container level
<AUTH_account_ID>

<SOF_Fileset>/<Device>

<Spectrum Scale Filesystem> 40

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


Objectization – Making Files as Objects (Accessing File via Object interface)
• Spectrum Scale 4.2 features with a system service called ibmobjectizer responsible for objectization.
• Objectization is a process that converts files ingested from the file interface on unified file and access enabled
container path to be available from the object interface.
• When new files are added from the file interface, they need to be visible to the Swift database to show correct container
listing and container or account statistics.

Object NFS/SMB/POSIX
(http)
1
3 Data ingested as Files
Files accessed as Objects
2

objectization

Unified File and Object


ibmobjectizer
Fileset

Spectrum Scale Filesystem


41

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


Unified File and Object Access – Policy Integration for Flexibility

This feature is specifically made available as an “object storage policy” as it


gives the following advantages:
• Flexibility for administrator to manage unified file and object access separately
• Allows to coexist with traditional object and other policies
• Create multiple unified file and object access policies which can vary based on
underlying storage
• Since policies are applicable per container , it gives end user the flexibility to
create certain containers with Unified File and Object Access policy and
certain without it.

• Example: mmobj policy create SwiftOnFileFS --enable-file-


access

42

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


IBM Spectrum Scale Protocol Node-Installation (1 of 2)
• Install gpfs packages: rpm –ivh gpfs*
• Verify the nodes in the cluster are healthy and CCR Enabled: mmlscluster
• Verify pre-req packages: mmbuildgpl
• Create cesSharedRoot file system~10 GB
• Mount cesSharedRoot file system: mmstartup and mmount cesSharedRoot -a
• Add node to the cluster: mmaddnode –N <protocolnode> and mmchlicense sever –N <protocolnode>
• Enable CES on protocol nodes: mmchnode -–ces-enable -N <protocolnode>
• Change cesShareRoot config: mmchconfig cesSharedRoot=</gpfs/cesSharedRoot>
• Run Spectrumscale install/deploy script:
– ./spectrumscale setup -s <nsdserverip10.0.6.100> ; ./spectrumscale node add <protocolnode> -a
– ./spectrumscale node add aus-gpfs -p
– ./spectrumscale config protocols -e <publicip-192.168.56.123>
– ./spectrumscale config protocols -f cesSharedRoot -m /gpfs/cesSharedRoot
– ./spectrumscale enable smb; ./spectrumscale enable nfs
– ./spectrumscale node list
– Check file system for export: mmlsfs all -D –k #verify –k and –D=nfsv4

43

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


IBM Spectrum Scale Protocol Node-Installation (2 of 2)
– Set Authentication method:
mmuserauth service create --type ad --data-access-method file --netbios-name
<abc > --user-name <admin> --idmap-role master --servers <xyg.xxxx.com> --
idmap-range-size 10000 --idmap-range 10000-650000000

– Export SMB and NFS:


mkdir /ibm/fs1/smb_export1 chown “DOMAIN\\USER” /ibm/fs1/smb_export1
mmsmb export add smb_export1 /ibm/fs1/smb_export1 -- option
“browseable=yes”
mmsmb export list

44

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


IBM Spectrum Scale Protocol Node-Deployment

45

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


IBM Spectrum Scale Protocol Node-Configuration

46

© Copyright IBM Corporation 2016 © 2016 IBM Corporation


Managing remote access to file systems
• List all clusters authorized to mount files systems owned by cluster1. On cluster1:
#mmauth show
• Authorize 3rd cluster, cluster3 to access file systems on cluster1:
#mmauth add cluster3 –k cluster3_id_rsa.pub
#mmauth grant cluster3 –f <filesystem>
• Subsequent Revoke cluster3 authorization
#mmauth deny cluster3 –f <filesystem>
• Permanent delete cluster authorization
#mmauth delete cluster3

47 © 2016 IBM Corporation


Other information about file systems accessed by nodes from
another cluster
• A file system is administered only by the cluster where the file system was created. Other clusters may
be allowed to mount the file system, but their administrators cannot add or delete disks, or change
characteristics of the file system
• Since each cluster is managed independently, there is no automatic coordination and propagation of
changes between clusters
• If the names of the contact nodes change, the name of the cluster changes, or the public key file
changes, use the update option of the mmremotecluster command to reflect the changes.
• Use the show option of the mmremotecluster and mmremotefs commands to display the current
information about remote clusters and file systems.
• If the cluster that owns a file system has a maxblocksize configuration parameter that is different from
the maxblocksize configuration parameter of the cluster that desires to mount a file system, a
mismatch may occur and file system mount requests may fail . Use mmlsconfig command to display
information and correct any discrepancies with the mmchconfig command.

48 © 2016 IBM Corporation


Options of Read-only Access Mount
• Option 1: mount option from remote cluster
– Remote cluster mount with read-only access
ldap:~ #mmmount rfs2 -o ro
mmmount: Mounting file systems ...
– Check mount status on ESS node
chi:~ #mmlsmount fs2 –L
File system fs2 is mounted on 3 nodes:
10.0.6.101 pok-gpfs chi-gpfs (internal mount)
10.0.6.100 chi-gpfs chi-gpfs (read-only mount)
10.0.6.104 ldap-gpfs extra.gpfs.cluster (read-only mount)

• Option 2: Grant read-only access of the file system


a. Grant read-only access on ESS node
chi:~ #mmauth grant –f /dev/fs2 –a ro
chi:~ #mmauth show all
b. Remote cluster mount (no need to specifiy read-only access)
ldap:~ #mmmount
Cluster name: extra.gpfs.cluster Fri Feb 12 14:33:24 EST 2016: mmmount: Mounting file systems ...
Cipher list: AUTHONLY mount: /dev/rfs2 is write-protected, mounting read-only
SHA digest: b7d3bdb971b34a01733d44b6ae78b29ae8d31e6e330e4a0d2105a64caeb9d999
File system access: cesSharedRoot (rw, root allowed) c. Check mount status on ESS node
fs1 (rw, root allowed)
chi:~ #mmlsmount fs2 -L
fs2 (ro, root allowed) <<<<<< read-only access granted
File system fs2 is mounted on 2 nodes:
Cluster name: chi-gpfs (this cluster) 10.0.6.101 pok-gpfs chi-gpfs (internal mount)
Cipher list: AUTHONLY 10.0.6.104 ldap-gpfs extra.gpfs.cluster (read-only mount)
SHA digest: 967b9d28e659a916da27707a44a7ba98b9be911ac1339e2d543c61ac9c9a92c2
SHA digest (new): f08fc5fcebf2d0f646dd25a54803574e5cb2c9040c6211569a846033504b1db8
File system access: (all rw)
49 © 2016 IBM Corporation
Key Context and Assumptions for migration using AFM
• Both home and cache systems are configured with compatible authentication
methods
• File access via protocol nodes via SMB or NFS
• The cache system has the capacity and performance that are equal or better than the
home system
• All DNS names and aliases are part of the home system can be extended/modified to
accommodate the cache system
• Special handling for systems using TSM and/or HSM managed.
• Data access only via migrate cache or home system but not both simultaneously
• Prefect time affected by network latency, bandwidth, performance capability of the
gateway and the underlying disk speed.

50 © 2016 IBM Corporation


Data Transfer using AFM
• Data Transfer method is over NFS
• The new system needs to have one or more configured AFM gateway nodes with
high performance and high speed network
• Network IP routes and data path from the gateway nodes to the home system is on
the same subnet and over at least 10 Gbe links
• A sample script provided to obtain a full list of files
• For each data set, take a snapshot of the source data and transfer data
• At the end of the transfer, ensure that data are copied fully (i.e. use checksum)
• Repeat the transfer process with fresh snapshots

51 © 2016 IBM Corporation


Migration steps using AFM
• Collect relevant configuration from the home system to be used in the cache system
• Set up the home system for migration phase using AFM
• Create Recovery points/snapshots in the home system
• Set up AFM cache filesets and exports on the cache system using Single Writer (SW) mode
• Disable cache eviction and set prefect threshold on each cache
• Create target exports to match the source exports
• Ensure authentication is compatible with the source environment
• Use AFM control tools to pull in all or hot data from the home fileset to the cache system
• Redirect users and applications to the new cache system
• Pull remaining updated data and verify data are migrated to cache
• Convert AFM cache to ordinary filesets (as necessary)

52 © 2016 IBM Corporation

S-ar putea să vă placă și