Documente Academic
Documente Profesional
Documente Cultură
Each block acts as an individual hard drive and is configured by the storage
administrator. These blocks are controlled by the server-based operating
system, and are generally accessed by Fibre Channel, iSCSI or Fibre
Channel over Ethernet protocols.
Because the volumes are treated as individual hard disks, block storage
works well for storing a variety of applications. File
systems and databases are common uses for block storage because they
require consistently high performance. Email servers such as Microsoft
Exchange use block storage in lieu of file- or network-based storage systems.
RAID arrays are a prime use case for block storage as well. With RAID,
multiple independent disks are combined for data protection and performance.
The ability of block storage to create individually controlled storage volumes
makes it a good fit for RAID.
Virtual machine file systems are another common use for block-level storage.
Virtualization vendors such as VMware support block storage protocols, which
can improve migration performance and improve scalability. Using a SAN for
block storage also aids virtual machine (VM) management, allowing for non-
standard SCSI commands to be written.
While there are benefits to using block storage, there are also alternatives that
may be better suited to certain organizations or uses. Two options stand out
when it comes to facing off with block-level storage: file storage and object
storage.
If simplicity is the goal, file storage may win out over block-level storage. But while
block storage devices tend to be more complex and expensive than file storage, they
also tend to be more flexible and provide better performance.
File storage provides a centralized, highly accessible location for files, and generally
comes at a lower cost than block storage. File storage uses metadata and directories to
organize files, which makes it a convenient option for an organization looking to
simply store large amounts of data.
The relatively easy deployment of file storage makes it a viable tool for data
protection, and the low costs and simple organization can be helpful for local
archiving. File sharing within an organization is another common use for file storage.
The simplicity of file storage can also be its downfall. While it has a hierarchical
organization to it, the more files added, the more difficult and tedious it becomes to
sift through file storage. If performance is the deciding factor, object or block-level
storage win out over file storage.
Some products, such as Hewlett Packard Enterprise (HPE) 3PAR's File Persona
service, have converged file and block storage to provide the benefits of both
technologies.
Rather than splitting files into raw data blocks, object storage clumps data together as
one object that contains data and metadata. Blocks of storage do not contain metadata,
so in that regard object storage can provide more context about the data, which can be
helpful in classifying and customizing the files. Each object also has a unique
identifier, which makes quicker work of locating and retrieving objects from storage.
Block storage can be expanded, but object storage is unmatched when it comes to
scalability. Scaling out an object storage architecture only requires adding nodes to
storage cluster.
The flexibility and scalability of object storage may be appealing, but some
organizations may choose to prioritize performance and choose file or block storage.
While block storage allows for editing incremental parts of a file, object stores must
be edited as one unit. If one part of an object needs to be edited, the entire object must
be accessed and updated, then rewritten, which can negatively affect performance.
Both object and block-level storage are used in the enterprise, but object storage use
cases lean more toward scenarios dealing with large amounts of data, such as big data
storageand backup archives. Because of this, modern data storage environments such
as the cloud are arguably trending toward object-based storage over file and block
storage options. However, individual needs will always be the determining factor for
which form of storage is used.
Along with HPE, several larger and smaller storage vendors provide block storage.
The largest storage vendors are Dell EMC, HPE, Hitachi Vantara, IBM and NetApp.
Additional vendors include DataDirect Networks, Huawei, Infinidat, Kaminario,
Nutanix, Oracle, Pure Storage, Tintri and Western Digital. The largest vendors all
have several block storage platforms, as well as unified storage that runs block and
file on the same arrays.
OpenStack Block Storage (Cinder) is an open source form of block storage, which
provisions and manages storage blocks. It also provides basic storage capabilities such
as snapshot management and replication. OpenStack Block Storage is supported by
other vendors such as IBM, NetApp, Rackspace, Red Hat and VMware.
Amazon Elastic Block Store (EBS) is persistent block storage for Amazon Elastic
Cloud Compute. EBS is scalable and designed for workloads such as big data
analytics, NoSQL databases and data warehousing.
Why block storage is gaining momentum
With large vendors like Dell EMC and Amazon on board with block storage products,
it is clearly going to be a supported technology for the foreseeable future. While there
are pros and cons to its use, many of the negatives can be chalked up to features that
are better provided by a different storage system. These needs may vary by
organization, and while file or object storage may be better suited to some cases,
block storage will likely be the right choice for others.
If an organization is looking to incorporate the cloud, then they will find block
storage to be a common partner for cloud computing.
Oracle extents
WhatIs.com
Follow:
The term extent is also sometimes used in reference to any contiguous space
-- for example, a set of sectors -- on a hard drive that is reserved for a
particular file, folder or application.
Oracle Database stores data in data blocks, which can also be called logical
blocks. A data block is the smallest unit of data within a database. One data
block can correspond to a certain number of bytes of physical database space
on a disk.
Extents are made up of groups of data blocks. The amount of logical database
storage that is greater than an extent is a database segment.
Oracle's database provides a segment advisor for IT pros to determine if an object has
any space available for deallocation based on the amount of space fragmentation
within the subject. The extents of a segment do not return to the tablespace unless the
schema object within the segment is dropped.
A database segment is a set of extents that contains all of the data necessary for a
logical storage structure within a tablespace.
For every table that is created, Oracle Database allocates extents to form the table's
data segment. Oracle provides space for segments within extents, so when a segment's
existing extents are full, the software provides another extent for that segment.
Improving your enterprise data storage management – essentially, better
leveraging your network-attached storage (NAS) data or block storage data,
as well as access to both – is a challenge for many IT organizations.
When considering best practices for improving data storage management, Jeff
Boles, senior analyst and director, validation services at Hopkinton, Mass.-
based Taneja Group, explains how to better leverage NAS data storage
systems via data classification and tiering tools and techniques. When it
comes to block storage, Boles advises users to consider performance
management toolsand thin provisioning.
Q. How can users better leverage their existing NAS data storage?
A. When it comes to [NAS data storage, or] file storage, there's always
been data classificationand tiering as a possibility for optimizing your storage.
At the heart of the matter is how you understand where you're applying your
storage resources -- if you're storing the right types of files in the right places,
if you're using your storage for data that's actually important to the business.
Data classification with e-discovery behind it has been driven to a whole new
level of maturity, and if you haven't looked in a while, there's a whole new set
of tools out there that you can access to classify your data, figure out what's
going on, and really move and optimize it. And in fact, even tools like StorNext
from Quantum Corp. has been working behind the scenes, and they're getting
new capabilities over time because now they're at the heart of data
deduplication. But StorNext was originally a data archiving platform. So,
there's interesting possibilities.
In addition, a lot of the data classification vendors out there have some tools
that you can apply alongside of a NetApp filer, for instance, or EMC Celerra,
and use things like the file mover API to tier some of your data to other
storage systems.
But what you really want to dig into is a tool that can help you understand your
data without too much complexity. You don't want to get into "analysis
paralysis" when it comes to tagging stuff and getting all kinds of metadata
from your existing files. But you want something that understands who's using
that file, how often it's accessed and how important it is to the business.
Q. What advice do you have for end users looking to make better use of
their block data storage systems?
A. Let's talk about identifying how you're using your existing block storage
with an eye toward performance. There are a lot of technologies out there, like
thin provisioning; you either have that or you don't today. Certainly if you're
acquiring new stuff, you should never overlook thin provisioning and make
sure that you're thin provisioning inside of the system is built to deliver the
performance capabilities that you expect of it.
But if you're not in that place today, let's talk about using your block storage a
little bit longer than normal and with an eye toward performance. Vendors like
Akorri are bringing performance management tools to the table that can help
you peer into your infrastructure and understand performance requirements in
various applications. [Companies] like Virtual Instruments can give you really
deep, packet-level insight that they can roll up into big dashboard data to help
you understand your environment.
There's even things out there like Performance Pack for EVA from Hewlett-
Packard (HP) Co. that help you get this kind of visibility as well – these tools
can help you understand on a session basis, an application basis, what kind
of performance resources you require, and maybe you can start differentiating
a little bit better between how you provision storage in your environment and
doing things like restricting bandwidth within your fabric if you have that type
of intelligence within your switches. Or you can reconfigure your LUNs [logical
unit numbers] on the back end so that you're not taking up as many resources
and maybe your RAID volume constructions are a little bit different, your
virtualized volumes are a little bit different, so you're not sucking down the
same performance for every application when not every application needs it.
A. There are big opportunities in block [data storage] – virtualized I/O, for
instance, next-generation fabrics for select sets of equipment, things like Fibre
Channel over Ethernet (FCoE). When it comes to buying new equipment,
don't just keep provisioning host bus adapters [HBAs] and networks
separately and redundant connectivity all over your enterprise and consuming
your Fibre ports if that's a limited resource. [Consider] next-generation fabrics
and things like virtualized I/O, where you're doing I/O to your local servers with
InfiniBand or with Ethernet and running FCoE over it, where you can take this
out over a single wire or two wires from a server to a gateway that only
consumes a couple of ports from your fabric.
Then let's turn an eye toward [improving access to NAS data storage]. You
shouldn't overlook the opportunity to spread your use of wide-area data
services, wide-area data optimization for [NAS data storage]. This can let you
consolidate file storage in single locations. Maybe you have some of that in
your enterprise today, but you're not making full use of it.
Look at getting more of your data back into a central location where you can
apply your time and effort to the management of it better; [make sure] you
have the right wide-area data services/wide-area file services in place so that
users can still access it like a localized resource but keep it in a central
location. So, look for those types of opportunities – compressing bandwidth,
moving data back to a central place, not occupying as many resources, when
it comes to connecting into your existing fabric.
For the pros, this question represents "Beginner's Storage 101." But the
storage tech literature always talks about "block level" versus "file level" data,
without ever clearly explaining the key differences and relevance. Can
someone please do so, once and for all, in layman's language so all us
"uninitiated" can understand? Thank you.
Any two devices communicating over a network have to agree on how they
will communicate. Standard protocols are the implementations of those
communications agreements. There can be and there are many networking
protocols.
Storage devices and subsystems typically are slaves to the filing systems that
write data to them and read from them. The filing systems are typically file
systems or database systems. Examples of filing systems are the NTFS file
system in Windows 2000 and NT, the FAT file system in DOS, the many
flavors of the Unix File system (UFS), the Veritas File system (VxFS), Oracle
databases, Informix databases, Sybase databases.
Filing systems do two things: First, they represent data to end users and
applications. This data is typically organized in directories or folders typically
in some hierarchical fashion. I talk about this in my new book as data
representation. The second thing filing systems do, is organize where data is
placed in storage. These filing systems have to scatter the data around the
storage container to make sure that all data can be accessed with reasonable
performance. They do this by directing the storage block addresses where the
data is going to be placed. I refer to this as a data structure function. Today,
these are actually all logical block addresses as the disk drives keep their own
internal block translation tables. That might be more than you need to know
for now, but it could be useful for some.
It is also possible for systems to request data using the user-level data
representation interfaces (File level storage). This is done by the client using
the data's filename, its directory location, URL, or whatever. This is a
client/server model of communicating. The server in this case receives the
filing request and then looks up the data storage locations where the data is
stored and retrieves it using storing level functions (block level storage). The
server does not send the file to the client as blocks, but as bytes of the file.
File level protocols do not have the capability of understanding block
commands. Likewise, block protocols cannot convey file access requests and
responses.
One of the confusing things in this is that filing and storing are tightly
integrated. Neither can work without the other. But when it comes to
understanding how storing and filing traffic is transferred over a network; both
are independent of the wiring (networking or bus) that supports their
communications. In other words, storing and filing traffic can exist on the
same network using different storage application protocols.
===================
When it comes to Hyper-V and VMware storage, which is better: block- or file-
based access? The rate of adoption of server virtualisation has accelerated
over recent years, and virtual server workloads now encompass many
production applications, including Tier 1 applications such as databases.
For that reason it is now more important than ever that Hyper-V and VMware
storage is well-matched to requirements of the environment. In this article we
will discuss the basic requirements for Hyper-V and VMware storage and
examine the key question of block vs file storage in such deployments.
There are, however, some disadvantages when using NFS with VMware.
Scalability is limited to eight NFS shares per VMware host (this can be
expanded to 64 but also requires TCP/IP heap size to be increased).
Although these NFS shares can scale to the maximum size permitted by
the storage filer, the share is typically created from one group of disks with
one performance characteristic; therefore, all guests on the share will
experience the same I/O performance profile.
For Hyper-V, CIFS allows virtual machines (stored as virtual hard disk, or
VHD, files) to be stored and accessed on CIFS shares specified by a Uniform
Naming Convention (UNC) or a share mapped to a drive letter. While this
provides a certain degree of flexibility in storing virtual machines on Windows
file servers, CIFS is an inefficient protocol for the block-based access required
by Hyper-V and not a good choice. It is disappointing to note that Microsoft
currently doesn’t support Hyper-V guests on NFS shares. This seems like a
glaring omission.
Block protocols include iSCSI, Fibre Channel and FCoE. Fibre Channel and
FCoE are delivered over dedicated host adapter cards (HBAs and CNAs,
respectively). iSCSI can be delivered over standard NICs or using dedicated
TOE (TCP/IP Offload Engine) HBAs. For both VMware and Hyper-V, the use
of Fibre Channel or FCoE means additional cost for dedicated storage
networking hardware. iSCSI doesn’t explicitly require additional hardware but
customers may find it necessary to gain better performance.
VMware supports all three block storage protocols. In each case, storage is
presented to the VMware host as a LUN. Block storage has the following
advantages.
Each LUN is formatted with Virtual Machine File System, or VMFS, which
is specifically written for storing virtual machines.
ESXi 4.x supports “boot from SAN” for all protocols, enabling stateless
deployments.
SAN environments can use RDM (Raw Device Mapping), which enables
virtual guests to write non-standard SCSI commands to LUNs on the
storage array. This feature is useful on management servers.
Summary
NFS storage is suitable only for VMware deployments and is not supported by
Hyper-V. Typically, NAS filers are cheaper to deploy than Fibre Channel
arrays, and NFS provides better out-of-band access to guest files without the
need to use the hypervisor. In the past NFS had been used widely for
supporting data like ISO installation files, but today it has wider deployments
where the array architecture supports the random I/O nature of virtual
workloads.
We recap the key attributes of file and block storage access and the pros and
cons of object storage, a method that offers key benefits but also drawbacks
compared with SAN and NAS
This article will recap the fundamentals of file and block, but with the purpose
of highlighting the quite different characteristics of object storage, all of which
are forms of shared storage. In the final analysis, we will suggest the use
cases most suited to object storage, as well as file and block.
The trigger is the rise of object storage, which has become prominent in the
form of array-type products as well as being the basis for cloud-based
protocols such as Amazon’s S3.
To see how object storage differs significantly from SAN and NAS protocols,
let’s first look at those.
That is the part we see. But under the bonnet, that file path and the file system
also handle addressing to the physical location of blocks of storage on the
media itself.
The key difference between file access/NAS and block access/SAN is that in
NAS, the file system resides on the array. Here, an application’s I/O requests
go via the file system resident on the NAS hardware, accessed as a volume or
drive. In a SAN, the file system is external to the array and I/O calls are
handled by the file system on the server, with only block-level information
required to access data from the SAN.
NAS is best suited to retention and access of entire files and has locking systems that
prevent simultaneous changes and corruption to files.
Meanwhile, SAN systems allow changes to blocks within entire files and so are
extremely well suited to database and transactional processing.
SAN and NAS are well suited to what they do, but have drawbacks.
For example, NAS can be limited by scale. Historically, organisations put in a NAS
box to service a department, but these proliferated and were unconnected, leading to
silos of data. This issue is overcome with scale-out NAS, where multiple NAS
instances operate a single, highly-scalable parallel file system.
The tree-like file system hierarchy can handle millions of files quite easily, but once
you scale to billions, it can start to slow up.
Massive scalability
Object storage brings massive scalability. That is because it works differently from
the SAN and NAS protocols. It has no file system but, like NAS, changes are at the
file level.
That means object storage is massively scalable, to billions of objects, because the file
organisation does not become unwieldy the bigger it becomes.
Objects also have metadata, and lots of it, potentially, all definable by the customer.
That means any attribute can be associated with an object in its header metadata: the
application it is associated with, its data protection characteristics, tiering information,
when it should be deleted, and by custom business- or organisation-related attributes.
So, object storage is eminently suited to analytics, being searchable in very large
datasets for potentially almost any attribute.
Object storage is a rising star in data storage, especially for cloud and web use. But what are the pros and cons
Both NAS and object storage offer highly scalable file storage for large volumes of unstructured data, but which
But that multiple location attribute can also be an advantage, making object storage
well-suited to an organisation with multi-regional needs.
By contrast, SAN and NAS can be “strongly consistent”, with near real-time mirrors
of datasets possible.
Also, object storage cannot perform as well as SAN and sometimes NAS, mainly
because of the large file header overheads it carries. It also cannot offer the sub-file
block-level manipulation required for database and transactional work that SAN
access can.
For those two key reasons, object storage is best suited to large datasets
of unstructured data in which objects do not change that often.
Outside the pros and cons of the technology per se, object storage has the advantage
of relative cheapness, often running on commodity hardware. That is in contrast to
potentially expensive packaged array-type products from storage box suppliers.
Having said that, costs can come in other areas, such as changes to your software
environment. Not all applications will necessarily be natively compatible with object
storage file calls. Built for NFS, SCSI, and so on, they will need adapting to deal with
the Get, Put, Delete and other commands of object storage.
To sum up:
NAS: Good at secure file sharing. Can become siloed. Scale-out NAS potentially
good at scale. Bad at extreme scale.
SAN and NAS: Both can come with advanced storage features, such as replication.
Both can be relatively costly compared with object storage on commodity
hardware, although both SAN and NAS software-defined storage are available.
Both lack the rich metadata of object storage.
Object storage: Very scalable, suited to unstructured data and large datasets,
potentially good for analytics via rich metadata. Lacks high-end performance and
data protection is slow across clusters. Can be very cost-efficient, hardware-wise.