Documente Academic
Documente Profesional
Documente Cultură
Announcements
We have a Grader!
o Anirudh Dhawan (andhawan@asu.edu)
o Office Hours: Thur 10am-12pm; BA Suite 318
Map Reduce
Big
Data
Platform
s
Parallel
Programming
Cloud
Computing
Virtualization
What is Virtualization?
Virtualization means that Applications can use a resource without
any concern for where it resides, what the technical interface is,
how it has been implemented, which platform it uses, and how
much of it is available
~Rick F. Van der Lans in Data Virtualization for Business Intelligence Systems
Server
VM
App
App
Bins/Li
Bins/Lib
bs
s
Guest
Guest
OS
OS
Hypervisor Type-2
Host OS
Server
erence: https://en.wikipedia.org/wiki/Hypervisor
Server Virtualization provides
App
Bins/Lib
s
App
App
Bins/Libs
Docke
r
Host OS
Server
ence: https://en.wikipedia.org/wiki/Operating-system-level_virtualization
Striping
o Sequential blocks of data are stored on different physical storage devices in (typically) round-robin fashion.
o Example: Disk1 <A, C, E>; Disk2 <B, D, F>
o Striping is useful when requests for data are faster than a single storage device can deliver. Striping data across multiple storage devices
allows for concurrent access to data thereby improving performance.
Mirroring
o Replication of data onto separate disks in real time.
o Example: Disk1 <A, B, C>; Disk2 <A, B, C>
o Improves data redundancy and reliability.
Parity
o When data on a crashed disk can be reconstructed using data on other disks (using the XOR operation)
o Example: Disk1 <A:11010011>; Disk2 <B:10011001>; Disk3 <PAB: 01001010>
Essentially, PAB = A XOR B, so is any one disk crashes, you can reconstruct using XOR operation between other two
File System:
o Controls how data is managed, stored and retrieved.
o Without a file system, we would just have a large blob of data with no way to identify different connected pieces of information.
o File systems are organized around groups of data called files, and groups of files called directories or folders.
o Distributed files systems are files systems that are spread across multiple servers.
Reference: Wikipedia
Storage Virtualization
Data is abstracted into what appears to be a single storage unit, while the physical
storage actually spans multiple heterogeneous devices and often locations
Striping
(provides
excellent
performance)
Mirroring
(provides
excellent
redundancy)
Parity
(provides
good
redundancy)
Minimum
Number
of Disks
Example
(Disk Blocks)
Comments
RAID 0
Yes
No
No
Disk 1 -- A, C, E
Disk 2 -- B, D, F
Excellent Performance.
No Redundancy.
Do not use for critical
applications.
RAID 1
No
Yes
No
Disk 1 -- A, B, C
Disk 2 -- A, B, C
Good Performance.
Excellent Redundancy.
RAID 5
Yes
No
Yes
(Distributed
Parity)
Disk 1 A, C,
PEF
Disk 2 B, PCD,
E
Disk 3 PAB, D,
F
Good Performance.
Good Redundancy.
Most cost effective.
Fast Reads; Slow Writes.
RAID 10
Yes
Yes
No
Disk
Disk
Disk
Disk
Excellent Performance.
Excellent Redundancy.
Great for mission critical
applications.
ference: https://en.wikipedia.org/wiki/RAID
1
2
3
4
-----
A,
A,
B,
B,
C, E
C, E
D, F
D, F
IP Address
o
Higher order bits determine network (indicated by subnet mask), and lower order bits determine host (device)
Subnetting:
o
Switch:
o
A computer network with interconnected devices within a limited geographical area such as a house or building.
Router
o
Routers maintain routing tables to determine whether traffic is meant for this LAN, a connected LAN or a different
network.
Example: the home router connects home computers to the internet (these are similar networks since they both share
TCP/IP protocol)
Reference: Wikipedia
Network Virtualization
Creation of logical, virtual networks that are decoupled from the (limitations of) underlying
physical hardware.
Securely extends a private network over a public network such as the internet
Users can remotely communicate with the private network as though they were
directly connected to it with the same functionality, security and administrative
policies
Application Virtualization
Application Virtualization separates the Application from the OS, so Applications can
be more easily deployed and delivered.
The application is packaged and streamed from the server down the network to the
client and, instead of being installed on the client device, is executed on the local
device in a virtual bubble that is completely isolated from the client OS.
Only required parts are streamed as and when they are used.
Once the application has been streamed, it is cached on the client device so it doesnt have
to be streamed every time a user uses it on the client. This also means the application can
be used even when the client is not connected to the server.
When an application upgrade is available, the server copy is upgraded, and the upgrades are
streamed down to the clients the next time the application is used on the client.
Reference: http://blogs.msdn.com/b/ianm/archive/2010/06/11/microsoft-virtual-desktop-101-making-sense-of-vdi-rds-app-v-med-v-and-
Cloud Computing
You typically sign up for service (free with ads, free trial, or subscription)
You dont need to install application software, and version upgrades are
pushed seamlessly
You rely on the service provider for infrastructure (eg: you dont set up mail
server)
*Note: a lot of these services come with clients apps we are not considering
that scenario here.
Key enabling technologies include: (1) fast wide-area networks, (2) powerful,
inexpensive server computers, and (3) high-performance virtualization for
commodity hardware.
ource: http://www.nist.gov/itl/cloud/
http://
www.intel.com/content/www/us/en/cloud-computing/cloud-101-vid
eo.html
Deployment Models
There are 3 basic deployment models in cloud computing:
Private Cloud
o
On-Prem Private Cloud: On-Prem Data Center + Network Virtualization + Cloud Orchestration Software
Externally Hosted Private Cloud (also called Virtual Private Cloud): Logically isolated, user-defined, and usercontrolled portion of a 3rd party hosted cloud (like AWS or Microsoft).
Public Cloud
o
Third-Party Provides Cloud Services (3 different service models - IaaS, PaaS, or SaaS)
Service Provider held to agreed upon availability, reliability, privacy and security standards
Hybrid Cloud
o
Service Models
There are 4 basic service models in cloud computing, based on what
parts of the stack the User controls vs what the Cloud Provider
manages.
erence: https://en.wikipedia.org/wiki/Cloud_computing
Key Characteristics
On-demand self-service:
A consumer can provision computing capabilities, as needed automatically without requiring human interaction
with each service provider.
Resource pooling:
Computing resources are pooled to serve multiple consumers, with different physical and virtual resources
dynamically assigned and reassigned according to consumer demand.
Measured service:
Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and
consumer of the utilized service.
Advantages
o
Faster Deployment since infrastructure set up is quick, and software integration is easier
Cost Reduction due to savings on sunk cost of infrastructure, licenses, and maintenance
Risks
o
Dependency on the Provider can lead to vendor lock-in and migration challenges
Downtime of service can occur due to Service Provider outage or network access issues
Reference: https://en.wikipedia.org/wiki/Cloud_computing
Virtualization means that Applications can use a resource without any concern
for where it resides, what the technical interface is, how it has been
implemented, which platform it uses, and how much of it is available.
o Virtualization can occur at different levels of the stack: Server, Storage, Network, Desktop
and Application.
Homework - spend a 5-10 minutes on each of these Sites: Amazon AWS, Microsoft Azure,
Google Cloud
o Do you now see a number of familiar terms on these sites?
o What deployment models do they cover?
o What service models do they cover?
o Note how they all have very similar competing offers (including free trials to improve adoption).