Sunteți pe pagina 1din 55

Docker

History of Virtualization
What is Containerization?
What is Docker?
Docker Architecture
 Docker Engine
 Docker Images
 Registries
 Docker Containers
Installing Docker on Linux
Basic Docker commands

History of Virtualization
Earlier, the process for deploying a service was slow and painful. First, the developers were writing code;
then the operations team would deploy it on bare metal machines, where they had to look out for
library versions, patches, and language compilers for the code to work. If there were some bugs or
errors, the process would start all over again, the developers would fix it, and then again, the
operational team was there to deploy.

There was an improvement with the creation of Hypervisors. Hypervisors have multiple Virtual machines
or VMs on the same host, which may be running or turned off. VMs decreased the waiting time for
deploying code and bug fixing in a big manner, but the real game changer was Docker containers.

What is Virtualization?
Virtualization is the technique of importing a Guest operating system on top of a Host operating system.
This technique was a revelation at the beginning because it allowed developers to run multiple
operating systems in different virtual machines all running on the same host. This eliminated the need
for extra hardware resource.

The advantages of Virtual Machines or Virtualization are:


Multiple operating systems can run on the same machine
Maintenance and Recovery were easy in case of failure conditions
Total cost of ownership was also less due to the reduced need for infrastructure

In the diagram on the right, you can see there is a host operating system on which there are 3 guest
operating systems running which is nothing but the virtual machines.
As you know nothing is perfect, Virtualization also has some shortcomings. Running multiple Virtual
Machines in the same host operating system leads to performance degradation. This is because of the
guest OS running on top of the host OS, which will have its own kernel and set of libraries and
dependencies. This takes up a large chunk of system resources, i.e. hard disk, processor and especially
RAM.
Another problem with Virtual Machines which uses virtualization is that it takes almost a minute to
boot-up. This is very critical in case of real-time applications.

Following are the disadvantages of Virtualization:


Running multiple Virtual Machines leads to unstable performance
Hypervisors are not as efficient as the host operating system
Boot up process is long and takes time

What is Containerization?
Containerization is the technique of bringing virtualization to the operating system level. While
Virtualization brings abstraction to the hardware, Containerization brings abstraction to the operating
system. Do note that Containerization is also a type of Virtualization. Containerization is however more
efficient because there is no guest OS here and utilizes a host’s operating system, share relevant
libraries & resources as and when needed unlike virtual machines. Application specific binaries and
libraries of containers run on the host kernel, which makes processing and execution very fast. Even
booting-up a container takes only a fraction of a second. Because all the containers share, host
operating system and holds only the application related binaries & libraries. They are lightweight and
faster than Virtual Machines.

Advantages of Containerization over Virtualization


Containers on the same OS kernel are lighter and smaller
Better resource utilization compared to VMs
Boot-up process is short and takes few seconds
In the diagram, you can see that there is a host operating system which is shared by all the containers.
Containers only contain application specific libraries which are separate for each container and they are
faster and do not waste any resources.

All these containers are handled by the containerization layer which is not native to the host operating
system. Hence a software is needed, which can enable you to create & run containers on your host
operating system.

What is Docker?
Docker is a containerization platform that packages your application and all its dependencies together in
the form of Containers to ensure that your application works seamlessly in any environment.

As you can see in the diagram, each application will run on a separate container and will have its own
set of libraries and dependencies. This also ensures that there is process level isolation, meaning each
application is independent of other applications, giving developers surety that they can build
applications that will not interfere with one another.

As a developer, I can build a container which has different applications installed on it and give it to my
QA team who will only need to run the container to replicate the developer environment.

Benefits of Docker
Now, the QA team need not install all the dependent software and applications to test the code and this
helps them save lots of time and energy. This also ensures that the working environment is consistent
across all the individuals involved in the process, starting from development to deployment. The number
of systems can be scaled up easily and the code can be deployed on them effortlessly.

Virtualization vs Containerization
Virtualization and Containerization both let you run multiple operating systems inside a host machine.

Virtualization deals with creating many operating systems in a single host machine. Containerization on
the other hand will create multiple containers for every type of application as required.
Figure: What is Big Data Analytics – Virtualization versus Containerization

As we can see from the image, the major difference is that there are multiple Guest Operating Systems
in Virtualization which are absent in Containerization. The best part of Containerization is that it is very
lightweight as compared to the heavy virtualization.

Docker Architecture
Let's talk about Docker main components in the Docker Architecture

Docker Engine
Docker is the client-server type of application which means we have clients who relay to the server. So,
the Docker daemon called: dockerd is the Docker engine which represents the server. The docker
daemon and the clients can be run on the same or remote host, and they communicate through
command line client binary, as well as a full RESTful API to interact with the daemon: dockerd.

Docker Images
Docker images are the "source code" for our containers; we use them to build containers. They can have
software pre-installed which speeds up deployment. They are portable, and we can use existing images
or build our own.

Registries
Docker stores the images we build in registries. There are public and private registries. Docker company
has public registry called Docker hub, where you can also store images privately. Docker hub has millions
of images, which you can start using now.

Docker Containers
Containers are the organizational units of Docker. When we build an image and start running it; we are
running in a container. The container analogy is used because of the portability of the software we have
running in our container. We can move it, in other words, "ship" the software, modify, manage, create
or get rid of it, destroy it, just as cargo ships can do with real containers.

In simple terms, an image is a template, and a container is a copy of that template. You can have
multiple containers (copies) of the same image.
Below we have an image which perfectly represents the interaction between the different components
and how Docker container technology works.

Installing Docker on Linux


To install docker, we need to use the Docker team's DEB packages. For that, first we need to install some
prerequisite packages.

Step 1) Adding prerequisite Ubuntu packages

There are certain packages you require in your system for installing Docker. Execute the below
command to install those packages.

$ sudo apt-get install \


apt-transport-https \
ca-certificates curl \
software-properties-common
*the sign "/" is not necessary it's used for the new line, if want you can write the command without
using "/" in one line only.

Step 2) Add the Docker GPG key

Now, import Dockers official GPG key to verify packages signature before installing them with apt-get.
Run the below command on terminal:

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -


Step 3) Adding the Docker APT repository

Now, add the Docker repository on your Ubuntu system which contains Docker packages including its
dependencies, for that execute the below command:

$ sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"

You may be prompted to confirm that you wish to add the repository and have the repository's GPG key
automatically added to your host.
The lsb_release command should populate the Ubuntu distribution version of your host.

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -


cs) stable"

Step 4) Update APT sources

Now you need to upgrade apt index.

$ sudo apt-get update

Step 5) Installing the Docker packages on Ubuntu

We can now install the Docker package itself and install Docker community edition, for that execute the
below commands:

$ sudo apt-get install docker-ce

The above-given command installs Docker and other additional required packages. Before Docker 1.8.0,
the package name was lxc-docker, and between Docker 1.8 and 1.13, the package name was docker-
engine.
NOTE: Docker for Windows requires Windows 10 Pro or Enterprise version 14393, or Windows server
2016 RTM to run

Basic Docker Commands

The most basic command we must run after installing Docker is $ docker info as we said previously.
$ sudo docker info

You should get the similar or following result

As we can see we have information about docker containers how many are running, paused or stopped
and how many images we have downloaded. So, let's get our first image.

$ sudo docker pull alpine

With this command we are telling docker to download the image alpine, to pull it from the public
registry, the latest version which is set by default.
*alpine is a minimal Docker image based on Alpine Linux with a complete package index and only 5 MB
in size.

If we want to run the image as a container, we will use the following command.

$ sudo docker run -i -t alpine /bin/bash

If we run the command, we will be sent directly to the alpine's terminal.


The -i flag keeps STDIN open from the container, even when you are not attached to it. This persistent
standard input is one half of what you require for an interactive shell.
The -t flag is the other half, and which instructs Docker to assign a pseudo-tty to the container.
This offers us an interactive shell in the new container. We exit the container with a simple exit
command.

Now we can try running an Ubuntu image.

$ sudo docker run -it ubuntu /bin/bash

You can notice docker checks for the image locally, and if it's not there, the image is pulled from the
image library automatically, and once again we have an interactive shell running. We can also name the
containers as we run them.

$ sudo docker run –-name our_container -it ubuntu /bin/bash


and we exit again.

We can also run container we previously created, without an interactive shell.

$ sudo docker start container_name


And stop the container writing docker stop container_name

$ sudo docker stop container_name


If we want to see all running containers, we just run

$ sudo docker ps
And for all containers we add "- a"at the end of this same command, like this sudo docker ps -a

This command shows Container's ID, which image is using when was created, running status, exposed
ports and randomly generated name for the container for easier management.

When we run containers, we would also like to know how much resources they are using, for that
purpose we can use the command.

$ sudo docker stats


You can also see which images we have downloaded locally and info about them.

$ sudo docker images


The command displays the docker image with a tag which shows our image version, a distinctive image
ID, when was created and image size.

Summary
Earlier, the process for deploying a service was slow and painful but, VMs decreased the waiting time for
deploying code and bug fixing in a big manner
Docker is computer software used for Virtualization in order to have multiple Operating systems running
on the same host
Docker is the client-server type of application which means we have clients who relay to the server
Docker images are the "source code" for our containers; we use them to build
Docker has two types of registries 1.) public and 2) private registries
Containers are the organizational units of Docker. In simple terms, an image is a template, and a
container is a copy of that template. You can have multiple containers (copies) of the same image.

Command Description

docker info Information Command

docker pull Download an image

docker run -i -t image_name /bin/bash Run image as a container

docker start our_container Start container

docker stop container_name Stop container

docker ps List of al running containers

docker stats Container information

docker images List of images downloaded


Hystrix

Overview
A typical distributed system consists of many services collaborating together.
These services are prone to failure or delayed responses. If a service fails it may impact on other services
affecting performance and possibly making other parts of application inaccessible or in the worst case
bring down the whole application.

Of course, there are solutions available that help make applications resilient and fault tolerant – one
such framework is Hystrix.

The Hystrix framework library helps to control the interaction between services by providing fault
tolerance and latency tolerance. It improves overall resilience of the system by isolating the failing
services and stopping the cascading effect of failures.

Simple Example
The way Hystrix provides fault and latency tolerance is to isolate and wrap calls to remote services.
In this simple example we wrap a call in the run() method of the HystrixCommand:

class CommandHelloWorld extends HystrixCommand<String> {

private String name;

CommandHelloWorld(String name) {
super(HystrixCommandGroupKey.Factory.asKey("ExampleGroup"));
this.name = name;
}

@Override
protected String run() {
return "Hello " + name + "!";
}
}
and we execute the call as follows:

@Test
public void givenInputBobAndDefaultSettings_whenCommandExecuted_thenReturnHelloBob(){
assertThat(new CommandHelloWorld("Bob").execute(), equalTo("Hello Bob!"));
}

Maven Setup
To use Hystrix in a Maven projects, we need to have hystrix-core and rxjava-core dependency from
Netflix in the project pom.xml:
<dependency>
<groupId>com.netflix.hystrix</groupId>
<artifactId>hystrix-core</artifactId>
<version>1.5.4</version>
</dependency>

<dependency>
<groupId>com.netflix.rxjava</groupId>
<artifactId>rxjava-core</artifactId>
<version>0.20.7</version>
</dependency>

Setting up Remote Service


Let’s start by simulating a real-world example.
In the example below, the class RemoteServiceTestSimulator represents a service on a remote server. It
has a method which responds with a message after the given period of time. We can imagine that this
wait is a simulation of a time-consuming process at the remote system resulting in a delayed response
to the calling service:

class RemoteServiceTestSimulator {

private long wait;

RemoteServiceTestSimulator(long wait) throws InterruptedException {


this.wait = wait;
}

String execute() throws InterruptedException {


Thread.sleep(wait);
return "Success";
}
}
And here is our sample client that calls the RemoteServiceTestSimulator.

The call to the service is isolated and wrapped in the run() method of a HystrixCommand. Its this
wrapping that provides the resilience we touched upon above:

class RemoteServiceTestCommand extends HystrixCommand<String> {

private RemoteServiceTestSimulator remoteService;

RemoteServiceTestCommand(Setter config, RemoteServiceTestSimulator remoteService) {


super(config);
this.remoteService = remoteService;
}

@Override
protected String run() throws Exception {
return remoteService.execute();
}
}
The call is executed by calling the execute() method on an instance of the RemoteServiceTestCommand
object.

The following test demonstrates how this is done:

@Test
public void givenSvcTimeoutOf100AndDefaultSettings_whenRemoteSvcExecuted_thenReturnSuccess()
throws InterruptedException {

HystrixCommand.Setter config = HystrixCommand


.Setter
.withGroupKey(HystrixCommandGroupKey.Factory.asKey("RemoteServiceGroup2"));

assertThat(new RemoteServiceTestCommand(config, new


RemoteServiceTestSimulator(100)).execute(),
equalTo("Success"));
}
So far we have seen how to wrap remote service calls in the HystrixCommand object. In the section
below let’s look at how to deal with a situation when the remote service starts to deteriorate.

Working with Remote Service and Defensive Programming

Defensive Programming with Timeout


It is general programming practice to set timeouts for calls to remote services.
Let’s begin by looking at how to set timeout on HystrixCommand and how it helps by short circuiting:

@Test
public void
givenSvcTimeoutOf5000AndExecTimeoutOf10000_whenRemoteSvcExecuted_thenReturnSuccess()
throws InterruptedException {

HystrixCommand.Setter config = HystrixCommand


.Setter
.withGroupKey(HystrixCommandGroupKey.Factory.asKey("RemoteServiceGroupTest4"));

HystrixCommandProperties.Setter commandProperties = HystrixCommandProperties.Setter();


commandProperties.withExecutionTimeoutInMilliseconds(10_000);
config.andCommandPropertiesDefaults(commandProperties);

assertThat(new RemoteServiceTestCommand(config, new


RemoteServiceTestSimulator(500)).execute(),
equalTo("Success"));
}
In the above test, we are delaying the service’s response by setting the timeout to 500 ms. We are also
setting the execution timeout on HystrixCommand to be 10,000 ms, thus allowing enough time for the
remote service to respond.

Now let’s see what happens when the execution timeout is less than the service timeout call:

@Test(expected = HystrixRuntimeException.class)
public void
givenSvcTimeoutOf15000AndExecTimeoutOf5000_whenRemoteSvcExecuted_thenExpectHre()
throws InterruptedException {

HystrixCommand.Setter config = HystrixCommand


.Setter
.withGroupKey(HystrixCommandGroupKey.Factory.asKey("RemoteServiceGroupTest5"));

HystrixCommandProperties.Setter commandProperties = HystrixCommandProperties.Setter();


commandProperties.withExecutionTimeoutInMilliseconds(5_000);
config.andCommandPropertiesDefaults(commandProperties);

new RemoteServiceTestCommand(config, new RemoteServiceTestSimulator(15_000)).execute();


}
Notice how we’ve lowered the bar and set the execution timeout to 5,000 ms.

We are expecting the service to respond within 5,000 ms, whereas we have set the service to respond
after 15,000 ms. If you notice when you execute the test, the test will exit after 5,000 ms instead of
waiting for 15,000 ms and will throw a HystrixRuntimeException.

This demonstrates how Hystrix does not wait longer than the configured timeout for a response. This
helps make the system protected by Hystrix more responsive.

In the below sections we will look into setting thread pool size which prevents threads being exhausted
and we will discuss its benefit.

Defensive Programming with Limited Thread Pool


Setting timeouts for service call does not solve all the issues associated with remote services.
When a remote service starts to respond slowly, a typical application will continue to call that remote
service.
The application doesn’t know if the remote service is healthy or not and new threads are spawned every
time a request comes in. This will cause threads on an already struggling server to be used.

We don’t want this to happen as we need these threads for other remote calls or processes running on
our server and we also want to avoid CPU utilization spiking up.

Let’s see how to set the thread pool size in HystrixCommand:

@Test
public void givenSvcTimeoutOf500AndExecTimeoutOf10000AndThreadPool_whenRemoteSvcExecuted
_thenReturnSuccess() throws InterruptedException {

HystrixCommand.Setter config = HystrixCommand


.Setter
.withGroupKey(HystrixCommandGroupKey.Factory.asKey("RemoteServiceGroupThreadPool"));

HystrixCommandProperties.Setter commandProperties = HystrixCommandProperties.Setter();


commandProperties.withExecutionTimeoutInMilliseconds(10_000);
config.andCommandPropertiesDefaults(commandProperties);
config.andThreadPoolPropertiesDefaults(HystrixThreadPoolProperties.Setter()
.withMaxQueueSize(10)
.withCoreSize(3)
.withQueueSizeRejectionThreshold(10));

assertThat(new RemoteServiceTestCommand(config, new


RemoteServiceTestSimulator(500)).execute(),
equalTo("Success"));
}
In the above test, we are setting the maximum queue size, the core queue size and the queue rejection
size. Hystrix will start rejecting the requests when the maximum number of threads have reached 10
and the task queue has reached a size of 10.
The core size is the number of threads that always stay alive in the thread pool.

Defensive Programming with Short Circuit Breaker Pattern


However, there is still an improvement that we can make to remote service calls.
Let’s consider the case that the remote service has started failing.

We don’t want to keep firing off requests at it and waste resources. We would ideally want to stop
making requests for a certain amount of time in order to give the service time to recover before then
resuming requests. This is what is called the Short Circuit Breaker pattern.

Let’s see how Hystrix implements this pattern:

@Test
public void givenCircuitBreakerSetup_whenRemoteSvcCmdExecuted_thenReturnSuccess()
throws InterruptedException {

HystrixCommand.Setter config = HystrixCommand


.Setter
.withGroupKey(HystrixCommandGroupKey.Factory.asKey("RemoteServiceGroupCircuitBreaker"));

HystrixCommandProperties.Setter properties = HystrixCommandProperties.Setter();


properties.withExecutionTimeoutInMilliseconds(1000);
properties.withCircuitBreakerSleepWindowInMilliseconds(4000);
properties.withExecutionIsolationStrategy
(HystrixCommandProperties.ExecutionIsolationStrategy.THREAD);
properties.withCircuitBreakerEnabled(true);
properties.withCircuitBreakerRequestVolumeThreshold(1);

config.andCommandPropertiesDefaults(properties);
config.andThreadPoolPropertiesDefaults(HystrixThreadPoolProperties.Setter()
.withMaxQueueSize(1)
.withCoreSize(1)
.withQueueSizeRejectionThreshold(1));

assertThat(this.invokeRemoteService(config, 10_000), equalTo(null));


assertThat(this.invokeRemoteService(config, 10_000), equalTo(null));
assertThat(this.invokeRemoteService(config, 10_000), equalTo(null));

Thread.sleep(5000);

assertThat(new RemoteServiceTestCommand(config, new


RemoteServiceTestSimulator(500)).execute(),
equalTo("Success"));

assertThat(new RemoteServiceTestCommand(config, new


RemoteServiceTestSimulator(500)).execute(),
equalTo("Success"));

assertThat(new RemoteServiceTestCommand(config, new


RemoteServiceTestSimulator(500)).execute(),
equalTo("Success"));
}

public String invokeRemoteService(HystrixCommand.Setter config, int timeout)


throws InterruptedException {

String response = null;

try {
response = new RemoteServiceTestCommand(config,
new RemoteServiceTestSimulator(timeout)).execute();
} catch (HystrixRuntimeException ex) {
System.out.println("ex = " + ex);
}

return response;
}
In the above test we have set different circuit breaker properties. The most important ones are:

 The CircuitBreakerSleepWindow which is set to 4,000 ms. This configures the circuit breaker
window and defines the time interval after which the request to the remote service will be
resumed
 The CircuitBreakerRequestVolumeThreshold which is set to 1 and defines the minimum
number of requests needed before the failure rate will be considered.

With the above settings in place, our HystrixCommand will now trip open after two failed request. The
third request will not even hit the remote service even though we have set the service delay to be 500
ms, Hystrix will short circuit and our method will return null as the response.

We will subsequently add a Thread.sleep(5000) in order to cross the limit of the sleep window that we
have set. This will cause Hystrix to close the circuit and the subsequent requests will flow through
successfully.

Conclusion
In summary Hystrix is designed to
 Provide protection and control over failures and latency from services typically accessed over
the network
 Stop cascading of failures resulting from some of the services being down
 Fail fast and rapidly recover
 Degrade gracefully where possible
 Real time monitoring and alerting of command center on failures

References
 https://www.baeldung.com/introduction-to-hystrix

Zookeeper

What is a Distributed System?


A distributed application is an application which can run on multiple systems in a network. It runs
simultaneously by coordinating themselves to complete a certain task. These tasks may take plenty of
hours to complete by any non-distributed application.
The time to complete the task can be further reduced by configuring the distributed application to run
on more systems. A group of systems in which a distributed application is running is called a Cluster and
each machine running in a cluster is called a Node.
A distributed application has two parts, Server and Client application. Server applications are distributed
and have a common interface so that clients can connect to any server in the cluster and get the same
result. Client applications are the tools to interact with a distributed application.
Benefits of Distributed Applications
 Reliability − Failure of a single or a few systems does not make the whole system to fail.
 Scalability − Performance can be increased as and when needed by adding more machines with
minor change in the configuration of the application with no downtime.
 Transparency − Hides the complexity of the system and shows itself as a single entity /
application.

Challenges of Distributed Applications


 Race condition − Two or more machines trying to perform a task, which actually needs to be
done only by a single machine at any given time. For example, shared resources should only be
modified by a single machine at any given time.
 Deadlock − Two or more operations waiting for each other to complete indefinitely.
 Inconsistency − Partial failure of data.

What is Zookeeper?
Apache Zookeeper is an open source distributed coordination service that helps you manage a large set
of hosts. Management and coordination in a distributed environment are tricky. Zookeeper automates
this process and allows developers to focus on building software features rather worry about the
distributed nature of their application.
Zookeeper helps you to maintain configuration information, naming, group services for distributed
applications. It implements different protocols on the cluster so that the application should not
implement on their own. It provides a single coherent view of multiple machines.

ZooKeeper is a distributed co-ordination service to manage large set of hosts. Co-ordinating and
managing a service in a distributed environment is a complicated process. ZooKeeper solves this issue
with its simple architecture and API. ZooKeeper allows developers to focus on core application logic
without worrying about the distributed nature of the application.

The ZooKeeper framework was originally built at “Yahoo!” for accessing their applications in an easy and
robust manner. Later, Apache ZooKeeper became a standard for organized service used by Hadoop,
HBase, and other distributed frameworks. For example, Apache HBase uses ZooKeeper to track the
status of distributed data.

Why Apache Zookeeper?


Here, are important reasons behind the popularity of the Zookeeper:
 It allows for mutual exclusion and cooperation between server processes
 It ensures that your application runs consistently
 The transaction process is never completed partially. It is either given the status of Success or
failure. The distributed state can be held up, but it's never wrong
 Irrespective of the server that it connects to, a client will be able to see the same view of the
service
 Helps you to encode the data as per the specific set of rules
 It helps to maintain a standard hierarchical namespace like files and directories
 Computers, which run as a single system which can be locally or geographically connected
 It allows to Join/leave node in a cluster and node status at the real time
 You can increase performance by deploying more machines
 It allows you to elect a node as a leader for better coordination
 ZooKeeper works fast with workloads where reads to the data are more common than writes

ZooKeeper Architecture: How it works?


 Zookeeper follows a Client-Server Architecture
 All systems store a copy of the data
 Leaders are elected at startup

Server: The server sends an acknowledge when any client connects. In the case when there is no
response from the connected server, the client automatically redirects the message to another server.

Client: Client is one of the nodes in the distributed application cluster. It helps you to accesses
information from the server. Every client sends a message to the server at regular intervals that helps
the server to know that the client is alive.

Leader: One of the servers is designated a Leader. It gives all the information to the clients as well as an
acknowledgment that the server is alive. It would perform automatic recovery if any of the connected
nodes failed.

Follower: Server node which follows leader instruction is called a follower.

Client read requests are handled by the correspondingly connected Zookeeper server
The client writes requests are handled by the Zookeeper leader.
Ensemble/Cluster: Group of Zookeeper servers which is called ensemble or a Cluster. You can use
ZooKeeper infrastructure in the cluster mode to have the system at the optimal value when you are
running the Apache.

ZooKeeper WebUI: If you want to work with ZooKeeper resource management, then you need to use
WebUI. It allows working with ZooKeeper using the web user interface, instead of using the command
line. It offers fast and effective communication with the ZooKeeper application.
The Zookeeper Data Model (ZDM)

The zookeeper data model follows a


Hierarchal namespace where each node
is called a ZNode. A node is a system
where the cluster runs.

Every ZNode has data. It may or may not


have children

ZNode paths:
 Canonical, slash-separated and
absolute
 Not use any relative references
 Names may have Unicode
characters

ZNode maintains stat structure and


version number for data changes.

Types of Zookeeper Nodes


There are three types of Znodes:
 Persistence znode: This type of znode is alive even after the client which created that specific
znode, is disconnected. By default, in zookeeper, all nodes are persistent if it is not specified.
 Ephemeral znode: This type of zookeeper znode are alive until the client is alive. Therefore,
when the client gets a disconnect from the zookeeper, it will also be deleted. Moreover,
ephemeral nodes are not allowed to have children.
 Sequential znode: Sequential znodes can be either ephemeral or persistent. So when a new
znode is created as a sequential znode. You can assign the path of the znode by attaching a 10
digit sequence number to the original name.

ZDM- Watches
Zookeeper, a watch event is a one-time trigger which is sent to the client that set watch. It occurred
when data from that watch changes. ZDM watch allows clients to get notifications when znode changes.
ZDM read operations like getData(), getChidleren(), exist have the option of setting a watch.

Watches are ordered, the order of watch events corresponds to the order of the updates. A client will
able to see a watch event for znode before seeing the new data which corresponds to that znode.

ZDM- Access Control list


Zookeeper uses ACLs to control access to its znodes. ACL is made up of a pair of (Scheme: id, permission)
Build in ACL schemes:

world: has a single id, anyone

auth: Not use any id, It represents any authenticated user

digest: use a username: password

host: Allows you to use client's hostname as ACL id identity

IP: use the client host IP address as ACL id identity

ACL Permissions:

CREATE
READ
WRITE
DELETE
ADMIN
E.x. (IP: 192.168.0.0/16, READ)

The ZKS - Session States and Lifetime

Before executing any request, it is important that the client must establish a session with service
All operations clients are sent to service are automatically associated with a session
The client may connect to any server in the cluster. But it will connect to only a single server
The session provides "order guarantees". The requests in the session are executed in FIFO order
The main states for a session are 1) Connecting, 2) Connected 3) Closed 4) Not Connected.
How to install ZooKeeper
Step 1) Go to this link and click "Continue to Subscribe"
Step 2) On next page, Click Accept Terms

Step 3) You will see the following message

Step 4) Refresh the page after 5 minutes and click "Continue to Configure"
Step 5) In next screen, click "Continue to Launch"

Step 6) You are done!

Apache ZooKeeper Applications


Apache Zookeeper used for following purposes:

Managing the configuration


Naming services
Choosing the leader
Queuing the messages
Managing the notification system
Synchronization
Distributed Cluster Management
Companies using Zookeeper
Yahoo
Facebook
eBay
Twitter
Netflix
Zynga
Nutanix
DisAdvantages of using Zookeeper
Data loss may occur if you are adding new Zookeeper Servers
No Migration allowed for users
Not offer support for Rack placement and awareness
Zookeeper does not allow you to reduce the number of pods to prevent accidental data loss
You can't switch service to host networking without a full re-installation when the service is deployed on
a virtual network
Service doesn't support changing volume requirements once the initial deployment is over
There are large numbers of node involved so there could be more than one point of failure
Messages can be lost in the communication network, which requires special software to recover it again
Summary
A distributed application is an application which can run on multiple systems in a network
Apache Zookeeper is an open source distributed coordination service that helps you manage a large set
of hosts
It allows for mutual exclusion and cooperation between server processes
Server, Client, Leader, Follower, Ensemble/Cluster, ZooKeeper WebUI are important zookeeper
components
Three types of Znodes are Persistence, Ephemeral and sequential
ZDM watch is a one-time trigger which is sent to the client that set watch. It occurred when data from
that watch changes
Zookeeper uses ACLs to control access to its znodes
Managing the configuration, Naming services., selecting the leader, Queuing the messages, Managing
the notification system, Synchronization, Distributed Cluster Management, etc.
Yahoo, Facebook, eBay, Twitter, Netflix are some known companies using zookeeper
The main drawback of tool is that loss may occur if you are adding new Zookeeper Servers
Learn Databases

Redis
A database is a crucial aspect of applications that are often only considered as an afterthought.
However, for many developers deciding which database to use when building apps is a critical decision.
Among the many popular data structures such as MySQL, MongoDB and Oracle, Redis is slowly gaining
popularity within the NoSQL databases. Although, it already plays a supporting role for many companies
including Twitter and Github, Redis is now gaining traction as a primary database.

Redis is an open-source data structure server that allows developers to organize data using a key-value
storage method. This powerful database is perfect for high performance jobs such as caching. Redis is a
no-fuss and fast database for many different functions including as a cache or a message broker.

Redis has three main peculiarities that sets it apart.


 Redis holds its database entirely in the memory, using the disk only for persistence.
 Redis has a relatively rich set of data types when compared to many key-value data stores.
 Redis can replicate data to any number of slaves.

Redis Advantages
Following are certain advantages of Redis.
 Exceptionally fast − Redis is very fast and can perform about 110000 SETs per second, about
81000 GETs per second.
 Supports rich data types − Redis natively supports most of the datatypes that developers
already know such as list, set, sorted set, and hashes. This makes it easy to solve a variety of
problems as we know which problem can be handled better by which data type.
 Operations are atomic − All Redis operations are atomic, which ensures that if two clients
concurrently access, Redis server will receive the updated value.
 Multi-utility tool − Redis is a multi-utility tool and can be used in several use cases such as
caching, messaging-queues (Redis natively supports Publish/Subscribe), any short-lived data in
your application, such as web application sessions, web page hit counts, etc.

Redis Versus Other Key-value Stores


 Redis is a different evolution path in the key-value DBs, where values can contain more complex
data types, with atomic operations defined on those data types.
 Redis is an in-memory database but persistent on disk database, hence it represents a different
trade off where very high write and read speed is achieved with the limitation of data sets that
can't be larger than the memory.
 Another advantage of in-memory databases is that the memory representation of complex data
structures is much simpler to manipulate compared to the same data structure on disk. Thus,
Redis can do a lot with little internal complexity.

Install Redis on Ubuntu


To install Redis on Ubuntu, go to the terminal and type the following commands −
$sudo apt-get update
$sudo apt-get install redis-server
This will install Redis on your machine.

Start Redis
$redis-server
Check If Redis is Working
$redis-cli
This will open a redis prompt.
redis 127.0.0.1:6379>
In the above prompt, 127.0.0.1 is your machine's IP address and 6379 is the port on which Redis server
is running. Now type the following PING command.

redis 127.0.0.1:6379> ping


PONG
This shows that Redis is successfully installed on your machine.

Run Commands on the Remote Server


To run commands on Redis remote server, you need to connect to the server by the same client redis-cli

Syntax
$ redis-cli -h host -p port -a password
Example
Following example shows how to connect to Redis remote server, running on host 127.0.0.1, port 6379
and has password mypass.

$redis-cli -h 127.0.0.1 -p 6379 -a "mypass"


redis 127.0.0.1:6379>
redis 127.0.0.1:6379> PING
PONG

Install Redis Desktop Manager on Ubuntu


To install Redis desktop manager on Ubuntu, just download the package from
https://redisdesktop.com/download

Open the downloaded package and install it.


Redis desktop manager will give you UI to manage your Redis keys and data.
Redis - Configuration
In Redis, there is a configuration file (redis.conf) available at the root directory of Redis. Although you
can get and set all Redis configurations by Redis CONFIG command.

Syntax
Following is the basic syntax of Redis CONFIG command.
redis 127.0.0.1:6379> CONFIG GET CONFIG_SETTING_NAME
Example
redis 127.0.0.1:6379> CONFIG GET loglevel
1) "loglevel"
2) "notice"

To get all configuration settings, use * in place of CONFIG_SETTING_NAME


Example
redis 127.0.0.1:6379> CONFIG GET *
1) "dbfilename"
2) "dump.rdb"
3) "requirepass"
4) ""
5) "masterauth"
6) ""
7) "unixsocket"
8) ""
9) "logfile"
10) ""
11) "pidfile"
12) "/var/run/redis.pid"
13) "maxmemory"
14) "0"
15) "maxmemory-samples"
16) "3"
17) "timeout"
18) "0"
19) "tcp-keepalive"
20) "0"
21) "auto-aof-rewrite-percentage"
22) "100"
23) "auto-aof-rewrite-min-size"
24) "67108864"
25) "hash-max-ziplist-entries"
26) "512"
27) "hash-max-ziplist-value"
28) "64"
29) "list-max-ziplist-entries"
30) "512"
31) "list-max-ziplist-value"
32) "64"
33) "set-max-intset-entries"
34) "512"
35) "zset-max-ziplist-entries"
36) "128"
37) "zset-max-ziplist-value"
38) "64"
39) "hll-sparse-max-bytes"
40) "3000"
41) "lua-time-limit"
42) "5000"
43) "slowlog-log-slower-than"
44) "10000"
45) "latency-monitor-threshold"
46) "0"
47) "slowlog-max-len"
48) "128"
49) "port"
50) "6379"
51) "tcp-backlog"
52) "511"
53) "databases"
54) "16"
55) "repl-ping-slave-period"
56) "10"
57) "repl-timeout"
58) "60"
59) "repl-backlog-size"
60) "1048576"
61) "repl-backlog-ttl"
62) "3600"
63) "maxclients"
64) "4064"
65) "watchdog-period"
66) "0"
67) "slave-priority"
68) "100"
69) "min-slaves-to-write"
70) "0"
71) "min-slaves-max-lag"
72) "10"
73) "hz"
74) "10"
75) "no-appendfsync-on-rewrite"
76) "no"
77) "slave-serve-stale-data"
78) "yes"
79) "slave-read-only"
80) "yes"
81) "stop-writes-on-bgsave-error"
82) "yes"
83) "daemonize"
84) "no"
85) "rdbcompression"
86) "yes"
87) "rdbchecksum"
88) "yes"
89) "activerehashing"
90) "yes"
91) "repl-disable-tcp-nodelay"
92) "no"
93) "aof-rewrite-incremental-fsync"
94) "yes"
95) "appendonly"
96) "no"
97) "dir"
98) "/home/deepak/Downloads/redis-2.8.13/src"
99) "maxmemory-policy"
100) "volatile-lru"
101) "appendfsync"
102) "everysec"
103) "save"
104) "3600 1 300 100 60 10000"
105) "loglevel"
106) "notice"
107) "client-output-buffer-limit"
108) "normal 0 0 0 slave 268435456 67108864 60 pubsub 33554432 8388608 60"
109) "unixsocketperm"
110) "0"
111) "slaveof"
112) ""
113) "notify-keyspace-events"
114) ""
115) "bind"
116) ""

Edit Configuration
To update configuration, you can edit redis.conf file directly or you can update configurations via
CONFIG set command.

Syntax
Following is the basic syntax of CONFIG SET command.
redis 127.0.0.1:6379> CONFIG SET CONFIG_SETTING_NAME NEW_CONFIG_VALUE
Example
redis 127.0.0.1:6379> CONFIG SET loglevel "notice"
OK
redis 127.0.0.1:6379> CONFIG GET loglevel
1) "loglevel"
2) "notice"

Redis - Data Types

Strings
Redis string is a sequence of bytes. Strings in Redis are binary safe, meaning they have a known length
not determined by any special terminating characters. Thus, you can store anything up to 512
megabytes in one string.

Example
redis 127.0.0.1:6379> SET name "tutorialspoint"
OK
redis 127.0.0.1:6379> GET name
"tutorialspoint"
In the above example, SET and GET are Redis commands, name is the key used in Redis and
tutorialspoint is the string value that is stored in Redis.
Note − A string value can be at max 512 megabytes in length.

Hashes
A Redis hash is a collection of key value pairs. Redis Hashes are maps between string fields and string
values. Hence, they are used to represent objects.

Example
redis 127.0.0.1:6379> HMSET user:1 username tutorialspoint password
tutorialspoint points 200
OK
redis 127.0.0.1:6379> HGETALL user:1
1) "username"
2) "tutorialspoint"
3) "password"
4) "tutorialspoint"
5) "points"
6) "200"
In the above example, hash data type is used to store the user's object which contains basic information
of the user. Here HMSET, HGETALL are commands for Redis, while user − 1 is the key.
Every hash can store up to 232 - 1 field-value pairs (more than 4 billion).

Lists
Redis Lists are simply lists of strings, sorted by insertion order. You can add elements to a Redis List on
the head or on the tail.

Example
redis 127.0.0.1:6379> lpush tutoriallist redis
(integer) 1
redis 127.0.0.1:6379> lpush tutoriallist mongodb
(integer) 2
redis 127.0.0.1:6379> lpush tutoriallist rabitmq
(integer) 3
redis 127.0.0.1:6379> lrange tutoriallist 0 10

1) "rabitmq"
2) "mongodb"
3) "redis"
The max length of a list is 232 - 1 elements (4294967295, more than 4 billion of elements per list).

Sets
Redis Sets are an unordered collection of strings. In Redis, you can add, remove, and test for the
existence of members in O(1) time complexity.

Example
redis 127.0.0.1:6379> sadd tutoriallist redis
(integer) 1
redis 127.0.0.1:6379> sadd tutoriallist mongodb
(integer) 1
redis 127.0.0.1:6379> sadd tutoriallist rabitmq
(integer) 1
redis 127.0.0.1:6379> sadd tutoriallist rabitmq
(integer) 0
redis 127.0.0.1:6379> smembers tutoriallist
1) "rabitmq"
2) "mongodb"
3) "redis"
Note − In the above example, rabitmq is added twice, however due to unique property of the set, it is
added only once.
The max number of members in a set is 232 - 1 (4294967295, more than 4 billion of members per set).

Sorted Sets
Redis Sorted Sets are similar to Redis Sets, non-repeating collections of Strings. The difference is, every
member of a Sorted Set is associated with a score, that is used in order to take the sorted set ordered,
from the smallest to the greatest score. While members are unique, the scores may be repeated.

Example
redis 127.0.0.1:6379> zadd tutoriallist 0 redis
(integer) 1
redis 127.0.0.1:6379> zadd tutoriallist 0 mongodb
(integer) 1
redis 127.0.0.1:6379> zadd tutoriallist 0 rabitmq
(integer) 1
redis 127.0.0.1:6379> zadd tutoriallist 0 rabitmq
(integer) 0
redis 127.0.0.1:6379> ZRANGEBYSCORE tutoriallist 0 1000
1) "redis"
2) "mongodb"
3) "rabitmq"

Redis - Keys

Redis keys commands are used for managing keys in Redis. Following is the syntax for using redis keys
commands.

Syntax
redis 127.0.0.1:6379> COMMAND KEY_NAME
Example
redis 127.0.0.1:6379> SET tutorialspoint redis
OK
redis 127.0.0.1:6379> DEL tutorialspoint
(integer) 1
In the above example, DEL is the command, while tutorialspoint is the key. If the key is deleted, then the
output of the command will be (integer) 1, otherwise it will be (integer) 0.

Redis Keys Commands


Following table lists some basic commands related to keys.

1 DEL key
This command deletes the key, if it exists.

2 DUMP key
This command returns a serialized version of the value stored at the specified key.

3 EXISTS key
This command checks whether the key exists or not.

4 EXPIRE key seconds


Sets the expiry of the key after the specified time.

5 EXPIREAT key timestamp


Sets the expiry of the key after the specified time. Here time is in Unix timestamp format.

6 PEXPIRE key milliseconds


Set the expiry of key in milliseconds.

7 PEXPIREAT key milliseconds-timestamp


Sets the expiry of the key in Unix timestamp specified as milliseconds.

8 KEYS pattern
Finds all keys matching the specified pattern.

9 MOVE key db
Moves a key to another database.
10 PERSIST key
Removes the expiration from the key.

11 PTTL key
Gets the remaining time in keys expiry in milliseconds.

12 TTL key
Gets the remaining time in keys expiry.

13 RANDOMKEY
Returns a random key from Redis.

14 RENAME key newkey


Changes the key name.

15 RENAMENX key newkey


Renames the key, if a new key doesn't exist.

16 TYPE key
Returns the data type of the value stored in the key.

Redis - Strings
Redis strings commands are used for managing string values in Redis. Following is the syntax for using
Redis string commands.

Syntax
redis 127.0.0.1:6379> COMMAND KEY_NAME
Example
redis 127.0.0.1:6379> SET tutorialspoint redis
OK
redis 127.0.0.1:6379> GET tutorialspoint
"redis"
In the above example, SET and GET are the commands, while tutorialspoint is the key.

Redis Strings Commands


Following table lists some basic commands to manage strings in Redis.

Sr.No Command & Description


1 SET key value
This command sets the value at the specified key.

2 GET key
Gets the value of a key.

3 GETRANGE key start end


Gets a substring of the string stored at a key.
4 GETSET key value
Sets the string value of a key and return its old value.

5 GETBIT key offset


Returns the bit value at the offset in the string value stored at the key.

6 MGET key1 [key2..]


Gets the values of all the given keys

7 SETBIT key offset value


Sets or clears the bit at the offset in the string value stored at the key

8 SETEX key seconds value


Sets the value with the expiry of a key

9 SETNX key value


Sets the value of a key, only if the key does not exist

10 SETRANGE key offset value


Overwrites the part of a string at the key starting at the specified offset

11 STRLEN key
Gets the length of the value stored in a key

12 MSET key value [key value ...]


Sets multiple keys to multiple values

13 MSETNX key value [key value ...]


Sets multiple keys to multiple values, only if none of the keys exist

14 PSETEX key milliseconds value


Sets the value and expiration in milliseconds of a key

15 INCR key
Increments the integer value of a key by one

16 INCRBY key increment


Increments the integer value of a key by the given amount

17 INCRBYFLOAT key increment


Increments the float value of a key by the given amount

18 DECR key
Decrements the integer value of a key by one

19 DECRBY key decrement


Decrements the integer value of a key by the given number
20 APPEND key value
Appends a value to a key

Redis - Hashes
Redis Hashes are maps between the string fields and the string values. Hence, they are the perfect data
type to represent objects.
In Redis, every hash can store up to more than 4 billion field-value pairs.

Example
redis 127.0.0.1:6379> HMSET tutorialspoint name "redis tutorial"
description "redis basic commands for caching" likes 20 visitors 23000
OK
redis 127.0.0.1:6379> HGETALL tutorialspoint
1) "name"
2) "redis tutorial"
3) "description"
4) "redis basic commands for caching"
5) "likes"
6) "20"
7) "visitors"
8) "23000"
In the above example, we have set Redis tutorials detail (name, description, likes, visitors) in hash
named ‘tutorialspoint’.

Redis Hash Commands


Following table lists some basic commands related to hash.

Sr.No Command & Description


1 HDEL key field2 [field2]
Deletes one or more hash fields.

2 HEXISTS key field


Determines whether a hash field exists or not.

3 HGET key field


Gets the value of a hash field stored at the specified key.

4 HGETALL key
Gets all the fields and values stored in a hash at the specified key

5 HINCRBY key field increment


Increments the integer value of a hash field by the given number

6 HINCRBYFLOAT key field increment


Increments the float value of a hash field by the given amount

7 HKEYS key
Gets all the fields in a hash
8 HLEN key
Gets the number of fields in a hash

9 HMGET key field1 [field2]


Gets the values of all the given hash fields

10 HMSET key field1 value1 [field2 value2 ]


Sets multiple hash fields to multiple values

11 HSET key field value


Sets the string value of a hash field

12 HSETNX key field value


Sets the value of a hash field, only if the field does not exist

13 HVALS key
Gets all the values in a hash

14 HSCAN key cursor [MATCH pattern] [COUNT count]


Incrementally iterates hash fields and associated values

Redis - Lists
Redis Lists are simply lists of strings, sorted by insertion order. You can add elements in Redis lists in the
head or the tail of the list.

Maximum length of a list is 232 - 1 elements (4294967295, more than 4 billion of elements per list).

Example
redis 127.0.0.1:6379> LPUSH tutorials redis
(integer) 1
redis 127.0.0.1:6379> LPUSH tutorials mongodb
(integer) 2
redis 127.0.0.1:6379> LPUSH tutorials mysql
(integer) 3
redis 127.0.0.1:6379> LRANGE tutorials 0 10
1) "mysql"
2) "mongodb"
3) "redis"
In the above example, three values are inserted in Redis list named ‘tutorials’ by the command LPUSH.

Redis Lists Commands


Following table lists some basic commands related to lists.

Sr.No Command & Description


1 BLPOP key1 [key2 ] timeout
Removes and gets the first element in a list, or blocks until one is available
2 BRPOP key1 [key2 ] timeout
Removes and gets the last element in a list, or blocks until one is available

3 BRPOPLPUSH source destination timeout


Pops a value from a list, pushes it to another list and returns it; or blocks until one is available

4 LINDEX key index


Gets an element from a list by its index

5 LINSERT key BEFORE|AFTER pivot value


Inserts an element before or after another element in a list

6 LLEN key
Gets the length of a list

7 LPOP key
Removes and gets the first element in a list

8 LPUSH key value1 [value2]


Prepends one or multiple values to a list

9 LPUSHX key value


Prepends a value to a list, only if the list exists

10 LRANGE key start stop


Gets a range of elements from a list

11 LREM key count value


Removes elements from a list

12 LSET key index value


Sets the value of an element in a list by its index

13 LTRIM key start stop


Trims a list to the specified range

14 RPOP key
Removes and gets the last element in a list

15 RPOPLPUSH source destination


Removes the last element in a list, appends it to another list and returns it

16 RPUSH key value1 [value2]


Appends one or multiple values to a list

17 RPUSHX key value


Appends a value to a list, only if the list exists
Redis - Sets
Redis Sets are an unordered collection of unique strings. Unique means sets does not allow repetition of
data in a key.
In Redis set add, remove, and test for the existence of members in O(1) (constant time regardless of the
number of elements contained inside the Set). The maximum length of a list is 232 - 1 elements
(4294967295, more than 4 billion of elements per set).

Example
redis 127.0.0.1:6379> SADD tutorials redis
(integer) 1
redis 127.0.0.1:6379> SADD tutorials mongodb
(integer) 1
redis 127.0.0.1:6379> SADD tutorials mysql
(integer) 1
redis 127.0.0.1:6379> SADD tutorials mysql
(integer) 0
redis 127.0.0.1:6379> SMEMBERS tutorials
1) "mysql"
2) "mongodb"
3) "redis"
In the above example, three values are inserted in Redis set named ‘tutorials’ by the command SADD.

Redis Sets Commands


Following table lists some basic commands related to sets.

Sr.No Command & Description


1 SADD key member1 [member2]
Adds one or more members to a set

2 SCARD key
Gets the number of members in a set

3 SDIFF key1 [key2]


Subtracts multiple sets

4 SDIFFSTORE destination key1 [key2]


Subtracts multiple sets and stores the resulting set in a key

5 SINTER key1 [key2]


Intersects multiple sets

6 SINTERSTORE destination key1 [key2]


Intersects multiple sets and stores the resulting set in a key

7 SISMEMBER key member


Determines if a given value is a member of a set

8 SMEMBERS key
Gets all the members in a set

9 SMOVE source destination member


Moves a member from one set to another

10 SPOP key
Removes and returns a random member from a set

11 SRANDMEMBER key [count]


Gets one or multiple random members from a set

12 SREM key member1 [member2]


Removes one or more members from a set

13 SUNION key1 [key2]


Adds multiple sets

14 SUNIONSTORE destination key1 [key2]


Adds multiple sets and stores the resulting set in a key

15 SSCAN key cursor [MATCH pattern] [COUNT count]


Incrementally iterates set elements

Redis - Sorted Sets


Redis Sorted Sets are similar to Redis Sets with the unique feature of values stored in a set. The
difference is, every member of a Sorted Set is associated with a score, that is used in order to take the
sorted set ordered, from the smallest to the greatest score.
In Redis sorted set, add, remove, and test for the existence of members in O(1) (constant time
regardless of the number of elements contained inside the set). Maximum length of a list is 232 - 1
elements (4294967295, more than 4 billion of elements per set).

Example
redis 127.0.0.1:6379> ZADD tutorials 1 redis
(integer) 1
redis 127.0.0.1:6379> ZADD tutorials 2 mongodb
(integer) 1
redis 127.0.0.1:6379> ZADD tutorials 3 mysql
(integer) 1
redis 127.0.0.1:6379> ZADD tutorials 3 mysql
(integer) 0
redis 127.0.0.1:6379> ZADD tutorials 4 mysql
(integer) 0
redis 127.0.0.1:6379> ZRANGE tutorials 0 10 WITHSCORES
1) "redis"
2) "1"
3) "mongodb"
4) "2"
5) "mysql"
6) "4"
In the above example, three values are inserted with its score in Redis sorted set named ‘tutorials’ by
the command ZADD.

Redis Sorted Sets Commands


Following table lists some basic commands related to sorted sets.

Sr.No Command & Description


1 ZADD key score1 member1 [score2 member2]
Adds one or more members to a sorted set, or updates its score, if it already exists

2 ZCARD key
Gets the number of members in a sorted set

3 ZCOUNT key min max


Counts the members in a sorted set with scores within the given values

4 ZINCRBY key increment member


Increments the score of a member in a sorted set

5 ZINTERSTORE destination numkeys key [key ...]


Intersects multiple sorted sets and stores the resulting sorted set in a new key

6 ZLEXCOUNT key min max


Counts the number of members in a sorted set between a given lexicographical range

7 ZRANGE key start stop [WITHSCORES]


Returns a range of members in a sorted set, by index

8 ZRANGEBYLEX key min max [LIMIT offset count]


Returns a range of members in a sorted set, by lexicographical range

9 ZRANGEBYSCORE key min max [WITHSCORES] [LIMIT]


Returns a range of members in a sorted set, by score

10 ZRANK key member


Determines the index of a member in a sorted set

11 ZREM key member [member ...]


Removes one or more members from a sorted set

12 ZREMRANGEBYLEX key min max


Removes all members in a sorted set between the given lexicographical range

13 ZREMRANGEBYRANK key start stop


Removes all members in a sorted set within the given indexes

14 ZREMRANGEBYSCORE key min max


Removes all members in a sorted set within the given scores

15 ZREVRANGE key start stop [WITHSCORES]


Returns a range of members in a sorted set, by index, with scores ordered from high to low

16 ZREVRANGEBYSCORE key max min [WITHSCORES]


Returns a range of members in a sorted set, by score, with scores ordered from high to low

17 ZREVRANK key member


Determines the index of a member in a sorted set, with scores ordered from high to low

18 ZSCORE key member


Gets the score associated with the given member in a sorted set

19 ZUNIONSTORE destination numkeys key [key ...]


Adds multiple sorted sets and stores the resulting sorted set in a new key

20 ZSCAN key cursor [MATCH pattern] [COUNT count]


Incrementally iterates sorted sets elements and associated scores

Redis - HyperLogLog
Redis HyperLogLog is an algorithm that uses randomization in order to provide an approximation of the
number of unique elements in a set using just a constant, and small amount of memory.
HyperLogLog provides a very good approximation of the cardinality of a set even using a very small
amount of memory around 12 kbytes per key with a standard error of 0.81%. There is no limit to the
number of items you can count, unless you approach 264 items.

Example
Following example explains how Redis HyperLogLog works.
redis 127.0.0.1:6379> PFADD tutorials "redis"
1) (integer) 1
redis 127.0.0.1:6379> PFADD tutorials "mongodb"
1) (integer) 1
redis 127.0.0.1:6379> PFADD tutorials "mysql"
1) (integer) 1
redis 127.0.0.1:6379> PFCOUNT tutorials
(integer) 3
Redis HyperLogLog Commands
Following table lists some basic commands related to Redis HyperLogLog.

Sr.No Command & Description


1 PFADD key element [element ...]
Adds the specified elements to the specified HyperLogLog.

2 PFCOUNT key [key ...]


Returns the approximated cardinality of the set(s) observed by the HyperLogLog at key(s).

3 PFMERGE destkey sourcekey [sourcekey ...]


Merges N different HyperLogLogs into a single one.

Redis - Publish Subscribe


Redis Pub/Sub implements the messaging system where the senders (in redis terminology called
publishers) sends the messages while the receivers (subscribers) receive them. The link by which the
messages are transferred is called channel.

In Redis, a client can subscribe any number of channels.

Example
Following example explains how publish subscriber concept works. In the following example, one client
subscribes a channel named ‘redisChat’.

redis 127.0.0.1:6379> SUBSCRIBE redisChat


Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "redisChat"
3) (integer) 1
Now, two clients are publishing the messages on the same channel named ‘redisChat’ and the above
subscribed client is receiving messages.

redis 127.0.0.1:6379> PUBLISH redisChat "Redis is a great caching technique"


(integer) 1
redis 127.0.0.1:6379> PUBLISH redisChat "Learn redis by tutorials point"
(integer) 1
1) "message"
2) "redisChat"
3) "Redis is a great caching technique"
1) "message"
2) "redisChat"
3) "Learn redis by tutorials point"
Redis PubSub Commands
Following table lists some basic commands related to Redis Pub/Sub.

Sr.No Command & Description


1 PSUBSCRIBE pattern [pattern ...]
Subscribes to channels matching the given patterns.

2 PUBSUB subcommand [argument [argument ...]]


Tells the state of Pub/Sub system. For example, which clients are active on the server.

3 PUBLISH channel message


Posts a message to a channel.

4 PUNSUBSCRIBE [pattern [pattern ...]]


Stops listening for messages posted to channels matching the given patterns.

5 SUBSCRIBE channel [channel ...]


Listens for messages published to the given channels.

6 UNSUBSCRIBE [channel [channel ...]]


Stops listening for messages posted to the given channels.
Questions

Docker

1. What is Docker?
Docker is an open-source lightweight containerization technology. It has gained widespread popularity
in the cloud and application packaging world. It allows you to automate the deployment of applications
in lightweight and portable containers.

2. What are the advantages of using Docker container?


Here, are a major advantage of using Docker.
Offers an efficient and easy initial set up
Allows you to describe your application lifecycle in detail
Simple configuration and interacts with Docker Compose.
Documentation provides every bit of information.

3. What are the important features of Docker?


Here are the essential features of Docker:
Easy Modeling
Version control
Placement/Affinity
Application Agility
Developer Productivity
Operational Efficiencies

4. What are the main drawbacks of Docker?


Some notable drawbacks of Docker are:
Doesn't provide a storage option
Offer a poor monitoring option.
No automatic rescheduling of inactive Nodes
Complicated automatic horizontal scaling set up

5. What is Docker image?


The Docker image help to create Docker containers. You can create the Docker image with the build
command. Due to this, it creates a container that starts when it begins to run. Every docker images are
stored in the Docker registry.

6. What is Docker Engine?


Docker daemon or Docker engine represents the server. The docker daemon and the clients should be
run on the same or remote host, which can communicate through command-line client binary and full
RESTful API.

7. Explain Registries

There are two types of registry is

Public Registry
Private Registry
Docker's public registry is called Docker hub, which allows you to store images privately. In Docker hub,
you can store millions of images.

8. What command should you run to see all running container in Docker?

$ docker ps
9. Write the command to stop the docker container

$ sudo docker stop container name


10. What is the command to run the image as a container?

$ sudo docker run -i -t alpine /bin/bash


11. What are the common instruction in Dockerfile?

The common instruction in Dockerfile are: FROM, LABEL, RUN, and CMD.

12. What is memory-swap flag?

Memory-swap is a modified flag that only has meaning if- memory is also set. Swap allows the container
to write express memory requirements to disk when the container has exhausted all the RAM which is
available to it.

13. Explain Docker Swarm?

Docker Swarm is native gathering for docker which helps you to a group of Docker hosts into a single
and virtual docker host. It offers the standard docker application program interface.

14. How can you monitor the docker in production environments?

Docker states and Docker Events are used to monitoring docker in the production environment.

15. What the states of Docker container?

Important states of Docker container are:

Running
Paused
Restarting
Exited
16. What is Docker hub?

Docker hub is a cloud-based registry that which helps you to link to code repositories. It allows you to
build, test, store your image in Docker cloud. You can also deploy the image to your host with the help
of Docker hub.

17. What is Virtualization?

Virtualization is a method of logically dividing mainframes to allow multiple applications to run


simultaneously.

However, this scenario changed when companies and open source communities were able to offer a
method of handling privileged instructions. It allows multiple OS to run simultaneously on a single x86
based system.

18. What is Hypervisor?


The hypervisor allows you to create a virtual environment in which the guest virtual machines operate.
It controls the guest systems and checks if the resources are allocated to the guests as necessary.

19. Explain Docker object labels

Docker object labels is a method for applying metadata to docker objects including, images, containers,
volumes, network, swam nodes, and services.

20. Write a Docker file to create and copy a directory and built it using python modules?

FROM pyhton:2.7-slim

WORKDIR /app

COPY . /app

docker build –tag


21. Where the docker volumes are stored?

You need to navigate:

/var/lib/docker/volumes
22. List out some important advanced docker commands

Command Description
docker info Information Command
docker pull Download an image
docker stats Container information
Docker images List of images downloaded
23. How does communication happen between Docker client and Docker Daemon?

You can communicate between Docker client and Docker Daemon with the combination of Rest API,
socket.IO, and TCP.

24. Explain Implementation method of Continuous Integration(CI) and Continues Development (CD) in
Docker?

You need to do the following things:

Runs Jenkins on docker


You can run integration tests in Jenkins using docker-compose
25. What are the command to control Docker with Systemd?

systemctl start/stop docker


service docker start/stop
26. How to use JSON instead of YAML compose file?
docker-compose -f docker-compose.json up
27. What is the command you need to give to push the new image to Docker registry?

docker push myorg/img


28. How to include code with copy/add or volumes?

In docker file, we need to use COPY or ADD directive. This is useful to relocate code. However, we
should use a volume if we want to make changes.

29. Explain the process of scaling your Docker containers

The Docker containers can be scaled to any level starting from a few hundred to even thousands or
millions of containers. The only condition for this is that the containers need the memory and the OS at
all times, and there should not be a constraint when the Docker is getting scaled.

30. What is the method for creating a Docker container?

You can use any of the specific Docker images for creating a Docker container using the below
command.

docker run -t -i command name


This command not only creates the container but also start it for you.

31. What are the steps for the Docker container life cycle?

Below are the steps for Docker life cycle:

Build
Pull
Run
32. How can you run multiple containers using a single service?

By using docker-compose, you can run multiple containers using a single service. All docker-compose
files uses yaml language.

33. What is CNM?

CNM stands for Container Networking Model. It is a standard or specification from Docker, Inc. that
forms the basis of container networking in a Docker environment. This docker's approach provides
container networking with support for multiple network drivers.

34. Does Docker offer support for IPV6?

Yes, Docker provides support IPv6. IPv6 networking is supported only on Docker daemons runs on Linux
hosts. However, if you want to enable IPv6 support in the Docker daemon, you need to modify
/etc/docker/daemon.json and set the ipv6 key to true.
35. Can you lose data when the container exits?

No, any data that your application writes to disk get stored in container. The file system for the contain
persists even after the container halts.

36. What are a different kind of volume mount types available in Docker?

Bind mounts- It can be stored anywhere on the host system

37. How to configure the default logging driver under Docker?

To configure the Docker daemon to default to a specific logging driver. You need to set the value of log-
driver to the name of the logging drive the daemon.jason.fie.

38. Explain Docker Trusted Registry?

Docker Trusted Registry is the enterprise-grade image storage toll for Docker. You should install it after
your firewall so that you can securely manage the Docker images you use in your applications.

39. What are Docker Namespaces?

The Namespace in Docker is a technique which offers isolated workspaces called the Container.
Namespaces also offer a layer of isolation for the Docker containers.

40. What are the three components of Docker Architecture

Client
Docker-Host
Registry
41. What is client?

Docker provides Command Line Interface tools to the client to interact with Docker daemon.

42. What is the purpose of Docker_Host?

It contains container, images, and Docker daemon. It offers a complete environment to execute and run
your application.

43. How do I run multiple copies of Compose file on the same host?

Compose uses the project name which allows you to create unique identifiers for all of a project's
containers and other resources. To run multiple copies of a project, set a custom project name using the
-a command-line option or using COMPOSE_PROJECT_NAME environment variable.
Random Topics

1. Good and Bad practices of REST

Typically, we use a RESTful design for our web APIs. The concept of REST is to separate the API structure
into logical resources. There are HTTP methods GET, DELETE, POST and PUT to operate with the
resources.
These are 10 best practices to design a clean RESTful API:

1. Use nouns but no verbs


For an easy understanding use this structure for every resource:

Do not use verbs:

2. GET method and query parameters should not alter the state
Use PUT, POST and DELETE methods instead of the GET method to alter the state.
Do not use GET for state changes:

3. Use plural nouns


Do not mix up singular and plural nouns. Keep it simple and use only plural nouns for all resources.
4. Use sub-resources for relations
If a resource is related to another resource use sub resources.

5. Use HTTP headers for serialization formats


Both, client and server, need to know which format is used for the communication. The format must be
specified in the HTTP-Header.

Content-Type defines the request format.


Accept defines a list of acceptable response formats.

6. Use HATEOAS
Hypermedia as the Engine of Application State is a principle that hypertext links should be used to
create a better navigation through the API.

{
"id": 711,
"manufacturer": "bmw",
"model": "X5",
"seats": 5,
"drivers": [
{
"id": "23",
"name": "Stefan Jauker",
"links": [
{
"rel": "self",
"href": "/api/v1/drivers/23"
}
]
}
]
}
7. Provide filtering, sorting, field selection and paging for collections
Filtering
Use a unique query parameter for all fields or a query language for filtering.

Sorting
Allow ascending and descending sorting over multiple fields.

This returns a list of cars sorted by descending manufacturers and ascending models.

Field selection
Mobile clients display just a few attributes in a list. They don’t need all attributes of a resource. Give the
API consumer the ability to choose returned fields. This will also reduce the network traffic and speed up
the usage of the API.

Paging
Use limit and offset. It is flexible for the user and common in leading databases. The default should be
limit=20 and offset=0

To send the total entries back to the user use the custom HTTP header: X-Total-Count.

Links to the next or previous page should be provided in the HTTP header link as well. It is important to
follow this link header values instead of constructing your own URLs.

Link: <https://blog.mwaysolutions.com/sample/api/v1/cars?offset=15&limit=5>; rel="next",


<https://blog.mwaysolutions.com/sample/api/v1/cars?offset=50&limit=3>; rel="last",
<https://blog.mwaysolutions.com/sample/api/v1/cars?offset=0&limit=5>; rel="first",
<https://blog.mwaysolutions.com/sample/api/v1/cars?offset=5&limit=5>; rel="prev",

8. Version your API


Make the API Version mandatory and do not release an un-versioned API. Use a simple ordinal number
and avoid dot notation such as 2.5.

We are using the url for the API versioning starting with the letter, v

9. Handle Errors with HTTP status codes


It is hard to work with an API that ignores error handling. Pure returning of a HTTP 500 with a stacktrace
is not very helpful.

Use HTTP status codes


The HTTP standard provides over 70 status codes to describe the return values. We don’t need them all,
but there should be used at least amount of 10.

200 – OK – Everything is working


201 – OK – New resource has been created
204 – OK – The resource was successfully deleted

304 – Not Modified – The client can use cached data

400 – Bad Request – The request was invalid or cannot be served. The exact error should be explained in
the error payload. E.g. The JSON is not valid.
401 – Unauthorized – The request requires a user authentication
403 – Forbidden – The server understood the request but is refusing it or the access is not allowed.
404 – Not found – There is no resource behind the URI.
422 – Un-Processable Entity – Should be used if the server cannot process the entity, e.g. if an image
cannot be formatted or mandatory fields are missing in the payload.

500 – Internal Server Error – API developers should avoid this error. If an error occurs in the global catch
block, the stracktrace should be logged and not returned as response.

Use error payloads


All exceptions should be mapped in an error payload. Here is an example how a JSON payload should
look like.

{
"errors": [
{
"userMessage": "Sorry, the requested resource does not exist",
"internalMessage": "No car found in the database",
"code": 34,
"more info": "http://dev.mwaysolutions.com/blog/api/v1/errors/12345"
}
]
}

10. Allow overriding HTTP method


Some proxies support only POST and GET methods. To support a RESTful API with these limitations, the
API needs a way to override the HTTP method.

Use the custom HTTP Header X-HTTP-Method-Override to override the POST Method.

S-ar putea să vă placă și