Sunteți pe pagina 1din 30

Managing Applications in AWS

by Jasenko Krejic

Learn how to use new Amazon serverless technologies, presented in a clear and consistent way,
with practical demonstrations in the AWS console.

Description

Amazon Web Services is gaining ground and creating new services almost on a daily basis, and
technologies such as Elastic Beanstalk, Lambda, and Step Functions are their cutting-edge
products. In this course, Managing Applications in AWS, you will gain the ability to manage
applications in 5 different technologies. First, you will learn how to manage all aspects of
serverless application platforms such as Elastic Beanstalk and Lambda. Next, you will discover
batch computing with AWS Batch and create workflows with AWS Step functions. Finally, you will
explore how to connect external users to internal AWS services using AWS API Gateway. When
you are finished with this course, you will have the skills and knowledge of above-mentioned
serverless technologies needed to manage all aspects of running applications in the cloud with
AWS.

Course Overview

Course Overview

(Music) Hi everyone. My name is Jasenko Krejic, and welcome to my course, Managing


Applications in AWS. I am Head of System of Support department at Postal company of Republica
Srpska, Bosnia, and Herzegovina, in charge of various server systems and technologies. In this
course, you are going to learn five technologies that are part of an amazing new thing, the whole
IT world is talking about, the serverless. Some of the major topics that we will cover include
complete management of Amazon ElasticBeanstalk, creating and running Lambda functions,
creating and configuring Amazon batch jobs, working with Amazon step functions, and using
Amazon API Gateway. By the end of this course, you'll know everything you need to work and be
able to manage ElasticBeanstalk, Lambda functions, batch jobs, step functions, and API gateways.
Before beginning the course, you should be familiar with working in the AWS console from the
browser, and previous experience in Linux command line would be beneficial because we will use
it in a few topics. I hope you will join me on this journey to learn AWS serverless technologies with
Managing Applications in AWS at Pluralsight.
Deploying Applications with AWS Elastic Beanstalk

Introduction

Hello everyone. Welcome to this module. It's called Deploying Applications with AWS Elastic
Beanstalk. In this module, I am first going to introduce you to the Elastic Beanstalk, and to get the
feeling of how easy it is to create an application with Elastic Beanstalk, I will show you how to
deploy a sample application in Python. A lot of things happen when you create an application in
Elastic Beanstalk so I will shed some light on what happens behind the scenes. Then I will show
you how to upload a new version of an application and how to manage versions in general. Elastic
Beanstalk applications need some external ways to store their data permanently, and we will
configure Elastic Beanstalk to create an AWS RDS database that you can use to achieve that. And
the last thing in this module is to go through configuration of another platform, this time with one
that is supporting PHP.

Elastic Beanstalk Overview

First, let's say a thing or two about what Elastic Beanstalk is and what it provides for us. It is an
AWS or Amazon Web Services service that makes running applications easier for us. It runs in the
AWS Cloud, which means that it is running somewhere else and Amazon manages the hardware
infrastructure for us. It reduces the complexity of setting, as well as maintaining our infrastructure
on AWS Cloud without sacrificing well, much of the control. We know that these two things,
complexity and control, are usually on the opposite sides of the globe. With Elastic Beanstalk, your
job is to prepare the application and upload it. When you upload it, Elastic Beanstalk automatically
does server provisioning, load balancing, scaling, and health monitoring of an application, and
some more things behind the scenes. You may wonder which application languages does it
support? Well, at the moment, it supports applications written for Go, Java, .NET, PHP, Python,
Ruby, Node.js, as well as custom Docker containers. Your main job is to prepare the application
and upload it to the Elastic Beanstalk. What Elastic Beanstalk does is to provision resources for you
like create the appropriate EC2 instance of instances with software that supports a chosen
language. With the appropriate resources provisioned, all there is for you is to upload new
versions, and potentially manage your Elastic Beanstalk environment configuration. If you are
happy with the version you uploaded and don't need to change configuration, you can sit back and
enjoy letting AWS take care of potential shutdown of the server by replacing it with an identical
one.

Installing a Sample Application

In this demo, I'm going to show you how to quickly create a sample application and connect to it
with your browser. To quickly create an application and to see the power and simplicity of Elastic
Beanstalk, from the initial AWS dashboard, choose Elastic Beanstalk from the Compute section to
go to that service dashboard. Once there, click on the dashboard's Get Started button. Now we
have to type in the application name. I will type here Globomantics, which is a fictional company
we will be working with in this course. Through that, we should choose a platform on which our
application will be running. Platforms that are available at this moment are preconfigured
programming language platforms like Go, .NET, Java, Node.js, Ruby, PHP, Python, and Tomcat web
server. You also have generic Docker and multi-container Docker, as well as some preconfigured
Docker environments like Glassfish, Go, and Python. Throughout this course I will use Python so if
you use some other programming language, you can do this, but you should know that you will be
on your own. I propose that you follow my choices until you finish this course, and only then make
some modifications. So let us choose Python and leave default select Sample application. Scroll
down and click Create Application. Once we click it, the process of creating an environment will
start and we will be warned that it will take a few minutes so I will pause the recording until it is
finished. When it is finished, we will have an application and an environment that is green,
meaning that it is healthy and functional. We have an application created and we have a link over
here which points you to the URL of the application. Click on it and the new tab will open, which
shows you a sample application web page. I will soon explain what happened in the background,
but just know that you have a fully functional Python web application, which doesn't do much at
the moment, but instead of that, shows you how easy it is to deploy an application to the Elastic
Beanstalk platform.

Behind the Scenes of Elastic Beanstalk

When you create a new application in Elastic Beanstalk, it does the heavy lifting for us. During the
application creation, AWS resources that are created include, but are not limited to EC2 instances
that the application is running at, Elastic Load Balancer if there are multiple instances, Amazon
RDS; if an application uses an RDS database, then Autoscaling Group to keep the number of
instances that we need; Security Group for our instance or instances, and S3 Bucket to store some
data. In this demo, I will show you what AWS resources were created for us when we created a
sample Python application. We have made just a few mouse clicks and entered an application
name. As you know, the application has to run somewhere, and that somewhere is a server,
whether a virtual or physical one. So when we created an application, Elastic Beanstalk created an
EC2 instance for us with the appropriate software installed, a few other things as well. Let me
show you that. Now go to Services, EC2, and from EC2 dashboard, go to EC2 Instances here. We
have a new instance here that we didn't have before. Its name is Globomantics-env, and we can
see that it has a DNS name and an IP address that corresponds to it. Let's copy any of those in the
new tab of our browser. Paste and go, and look. We have our sample application here that is the
same as this tab so it really is running on this newly created instance. Let's take a look at what
other resources were created by Elastic Beanstalk. In EC2 service, let's go to Security Groups and
here we can see there is Globomantics-env security group, which was created by Elastic Beanstalk.
If we take a look at its inbound rules, we can see it's allowing only port 80, which is HTTP port, and
we can't log in via SSH since port 22 is not allowed. If we take a look at Elastic IPs here, we can see
a new Elastic IP, and if we take a look at the instance it is associated with, we can see that it
belongs to Globomantics-env. There is also one more thing that is created. If we go to Services, S3,
you can see here a bucket that is called elasticbeanstalk-us-east-1-something. This is the bucket
that holds configuration data and application versions for us-east-1 region applications. It is
created when first application in the region is created and even if we delete the application, it
stays there. If we delete an application and we decided we don't need the bucket anymore, we
won't be able to delete it because we will have a Permission Denied error. That is because the
policy on this bucket is set to prevent accidental deletion so even the root account can't delete it
by default. To delete it, we need to modify bucket policy on the bucket, here and on Delete Bucket
instead of Deny, type Allow. And of course, save it if we want to. Now you will be able to delete it
normally.

Managing Application Versions

In this demo, I will show you how to download sample application code, modify and upload new
version, deploy application version, and delete application version. We have deployed our sample
Python application and have seen how to access it from our browser. Suppose we decide that we
don't like it anymore and want to change it. We don't have a source code of our application since
it is a sample one, but I will show you how to obtain it. We will modify it and deploy it to Elastic
Beanstalk. Let's go to our sample application page, choose Elastic Beanstalk overview, and on the
left side of the page, Tutorials and Samples, find Python sample application, Python version 1,
download it. Now go to the Download folder and unzip it. Now open application.py with your text
editor. Inside it, find the line, Your first AWS Elastic Beanstalk Python application, and change it to
something like You have successfully uploaded a new version of a sample application, save and
close. Now with your archive manager, create a zip file that's called python-v2. Now we can delete
these files; we don't need them anymore. Now back to Globomantics environment, there is a
button called Upload and Deploy. Choose the file python-v2 and when you choose it, you can see
that the version label is automatically filled. Click Deploy, and the message appears that the
environment is being updated. You can click Events to see that the new version is being deployed.
Once it is finished, let's visit the application's web page, and now you can see that there is a text,
you have successfully uploaded new version. Now let's go to Globomantics application again.
Choose Application versions, and we can see a Python version 2 application here. If we don't need
it anymore, we can select it and go to Actions and click Delete. Default action to is to also delete it
from S3 bucket. Click Delete, and you can see that it's gone, but it is still not enough, because we
still need to deploy the old version. Select it, click Actions, and Deploy. Select Environment, click
Deploy. Now it may take a while until it is deployed, but at the end, you will get the old version
back with familiar text, your first AWS Elastic Beanstalk, etc.

Persistent Storage
The problem with Elastic Beanstalk is that in its default form it is essentially stateless so if an
instance goes down and is replaced, it starts from scratch. There are, however, ways to save data
using Elastic Beanstalk, and these are Amazon S3, DynamoDB, Elastic File System, Elastic Block
Store, and Amazon RDS. In this demo, I will show you how to create an RDS instance from Elastic
Beanstalk environment, what are configuration options for such RDS database, and how to access
the database through environment variables. Let's now go to Globomantics, Environments,
Configuration here, and at the bottom of the screen find Database, click Modify. Now we can
restore existing database snapshot if it exists. We don't have it so we'll create a new one. We have
a choice of MySQL, Oracle-SE, Postgres, Microsoft, SQL Server, various editions, engine versions,
instance classes, storage. We'll leave it as default. Then we should type in username. Let's type
master, and the password, and then choose Retention, whether the database is saved to a
snapshot when the environment is terminated. I will leave it as Delete, and now for Availability we
can choose between Low and High. Now click Apply. After clicking Apply, we are redirected to a
dashboard where we have a message that the new RDS database is being created, and it will take
a few minutes. I will pause the recording and come back when the RDS is created. Now with the
database created, go to Configuration, Database, click Modify, and you will be able to see the
details of the database here. There is another place where we can see our database details. Go to
Services, type rds, and click RDS to go to the RDS AWS service. Once finished, click Databases on
the left portion of the screen, and here we have our database. One thing I must warn you about,
do not delete database outside your Elastic Beanstalk environment, not even from this RDS
service, because you will end up with your environment unable to delete, except from the cloud
formation. And let me tell you about one final advantage of creating an RDS database from Elastic
Beanstalk. You don't need to hard code your Python or any other programming language program.
With pre-defined values of hostname, database name, username, and other details, they are
accessible to Python programs with these variables: RDS_HOSTNAME, RDS_PORT, RDS_DB_NAME,
RDS_USERNAME, and RDS_PASSWORD.

Configuring Different Platforms

In this demo we are going to create a new environment with PHP and then compare PHP and
Python configuration options. We have our application that is running on Python environment.
Applications can have multiple environments for production and test purposes, for example, but
they don't have to support the same language. So let's create another one. Click Actions, Create
environment, Web server environment, and now for environment name, leave Globomantics-env-
1; for platform choose PHP, sample application, and click Create Environment. Since it will take
some time, I will pause the video and come back in a few minutes. Now we have two
environments, one with Python sample application and another one with PHP. Let's open
configuration of both these environments in separate tabs so we can compare the options and
values found there. At first glance, you can see that most of the configuration options are the
same, except Software configuration and Instances. Instances that are created for these
environments are of the same instance family, but as you can expect, they're base images
specified by their AMI ID are different. Software configuration options for these environments are
substantially different. Let's click on Modify, and you can see that on Python environment we have
settings such as the path that is which application we are serving, number of processes, number of
threads for each process, AWS X-Ray, S3 log storage, and CloudWatch, all of which are disabled by
default. You have ability to serve static files directly from the file system, and environment
properties that can be accessed by Python. On the other hand, in PHP configuration options, you
have things such as document root, which is the default directory, memory limit, whether
compression is enabled or not, value of Allow URL fopen, whether errors are shown, maximum
execution time. All these options are written at the PHP configuration file called php.ini, and you
can see that the number of options you can change is pretty restricted. You can see that different
environments have different configuration options and not only related to their configuration files.
At the end, if you don't need to use the PHP environment anymore, terminate it because it will
incur significant cost because of its EC2 instance that is created. Also, if you don't need the RDS
that we created with Python environment, please terminate the whole environment, because
once created, you can't delete RDS and leave the environment.

Administering AWS Elastic Beanstalk Applications

Introduction

Hello everyone. Welcome to this module called Administering AWS Elastic Beanstalk Applications.
In this module, you are going to learn how to configure various environment options, including the
creation of a load balancer, and an auto-scaling group. There is a cool new feature called a worker
environment, and I will introduce you to this concept and show you how to create and test one.
The next thing I will show you is what are and how to perform rolling updates, which allows you to
schedule updates to your Beanstalk environment configuration on individual instances. And we
will finish this module and Elastic Beanstalk story with a bit of theory about managing multiple
environments and benefits that they provide.

Configuring Beanstalk Environment

In our previous module, we had started an environment Globomantics-env with a rather limited
choice of options. These were environment name, software platform, and a sample application.
Elastic Beanstalk environment is a complex system with a lot of moving parts and consequently, a
lot of configuration options. These configuration options contain a lot of namespaces, which are
sections for Elastic Beanstalk resources such as AWS auto-scaling groups. Configuration options
can be applied before creation of an environment during environment creation and after the
environment is created. There are four locations of these options, and guess what? They have
order of precedence. That means if there are multiple options with different values, the option in
the location with the higher precedence will be applied. Locations from higher to lower
precedence are: creation/update configuration options, no matter how they are applied, whether
from console, EB CLI, AWS, or SDK; then saved configurations; after that, .ebextension files. They
are executed in alphabetical order, shipped with your application bundle. And finally, with low
precedence we have default values. Saved configurations are actually YAML files that contain
configuration. They can be saved from the console, can be loaded to replace current configuration
from the console, and can be viewed from the Elastic Beanstalk command line. A thing or two
about .ebextension files. These files are JSON or YAML format files located in .ebextension folder
of your application bundle. They must be of .config file type and if there are multiple files, ones
that are first alphabetically are first applied, meaning that they have the lowest precedence
priority. For example, if your number of threads option during creation is 10, and your application
bundle is specifying twelve, and at the end the default value is something like 16, would be a
resultant number of threads? Well, you guessed it; it's 10.

Adding Load Balancer and Autoscaling

You have seen how to create an environment with Python in a sample application. This setup uses
a single EC2 instance, and users that access the application via its symbolic or DNS name, get the
application from this instance. A more complex setup includes a load balancer, which distributes
user requests across multiple identical EC2 instances, each instance having the same setup and
application, only different IP addresses. Users access load balancer, which in turn contacts the
instance, fetches a response, and serves it to the user. In addition to that, EC2 instances can be a
part of auto-scaling group, which takes care of maintaining a wanted number of instances. If one
instance goes down, it can create a new instance. If some condition is met, for example, excessive
network traffic over a specific threshold, it can create a new instance to serve the traffic, and if the
traffic goes down again, below some other threshold, it can terminate that instance. In this demo
I'm going to demonstrate to you how to configure existing Elastic Beanstalk environment to add a
load balancer and an auto-scaling group and show you how to configure it. In our Globomantics
environment, click Configuration and in the Capacity section, click Modify. Change the
environment type from Single instance to Load balanced. Now we can see here that the default
configuration wants to have minimum one instance and maximum four. So let's change the
minimum to two instances. We can leave Availability Zones, Placement, and Scaling cool down
with default values, and scroll down to another interesting part, the Scaling triggers. Here we have
different criteria when to add or remove an instance. The default here is to add an instance when
average network output exceeds 6MB per second for 5-minute period, and remove an instance if
an average network output falls below 2MB/second. Scroll down and click Apply, and now we
need to confirm that we really want all our instances replaced, because that is what happens in
the background. And now when you click Confirm, the process starts. You might hit Refresh a few
times. After the process is over, let's see what happened to our EC2 instance. Go to EC2
dashboard, click Instances, and look; there are two instances now. Let's see their names; they're
both named Globomantics. Let's get back to EC2 dashboard and take a closer look at load
balancer. Here we can see instances that the load balancer is serving by clicking Edit Instances. We
can see that it serves our two newly created instances. Let's close this now and go back to EC2
dashboard because I have one more thing to show you. I told you that if an instance goes down, it
will be replaced by an identical one created by the auto-scaling group. So let's simulate this by
clicking on Actions, Instance State, Terminate, and Confirm it. Instance is going down. After a few
minutes, the state is terminated, but if we click Refresh here, we can see that another instance is
created which replace the failed instance. So auto-scaling has done its magic and we have two
healthy instances again, ready to serve traffic through a load balancer.

Rolling Updates

As you have seen, certain changes of environment configuration require replacement of existing
instances. This means terminating instances, and then replacing them with newly created ones. It
is very fast; however, it is not instantaneous. During that replacement process, all of the instances
that need to be replaced are unavailable. Imagine that you have a lot of instances, for example,
100, and that they all become unavailable at the same time. Imagine that for some reason you
have to change the configuration very often, like daily. There is, however, a remedy, and it's called
rolling update. With rolling updates, you don't update all instances at once, but instead divide
them into batches of configurable size and do updates one batch at a time. Only when one batch is
finished updating, another can be updated and so on. So if we have for example, 16 instances and
we specify that the batch size is four, only four instances or 25% will be unavailable at any given
time. So that's quite acceptable, keeping also in mind that there is a load balancer that can
distribute a load to healthy instances. In this demo, I will show you how to configure rolling
updates to replace one instance at a time. From our Globomantics environment let's go to
Configuration, Rolling Updates and Deployments, and click Modify. Scroll to Configuration Updates
and you can see that the default rolling update type is disabled. Click on it and we can see that we
can choose rolling updates based on health, based on time, and immutable. Health-based type is
starting a new batch when the previous batch instances are healthy. Time-based additionally waits
for a certain amount of time, and immutable creates a replica of all instances ready to
immediately replace old instances. Now let us choose health-based rolling type with a batch size of
1 and a minimum capacity also 1. That means one instance at a time will be replaced and there will
always be one available instance. Click Apply. Let's now make some changes that will require
instance replacement. For example, let's change instance type from t2 micro to t1 micro. Click
Apply, and after that we have to confirm configuration change so we'll click Confirm. Let us now go
to the EC2 dashboard to see what is happening with our two existing instances. We can still see no
change as we have two instances of t2.micro type, but if we come a few minutes later, we can see
that our too-old instances with t2.micro type have been shut down, and two new instances with
t1.micro type have replaced them. You may want to follow this process by yourself, frequently
doing a refresh and see that the instances are replaced one at a time.

Worker Environments
We have previously created a web tier environment. You may remember that Elastic Beanstalk
offered two types of environment, the web tier and the worker tier. Web environment, as we have
seen, serves clients HTTP requests. Worker environment serves to process tasks that are not
normally associated with front-end applications. These kinds of tasks could be image processing,
audio/video processing, and performing various calculations. These tasks are usually too CPU or
memory intensive for a standard web server. We can choose to outsource these tasks to different
environments to process them, perform some action, and return the result, usually a success or a
fail message. So how does it work? When there is a need for a background task, a code from the
web environment generates a message that is sent to SQS. SQS or Simple Queue Service is AWS
managed service that manages message queues. Once the message is in the queue, it is delivered
to a worker environment. It is done through a worker daemon on the worker environment that
regularly checks and pulls messages from the queue. Once the message is delivered, it is deleted
from the queue. You don't even have to create a queue; it is done automatically when the worker
environment is created. Your duty is to create code that reads the message that is delivered to a
worker environment and take appropriate action. In this demo, I'm going to show you how to
configure a worker environment and manually send message to a queue from an AWS console.
From our Globomantics application, click Actions, Create environment, now choose Worker
environment, and Select. We can leave Environment name as it is. Platform, let's choose Python,
and click Create environment. After a while, a worker environment is created. A new instance is
added to a list of existing instances, but one more resource is added as well, the queue. You can
see the queue it has created by clicking View Queue, which is going to redirect us to a Simple
Queue Service dashboard. From an SQS database, we can send a test message by going to Queue
Actions and Message, and type our usual Hello Globomantics. After that, click Send Message, and
close the window. We can now view of delete messages here, and we have to click Start Pulling for
Messages, and waiting for the messages to show. There are no messages, because they have been
read by the worker instance and deleted from the queue.

Multiple Environments

You have gone with me through deployment of a new version of an Elastic Beanstalk environment
so you know that there can be different application versions like Globomantics v1, Globomantics
v2, and Globomantics v3. Different environments that serve different purposes can also coexist
like for example, environments for production, testing, and development. This is pretty much
standard division in the world of software development, and since it is so easy to set up a new
environment in Elastic Beanstalk, you might create a slightly different environment for users that,
for example, come from Asia, or an environment for one group of developers separated from the
main group like fresh developers. And what is especially rewarding, you can easily deploy any
version of an application to any environment. The kind of ease with which you can test various
application versions on various environments is simply awesome, and it really speeds up
application development and deployment. All of that, along with the help you get from the AWS in
automating maintenance of your infrastructure and your life as a software developer or a DevOps
just got much happier.

Module Summary

Thank you for being with me through this module, and I hope that you have learned a lot about
administering Beanstalk. We have seen how a configuration change can trigger a chain of events
including creation of load balancer and an auto-scaling group to keep the number of instances at
the wanted level, and that can even be triggered by a specific metric. Since these configuration
changes can make instances temporally unavailable, I have shown you how to make these kinds of
changes in batches so not all instances are affected, and to help you alleviate excessive load on
our web tier environments, I have explained the concept and demonstrated how to create a
worker tier environment. And at the end, I have shown you why using multiple environments can
help you speed up and streamline the process of software development.

Running Serverless Applications with AWS Lambda

Introduction

Hello again. Welcome to the module called Running Serverless Applications with AWS Lambda. In
this module, I will show you how to create a sample AWS Lambda application in Python, how to
test Lambda function, what are versions and what are aliases, how to invoke Lambda functions
from event sources or triggers, and we will discuss different Lambda configuration options and
how to deploy Lambda functions.

AWS Lambda Overview

Similar to the Elastic Beanstalk, AWS Lambda is an AWS service, which means it runs in AWS Cloud
and Amazon Web Services is responsible for the maintenance and provision of hardware and
software infrastructure. Your only responsibility is to upload or edit the code that is running there.
It supports Go, Java, .NET, PHP, Python, Ruby, Node.js, as well as custom programming
environments. Opposite to the Elastic Beanstalk, it creates no EC2 instances or other resources by
itself, but instead needs to be triggered in some way, and only then it is charged. So it costs you
nothing if it not used. It has awesome integration with other AWS services, which can be used to
trigger it. A Lambda function is a basic Lambda unit, which is a code that you can also edit with an
inline editor in addition to uploading it from your computer.

Creating a Sample Lambda Function


In this demo, I will introduce you to AWS Lambda by creating a sample Lambda function. I have
told you that AWS Lambda allows us to run the code without preoccupying ourselves with the
infrastructure that lies underneath. Now let me show you how easily you can create a Lambda
function. From the AWS dashboard under Compute, click Lambda. We still don't have any
functions here. Lambda is region dependent, and you can see that we are currently in our default
us-east1 region or North Virginia. That means that you can have different Lambda functions in
different regions, even with the same function name. To quickly create a function, click Create a
function. Next we have a choice of creating a function from scratch, which creates a simple Hello
World example; Use a blueprint, which will create the function from a list of standard use cases;
and Browser serverless app repository. We want it as simple as possible so we will use Create from
scratch. We will call our function helloGlobomantics. And now after that, we have to choose a
programming language. If we click here, click again, we can see all those supported languages, the
latest ones at the top. But still we will stick to our Python 2.7. After that there are permissions. You
don't need to worry about the permissions because basic execution role will be created and
assigned to your function. This execution role allows Lambda function to execute and upload logs
to CloudWatch. If we want our function, for example, to communicate with Amazon DynamoDB,
we will need to create and assign the appropriate role, that is permissions to Lambda function.
Now let's click Create a function, and in a few moments, we are going to be redirected to this
Functions dashboard. Congratulations! You have just created a sample Lambda function in Python.
Don't be perplexed by the number of options in the function dashboard or how to run or modify
this Lambda function. I will explain it all in the next few clips.

Testing Lambda Function

In this demo, we will configure a test event and test invoke a Lambda function. Although Lambda
functions are meant to communicate with various other AWS services, we will begin with testing
our Lambda function using its built-in test option, but first let's orientate ourselves a bit in this
Functions dashboard. It is divided between Configuration and Monitoring tabs. In Configuration
we have a designer window, which we can use to add triggers and assign permissions. For now we
will fold it to make room for other features. Under the Function code we have an inline editor. It is
very simple and it will probably find all the features very familiar, like those menus File, Edit, Find,
View, Go, Tools, Window, and also we can show and hide the list of files by clicking on
Environment. On some browsers, showing the default file with the code hangs, and you need to
close it with an X and open it from the file list. Let us now do a simple change to the sample code
and instead of Hello from Lambda, make it say Hello from Globomantics. Now you can click File,
Save, to save changes to a file, Save All to save all changes to all files, or click Save in the upper
right corner to save these and all other changes in the whole dashboard. I told you that opposite
to the Beanstalk, which is always running, Lambda functions are executed by a specific trigger. It is
very useful to test the function before it actually is triggered. We can do this from the dashboard
by clicking Test. Upon clicking it, since it is the very first time, we are prompted to create a new
test event. Test event is a JSON formatted code that defines variables and values that are available
as environment variables to the Lambda function code. There are some really sophisticated
templates, but we will stick with our Hello World template at the bottom. Let us now fill Event
name with something like HelloGlobomantics, and we won't change variable names or values since
we won't use them in our code. Scroll down and click Create. To start the test, click Test button.
Now if we scroll up, we will be able to see the results of an execution. You should get a success
message, Execution result: succeeded. You can now unfold the details and see a lot of details
about how the code was executed. First, we have a field where the output of the function is
shown. You can see the result in a text form where we can see our HelloGlobomantics message,
then you can see execution time or duration and time that was billed. Lambda is billed by the
execution time or duration, and it is rounded to the next one-hundred milliseconds time interval.
Then there is the amount of memory that was configured, as well as the memory that was actually
used. All this information is also available in CloudWatch logs. You can access it from here or from
here. So let's click it to open CloudWatch dashboard. Click on the log stream that corresponds to
this execution of Lambda function. You can see all the information here, and close CloudWatch. So
now you have successfully executed a Lambda function by configuring a test event and have seen
how this execution was recorded in CloudWatch logs.

Versions and Aliases

Most of the software development or deployment platforms have some sort of versioning so
when you are developing your code, you can name a specific snapshot as Version 1, Version 2,
Version 3, and in general, Version n. Probably naming some of the minor versions as Version 1.1
and so on. In Lambda, when you create a new function and you're working on it, you don't have
any versions, except the one working version that is named $LATEST. Once you have done some
changes and decide that you want it to become a version, you publish a version. That means
making a snapshot of a current state of a function. Versions are named automatically 1, 2, 3, four,
and so on. Current function state is always called $LATEST. With Lambda versioning, you can
switch to any of these created versions or delete any one of them. It is important that you know
that once they are created, versions are immutable, which means they are locked and they cannot
be changed. It may sound strange, but this is where another Lambda function feature comes to
help us, and this feature is called Aliases. Aliases can be thought of as symbolic links to function
versions. For example, these can be experimental, test, and production aliases. Experimental may
point to Version 1 of the application, Test to Version 3, and Production to Version 4 of the
application. Versions are immutable, but the aliases are not. So you can decide, for example, that
the test alias that is pointing to Version 3 of the application needs to point to the Version 2 of the
application. You can freely change that, and it will add the needed flexibility to versioning. So just
remember, versions once created are immutable, but aliases are not, which makes them a sort of
front end for developers to work with. In the next demo, I will show you how to create new
versions, create aliases, and manage versions and aliases. We start from our HelloGlobomantics
Lambda function dashboard. We have changed to show Hello from Globomantics, instead default
Hello from Lambda. So we can consider it to be a new version so let's publish that version. Click
Actions at the top of the screen, Publish new version, and fill in the version description, something
like First Version. Version numbers are auto-incremented, unique key being the version number,
and they contain not only function code, but also its configuration. We are now located in Version
1. If you try to edit code, you will notice that you are unable to. There is a message that says Code
and handler editing is available only in $LATEST, and the link to go to that version. If we click the
link, we will position ourselves in the latest version, which is still the same as Version 1, because
we didn't make any changes. If the window hangs, just close it and open it from the file list in the
left. Let's add something like again, Hello from Globomantics again, click Actions, and Publish new
version. We can now add a description like Second Version, and when we click Publish, we are
going to be located in the second version of our function. We can switch back and forth between
Version 1 and Version 2, but beware, they can't be edited. We can also delete any version. For
example, we want to delete Version 2, but before we can delete it, we need to delete any aliases
that reference it. In our case, we need to delete test alias. So we click Actions, delete alias,
confirm, and it's deleted. You may now delete Version 2, but be aware that version numbers are
not reused. So if you publish yet another version, it will have Version number 3. And even if we
delete a function and create a new one with the same name, version numbers will continue from
the latest version number of a deleted function, and there is no way to get around it in AWS. In
case you were wondering why am I not in North Virginia, but in Ohio, that's the answer.

Invoking Lambda Function

In this demo, I'm going to show you how to add a trigger to a Lambda function that will trigger its
execution by creation of an object in a specified S3 bucket and how to monitor execution logs. I
have shown you how to test a Lambda function, but Lambda functions are of no production use if
you have to log in the AWS console and start it manually. They excel in integration with other AWS
services. I will show you how to create the trigger, which will execute a Lambda function
automatically. We will use Designer, which we minimized before. We can see a list of services that
can be used as triggers for a Lambda function. I will make it as simple as possible and use object
creation in S3 bucket as a trigger. So click on S3, it will be added as a trigger, and you will be able
to configure it. First, we need to select the bucket that will serve as event source. We will create
one for this purpose. Now let's go to S3 Service, open it in a new tab, here, create a new bucket.
We will name it helloglobomantics123456. Remember, it needs to be unique across the world so
you will probably have to come up with something different and unique. Now click Create bucket
at the end, and it is created. Let's close this tab. After the bucket is created, it appears in the event
source list in S3 trigger configuration. Select it and continue with the configuration. Lambda will be
triggered by a specific event that happens in this bucket. The default is that it will trigger for all
object creation events. Making it as simple as possible, we will stick with defaults. You can even do
some filtering on object or other tags' names, selecting objects that start or end with specific
strings. Leave it blank, that means all objects. You can also create a trigger without actually
enabling it so you can uncheck Enabled, which we won't do, and just click Add. We still need to
save changes. Let's go and click Save. Now we have a really simple, but working system that will
log an event when any kind of object is created in S3 bucket. To see it in action, we need to go to
S3 bucket and create a simple folder inside it. Once we are positioned inside our triggering folder,
we need to create a bucket, call it whatever you want, and click Save. Now that the object was
created inside triggering bucket, let's go back to the Lambda dashboard to see effects it made.
Back in Lambda function, if you go to Monitoring tab and select CloudWatch, you will see that a
new stream is added, and this is the execution log of the Lambda function that was triggered by
folder creation in the S3 bucket. So we have gone through a proof of concept for triggering a
Lambda function outside previous test function. Lambda can be triggered by many other AWS
services and each require a different configuration. I am sure you will find one that suits your
needs and use it to your advantage.

Configuring Lambda Functions

In this demo, I am going to introduce you to all available Lambda function settings, which we still
didn't discuss. We will configure environment variables, and I will show you how to use them in
your code. Then I will go through execution role and its impact on designer frame, configure
memory and timeout settings. We will go through configuring some monitoring and
troubleshooting options, and we will end with concurrency and how to increase the number of
concurrent sessions. Let's go to our HelloGlobomantics Lambda function. You have seen how to
use event source mapping with S3 and how to use an inline editor to perform basic tasks in code
editing. Let me show you other function configuration options that you can change. Remember,
Lambda function is not only the code, it is the code plus the configuration options, and these are
equally important. Now scroll down to where we have environment variables. These are variables
in key value pairs that you can send to your application, very similar to those configured in tested-
env. These are very useful if you want the same function to behave differently with different
parameters. You may have development, testing, production set of variables, all using the same
function, but with different parameter values. Let us now edit it and for example, enter company
as a key and Globomantics as value. Now we need to edit the sample code to use these
parameters as environment variables. To use these variables in Python code, we need to import
the OS module, with not surprisingly, import os. We also need to change the code to print value of
environment variable company using os.environ. Remember, this is a Python specific way to show
environment variables. Now click Save, and click Test. Now you can see Globomantics value here,
and if we unfold the details, here as well. Next item in configuration of Lambda function is tags.
These are also key value pairs, but these are used for organizational purposes. Next we have an
execution role. You can see its role in Identity Access Management Console by clicking on the link,
and let's add some sample permissions, because by default, it has only basic execution
permissions and permissions to upload logs to CloudWatch. For example let's select
AlexaForBusinessFullAccess. Let's go back to our Lambda function dashboard and refresh it. In the
Designer frame, you can see new permissions that Lambda function obtained. Next, we can fill in
basic description of a function, and we can configure the memory for the execution. It can be
anything from 128MB to roughly 3GB in 64MB intervals. We also have a timeout, which is the
maximum allowed time for our function to run. It can be anything from 1 second to 15 minutes.
You don't see any CPU adjustment, and that is because processing power is configured in
proportion to the memory. You can also select the VPC if you want access to a specific resource
that can be accessed only from VPC. If a function fails, you can send that information or that letter
queue to Simple Queue Service or Simple Notification Service. You can also enable tracing with an
X-ray option, which will give you valuable information for troubleshooting of your Lambda
function execution. Then we have a concurrency. It represents the number of executions at the
same time. For an AWS account, you are allowed to run one-thousand concurrent executions
across all AWS regions. That is unreserved account concurrency, but you can also specify number
of concurrent executions that are reserved by this specific function. You can observe that the
unreserved account concurrency has dropped, and that is what all other functions across all other
regions share. This number is not a hard limit, because you can request a limit increase by going to
the Support Center and opening a support case. And the configuration option is enabling
CloudTrail for logging activities, with which we can create the compressed logs stored in an S3
bucket, and which is used for troubleshooting activities.

Deploying Lambda Functions

And at the end, just a few words about deployment methods for the AWS Lambda functions. First,
you can edit your Lambda function code and configuration in the AWS console. That is the
simplest way and most appropriate when you are starting to work with Lambda and your function
has only several lines of code. You can also edit your function code using your favorite local IDE,
and when you are happy with the changes, use the AWS CLI to upload your code to the Lambda
service. As long as I know, there isn't some kind of integration with any of the most familiar IDEs
around, at the moment that I'm writing this, but they might be just around the corner. Next, you
have Cloud9. It is an AWS service that creates an environment on EC2 instance and lets you access
your Lambda function code from within the IDE. This IDE lives in the browser, but it almost looks
and feels as it if you are developing on the local machine, and the number of editing options
comes close to local integrated development environments.

Module Overview

We have reached an end of yet another module, this one with AWS Lambda as a topic. I hope that
you got a good understanding of what Lambda functions are and what they can do for you. I
explained the basics of Lambda functions first, and to get into Lambda ecosystem as quickly as
possible, we have gone through creation of a sample Lambda application using Python as a
platform. I showed you how easy it is to test the Lambda function and how to work the versioning
from the AWS console. Then I showed you how Lambda fits into AWS services ecosystem by using
S3 as event source for triggering Lambda function execution. Lambda has a lot of configuration
options you can modify and we went through all the important ones, finishing with a bit of theory
about deployment options for Lambda functions.

Running Batch Computing Workloads with AWS Batch

Introduction

Hello again. Welcome to the module called, Running Batch Computing Workloads with AWS Batch.
In this module, I am first going to introduce you to the AWS Batch concepts and components, and
then for the first time, use command line and SSH connection to create a custom Docker container
that we will use in the Batch job we are going to create. All the components of job creation such as
job environment, job queue, job definition, and the job itself will be thoroughly explained. You will
also learn how to create array jobs and create jobs that depend on successful finish of other jobs
or job dependencies.

AWS Batch Overview

Similar to the previous services we discussed, AWS Batch is an AWS service that is completely
managed by AWS and runs in an AWS Cloud. It allows its users to run multiple jobs, multiple
meaning thousands, without the need to worry about managing infrastructures. Because AWS
does all the necessary provisioning and scaling for you. There are four components involved in the
AWS Batch, and these are as follows. Job is a basic unit in AWS Batch. It is a unit of work that
needs to be done and is a Docker containerized application. Job needs some kind of environment
it runs in, and it is a job definition. It specifies how much memory it gets, CPU power, permissions
to access other AWS components. Multiple jobs that are connected in some way create job
queues, which can have different priority levels, and which are executed by AWS job scheduler.
Jobs run in EC2 compute environments, and you can create job environments of your own,
unmanaged, or let AWS select the right instances for you, managed. Don't worry if you don't
understand these definitions right now. This service is relatively new, and you will get to
understand it with the help of demos in the next few clips.

Creating a Custom Docker Image

In this demo, you're going to learn how to start an EC2 instance, connect to it with an SSH client,
and install Docker software in it. Then you will see how to pull an image of a Python application
from a public repository, run a container from this image, then modify a container and save it as a
local image. Then we will continue by creating an IAM user to be able to push an image to an IWS
ECR. Create an ECR repository to store our image and finally, push an image to an ECR. During the
course, I have intentionally avoided using command line, whether AWS CLI or AWS Elastic
Beanstalk CLI, because it is a such a wide topic, and every answer reveals three more questions.
Since I am a command line guy, I would have talked more and more about CLI, which would make
this course way much longer than it should be. But now in order to modify a Docker image to suit
our needs, we need to start an EC2 instance, connect to it via SSH client, and install Docker
software on it. Then pull the image of a small container image called webapp, which is found in
public repositories, modified through command line, and using a special IAM user, send this image
to AWS ECR or Elastic Container Registry from which we will be using it to create the job definition.
Sound scary? Well, we will take it step by step. It would be nice for you to have some Docker
experience, but even if you don't, I hope that you will be able to follow along. Maybe even finding
some information on the internet. This is such a broad topic and it is impossible to cover
everything in one course. First, we need to launch an EC2 instance, click Launch instance. We can
select the first one. It has an AMI Linux operating system, and for an instance size, we can choose
the smallest one, click Next, launch. We can use existing or create a new key, which we can call
docker.pem, download it. Just remember the location, because we will use it later. Now we will
wait a minute or so for the instance to create. After it has been created, you should copy either its
public DNS or public IP address to the clipboard to use it to connect to the instance using an SSH
client of your choice. Now most of the people today use PuTTY, but for some reason, I don't like it
at all, and use MobaXterm, SSH client, which has a great free version. MobaXterm, right-click, user
sessions, and create new session, select SSH, paste the IP address of our instance for user, choose
ec2-user, and in Advanced SSH settings, choose docker.pem. This will connect us to an EC2
instance we created as the default EC2 user. EC2 user can't install the software we need, but it has
permissions to switch the root user that can. So we can execute sudo su command to become
root. We first need to install Docker with yum install -y docker, because m is in Linux is red hat or
CentOS compatible, and is using yum for package management. Now we need to start the Docker
service with service docker start. Now I won't really bother you with details. I will just pull the
image from public repository called training, that is called webapp. With the latest tag, which is a
tiny Docker image with Python web application, I will map Docker container port 5000, because
Python application is running on port 5000 to port 80 on an EC2 instance, so we will be to access
that sample Python application on port 80 on EC2 instance, but for that, we would need to open
that port on AWS firewall for that instance, which we won't do now. Also, -d specifies that the
container should run detached from the console. When starting, it will run python app.py. App.py
is that sample application, which brings Hello World at port 5000, translated to instance port 80.
Let's now press Enter and wait for the image to be downloaded from the internet from training
public repository, and used to start the container running python app.py command. Let's see what
containers we have with Docker ps. We can see that we have a container with id of 05134 and so
on, based on training, webapp, latest image, running python app.py command, as well as port
mapping and some other information. We can check that an application is working by using curl
utility on local host. We should get Hello World response, which we did. Now I want our container
to create an S3 bucket so we know that the job did something, and this bucket will be named
Globomantics + timestamp, so it has a unique name. First we need to connect to running the
Docker container using Docker exec. We need to enter id of running docker container as an
argument, and -it options for interactive console. We also need to add bin/bash to specify
interactive shell that we want to be running. In the list of a files in this directory, you can see
app.py. For text editing in command line, I will install Nano with apt get install nano, but you can
feel free to use any editor you like. Without going into much details, this Python code will create
bucket globomantics-epoch time. Remember, epoch time is time from first January of 1970 in
seconds commonly used in Unix world. Every time it is generated, it has a different name so there
are no naming conflicts. I have made this code as simple as possible so you can see how it works.
This is a rather new technology with a lot of moving parts so doing it right in a simple manner is
very important. In order to have access to S3, our Python application needs to have permissions.
We need to create a user in IAM that will have full S3 access. So in IAM console, create new user,
let's call it globomantics-s3, and make it have programmatic access. Now for permissions, let's
choose Attach existing policy directly, search for S3, and choose S3 full access. Now finish the
creation of a user, and soon after the user is created, we will be presented with access key and the
secret key. Copy those in a safe place, and soon we will be entering them in a command line. Back
in the command line, we need to update the container and install AWS CLI with apt-get update
and apt-get install aws cli. We also need to enter credential for the IAM user we created with AWS
configure with access key and secret key. I will enter only dummy values for access key and secret
key. After all, it is a secret. Before we test bucket creation, we need to install one more thing, and
it is a boto3 Python module, which is an AWS SDK for Python programming language. We'll do it
with command, pip install boto3. When the installation has finished, you can test bucket creation
from command line with Python S3.py. We can now go to the AWS S3 service console and see for
ourselves that the bucket has been properly created. Before we go further, we need to create an
ECR or Elastic Container Registry where we are going to store the image that is going to be used
for AWS Batch job. So in ECR AWS service, create repository, and call it globomantics. After it is
created, select it, it shows necessary commands to push an image to this repository from the
command line. This here is command to log in to the repository from the instance shell.
Remember, not inside Docker container. If we enter this in our command line, we will get an error,
unable to locate credentials, so we should run AWS configure with credentials. We now need to
create an IAM user that has permissions to push Docker images into the ECR repository. In IAM
create user called globomantics-ecr with programmatic access, because we are going to access it
from the command line, choose attach existing policies directly, and in a search box, enter
container and choose the first option, container registry full access. When the user creation has
finished, use access in secret keys with AWS configure, but this time not in the Docker container,
but in the instance itself. Now you will be able to execute the logging command. When you see
that the login has succeeded, we are ready to push an image to the ECR repository. To save this
container as an image or a template that we will soon push to an ECR repository, we use docker
commit followed by running container id. We can see a list of images now, the first one with
repository none and tag none is our newly created image. In ECR, copy the URI of the repository
we created, and then go back to the command line. We need to tag a local image with repository
URI that we copied +:+latest so it is docker plus tag, plus image id, plus repository URI, plus colon
latest. We can see with Docker images that the image was properly tagged, and finally, push it to
the repository using docker push followed by your repository URI. It might take a minute or so,
depending on your upload speed. When we go to the Globomantics repository, we can see that
we have a new item, a new tag, so we finally got the image which we can use to run a batch job.

Configuring and Running a Batch Job

In this demo, I am going to show you how to create a batch job from the custom Docker container
we created in the previous clip. We are going to create job environment, job queue, job definition,
and finally submit a job itself. What is important to remember is that the job based on a job
definition is submitted to a job queue, which executes jobs on the job environments. We don't
have anything in our batch yet so we are offered to get started. So let us click Get started. Let us
keep the wizard and create all components by ourselves. On the left side we have Jobs, Job
definitions, Job queues, and Compute environments. Let us first create a compute environment. It
is where the jobs are actually run. Click Compute environment and Create environment. We want
it to be managed so all resource provisioning will be done by the AWS. We will call it
globomantics-env and select AWS Batch service role and ECS IAM role, which are necessary for
normal functioning of the environment. If you want, you can select the key pair in case you need
to connect via SSH to the instance. Now provisioning can be either on demand, which means
instant provisioning, or we can choose both instances and wait for the instance price to fall below
our specified threshold. For the instance type, you can either choose optimal, which chooses
appropriate instance type from M4, C4, or R4, or select a specific instance type that you want. For
the minimum vCPU, if we leave it to 0, there will be no instances started if there is no job to be
run, and if we set it to let's say 2, desired 2, there will always be an EC2 instance with two virtual
CPUs, which means faster start. So let's select 2, 2, 256. Click Create and the environment is being
created. You can see its basic parameters, you can edit it or even disable it. If you go to the EC2
dashboard, you will see that an instance is created of m4.large type, and it is there because we
have specified a vCPU with a non-0 value so it is always running regardless of the fact that there is
no job submitted yet. Next, we need to create the job queue. Click Create, enter globomantics-
queue. For Priority we can enter any integer that we want, because we don't have any other
queues. If there are other queues, ones with higher priority numbers will grab resources from the
same environment faster. So if we have more environments, we can assign queues to multiple
environments with different priorities. Select the environment and click Create Job Queue. After
it's created, it's time to create the job definition. Job definition as mentioned, is a template for
jobs. Click Create and enter globomantics-job. Now we don't need to change job attempts, which
specifies how many times the job is retried, or a timeout, which specifies allowed duration of a job
execution in seconds. We have a job role, and it is an IAM role, which assigns permissions for a
container to access other AWS services. We should choose basic ECS execution role. For a
container image, paste the URI of our custom Docker image in the repository. We need to run a
command, python s3.py, and you should enter it in the command field. For the CPU, instead of
default 16, enter 2; for Memory, enter 1024. There are many, many more configuration options
and we are running out of time so scroll down and click Create. And the final thing to do is actually
create the job based on job definition, send it to the queue, which executes it on an environment.
Click Jobs, Submit job. We can call it globomantics-job. It will be based on Globomantics def 1; this
one is a revision number, to be executed on Globomantics queue. It's a single job opposed to array
jobs, has no dependencies, and you can see now that you can override the job definition
parameters if you want. Scroll down and click Submit job. You can see now that the job is
submitted. Take a break and when you come back in the next clip, we will follow the path of the
job that has been submitted. Before you do that, visit the AWS S3 service and see for yourself that
the bucket has really been created, which means that the job has finished successfully.
Congratulations for following me through these two, at times rather complicated demos.

Array Jobs and Dependencies

In this demo, I am going to show you how to submit a different type of job, an array job, and also
demonstrate how to make it depend on the successful finish of another job called Dependency.
We will now create another job based on our existing job definition, globomantics-def. Instead of
running python s3.py, it will run on Linux command sleep 40. So it will do nothing, just stay idle for
forty seconds, until it exits successfully. Then we will make this command the dependency of
another job, this time an array job, which will create three identical instances of a job based on the
same definition with standard python s3.py command. Only after the first job has successfully
finished, the dependent array job will be able to start. So you will be introduced to the concept of
dependencies and array jobs at the same time. First, let's submit a job. It's called sleep-job. It's
based on definition globomantics-def, and for a command, instead of python s3.py, enter sleep 40.
Scroll down, click Submit job. As soon as it's submitted, copy the id of a job to the clipboard. Now,
submit another job; you can name it array-job with the usual options, and instead of single,
choose array job. Size will be 3, and then there is dependency. Paste the id we copied of our
running job, sleep job. Now we have running-job sleep-job, and only when it has finished
execution, our three jobs will run. Refresh a few times, wait 40 seconds until sleep job expires and
goes to Succeeded, and maybe sometime until there are enough resources for the array job to
run, and you will be seeing newly created S3 buckets, which means array-job has finished
successfully.

Job States

In this demo, I'm going to show you which job states a submitted job is going through until it has
finished either as succeeded or failed. When submitted to a queue, the job is marked as
submitted. Job is first analyzed by the scheduler, determining if it has dependencies that need to
be met; that is successful completion of other jobs. If some dependency needs to be met, its status
is marked as pending. If the condition is met or has no conditions, it is marked as runnable.
Runnable jobs are started when there are enough resources or when the resources have been
created. When the job image is pulled and container is being started, the job is marked as starting.
After the container is started, job is marked as running. After the job has been completed, it exits
with an exit status, either 0 for successful finish or non-0 for unsuccessful. That applies if the
number of retries is set to 0; otherwise it is moved to runnable state again. If the job has
succeeded, its status becomes succeeded; otherwise it becomes failed after all retry attempts.
Usually, failed state is connected to AWS Batch being unable to provision resources.

Module Summary

In this module, we have gone a long way from explaining core concepts of AWS Batch jobs. The
demo in this module was quite demanding, because it integrated multiple AWS services, Batch, S3,
ECR, EC2, and IAM, making it by far the most complex demo in this course, but with complexity
comes power and in the next demo we used already created Docker image to start a dependent
array job with only a minimal modification to a job that was submitted previously.

Coordinating Components with AWS Step Functions

Introduction

Hello again. Welcome to this module called, Coordinating Components with AWS Step Functions.
We have gone through a lot of material in this course, and I'm going to use some of it in this
module. In this module, you are going to learn what AWS Step function is, what it is used for, and
what are its key components, tasks and state machines, how to use an existing Lambda function to
create a task. Then we will move on to another key component, activity and activity workers. We
will briefly take a look at Step functions, monitoring and troubleshooting, before we finish the
module with pointing at some of the limitations imposed upon Step functions by the AWS.

Step Functions Overview

AWS Step function is an AWS service, again, running in their own managed cloud. It allows you to
build an application from individual tasks, which are somewhat similar to AWS Batch jobs, but are
based on Lambda functions and not only then. AWS Step function application is called state
machine, and state machine consists of tasks. State machines are written in Amazon State
Language or ASL, and AWS Step function console helps you by providing a visual workflow
presentation of your state machine in ASL code. Unfortunately, visual workflows are still read-
only, meaning that you must create and work with state machine using code editor only. We can
combine tasks almost any way we like. For example, task 1 can run and task 2 can accept output of
task 1 to process it. Then we can start task 1 and task 2 in parallel so they are executed
simultaneously. And we can also start task 1 and task 2 in parallel and feed both of their outputs to
task 3 for further processing. Tasks can also make choices based on input or any other parameter
and can decide which state will be executed next. Here, task 1 is a task that's based on some
parameter, decides whether to pass execution to task 2 or task 3.

Creating a Sample State Machine

In this demo, I am going to create the simplest possible state machine so you can get the feeling
what working with Step functions from the AWS console is and what its code and visual
representation looks like. From the AWS console, go to AWS Step function, which is found in
Application Integration section of AWS Services. To quickly create a state machine, click Get
started, and choose Author with code snippets. Name the state machine globomantics-state and
scroll down. You can see code that is pre-populated, which is the Step function version of Hello
World and visual representation of that code. If you click Generate code snippet, you will see why
it is called like that. You have a wide variety of templates that will generate code snippets if you
enter parameters. For example, sending an SMS requires a phone number, and a message text.
You can see that the code changes as you enter values. The code snippet can be copied to the
clipboard, but we won't do that as we are happy with our basic sample code. Let us go back now
to our original sample code, and I will explain to you what it means. The code is written in Amazon
State Language, short ASL. Every state machine has to start somewhere, and that is where StartAt
keyword comes. It defines the state, which is an entry point. Here we have only one state that is
called Hello World, and this state machine is starting there. We can have multiple states and order
doesn't matter. Editor and visual part has a limited syntax checking so if we change StartAt to be
HelloWorld2, we will get an error in the code editor and refreshing visual part will show this
connected start and Hello World. This happens because there isn't such a state HelloWorld2.
Return the previous value of StartAt so there can be multiple states, but here we have only one
Hello World. As usual, I will change it to HelloGlobomantics. Refreshing the visual part reflects the
name change. Let's go to the body of the state, type here is pass, meaning that the state is doing
nothing, just acting as a kind of placeholder. Result is what is being sent as a return value. I will
change it to Hello Globomantics, and at the end, a key named End with Boolean and not string
value true means that the state machine ends here. Next, we need to assign permissions. There
are no special permissions for echoing Hello Globomantics. These permissions are needed if we
wanted to access other AWS resources. We can select Create an IAM role for me, and for a name
you can enter gm-state-role. Click Create state machine to get started. In the green field, you can
see that the state machine has been created along with the appropriate role. Scroll down and click
Start execution in either of these two buttons. You are offered to name the execution so you can
differentiate between different executions. You may call it whatever you want and you can also
enter input values in JSON format. We may enter company Globomantics. Click Start execution,
and we have a success message. We can see execution details with Input details, and Output
details. Scroll down, and we have a visual workflow and details in color. Scroll down some more
and you will see execution history. So you have seen the creation of a sample state machine. It
doesn't look much at the moment, but we will soon add more functionality to demonstrate its
power and flexibility.
Creating a Lambda State Machine

In this demo, I'm going to show you what steps you need to take to use Lambda function in the
task of a state machine in AWS Step functions console. To use Lambda in a state machine, we must
first create a Lambda function. I like to use code that creates an S3 bucket to demonstrate it is
working so I will do it in this demo as well. Lambda function will require access to S3. So first we
need to create a role for Lambda to have full access to S3 service. From the AWS, click Roles,
Create role, and for the service that will use this role, choose Lambda. Now go to Next
permissions, and in filter policy, search for S3 and select Amazon S3 Full Access. Next Tags, next
Review, and for role name, just enter S3. Click Create role. In Lambda AWS Services click Create
Function, Author from scratch, call it bucket-creation, and for a platform choose Python 2.7. For
execution role choose Select existing roles, and choose S3. Click Create function, and soon we
have a function ready. In the Designer frame, you can see that the right portion contains S3
service. Remember in Lambda module, I mentioned that Designer visually represents triggers and
the services Lambda has permissions to access. Well, you can see it in action again. We now need
to enter the code that will generate a bucket. We can reuse existing code from our s3.py from
previous module. I have already copied it so I will paste it here. I'm using us-east-1- AWS region
now, and when you are in this region, not only that we don't need to specify location constraints,
but we will get an error if we do. So you may test invoke it to see that it really creates a bucket, or
move on to Step functions. Before you do that, you need to copy the ARN or Amazon Resource
Name, a unique identifier for AWS resource of a Lambda function by clicking Copy here. Now go to
AWS Step function, select Create state machine, and this time we will call it globomantics-state-S3.
Generate code snippet, select AWS Lambda, Invoke a function. Here we can select the function,
but we will choose to enter function instead. So we can paste the ARN of a Lambda function we
copied to the clipboard. So let's do that. We can also specify payload, key value pairs that are
available to the Lambda code, but we won't enter anything, we'll just click Copy to clipboard. We
get the message that snippet copied to clipboard. Now generated snippet should be pasted
instead of Hello World state. Take care not to put an extra or erase required braces; it won't work.
We need to change two things. First, StartAt should be Invoke Lambda function. Also, our state
machine has only one state so next NEXT_STATE should be replaced by End true. Remember, true
is not a string, it's Boolean. Refresh the visual and you can see the state invoking Lambda function.
Click Next to go to permissions. We don't need any special role, just select Create role for me,
enter gm-state-role-s3 and Create state machine. When the state machine is created, we can test
that it is working by starting a test execution and when the execution is finished, we can go to the
S3 service and see for ourselves that the bucket is really created. So you have learned how to
create a state machine that uses a Lambda function to perform a task. I mentioned Lambda
functions are not the only way to perform tasks in state machines, and soon we'll talk about these
as well.
Activities and Activity Workers

In AWS Step functions, there is a feature that is called an activity. Activity is called from a task in
the state machine, and it is of a fire-and-forget nature or asynchronous task, meaning after the
activity is called, the execution of a state machine is continued. Activities use external processes
called activity workers to do specific tasks. Activity has its ARN identifier that is exposed in which
external activity workers are polling. So activities are putting assigned tasks for activity workers in
a queue while the Step function continues its execution. These workers can be located somewhere
else. They must not be Lambda functions, and they don't even have to be located in an AWS
ecosystem. For example, it can be a Python script located on an EC2 instance like the one we
created in previous chapters. In this demo, I will introduce you to a feature of AWS Step functions
called Activities and how they work by assigning tasks to activity workers, in this case, the Python
script placed in an AWS EC2 instance. To create an activity, in the Step function dashboard, select
activities, and since we don't have any activities, click Create Activity. We will call this one
external. After it is created, all we need is its ARN. There isn't a copy icon so we'll copy it manually.
The activity like I mentioned before contains a queue. When the task is executed in AWS Step
function state machine calling an activity, a message is placed in an activity queue. It would stay
there forever if nothing pulls this queue. I will create a Python script that will get that message
along with potential input values and hopefully do something meaningful with it. This time I won't
create an S3 bucket, I will leave it to you to add if you like. You will have other ways to see that the
activity worker performed the task. We can use existing state machine if we replace the ARN of a
lambda function with an ARN of an activity task so activity can be used as a drop-in replacement
for a Lambda function, except that it's asynchronous. State machines can be copied so let's copy
globomantics-state-s3 to a new state machine using copy to new, and I will call it Globomantics
activity. Replace resource of a Lambda function with an ARN of an activity we created. Also,
change invoking Lambda function to invoking activity in two places. You may now refresh the
visual representation as well. The code is copied, but the permissions are not so create IAM role
for me and we can just call it activity-role. Scroll down, click Create state machine. When it is
created, scroll down and click Start execution. This is the payload that is being sent to the activity
worker, which we still don't have. You can see that the execution state is running and if you scroll
some more in visual workflow, you can see that invoke Lambda task is blue, meaning it is running.
It will run until activity worker pulls the queue and returns a success message. Keep it running
while we create the worker on EC2 instance and start it. I hope that you save the EC2 instance
from previous modules. You will need pip and boto3 installed, as well as configure AWS CLI
credentials, which we have also previously covered. Create the file worker.py with the following
code shown in the SSH window. I made fonts a bit larger so you can copy it. Be sure to enter your
ARN, not mine. When you are finished, start the script and quickly go back to the execution in
visual workflow. You can see that in a few seconds, blue becomes green, which means that the
execution was a success.

Monitoring and Troubleshooting


In the following demo, I am going to demonstrate how to monitor AWS Step function state
machines and activities by using CloudWatch, as well as how to troubleshoot and find an error in
the state machine definition. Monitoring is an essential part of having a well-behaved AWS service,
and it is especially true when it comes to services like state machines or Lambda functions where
there are executions and there are things that can be monitored such as execution time. Step
function provide multiple CloudWatch measurements or how they call it, metrics, which can help
you fine tune your state machine settings and other resources. There isn't a dedicated link that will
take us to the CloudWatch from the Step function dashboard so we'll have to go there by
ourselves. If you select Metrics from the left portion of the screen, among them you will note is
the state namespace, namespaces being the CloudWatch sections with number 136 next to it,
which is number of metrics you can use. Click States and you can see various so-called dimensions,
API usage, activity, service integration, service, and execution metrics. Since we used Activity to
run a worker in the previous chapter, let's take a look at activity dimension. You can see several
metrics, among which we can take for example, activity runtime, which represents time between
activity start and activity end. To see the graph of these metrics, check the checkbox left of the
metric's ARN. The default time interval on the graph is three hours, and I have made it pause since
I created an activity. Make it three days and you can see the bottom line representing lowest
value, top line representing highest value, and the line between an average value. So you can see
that there are six values and it's easy to spot the minimum and maximum value and dots that
visually represent individual values. There are a lot of options in CloudWatch, which I won't
elaborate since they're not really a topic of this course. Just make sure to check CloudWatch logs,
and if you feel that your step function needs to be optimized, they are your best friends in
situations where a number of executions is too high to process these parameters manually. A lot
of information for monitoring, as well as for troubleshooting can be found in the execution event
history. Let's take a look at the execution log of Globomantics State S3 after we execute it again
with usual parameters company and Globomantics. Execution events are ordered by their event
id. First you can see when the execution was started and the role we assigned to the state
machine. Then you can see details about the task, its name and input parameters, if any. Then
follow different details about the task to be executed. Task is started, we can see it here, and a lot
more information in TaskSucceeded execution event. Also some information about the end of task
execution, as well as successful execution of the whole step machine. Let us go back to our
Globomantics activity. I will show how with execution history you can detect errors in the ASL code
that you can't see in the editor or in the visual workflow. Change the ARN of an activity worker in
Globomantics activity to externalxy. Click Save, and it doesn't really show anything in the code
editor, and even if we reload the visual workflow, there would be no error shown, but if we go to
Start execution, no parameters needed, start execution, we can see that the execution status is
failed, and if we wish to investigate the cause of the problem, we can see that under id number 3,
execution type is failed, and if we unfold it, you can see an error warning that has specified ARN
externalxy doesn't exist. You can see that execution history is really, really useful, but it does have
some limitations, which we will mention in the following clip, the last in the module.
Step Functions Limitations

Although AWS builds its public image as a company on being able to infinitely scale out, there are
certain limitations on all the services it provides, and AWS Step function is no exception to the
rule. First limitation is the one we are accustomed to in all IT-related things, naming conventions.
In AWS, you can name state machines, activities, and executions. Those three can use only
alphanumerical characters, dashes, underscores, plus equal sign monkey dot and comma signs.
Known ASCII codes can be used for names, but you won't be able to use CloudWatch logs, as it
does not recognize them. Then there is a limit related to the number of state machines and
activities, and it amounts to 10, 000 each, which you must admit is pretty generous. Maximum
number of simultaneous executions are 1 million. The maximum execution time is one year for
both state machines and tasks, and maximum time state machine or task can be idle is also one
year. It must be smaller than execution time, of course. Request size of an API request is 1MB, and
input/output size is 32KB. Then there is a history limit for individual executions, and it is the
number of items that appear in the history for a specific execution. It is set by default to 25, 000.
There is also a history retention period, and it's 90 days after which you won't be able to see
execution history anymore. Some of these limits are hard limits, meaning that they are immutable,
but some of these are soft limits and you can request a limit increase in the AWS support center.
The soft limits are API throttling parameters like the following ones: CreateActivity,
CreateStateMachine, DeleteActivity, DeleteStateMachine, StartExecution, StopExecution,
UpdateStateMachine. I have written half of the soft limits here. The numbers that you see are
bucket size and the refill rate. These are the parameters of so-called token bucket algorithm. For
example, create activity numbers of 100/1 means that at the start you can issue 100 CreateActivity
API calls at once, but then you run out of them and after that, you get one activity bonus or a
token every second that you can spend. Some of these are region dependent, for example,
StartExecution and StopExecution, while most of them have the same values across all AWS
regions.

Module Summary

We have finished yet another module with AWS Step functions as a topic this time. After
introducing you to the Step function concepts, you went with me through usual first step creating
a sample state machine, and then we added a functionality of calling an existing Lambda function
from the state machine task. We then moved to another important feature of Step functions,
activity that works along with activity worker, using an existing EC2 instance to run the Python
code that acts as an activity worker. We went through monitoring of state machines and activities
with CloudWatch and troubleshooting with execution history before finishing the module with a
description of soft and hard limitations of AWS Step functions.

Connecting with Amazon API Gateway


Introduction

Welcome to this module, the last in the course. It is time to learn yet another technology, and this
time, it is Amazon API Gateway. In this module you will learn what the AWS API Gateway is, what
it is used for and its benefits. We will go through setting a sample API gateway, learn about
deployments and stages, and see how to configure API Gateway for proxy, as well as for custom
integration. You will learn how to control access to an API gateway and CORS. AWS API Gateway is
an AWS service completely stored in AWS Cloud, which provides management of REST and
WebSockets APIs. It serves as a gateway to various internal AWS services by providing public
endpoints. While you must create and update your API, AWS API Gateway is responsible for traffic
management, end user authorization, performance monitoring, throttling API calls, and caching
API calls. If that sounds like a lot of theory, we will go to the console again and create a sample API
in the next demo.

Creating a Sample API

In this demo, I'm going to show you how to set a sample API that will invoke bucket creation of a
Lambda function and test it from the API gateway console. From the AWS dashboard, let's go to
AWS API Gateway service. As with most of the services, if you don't have anything created, you are
offered to get started so that is what we are going to do. AWS API Gateway is a regional service
meaning that you can have API Gateway with the same names in different regions. For this demo,
we will choose a REST API. So we will choose a new API, not a sample one nor import from other
systems. You can see that there are quite a few options to choose, but we will choose the simplest
one. I will name this API globomantics-api, and you may enter a description if you like. Now
endpoint type is interesting; Regional, Edge Optimized, or Private. Regional endpoints serve
requests coming from the same region. Edge optimized endpoints use CloudFront to reduce
latency, and Private ones are not accessible from the internet, only from specified VPCs. Now click
Create API. It is quickly created because opposite to the services we worked within the previous
modules, it doesn't require creation of resources such as EC2 instances or Docker containers. You
are presented with a wide variety of settings. For now, this API is doing nothing so we must
configure it and put it to action. To be able to access the API, we must deploy it. That means
making it accessible to the outside world via its endpoint. In the resources, we have only a root
resource, a slash. Resources in API Gateway are organized in a tree-like fashion so there can be
sub resources of a root resource. Each resource can have 0 or more methods, which correspond to
the HTTP verbs like Get, Boot, Post. Deploying it exposes it to the public, but we must deploy it to
the stage, which I will soon talk about. Click Actions, Deploy API. Since we don't have any stages,
click New Stage and call it Initial, for descriptions in initial stage, and initial deployment. We can't
do that. We are warned that there are no methods so we must create a method for this resource.
Click the root resource and choose HTTP verb. For example, choose GET, select checkmark to
confirm. Now we are allowed to do setup of our GET method. We have different integration points
or integration types, which specify what kind of AWS or non-AWS back end service API Gateway is
connecting the client that sends and requests. We'll choose the default, which is Lambda
integration. The request is passed to the Lambda function, and if we also choose Lambda proxy
integration, we will be able to pass parameters to the Lambda function code and use it for
something. We need to specify the region, I mean us-east-1, and type a string contained in the
Lambda function name that we want. Hopefully, you did not delete bucket-creation Lambda
function we created in previous modules so we can use it. We then click Save. API Gateway asks
you to give permission to the API to invoke Lambda function. Now you are presented with a
diagram with a lot of information, but it essentially represents a request to workflow from client to
Lambda function via API Gateway and response from Lambda via API Gateway back to the client.
To test the method without deploying it, which means not making it public, click on the lightning
icon, and you are presented with an overview of how different settings are configured. Scroll
down and click Test. Nothing happens and the test button has disappeared, but if we go to the S3
service, we can see that the bucket is created, meaning that the Lambda function bucket creation
has been successfully triggered.

Deployment and Stages

To make an API public, we need to deploy it instead of only testing it from an AWS console. APIs
are deployed to stages. A stage can be thought of as a snapshot of of an API with its complete
configuration. This allows you to have the same API deployed in slightly different versions so you
can have a usual development testing and production stages, calling them dev, test, and prod.
When the API is deployed, you can access it by its https endpoint that looks like https://api-
uri.com /. Those three stages will therefore be accessed at /dev/, /test/, and /prod/. But you're
not limited to this. If you have a custom DNS domain name like gmapi.com, you can point its sub
domains to individual stages. The first one can be accessed at, for example, dev.gm - api.com, the
second one at test.gm - api.com, and so on. In this demo, I'm going to show you how to deploy an
API to a stage, making it accessible through an HTTPS endpoint. From our Globomantics API, click
Actions, and Deploy API. Click on the new stage, name it dev. Description can be development,
and deployment description can also be something like development. After the stage is created,
we have a list of stages and in it we have a dev stage unfolded, and you can see that it contains
/source/ and a Get method. Fold it again and take a look at the invoke URL. This is a public HTTPS
endpoint that you can access from the internet with your browser, or use it programmatically,
which is the point of using an API. If we click an endpoint, it will open a new tab with the message
internal server error. Although we got internal server error, which is there because we did not
configure Lambda response yet, if we go to S3 service, we can see that the bucket is created,
meaning that a Lambda was activated.

Different API Integrations


In the previous clips, we chose Lambda integration type, but this is not the only integration that is
available in AWS API Gateway. Depending on the endpoint that you are connecting to, there are
different integrations. We have already seen how a Lambda integration works, and we chose
Lambda proxy integration, other type being Lambda custom integration. This type of proxy
integration is the simplest one, because the data is being sent to the Lambda function unchanged,
and we don't have and even can't modify integration requests and integration response. Lambda
function receives request query strings, path variables, body and header unchanged. Custom
Lambda integration allows us to use complex mappings, even those from previous API systems we
may have used, which is a big plus when migrating from other systems. After that, there is an
option to choose an HTTP integration type, which also offers a choice between HTTP proxy and
HTTP custom integration. Next, there is an AWS type of integration, the only one so far that is not
having a proxy subtype. You are solely responsible for mappings that transform data from the user
input to the integration request, as well as from the AWS service output, to the datacenter back to
the user. There is also mock type integration that implements API Gateway itself as a back end
service, which responds to a user request. An example of this type is a gateway response. And the
last type, a VPC named type of integration. It allows us to use a private HTTP or HTTPS endpoint,
usually not open to the public, and expose it to the general internet.

Configuring Gateway Responses

In this demo, we are going to learn what the Gateway responses are and how you can customize
any of these responses to show your own message and headers. There are certain responses from
the gateway that are sent back to the user if the integration point cannot be reached. You can see
them all if you click Gateway Responses from our Globomantics API. A frequent response is 403,
Missing Authentication Token, which is received when we try to access the resource that does not
exist. I copy the dev stage URL in a new tab, appending nonexistent resource at the end of URL,
and since it doesn't exist, a 403 Missing Authentication Token is returned by the API Gateway. If
we want to customize the response, go to Gateway Response 403, Missing Authentication Token.
Here we can add a new header, for example, header name company, value Globomantics, with
Globomantics enclosed in single quotes. We can also modify the message sent back to user.
Instead of the default message, we can enter Globomantics resource not found between quotes.
Now for this to take effect we must redeploy the API. We can do it to the dev stage that already
exists. Now we can go back and enter the nonexistent URL again, making sure it is not cached. We
can refresh it a few times. Now you can see the message, Globomantics resource not found, and
also header company with value Globomantics. So this is how you can easily customize any
gateway response to show the message and the headers that you want.

Custom Integration
In this demo, you're going to learn how to configure custom Lambda integration by configuring
body mapping templates and use query string parameters in Python Lambda function. We have
configured an API with Lambda proxy integration, meaning that all the data sent is available to the
Lambda function unchanged. With custom Lambda integration, we need to configure API Gateway
objects to be able, for example, get parameters sent in the browser. In method integration
request, we must first uncheck a Lambda proxy integration, click Yes several times, and in the new
settings that appear, go to Body Mapping Templates, add a template, and type application-json,
which is grayed out as a hint. AWS asks you to allow only defined content type instead of allowing
all, which is default, click Yes. Now paste the template that is available as at template file.
Templates are written using a special syntax called Velocity Template Language or VTL. Now click
Save, and in the method request, you also need to add the parameter Hello to URL query string
parameter. Only these parameters are allowed. Now you need to deploy the API to an existing dev
stage. Every time you make some change in the configuration, don't forget to redeploy it. Now
back to our bucket creation Lambda function. You can see that I have commented bucket creation;
we don't need it anymore, and I have added a signing query parameter, Hello value to the variable
Hello via event query Hello structure. I'm also adding its value to Hello response. So it should
return Hello plus anything that is passed as a parameter in a browser. Now let's invoke our URL. It
returns an error, because no query parameter is found, it is not sent, but if we add hello = test
after URL with question mark of course, it will return the parameter itself, Hello, Test.

Enabling CORS

In this demo, I will show you how to enable CORS in API Gateway with a simple click of a mouse.
CORS or Cross Origin Resource Sharing, is a browser security feature, and it prevents ages that are
initiated from scripts running in the browsers. Cross origin means HTTP request to another
domain, sub domain, or protocol. To enable CORS support for our method, go to Resources,
choose a method, and click Actions, Enable CORS. You will be asked to replace existing CORS
headers. You should confirm it, and you will get CORS enabled for your method. It just doesn't get
more simple than that, does it?

Module Summary

In this module, which marks the end of the whole course, you have learned what the API Gateway
is, how it fits into AWS ecosystem, and we started as usual with creation of a sample API. We
moved on to deployment and stages, talking about different API integration types, and gateway
responses, before showing you how to set a custom API Lambda integration, and how to enable
CORS. Thank you for being with me throughout this course, and I wish you all the best!

S-ar putea să vă placă și