Sunteți pe pagina 1din 8

DEFENITIONS:

Autonomous: denoting or performed by a device capable of operating without direct


human control
Backpropagation: Backpropagation is a technique used to train certain artificial
neural networks – it is essentially a principal that allows the machine learning
program to adjust itself according to looking at its past function.
BigO notation: A theoretical measure of the execution of an algorithm, usually the
time or memory needed, given the problem size n, which is usually the number of
items. Informally, saying some equation f(n) = O(g(n)) means it is less than some
constant multiple of g(n).
Bounding boxes: Bounding boxes are imaginary boxes that are around objects that
are being checked for collision, like pedestrians on or close to the road, other
vehicles and signs. There is a 2D coordinate system and a 3D coordinate system
that are both being used. Bounding box on a road.
Brute-force: In cryptography, a brute-force attack consists of an attacker submitting
many passwords or passphrases with the hope of eventually guessing correctly. The
attacker systematically checks all possible passwords and passphrases until the
correct one is found
Convolutional neural networks (CNNs): A convolutional neural network (CNN) is a
type of artificial neural network used in image recognition and processing that is
specifically designed to process pixel data. ... A neural network is a system of
hardware and/or software patterned after the operation of neurons in the human
brain.
Cost function: A cost function is a function of input prices and output quantity whose
value is the cost of making that output given those input prices, often applied through
the use of the cost curve by companies to minimize cost and maximize production
efficiency.
Deep learning: Deep learning is a subset of machine learning where artificial neural
networks, algorithms inspired by the human brain, learn from large amounts of data.
... Deep learning allows machines to solve complex problems even when using a
data set that is very diverse, unstructured and inter-connected.
Dijkstra’s algorithm: Dijkstra's algorithm (or Dijkstra's Shortest Path First algorithm,
SPF algorithm) is an algorithm for finding the shortest paths between nodes in a
graph, which may represent, for example, road networks. It was conceived by
computer scientist Edsger W. Dijkstra in 1956 and published three years later.
End-to-end learning: End-to-end learning refers to training a possibly complex
learning system by applying gradient-based learning to the system as a whole. End-
to-end learning systems are specifically designed so that all modules are
differentiable.
Feature maps (Activation maps): Feature map and activation map mean exactly
the same thing. It is called an activation map because it is a mapping that
corresponds to the activation of different parts of the image, and also a feature map
because it is also a mapping of where a certain kind of feature is found in the image
Filters (Kernels): In image processing, a kernel, convolution matrix, or mask is a
small matrix. It is used for blurring, sharpening, embossing, edge detection, and
more. /
Filter stride: Stride is the number of pixels shifts over the input matrix. When the
stride is 1 then we move the filters to 1 pixel at a time. When the stride is 2 then we
move the filters to 2 pixels at a time and so on.
Greedy algorithm: A greedy algorithm is an algorithmic paradigm that follows the
problem solving heuristic of making the locally optimal choice at each stage with the
hope of finding a global optimum. In general, greedy algorithms have five
components: A candidate set, from which a solution is created.
Machine learning: Machine learning is an application of artificial intelligence (AI) that
provides systems the ability to automatically learn and improve from experience
without being explicitly programmed. Machine learning focuses on the development
of computer programs that can access data and use it learn for themselves.
Max-pooling / Pooling: Max pooling is a sample-based discretization process. The
objective is to down-sample an input representation (image, hidden-layer output
matrix, etc.), reducing its dimensionality and allowing for assumptions to be made
about features contained in the sub-regions binned
Multi-layer perceptron (MLP): A multilayer perceptron (MLP) is a class of
feedforward artificial neural network. A MLP consists of at least three layers of nodes:
an input layer, a hidden layer and an output layer. Except for the input nodes, each
node is a neuron that uses a nonlinear activation function.
Nearest neighbour algorithm: In pattern recognition, the k-nearest neighbors
algorithm (k-NN) is a non-parametric method used for classification and regression.
In both cases, the input consists of the k closest training examples in the feature
space.
Overfitting: Overfitting refers to a model that models the training data too well.
Overfitting happens when a model learns the detail and noise in the training data to
the extent that it negatively impacts the performance of the model on new data.
Point clouds: A point cloud is a set of data points in space. Point clouds are
generally produced by 3D scanners, which measure a large number of points on the
external surfaces of objects around them
Receptive field: The receptive field in a convolutional neural network refers to the
part of the image that is visible to one filter at a time. This receptive field increases
linearly as we stack more convolutional layers or increases exponentially when we
stack atrous convolutions.
Sensor Fusion: Sensor fusion is software that intelligently combines data from
several sensors for the purpose of improving application or system performance.
Combining data from multiple sensors corrects for the deficiencies of the individual
sensors to calculate accurate position and orientation information.
Society of Automotive Engineers: SAE International, initially established as the
Society of Automotive Engineers, is a U.S.-based, globally active professional
association and standards developing organization for engineering professionals in
various industries.
Shift invariance (Spatial invariance): Shift invariance. For a system to be shift-
invariant (or time-invariant) means that a. time-shifted version of the input yields a
time-shifted version of the. Output: y(t) = T[x(t)]
Vehicle-to-vehicle (VTV) protocol: Vehicle-to-vehicle (V2V) is an automobile
technology designed to allow automobiles to "talk" to each other. V2V
communications form a wireless ad hoc network on the roads. Such networks are
also referred to as vehicular ad hoc networks, VANETs.
Vehicle-to-infrastructure (VTI) protocol: Vehicle-to-infrastructure (V2I or v2i) is a
communication model that allows vehicles to share information with the components
that support a country's highway system. ... Similar to vehicle-to-vehicle (V2V)
communication, V2I uses dedicated short range communication (DSRC) frequencies
to tranfer data.
DSRC: Data Software Research Company
FINDINGS:
As of today, all automotive engineers and engineering branches are trying to aim and reach
the level of autonomous vehicles within the near future or in other terms level 5 for The SAE.
A few examples are, google; plan to release in the year 2020, Tesla by 2020, ford by 2021
etc.
Sven Larsson, the leader of Levangerstadt project, outlined “This is an ambitious plan for
building a new town that will incorporate the latest technology in providing an environment
that is safe both for the society that will live here and for the environment in which they will
live. One fundamental policy will be the incorporation of autonomous vehicles as the only
form of transport within the town’s boundaries.”
He then later outlined that the cars could be summoned at any time pleased by using
something along the lines of an app, showing that this would be extremely user friendly as
well as how they would run on improvised routes based on traffic, weather etc so that the
optimum ride for a user would be provided. He then went on to distinguish the few of the
several advantages presented by such a system. He also outlined a case in which human
driving could result in a
more sufficient and safer reaction.
The main aim of this project is to enable a vehicle to determine its exact location, perceive its
environment and surrounding, making the right decisions to get from point A to point B
safely.
Regarding the cost of the vehicle, it is predicted to be extremely high as the use of Sensor
fusion, LIDAR, combining the GPS and high definition mapping would not only be hard to
manufacture as well but the labour will also increase to monitor minute things at the
production site. These are all vital as they all play a key part in deep learning which is critical
to enable a car to act on itself according to the safety regulations and instructions.
CNN:
Structure of the CNN, determine its initial parameters and carry out the training of the
network. He emphasized, however, that his technical team should understand the basic
features of a CNN and how each layer is created from the previous. He first described the
different layers: the input layer, which for a colour image could be, for example, a 32 × 32 ×
3 pixel input plane(32 × 32 being the resolution of an image with 3 colour RGB colour
channels)the convolution layers, in which filters would search for certain features by moving
through the image in steps (the length of each step set by the stride used)the feature maps
(one for each specific feature), which are produced as a result of the Convolutions the ReLU
layers, which introduces non-linearity into the process – this helps with the training of these
networks the pooling layers, in which representative values are taken from one layer to form
the next. Max-pooling is the specific technique used in this case study the fully-connected
layer, which then links all the features together the output layer, which give a series of
probabilities that the original image was particular object.
The capturing of the image or feature in CNN’s are extremely unique and special as they use
instead of just converting strings of 0’s and 1’s, they arrange it so that it were in a
matrix(Krenels). For example: 1 0 1
1 0 1
1 0 0
Would identify as some stored value as shown above in the picture it could hold the value for
the pixel for a dog , or a car etc. The final image work somewhat similar to projectors as the
original image is grayscale and then later on it is split into channels to further be elaborated
on and refined.
These krenels can be blurred, sharpened , Gaussian blurred and are identified via aspects
such as edge detection.

 Depth: Depth corresponds to the number of filters we use for the convolution
operation. In the network shown in Figure 7, we are performing convolution of the
original boat image using three distinct filters, thus producing three different
feature maps as shown. You can think of these three feature maps as stacked 2d
matrices, so, the ‘depth’ of the feature map would be three.

 Stride: Stride is the number of pixels by which we slide our filter matrix over the input
matrix. When the stride is 1 then we move the filters one pixel at a time. When the
stride is 2, then the filters jump 2 pixels at a time as we slide them around. Having a
larger stride will produce smaller feature maps.

 Zero-padding: Sometimes, it is convenient to pad the input matrix with zeros around
the border, so that we can apply the filter to bordering elements of our input image
matrix. A nice feature of zero padding is that it allows us to control the size of the
feature maps. Adding zero-padding is also called wide convolution, and not using
zero-padding would be a narrow convolution.
CNN PRODUCES FEATURE MAPS BY STACKING IMAGES(VIA FILTERS)

Trolley problem:
When one moving trolley is on a track which has 5 people on it and the other shiftable one
has one person (provided the trolley cant be stopped) which one would the AI choose?
Extra research:

The self-driving car requires many sensors on the vehicle that enable the complex software
stack to do its job of replicating the human control function safely.

At the same time that these sensors accurately perceive and understand the vehicle’s
surroundings in real time, they must also simultaneously localize the position and direction
(also referred to as heading) of the vehicle with far greater precision and reliability than
traditional car navigation systems (like GPS), can obtain.

Consider for a moment a truck travelling down an interstate freeway at 80 mph. Staying
safely centred in its lane requires lateral positional control on the order of 30 centimetres. The
truck is also traveling more than 3,600 centimetres every second. An error as small as 0.2
degrees in heading or direction will result in the vehicle drifting left or right by those 30
centimetres in just seconds.

To continuously maintain these tight tolerances, the navigation function in automated vehicle
control systems relies on a wide range of sensors and data sources.

These sources include the vehicle’s vision sensors (such as LIDAR and cameras), lower
resolution-ranging sensors (like radar and ultrasound), and classic navigation techniques (like
GPS and maps). However each of these sensors depend on the external environment, and
hence, can experience data-loss or degradation with little warning. For example, in snowy
conditions a LIDAR’s effective range and resolution is reduced. In downtown, urban areas,
GPS can suffer from severe multi-path errors and frequent outage.

The function of one or more inertial measurement unit sensors (IMUs) on the vehicle is to
provide a source of accurate short-term position and heading information to mitigate these
environmental challenges ensuring safe control of the vehicle at all times.

Figure 1: Inertial Measurement Unit (IMU) technology helps autonomous vehicles achieve
this precision localization. ACEINNA IMU381ZA 9-Axis Precision IMU.

IMUs consist of three orthogonally-mounted accelerometers and three orthogonally-mounted


rate sensors (gyros) that can directly measure the vehicles’ motion with no external
dependencies. The output of the IMU is used to independently navigate or "dead-reckon" for
short periods of time. Dead-reckoning, also called free-integration, refers to processing the
IMU outputs using Newtonian equations of motion to estimate attitude, heading, velocity,
and position of the vehicle.

Dead-reckoning places tremendous challenges on IMU accuracy and requires careful design
choices related to both algorithm design and sensor selection. Traditionally, systems capable
of dead-reckoning have been called Inertial Navigation Systems (INS), and also go by names
of Inertial Reference System (IRS), GPS/INS, or Enhanced GPS/INS (EGI).

Today, these systems are commonly found in aerospace and defense applications. Not only
are these systems typically $10,000 or more per unit, but they generally work exclusively
with GPS for navigation as opposed to the broader set of sensors in today's autonomous
vehicle and ADAS architectures. Hence, many designs require direct use of IMU data in a
sensor fusion algorithm that blends lidar, camera, and radar as well as GPS data into a
navigation state estimate.

An Automated Left-Turn

One common use case for the IMU is to help reliably navigate intersections. Street
intersections with their frequent lack of lane markers and wide-open space can challenge
vision systems. Furthermore, in urban environments, there may not be good GPS data
available, yet crossing or turning through an intersection safely is fundamental to automated
driving. Using this practical use case as motivation, the remainder of this article summarizes
error modeling, simulation, and empirical testing techniques to validate IMU accuracy and
performance for this application.

Conceptually, dead-reckoning is a fairly simple algorithm. It computes the vehicle’s position


directly from IMU data and is like a short-term GPS signal without a GPS radio.
Dead-reckoning works by taking the three-axis rate output and integrating it to orientation.
Orientation is simply the vehicle's combination of heading and attitude (roll, pitch, and yaw).
The orientation is used to rotate the measured body frame accelerations into the Earth frame,
and then gravity is removed. The resulting Earth frame accelerations are double integrated to
position following Newton's laws of motion. This is diagramed in Figure 3:

Figure 3: How IMU is used to Dead-Reckon - Free Integration Algorithm. (Image Source:
Researchgate.net)

S-ar putea să vă placă și