Sunteți pe pagina 1din 73

# Data Structures and Algorithms

Sorting Algorithms

Tahira Mahboob

Bubble sort


Compare each element (except the last one) with its neighbor to the right  If they are out of order, swap them  This puts the largest element at the very end  The last element is now in the correct and final place Compare each element (except the last two) with its neighbor to two) the right  If they are out of order, swap them  This puts the second largest element next to last  The last two elements are now in their correct and final places Compare each element (except the last three) with its neighbor three) to the right  Continue as above until you have no unsorted elements on the left

## Example of bubble sort

7 2 8 5 4 2 7 8 5 4 2 7 8 5 4 2 7 5 8 4 2 7 5 4 8 2 7 5 4 8 2 7 5 4 8 2 5 7 4 8 2 5 4 7 8 2 5 4 7 8 2 5 4 7 8 2 4 5 7 8 2 4 5 7 8 2 4 5 7 8 (done)

## Code for bubble sort

void bubbleSort(int a[], int size) { int outer, inner; for (outer =size - 1; outer > 0; outer--) { // counting down outer--) for (inner = 0; inner < outer; inner++) { // bubbling up if (a[inner] > a[inner + 1]) { // if out of order... int temp = a[inner]; // ...then swap a[inner] = a[inner + 1]; a[inner + 1] = temp; } } } }

## Analysis of bubble sort



for (outer = size - 1; outer > 0; outer--) { outer--) for (inner = 0; inner < outer; inner++) { if (a[inner] > a[inner + 1]) { // code for swap omitted } }}
Let n = size-1 = size of the array sizeThe outer loop is executed n-1 times (call it n, thats close enough) Each time the outer loop is executed, the inner loop is executed  Inner loop executes n-1 times at first, linearly dropping to just once  On average, inner loop executes about n/2 times for each execution of the outer loop

  

In the inner loop, the comparison is always done (constant time), the swap might be done (also constant time) Result is n * n/2 * k, that is, O(n2/2 + k) = O(n2) k,


Implemented by inserting a particular element at the appropriate position. First iteration starts with the comparison of 1st element with the 0th element In second iteration 2nd element is compared with the 0th element is compared with the 0th and 1st element. In general, in every iteration an element is compared with all the elements before it. And if element in question can be inserted at suitable position, then space created for it by shifting elements one position to the right

Insertion Sort

Insertion sort
5 7 0 3 4 2 6 1 (0) 5 7 0 3 4 2 6 1 (0) 0 5 7 3 4 2 6 1 (2) 0 3 5 7 4 2 6 1 (2) 0 3 4 5 7 2 6 1 (2) 0 2 3 4 5 7 6 1 (4) 0 2 3 4 5 6 7 1 (1) 0 1 2 3 4 5 6 7 (6)

Insertion sort
void insertionsort(int a [ ], int size) insertionsort( { int i, temp, j; for(i=1;i<size; for(i=1;i<size; i++) { temp=a[i]; for(j=i-1;j>=0;j--) for(j=i-1;j>=0;j--) { if(a[j]>temp) a[j+1]=a[j]; else break; } a[j+1]=temp; } }

## Analysis of insertion sort



We run once through the outer loop, inserting each of n elements; this is a factor of n On average, there are n/2 elements already sorted  The inner loop looks at (and moves) half of these  This gives a second factor of n/4 Hence, the time required for an insertion sort of an array of n elements is proportional to n2/4 Discarding constants, we find that insertion sort is O(n2)

Summary
 

Bubble sort and insertion sort are all O(n2) As we will see later, we can do much better than this with somewhat more complicated sorting algorithms Within O(n2),  Bubble sort is very slow, and should probably never be used for anything  Insertion sort is usually the fastest of the two--in fact, for two--in small arrays (say, 10 or 15 elements), insertion sort is faster than more complicated sorting algorithms Aother algorithm selection sort and insertion sort are good enough for small arrays.

## Divide and Conquer Algorithms



 

Base case: the problem is small enough, solve directly Divide the problem into two or more similar and smaller subproblems Recursively solve the subproblems Combine solutions to the subproblems We will study two divide and Conquer algorithms
 



Base case




Two Subproblems:
 

Recursively
 

## Combine sorted FirstPart and sorted second part

Algorithm
void mergesort(int lo, int hi) mergesort(int { if (lo<hi) { int m=(lo+hi)/2; mergesort(lo, m); mergesort(m+1, hi); merge(lo, m, hi); } }

MergeSort

MergeSort (Example) - 2

MergeSort (Example) - 3

MergeSort (Example) - 4

MergeSort (Example) - 5

MergeSort (Example) - 6

MergeSort (Example) - 7

MergeSort (Example) - 8

MergeSort (Example) - 9

MergeSort (Example) - 10

MergeSort (Example) - 11

MergeSort (Example) - 12

MergeSort (Example) - 13

MergeSort (Example) - 14

MergeSort (Example) - 15

MergeSort (Example) - 16

MergeSort (Example) - 17

MergeSort (Example) - 18

MergeSort (Example) - 19

MergeSort (Example) - 20

MergeSort (Example) - 21

MergeSort (Example) - 22

14 23 45 98

6 33 42 67

14 23 45 98

6 33 42 67

Merge

14 23 45 98 6 Merge

6 33 42 67

14 23 45 98 6 14 Merge

6 33 42 67

14 23 45 98 6 14 23 Merge

6 33 42 67

14 23 45 98 6 14 23 33 Merge

6 33 42 67

14 23 45 98 6 14 23 33 42 Merge

6 33 42 67

14 23 45 98

6 33 42 67

6 14 23 33 42 45 Merge

14 23 45 98

6 33 42 67

6 14 23 33 42 45 67 Merge

14 23 45 98

6 33 42 67

6 14 23 33 42 45 67 98 Merge

MergeMerge-Sort Analysis
Time
 

Most of the work is in the merging Total time: O(n log n) O(n), more space than other sorts.

Space:


## Quick sort Algorithm

Given an array of n elements (e.g., integers):  If array only contains one element, return  Else  pick one element to use as pivot.  Partition elements into two sub-arrays: sub Elements less than or equal to pivot  Elements greater than pivot  Quicksort two sub-arrays sub Return results

Quicksort
To sort a[left...right]
1. if left < right 1.1. Partition a[left...right] such that all a[left...p-1] are less than a[p], and a[left...pall a[p+1...right] are >= a[p] 1.2. Quicksort a[left...p-1] a[left...p1.3. Quicksort a[p+1...right] 2. Terminate

Partitioning


## A key step in the Quicksort algorithm is partitioning the array

We choose some (any) number p in the array to use as a pivot  We partition the array into three parts:


numbers < p

## numbers greater >= p

Partitioning


 

Choose an array value (say, the first) to use as the pivot Starting from the left end, find the first element that is greater than or equal to the pivot Searching backward from the right end, find the first element that is less than the pivot Interchange (swap) these two elements Repeat, searching from where we left off, until done

Partitioning


To partition a[left...right]

1. Set p = a[left], l = left + 1, r = right; 2. while l < r, do 2.1. while l < right & a[l] < p, set l = l + 1 2.2. while r > left & a[r] >= p, set r = r - 1 2.3. if l < r, swap a[l] and a[r] 3. Set a[left] = a[r], a[r] = p 4. Terminate

Example of partitioning
        

choose pivot: search: swap: search: swap: search: swap: search: swap with pivot:

436924312189356 436924312189356 433924312189656 433924312189656 433124312989656 433124312989656 433122314989656 4 3 3 1 2 2 3 1 4 9 8 9 6 5 6 (left > right) 133122344989656

## Analysis of Quick sortbest case sort



Suppose each partition operation divides the array almost exactly in half Then the depth of the recursion is log2n


## However, there are many recursions!

How can we figure this out?  We note that


Each partition is linear over its subarray  All the partitions at one level cover the array


## Partitioning at various levels

Best case
  

 

We cut the array size in half each time So the depth of the recursion in log2n At each level of the recursion, all the partitions at that level do work that is linear in n O(log2n) * O(n) = O(n log2n) Hence in the average case, quicksort has time complexity O(n log2n) What about the worst case?

Worst case


In the worst case, partitioning always divides the size n array into these three parts:
A length one part, containing the pivot itself  A length zero part, and  A length n-1 part, containing everything else


 

We dont recur on the zero-length part zeroRecurring on the length n-1 part requires (in the worst case) recurring to depth n-1

## Worst case for quicksort



  

In the worst case, recursion may be n levels deep (for an array of size n) But the partitioning work done at each level is still n O(n) * O(n) = O(n2) So worst case for Quicksort is O(n2) When does this happen?


## Typical case for quicksort



  

If the array is sorted to begin with, Quicksort is terrible: O(n2) It is possible to construct other bad cases However, Quicksort is usually O(n log2n) The constants are so good that Quicksort is generally the fastest algorithm known Most real-world sorting is done by Quicksort real-

## Picking a better pivot



Before, we picked the first element of the subarray to use as a pivot  If the array is already sorted, this results in O(n2) behavior  Its no better if we pick the last element We could do an optimal quicksort (guaranteed O(n log n)) if we always picked a pivot value that n)) exactly cuts the array in half  Such a value is called a median: half of the values median: in the array are larger, half are smaller  The easiest way to find the median is to sort the array and pick the value in the middle (!)

Median of three
 

Obviously, it doesnt make sense to sort the array in order to find the median to use as a pivot Instead, compare just three elements of our (sub)array (sub)array the first, the last, and the middle  Take the median (middle value) of these three as pivot  Its possible (but not easy) to construct cases which will make this technique O(n2) Suppose we rearrange (sort) these three numbers so that the smallest is in the first position, the largest in the last position, and the other in the middle  This lets us simplify and speed up the partition loop

Heap Sort
  

Uses a heap as its data structure In-place sorting algorithm memory efficient Time complexity O(n log(n))

What is a Heap
A heap is also known as a priority queue and can be represented by a binary tree with the following properties:


## Structure property: A heap is a completely filled binary tree property:

with the exception of the bottom row, which is filled from left to right

Heap Order property: For every node x in the heap, the property:
parent of x greater than or equal to the value of x. (known as a maxHeap).

Example:
a heap
44 53

25

15

21

13

18

12

Algorithm


Step 1. Build Heap O(n)  Build binary tree taking N items as input, ensuring the heap structure property is held, in other words, build a complete binary tree.  Heapify the binary tree making sure the binary tree satisfies the Heap Order property. Step 2. Perform n deleteMax operations O(log(n))  Delete the maximum element in the heap which is the root node, and place this element at the end of the sorted array.

Simplifying things


For speed and efficiency we can represent the heap with an array. Place the root at array index 1, its left child at index 2, its right child at index 3, so on and so forth
53 44 25 15 21 13 18 3 12 5 7

53

44

25

15

21

13

18

12

10 11

53 44 25 15 21 13 18 3 12 5 7

53

44

25

15

21

13

18

12

10 11

## For any node i, the following formulas apply:

The index of its parent = i / 2 Index of left child = 2 * i Index of right child = 2 * i + 1

Sample Run


## Start with unordered array of data

21 15 25 3 5 12 7 19 45 2 9

Array representation:

## Binary tree representation:

21 15 3 5 12 25 7

19

45

Sample Run


## Heapify the binary tree 21 15 3 19 45 2 5 9 12 25 7 3 19 45 2 15 9 5 12 21 25 7 45 19 3 2 15 9 5 12 21 25 7

45 21 19 15 3 2 9 5 12 25 7 19 15 3 2 45 9

21 25 12 5 7 45 19 3 2 15 9

21 25 12 5 7

## Step 2 perform n 1 deleteMax(es), and replace last element in heap with

first, then re-heapify. Place deleted element in the last nodes position. re45 21 19 15 3 2
45 21 25 19 9

25 25 21 7 19 15 3 2
15 3 2 5 25 21 12 19 9 5 7 15 3 2 45

12 9 5 7

9 5
12 7

12

25 21 19 15 3 2
25 21 12 19 9 5 7 15 3 2 45

21 12 19 7 2 15 3
5 7 2 3 25 45

12 9 5 7

21 19 12 15 9

21 19 15 2 3
5 7 2 3 25 45

19 12 15 7 2
19 15 12 3 9 5 7 2 21 25 45

12 9 5 7

21 19 12 15 9

19 15 3 2
19 15 12 3 9 5 7 2 21 25 45 15 9 12 3 2

15 12 9 7 3 2 5 12 7

19 21 25 45

15 9 3
15 9 12 3

12 12 9 7
19 21 25 45

7 2 5
5 15 19 21 25 45

2
2 5 7

3
12 9 7 3

12 9 3
12 9 7 3

9 7 5 3
9 5 7 3

7 2
2 12 15 19 21 25 45

2
2 5

5
15 19 21 25 45

and finally

12 15 19 21 25 45

Conclusion
 

 

1st Step- Build heap, O(n) time complexity Step2nd Step perform n deleteMax operations, each with O(log(n)) time complexity total time complexity = O(n log(n)) Pros: fast sorting algorithm, memory efficient, especially for very large values of n. Cons: slower of the O(n log(n)) sorting algorithms