Documente Academic
Documente Profesional
Documente Cultură
Sorting Algorithms
Tahira Mahboob
Bubble sort
Compare each element (except the last one) with its neighbor to the right If they are out of order, swap them This puts the largest element at the very end The last element is now in the correct and final place Compare each element (except the last two) with its neighbor to two) the right If they are out of order, swap them This puts the second largest element next to last The last two elements are now in their correct and final places Compare each element (except the last three) with its neighbor three) to the right Continue as above until you have no unsorted elements on the left
for (outer = size - 1; outer > 0; outer--) { outer--) for (inner = 0; inner < outer; inner++) { if (a[inner] > a[inner + 1]) { // code for swap omitted } }}
Let n = size-1 = size of the array sizeThe outer loop is executed n-1 times (call it n, thats close enough) Each time the outer loop is executed, the inner loop is executed Inner loop executes n-1 times at first, linearly dropping to just once On average, inner loop executes about n/2 times for each execution of the outer loop
In the inner loop, the comparison is always done (constant time), the swap might be done (also constant time) Result is n * n/2 * k, that is, O(n2/2 + k) = O(n2) k,
Implemented by inserting a particular element at the appropriate position. First iteration starts with the comparison of 1st element with the 0th element In second iteration 2nd element is compared with the 0th element is compared with the 0th and 1st element. In general, in every iteration an element is compared with all the elements before it. And if element in question can be inserted at suitable position, then space created for it by shifting elements one position to the right
Insertion Sort
Insertion sort
5 7 0 3 4 2 6 1 (0) 5 7 0 3 4 2 6 1 (0) 0 5 7 3 4 2 6 1 (2) 0 3 5 7 4 2 6 1 (2) 0 3 4 5 7 2 6 1 (2) 0 2 3 4 5 7 6 1 (4) 0 2 3 4 5 6 7 1 (1) 0 1 2 3 4 5 6 7 (6)
Insertion sort
void insertionsort(int a [ ], int size) insertionsort( { int i, temp, j; for(i=1;i<size; for(i=1;i<size; i++) { temp=a[i]; for(j=i-1;j>=0;j--) for(j=i-1;j>=0;j--) { if(a[j]>temp) a[j+1]=a[j]; else break; } a[j+1]=temp; } }
We run once through the outer loop, inserting each of n elements; this is a factor of n On average, there are n/2 elements already sorted The inner loop looks at (and moves) half of these This gives a second factor of n/4 Hence, the time required for an insertion sort of an array of n elements is proportional to n2/4 Discarding constants, we find that insertion sort is O(n2)
Summary
Bubble sort and insertion sort are all O(n2) As we will see later, we can do much better than this with somewhat more complicated sorting algorithms Within O(n2), Bubble sort is very slow, and should probably never be used for anything Insertion sort is usually the fastest of the two--in fact, for two--in small arrays (say, 10 or 15 elements), insertion sort is faster than more complicated sorting algorithms Aother algorithm selection sort and insertion sort are good enough for small arrays.
Base case: the problem is small enough, solve directly Divide the problem into two or more similar and smaller subproblems Recursively solve the subproblems Combine solutions to the subproblems We will study two divide and Conquer algorithms
Base case
Recursively
Algorithm
void mergesort(int lo, int hi) mergesort(int { if (lo<hi) { int m=(lo+hi)/2; mergesort(lo, m); mergesort(m+1, hi); merge(lo, m, hi); } }
MergeSort
MergeSort (Example) - 2
MergeSort (Example) - 3
MergeSort (Example) - 4
MergeSort (Example) - 5
MergeSort (Example) - 6
MergeSort (Example) - 7
MergeSort (Example) - 8
MergeSort (Example) - 9
MergeSort (Example) - 10
MergeSort (Example) - 11
MergeSort (Example) - 12
MergeSort (Example) - 13
MergeSort (Example) - 14
MergeSort (Example) - 15
MergeSort (Example) - 16
MergeSort (Example) - 17
MergeSort (Example) - 18
MergeSort (Example) - 19
MergeSort (Example) - 20
MergeSort (Example) - 21
MergeSort (Example) - 22
14 23 45 98
6 33 42 67
14 23 45 98
6 33 42 67
Merge
14 23 45 98 6 Merge
6 33 42 67
14 23 45 98 6 14 Merge
6 33 42 67
14 23 45 98 6 14 23 Merge
6 33 42 67
14 23 45 98 6 14 23 33 Merge
6 33 42 67
14 23 45 98 6 14 23 33 42 Merge
6 33 42 67
14 23 45 98
6 33 42 67
6 14 23 33 42 45 Merge
14 23 45 98
6 33 42 67
6 14 23 33 42 45 67 Merge
14 23 45 98
6 33 42 67
6 14 23 33 42 45 67 98 Merge
MergeMerge-Sort Analysis
Time
Most of the work is in the merging Total time: O(n log n) O(n), more space than other sorts.
Space:
Quicksort
To sort a[left...right]
1. if left < right 1.1. Partition a[left...right] such that all a[left...p-1] are less than a[p], and a[left...pall a[p+1...right] are >= a[p] 1.2. Quicksort a[left...p-1] a[left...p1.3. Quicksort a[p+1...right] 2. Terminate
Partitioning
numbers < p
Partitioning
Choose an array value (say, the first) to use as the pivot Starting from the left end, find the first element that is greater than or equal to the pivot Searching backward from the right end, find the first element that is less than the pivot Interchange (swap) these two elements Repeat, searching from where we left off, until done
Partitioning
To partition a[left...right]
1. Set p = a[left], l = left + 1, r = right; 2. while l < r, do 2.1. while l < right & a[l] < p, set l = l + 1 2.2. while r > left & a[r] >= p, set r = r - 1 2.3. if l < r, swap a[l] and a[r] 3. Set a[left] = a[r], a[r] = p 4. Terminate
Example of partitioning
choose pivot: search: swap: search: swap: search: swap: search: swap with pivot:
436924312189356 436924312189356 433924312189656 433924312189656 433124312989656 433124312989656 433122314989656 4 3 3 1 2 2 3 1 4 9 8 9 6 5 6 (left > right) 133122344989656
Suppose each partition operation divides the array almost exactly in half Then the depth of the recursion is log2n
Each partition is linear over its subarray All the partitions at one level cover the array
Best case
We cut the array size in half each time So the depth of the recursion in log2n At each level of the recursion, all the partitions at that level do work that is linear in n O(log2n) * O(n) = O(n log2n) Hence in the average case, quicksort has time complexity O(n log2n) What about the worst case?
Worst case
In the worst case, partitioning always divides the size n array into these three parts:
A length one part, containing the pivot itself A length zero part, and A length n-1 part, containing everything else
We dont recur on the zero-length part zeroRecurring on the length n-1 part requires (in the worst case) recurring to depth n-1
In the worst case, recursion may be n levels deep (for an array of size n) But the partitioning work done at each level is still n O(n) * O(n) = O(n2) So worst case for Quicksort is O(n2) When does this happen?
If the array is sorted to begin with, Quicksort is terrible: O(n2) It is possible to construct other bad cases However, Quicksort is usually O(n log2n) The constants are so good that Quicksort is generally the fastest algorithm known Most real-world sorting is done by Quicksort real-
Before, we picked the first element of the subarray to use as a pivot If the array is already sorted, this results in O(n2) behavior Its no better if we pick the last element We could do an optimal quicksort (guaranteed O(n log n)) if we always picked a pivot value that n)) exactly cuts the array in half Such a value is called a median: half of the values median: in the array are larger, half are smaller The easiest way to find the median is to sort the array and pick the value in the middle (!)
Median of three
Obviously, it doesnt make sense to sort the array in order to find the median to use as a pivot Instead, compare just three elements of our (sub)array (sub)array the first, the last, and the middle Take the median (middle value) of these three as pivot Its possible (but not easy) to construct cases which will make this technique O(n2) Suppose we rearrange (sort) these three numbers so that the smallest is in the first position, the largest in the last position, and the other in the middle This lets us simplify and speed up the partition loop
Heap Sort
Uses a heap as its data structure In-place sorting algorithm memory efficient Time complexity O(n log(n))
What is a Heap
A heap is also known as a priority queue and can be represented by a binary tree with the following properties:
Heap Order property: For every node x in the heap, the property:
parent of x greater than or equal to the value of x. (known as a maxHeap).
Example:
a heap
44 53
25
15
21
13
18
12
Algorithm
Step 1. Build Heap O(n) Build binary tree taking N items as input, ensuring the heap structure property is held, in other words, build a complete binary tree. Heapify the binary tree making sure the binary tree satisfies the Heap Order property. Step 2. Perform n deleteMax operations O(log(n)) Delete the maximum element in the heap which is the root node, and place this element at the end of the sorted array.
Simplifying things
For speed and efficiency we can represent the heap with an array. Place the root at array index 1, its left child at index 2, its right child at index 3, so on and so forth
53 44 25 15 21 13 18 3 12 5 7
53
44
25
15
21
13
18
12
10 11
53 44 25 15 21 13 18 3 12 5 7
53
44
25
15
21
13
18
12
10 11
Sample Run
Array representation:
19
45
Sample Run
45 21 19 15 3 2 9 5 12 25 7 19 15 3 2 45 9
21 25 12 5 7 45 19 3 2 15 9
21 25 12 5 7
25 25 21 7 19 15 3 2
15 3 2 5 25 21 12 19 9 5 7 15 3 2 45
12 9 5 7
9 5
12 7
12
25 21 19 15 3 2
25 21 12 19 9 5 7 15 3 2 45
21 12 19 7 2 15 3
5 7 2 3 25 45
12 9 5 7
21 19 12 15 9
21 19 15 2 3
5 7 2 3 25 45
19 12 15 7 2
19 15 12 3 9 5 7 2 21 25 45
12 9 5 7
21 19 12 15 9
19 15 3 2
19 15 12 3 9 5 7 2 21 25 45 15 9 12 3 2
15 12 9 7 3 2 5 12 7
19 21 25 45
15 9 3
15 9 12 3
12 12 9 7
19 21 25 45
7 2 5
5 15 19 21 25 45
2
2 5 7
3
12 9 7 3
12 9 3
12 9 7 3
9 7 5 3
9 5 7 3
7 2
2 12 15 19 21 25 45
2
2 5
5
15 19 21 25 45
and finally
12 15 19 21 25 45
Conclusion
1st Step- Build heap, O(n) time complexity Step2nd Step perform n deleteMax operations, each with O(log(n)) time complexity total time complexity = O(n log(n)) Pros: fast sorting algorithm, memory efficient, especially for very large values of n. Cons: slower of the O(n log(n)) sorting algorithms