Introduction to Design and analysis of algorithms
CONTENTS
MODULE – I
Lecture 1 - Introduction to Design
and analysis of algorithms
Lecture 2 - Growth of Functions (
Asymptotic notations)
Lecture 3 - Recurrences, Solution
of Recurrences by substitution
Lecture 4 - Recursion tree
method
Lecture 5 - Master Method
Lecture 6 - Worst case analysis of merge sort, quick sort
and binary search
Lecture 7 - Design and analysis of
Divide and Conquer Algorithms
Lecture 8 - Heaps and Heap sort
Lecture 9 - Priority Queue
Lecture 10 - Lower Bounds for
Sorting
MODULE -II
Lecture 11 - Dynamic Programming
algorithms
Lecture 12 - Matrix Chain
Multiplication
Lecture 13 - Elements of Dynamic
Programming
Lecture 14 - Longest Common
Subsequence
Lecture 15 - Greedy Algorithms
Lecture 16 - Activity Selection
Problem
Lecture 17 - Elements of Greedy
Strategy
Lecture 18 - Knapsack Problem
Lecture 19 - Fractional Knapsack
Problem
Lecture 20 - Huffman Codes
MODULE - III
Lecture 21 - Data Structure for
Disjoint Sets
Lecture 22 - Disjoint Set
Operations, Linked list Representation
Lecture 23 - Disjoint Forests
Lecture 24 - Graph Algorithm - BFS
and DFS
Lecture 25 - Minimum Spanning
Trees
Lecture 26 - Kruskal algorithm
Lecture 27 - Prim's Algorithm
Lecture 28 - Single Source
Shortest paths
Lecture 29 - Bellmen Ford
Algorithm
Lecture 30 - Dijkstra's Algorithm
MODULE -IV
Lecture 31 - Fast Fourier
Transform
Lecture 32 - String matching
Lecture 33 -
Rabin-Karp Algorithm Lecture 34 - NP-Completeness
Lecture 35 - Polynomial time
verification
Lecture 36 - Reducibility
Lecture 37 - NP-Complete Problems
(without proofs)
Lecture 38 - Approximation
Algorithms
Lecture 39 - Traveling Salesman
Problem
MODULE
- I
• Lecture 1 -
Introduction to Design and analysis of
algorithms
• Lecture 2 -
Growth of Functions ( Asymptotic notations)
• Lecture 3 -
Recurrences, Solution of Recurrences by substitution
• Lecture 4 -
Recursion tree method
• Lecture 5 -
Master Method
• Lecture 6 -
Design and analysis of Divide and Conquer Algorithms
• Lecture 7 -
Worst case analysis of merge sort, quick sort and binary search
• Lecture 8 -
Heaps and Heap sort
• Lecture 9 -
Priority Queue
• Lecture 10 -
Lower Bounds for Sorting
Lecture 1 - Introduction to Design and analysis of algorithms
What is an
algorithm?
Algorithm
is a set of steps to complete a task.
For
example,
Task:
to make a cup of tea.
Algorithm:
•
add water and milk to the kettle,
•
boilit, add tea leaves,
•
Add sugar, and then serve it in cup.
What is Computer algorithm?
‘’a set of steps to accomplish or complete a task that is described
precisely enough that a computer can run it’’.
Described
precisely: very difficult for a machine to know how much water, milk
to be added etc. in the above tea making algorithm.
These algorithmsrun on computers or computational
devices.Forexample, GPS in our smartphones, Google hangouts.
GPS uses shortest path
algorithm. Online shopping uses cryptography which uses RSA algorithm.
Characteristics of an algorithm:-
•
Must take an input.
•
Must give some output(yes/no,valueetc.)
•
Definiteness –each instruction is clear and
unambiguous.
•
Finiteness –algorithm terminates after a finite number
of steps.
•
Effectiveness –every instruction must be basic i.e.
simple instruction.
Expectation from an algorithm
•
Correctness:-
Correct: Algorithms must produce
correct result.
Produce an incorrect
answer:Even if it fails to give correct results all the time still there is a
control on how often it gives wrong result. Eg.Rabin-Miller PrimalityTest (Used
in RSA algorithm): It doesn’t give correct answer all the time.1 out of 250
times it gives incorrect result.
Approximation algorithm:
Exact solution is not found, but near optimal solution can be found out.
(Applied to optimization problem.)
•
Less resource usage:
Algorithms should use less
resources (time and space).
Resource
usage:
Here, the time is considered to be the primary measure of
efficiency .We are also concerned with how much the respective algorithm
involves the computer memory.But mostly time is the resource that is dealt
with. And the actual running time depends on a variety of backgrounds: like the
speed of the Computer, the language in which the algorithm is implemented, the
compiler/interpreter, skill of the programmers etc.
So, mainly
the resource usage can be divided into: 1.Memory (space) 2.Time
Time taken by an algorithm?
performance measurement or
Apostoriori Analysis: Implementing the
algorithm in a machine and then calculating the time taken by the system to execute
the program successfully.
Performance Evaluation or Apriori
Analysis. Before implementing the algorithm in a system. This is done as
follows
1. How long
the algorithm takes :-will be represented as a function of the size of the
input.
f(n)→how long it takes if ‘n’ is the size of input.
2. How fast
the function that characterizes the running time grows with the input size.
“Rate of growth of running time”.
The algorithm with less rate of growth of running time is considered
better.
How algorithm is a technology ?
Algorithms are just like a technology. We all use latest and
greatest processors but we need to run implementations of good algorithms on
that computer in order to properly take benefits of our money that we spent to
have the latest processor. Let’s make this example more concrete by pitting a
faster computer(computer A) running a sorting algorithm whose running time on n
values grows like n2 against a slower computer (computer B) running
asorting algorithm whose running time grows like n lg n. They eachmust sort an
array of 10 million numbers. Suppose that computer A executes 10 billion
instructions per second (faster than anysingle sequential computer at the time
of this writing) and computer B executes only 10 million instructions per
second, so that computer A is1000 times faster than computer B in raw computing
power. To makethe difference even more dramatic, suppose that the world’s
craftiestprogrammer codes in machine language for computer A, and the resulting
code requires 2n2 instructions to sort n numbers. Suppose
furtherthat just an average programmer writes for computer B, using a highlevel
language with an inefficient compiler, with the resulting code taking 50n lg n
instructions.
Computer A (Faster) Computer B(Slower)
Running time grows like n2.
Grows innlogn.
10 billion instructions per sec. 10million
instruction per sec
2n2
instruction. 50
nlogn instruction.
Time taken=
It is more than 5.5hrs
it is under 20 mins.
So choosing a good
algorithm (algorithm with slower rate of growth) as used by computer B affects
a lot.
Lecture 2 - Growth of Functions ( Asymptotic notations)
Before going for growth of functions and asymptotic
notation let us see how to analyase an algorithm.
How to
analyse an Algorithm
Let us form an algorithm for Insertion sort (which sort a
sequence of numbers).The pseudo code for the algorithm is give below.
Pseudo
code:
for j=2 to
A length -------------------------------------------------- C1 key=A[j]-----------------------------------------------------------------C2
//Insert A[j] into sorted Array A[1.....j-1]------------------------C3
i=j-1------------------------------------------------------------------------C4
while i>0 &
A[j]>key---------------------------------------------------C5 A[i+1]=A[i]---------------------------------------------------------------C6
i=i-1------------------------------------------------------------------------C7
A[i+1]=key----------------------------------------------------------------C8
Let Ci be the cost of ith line. Since comment lines
will not incur any cost C3=0.
Cost
No. Of times Executed
C1n
C2 n-1
C3=0 n-1
C4n-1
C5
C6)
C7
C8n-1
Running time of the algorithm is:
T(n)=C1n+C2(n-1)+0(n-1)+C4(n-1)+C5)+ C8(n-1)
Best case:
It occurs when Array is sorted.
All tjvalues are 1.
T(n)=C1n+C2(n-1)+0 (n-1)+C4(n-1)+C5)+ C8(n-1)
=C1n+C2
(n-1) +0 (n-1) +C4 (n-1) +C5+ C8 (n-1)
= (C1+C2+C4+C5+ C8)
n-(C2+C4+C5+ C8)
•
Which is of the forman+b.
•
Linear function of n.So, linear growth.
Worst case:
It occurs when Array is reverse sorted, and tj =j
T(n)=C1n+C2(n-1)+0 (n-1)+C4(n-1)+C5)+ C8(n-1)
=C1n+C2(n-1)+C4(n-1)+C5)+ C8(n-1)
which is of the form an2+bn+c
Quadratic function. So in worst case insertion set grows in n2.
Why we concentrate on worst-case running time?
• The
worst-case running time gives a guaranteed upper bound on the runningtime for
any input.
• For some
algorithms, the worst case occurs often. For example, when searching, the worst
case often occurs when the item being searched for is not present, and searches
for absent items may be frequent.
•
Why not analyze the average case? Because it’s often
about as bad as the worst case.
Order of growth:
It is described by the highest degree term of the formula for
running time. (Drop lower-order terms. Ignore the constant coefficient in the
leading term.)
Example: We found out that for insertion sort the worst-case
running time is of the form an2
+ bn + c.
Drop lower-order terms. What remains is an2.Ignore
constant coefficient. It results in n2.But
we cannot say that the worst-case running time T(n) equals n2
.Rather It grows like n2 . But it doesn’t equal n2.We say
that the running time is Θ (n2) to capture the notion that the order
of growth is n2.
We usually consider
one algorithm to be more efficient than another if its worst-case running time
has a smaller order of growth. Asymptotic notation
•
It is a way to describe the characteristics of a
function in the limit.
•
It describes the rate of growth of functions.
•
Focus on what’s important by abstracting away
low-order terms and constant factors.
•
It is a way to compare “sizes” of functions:
O≈ ≤
Ω≈
≥
Θ
≈ =
o ≈ <
ω
≈ >
Example: n2 /2 − 2n = Θ (n2), with c1 =
1/4, c2 = 1/2, and n0 = 8.
Lecture 3-5: Recurrences, Solution
of Recurrences by substitution, Recursion Tree and Master Method
Recursion is a particularly powerful kind of reduction,
which can be described loosely as follows:
• If the given
instance of the problem is small or simple enough, just solve it.
• Otherwise,
reduce the problem to one or more simpler instances of the same problem.
Recursion is
generally expressed in terms of recurrences. In other words, when an algorithm
calls to itself, we can often describe its running time by a recurrence
equation which describes the overall running time of a problem of size
n in terms of the running time on smaller inputs.
E.g.the worst case running time T(n) of the merge sort procedure by
recurrence can be expressed as
T(n)= ϴ(1)
; if n=1
2T(n/2) + ϴ(n) ;if n>1
whose solution can be found as T(n)=ϴ(nlog n)
There are various techniques to solve recurrences.
1. SUBSTITUTION
METHOD:
The substitution method comprises of 3 steps
i.
Guess the form of the solution
ii.
Verify by induction iii. Solve for constants
We
substitute the guessed solution for the function when applying the inductive
hypothesis to smaller values. Hence the name “substitution method”. This method
is powerful, but we must be able to guess the form of the answer in order to
apply it.
e.g.recurrence equation: T(n)=4T(n/2)+n
step 1: guess the
form of solution
T(n)=4T(n/2)
F(n)=4f(n/2) F(2n)=4f(n)
F(n)=n2
So, T(n) is order of n2
Guess T(n)=O(n3)
Step 2: verify the
induction
Assume T(k)<=ck3
T(n)=4T(n/2)+n
<=4c(n/2)3 +n
<=cn3/2+n
<=cn3-(cn3/2-n)
T(n)<=cn3 as (cn3/2 –n) is always
positive So what we assumed was true.
T(n)=O(n3)
Step 3: solve for
constants
Cn3/2-n>=0
n>=1 c>=2
Now suppose
we guess that T(n)=O(n2) which is tight upper bound
Assume,T(k)<=ck2 so,we should prove that T(n)<=cn2
T(n)=4T(n/2)+n
4c(n/2)2+n cn2+n
So,T(n) will never be less than cn2. But if we will
take the assumption of T(k)=c1 k2-c2k, then we
can find that T(n) = O(n2)
2.
BY ITERATIVE METHOD:
e.g. T(n)=2T(n/2)+n
=> 2[2T(n/4) + n/2 ]+n
=>22T(n/4)+n+n
=> 22[2T(n/8)+ n/4]+2n
=>23T(n/23) +3n
After k iterations ,T(n)=2kT(n/2k)+kn--------------(1)
Sub problem size is 1 after n/2k=1 => k=logn
So,afterlogn iterations ,the sub-problem size will be 1.
So,when k=logn is put in equation 1
T(n)=nT(1)+nlogn
nc+nlogn ( say c=T(1)) O(nlogn)
3.BY
RECURSSION TREE METHOD:
In a recursion tree ,each node represents the cost of a
single sub-problem somewhere in the set of recursive problems invocations .we
sum the cost within each level of the tree to obtain a set of per level
cost,and then we sum all the per level cost to determine the total cost of all
levels of recursion .
Constructing a recursion tree for the recurrence
T(n)=3T(n/4)+cn2
Constructing a recursion tree for the recurrence T (n)= 3T
(n=4) + cn2.. Part (a) shows T (n), which progressively expands in
(b)–(d) to form the recursion tree. The fully expanded tree in part (d) has
height log4n (it has log4n + 1 levels).
Sub problem size at depth i =n/4i
Sub problem size is 1 when n/4i=1 => i=log4n
So, no. of levels =1+ log4n
Cost of each level = (no. of nodes) x (cost of each node)
No. Of nodes at depth i=3i
Cost of each node at depth i=c (n/4i)2
Cost of each level at depth i=3i c (n/4i)2
= (3/16)icn2
T(n)= i=0∑log4n cn2(3/16)i
T(n)= i=0∑log4n
-1 cn2(3/16)i
+ cost of last level
Cost of
nodes in last level =3iT(1) c3log4n (at last level i=log4n) cnlog43
T(n)=
+ c nlog43
<= cn2
<= cn2*(16/13)+ cnlog43 => T(n)=O(n2)
4.BY MASTER METHOD:
The master method solves recurrences of the form
T(n)=aT(n/b)+f(n)
where a>=1 and
b>1 are constants and f(n) is a
asymptotically positive function .
To use the master method, we have to remember 3 cases:
1.
If f(n)=O(nlogba - Ɛ) for
some constants Ɛ >0,then T(n)=ϴ(nlogba)
2.
If f(n)=ϴ( nlogba) then T(n)=ϴ(nlogbalogn)
3.
If f(n)=Ὠ(nlogba+Ɛ) for some constant Ɛ>0 ,and if
a*f(n/b)<=c*f(n) for some constant c<1 and all sufficiently large n,then
T(n)=ϴ(f(n))
e.g. T(n)=2T(n/2)+nlogn
ans: a=2 b=2 f(n)=nlogn
using
2nd formula
f(n)=ϴ(
nlog22logkn)
=>ϴ(n1
logkn)=nlogn =>K=1
T(n)=ϴ( nlog22 log1n)
=>ϴ(nlog2n)
Lecture 6 - Design and analysis of Divide and
Conquer Algorithms
DIVIDE AND CONQUER ALGORITHM
• In this approach
,we solve a problem recursively by applying 3 steps 1. DIVIDE-break the
problem into several sub problems of smaller size.
2. CONQUER-solve the problem recursively.
3. COMBINE-combine these solutions to create a solution to the
original problem.
CONTROL
ABSTRACTION FOR DIVIDE AND CONQUER ALGORITHM
Algorithm D and C (P)
{
if small(P)
then
return S(P)
else
{ divide P into
smaller instances P1 ,P2 .....Pk
Apply D and C to each sub problem
Return combine (D
and C(P1)+ D and C(P2)+.......+D and C(Pk))
}
}
Let a recurrence relation is
expressed as T(n)=
ϴ(1), if n<=C
aT(n/b) + D(n)+ C(n) ,otherwise
then n=input size
a=no. Of sub-problemsn/b= input size of the sub-problems
Lecture 7:
Worst case analysis of merge sort, quick sort
Merge
sort
It is one of the well-known divide-and-conquer algorithm.
This is a simple and very efficient algorithm for sorting a list of numbers.
We are given a sequence of n
numberswhich we will assume is stored in an array A [1...n]. Theobjective is to
output a permutation of this sequence, sorted in increasing order. This is
normally done by permuting the elements within the array A.
How can we apply divide-and-conquer to sorting? Here are the
major elements of the Merge Sort algorithm.
Divide: Split A down the middle into two sub-sequences, each of size
roughly n/2 .
Conquer: Sort each subsequence (by calling MergeSort recursively on
each).
Combine: Merge the two sorted sub-sequences into a single sorted list.
The dividing process ends when we have split the
sub-sequences down to a single item. A sequence of length one is trivially
sorted. The key operation where all the work is done is in the combine
stage,which merges together two sorted lists into a single sorted list. It
turns out that the merging process is quite easy to implement.
The following figure gives a high-level view of the
algorithm. The “divide” phase is shown on the left. It works top-down splitting
up the list into smaller sublists. The “conquer and combine” phases areshown on
the right. They work bottom-up, merging sorted lists together into larger
sorted lists.
Merge Sort
Designing the Merge Sort algorithm top-down. We’ll assume
that the procedure thatmerges two sortedlist is available to us. We’ll
implement it later. Because the algorithm is called recursively on sublists,in
addition to passing in the array itself, we will pass in two indices, which
indicate the first and lastindices of the subarray that we are to sort. The
call MergeSort(A, p, r) will sort the sub-arrayA [ p..r ] and return the sorted
result in the same subarray.
Here is the overview. If r = p, then this means that there is
only one element to sort, and we may returnimmediately. Otherwise (if p < r)
there are at least two elements, and we will invoke the divide-and-conquer. We
find the index q, midway between p and r, namely q = ( p + r ) / 2 (rounded down
to thenearest integer). Then we split the array into subarrays A [ p..q ] and A
[ q + 1 ..r ] . Call Merge Sort recursively to sort each subarray. Finally, we
invoke a procedure (which we have yet to write) whichmerges these two subarrays
into a single sorted array.
MergeSort(array A, int p, int r) { if (p
< r) { // we have at least 2 items
q
= (p + r)/2
MergeSort(A,
p, q) // sort A[p..q] MergeSort(A,
q+1, r) // sort A[q+1..r] Merge(A,
p, q, r) //
merge everything together
}
}
Merging: All that is left is to describe the procedure that
merges two sorted lists. Merge(A, p, q, r)assumes that the left subarray, A [
p..q ] , and the right subarray, A [ q + 1 ..r ] , have already been sorted.We
merge these two subarrays by copying the elements to a temporary working array
called B. Forconvenience, we will assume that the array B has the same index
range A, that is, B [ p..r ] . We have to indices i and j, that point to the
current elements ofeach subarray. We move the smaller element into the next
position of B (indicated by index k) andthen increment the corresponding index
(either i or j). When we run out of elements in one array, thenwe just copy the
rest of the other array into B. Finally, we copy the entire contents of B back
into A.
Merge(array
A, int p, int q, int r) { |
|
// merges A[p..q] with A[q+1..r] |
array B[p..r] |
|
|
i = k = p |
|
//initialize pointers |
j = q+1 |
|
|
while (i <= q and j <= r) { |
|
// while both subarrays are nonempty |
if
(A[i] <= A[j]) B[k++] = A[i++] |
|
//
copy from left subarray |
else B[k++] = A[j++]
|
|
//
copy from right subarray |
} |
|
|
while
(i <= q) B[k++] = A[i++] |
|
// copy any leftover to B |
while
(j <= r) B[k++] = A[j++] |
|
|
for i = p to r do A[i] = B[i] |
// copy B back to A }
|
Analysis: What remains is to analyze the running time of
MergeSort. First let us consider the running timeof the procedure Merge(A, p,
q, r). Let n = r − p + 1 denote the total length of both the leftand right
subarrays. What is the running time of Merge as a function of n? The algorithm
contains fourloops (none nested in the other). It is easy to see that each loop
can be executed at most n times. (Ifyou are a bit more careful you can actually
see that all the while-loops together can only be executed ntimes in total,
because each execution copies one new element to the array B, and B only has
space forn elements.) Thus the running time to Merge n items is Θ ( n ) . Let
us write this without the asymptoticnotation, simply as n. (We’ll see later why
we do this.)
Now, how do we describe the running time of the entire
MergeSort algorithm? We will do this throughthe use of a recurrence, that is, a
function that is defined recursively in terms of itself. To avoidcircularity,
the recurrence for a given value of n is defined in terms of values that are
strictly smallerthan n. Finally, a recurrence has some basis values (e.g. for n
= 1 ), which are defined explicitly.
Let’s see how to apply this to MergeSort. Let T ( n ) denote
the worst case running time of MergeSort onan array of length n. For
concreteness we could count whatever we like: number of lines of
pseudocode,number of comparisons, number of array accesses, since these will
only differ by a constant factor.Since all of the real work is done in the
Merge procedure, we will count the total time spent in theMerge procedure.
First observe that if we call MergeSort with a list
containing a single element, then the running time is aconstant. Since we are
ignoring constant factors, we can just write
T ( n ) =1 . When we call MergeSortwith a list of length n >1 , e.g.
Merge(A, p, r), where r − p +1 = n, the algorithm first computes q = ( p + r )
/ 2 . The subarray A [ p..q ] , which contains q − p + 1 elements. You can
verify that is of size n/ 2 . Thus the
remaining subarray A [ q +1 ..r ] has n/ 2 elements in it. How long does it
taketo sort the left subarray? We do not know this, but because n/ 2< n for
n >1 , we can express this as T (n/ 2) . Similarly, we can express the time
that it takes to sort the right subarray as T (n/ 2).
Finally, to merge both sorted lists takes n time, by the
comments made above. In conclusion we have
T ( n ) =1 if n = 1 ,
2T (n/ 2) + n otherwise.
Solving the above recurrence we can see that merge sort has
a time complexity of Θ (n log n) .
QUICKSORT
Worst-case running time: O (n2).
Expected running time: O (n lgn).
Sorts in place.
Description of quicksort
Quicksort is based on the three-step process of
divide-and-conquer.
• To sort the
subarrayA[p . . r ]:
Divide: Partition A[p
. . r ], into two (possibly empty) subarraysA[p . . q − 1] and
A[q + 1 . . r ], such that each element in the ÞrstsubarrayA[p
. . q − 1] is ≤ A[q] and A[q] is ≤ each element in the second
subarrayA[q + 1 . . r ].
Conquer: Sort the two
subarrays by recursive calls to QUICKSORT.
Combine: No work is
needed to combine the subarrays, because they are sorted in place.
• Perform the
divide step by a procedure PARTITION, which returns the index q that marks the position separating the
subarrays. QUICKSORT (A, p, r) ifp < r
thenq ←PARTITION(A, p, r )
QUICKSORT (A, p, q − 1)
QUICKSORT (A, q + 1, r)
Initial
call is QUICKSORT (A, 1, n)
Partitioning
Partition subarrayA
[p . . . r] by the following
procedure:
PARTITION (A, p, r) x
← A[r ] i ← p
–1
for j ← p to r –1
do if A[ j ] ≤ x
theni ← i + 1 exchangeA[i ] ↔ A[ j
]
exchangeA[i
+ 1] ↔ A[r ]
returni + 1
• PARTITION
always selects the last element A[r ] in the subarrayA[p . . r ] as the pivot
the element around which to partition.
• As the
procedure executes, the array is partitioned into four regions, some of which
may be empty:
Performance
of Quicksort
The running time of Quicksort depends on the partitioning of
the subarrays:
• If the
subarrays are balanced, then Quicksort can run as fast as mergesort.
• If they are
unbalanced, then Quicksort can run as slowly as insertion sort.
Worst case
• Occurs when
the subarrays are completely unbalanced.
• Have 0
elements in one subarray and n − 1
elements in the other subarray.
• Get the
recurrence
T (n) = T (n − 1) + T (0) + Θ
(n)
= T (n − 1) + Θ (n)
= O (n2) .
• Same running
time as insertion sort.
• In fact, the
worst-case running time occurs when Quicksort takes a sorted array as input,
but insertion sort runs in O(n) time
in this case.
Best case
• Occurs when
the subarrays are completely balanced every time.
• Each
subarray has ≤ n/2 elements.
• Get the
recurrence
T (n) = 2T (n/2) + Θ (n) = O(n lgn).
Balanced partitioning
• QuickPort’s
average running time is much closer to the best case than to the worst case.
• Imagine that
PARTITION always produces a 9-to-1 split.
• Get the
recurrence
T (n) ≤ T (9n/10) + T
(n/10) + _ (n) = O (n lgn).
• Intuition:
look at the recursion tree.
• It’s like
the one for T (n) = T (n/3) + T (2n/3)
+ O (n).
• Except that
here the constants are different; we get log10 n full levels and log10/9
n levels that are nonempty.
• As long as
it’s a constant, the base of the log doesn’t matter in asymptotic notation.
• Any split of
constant proportionality will yield a recursion tree of depth O (lgn).
Lecture 8 - Heaps and Heap sort
HEAPSORT
Inplace algorithm
Running Time: O(n log n) Complete Binary Tree
The (binary) heap data structure is an
array object that we can view as a nearly complete binary tree.Each node of the
tree corresponds to an element of the array. The tree is completely filled on
all levels except possibly the lowest, which is filled from the left up to a
point.
The root of the tree is A[1], and given the
index i
of a node, we can easily compute the indices of its parent, left child, and
right child:
PARENT
(i) => return [i/2]
LEFT
(i) => return 2i
RIGHT
(i) => return 2i+ 1
On most computers, the
LEFT procedure can compute
2i in one instruction by simply shifting the binary representation of i left by
one bit position.
Similarly, the RIGHT procedure can quickly
compute 2i + 1 by shifting the binary representation of i left by one bit
position and then adding in a 1 as the low-order bit.
The PARENT
procedure can compute
[i/2] by shifting i right one bit position. Good
implementations of heapsort often implement these procedures as
"macros" or "inline" procedures.
There are two kinds of binary heaps: max-heaps and min-heaps.
• In
a max-heap,the
max-heap property is that for every
node i other
than the root, A[PARENT(i)] >= A[i] ,that
is, the value of a node is at most the value of its parent. Thus, the largest
element in a max-heap is stored at the root, and the subtree rooted at a node
contains values no larger than that contained at the node itself.
• A
min-heap is organized in the opposite way; the min-heap property is
that for every node i other than the root, A[PARENT(i)<=A[i]
,
The smallest element in a min-heap is at the
root.
The
height of a node in a heap is the number of edges on the longest simple
downward path from the node to a leaf and
The
height of the heap is the height of its root.
Height of a heap of n
elements which is based on a complete binary tree is O(log n).
Maintaining the heap
property
MAX-HEAPIFY
lets the value at A[i] "float down" in the max-heap so that the
subtree rooted at index i obeys the max-heap property.
MAX-HEAPIFY(A,i)
1.
l LEFT(i)
2.
r RIGHT(i)
3.
if A[l] > A[i]
4.
largest l
5.
if A[r] > A[largest]
6.
Largest r
7.
if largest != i
8.
Then exchange A[i] A[largest]
9.
MAX-HEAPIFY(A,largest)
At each step, the
largest of the elements A[i], A[LEFT(i)], and A[RIGHT(i)] is
determined, and its index is stored in largest. If A[i] is largest,
then the subtree rooted at node i is already a max-heap and the procedure
terminates. Otherwise, one of the two children has the largest element, and A[i
] is swapped with A[largest], which causes node i and its children to satisfy
the max-heap property. The node indexed by largest, however, now has the original value A[i], and
thus the subtree rooted at
largest might violate the max-heap property.
Consequently,
we call MAX-HEAPIFY
recursively on that subtree.
Figure:
The action of MAX-HEAPIFY
(A, 2), where heap-size =
10. (a) The initial configuration,
with A [2] at node i = 2 violating the max-heap property since it is not larger
than both children. The max-heap property is restored for node 2 in (b) by exchanging A [2] with A[4],
which destroys the max-heap property for node 4. The recursive call
MAX-HEAPIFY
(A,4) now has i = 4. After swapping A[4] with A[9], as shown in (c), node 4 is fixed up, and the
recursive call MAX-HEAPIFY(A,
9) yields no further change to the data structure.
The running time of
MAX-HEAPIFY
by the recurrence can be described as
T (n) < = T (2n/3) + O (1)
The solution to this recurrence is T(n)=O(log n)
Building a heap
Build-Max-Heap(A)
1.
for i [n/2]
to 1
2.
do MAX-HEAPIFY(A,i)
4 |
1 |
3 |
2 |
1 |
9 |
1 |
1 |
8 |
7 |
We can derive a tighter bound by observing
that the time for MAX-HEAPIFY
to run at a node varies with the height of the node in the tree, and the
heights of most nodes are small. Our tighter analysis relies on the properties
that an n-element heap has height [log n] and at most [n/2h+1] nodes
of any height h.
The total cost of BUILD-MAX-HEAP
as being bounded is T(n)=O(n)
The HEAPSORT Algorithm
HEAPSORT(A)
1.
BUILD MAX-HEAP(A)
2.
for i=n to 2
3.
exchange A[1] with A[i]
4.
MAX-HEAPIFY(A,1)
1 |
2 |
3 |
4 |
7 |
8 |
9 |
10 |
14 |
16 |
A
TheHEAPSORT
procedure takes time O(n log n), since the call to
BUILD-MAX-
HEAP takes time O(n) and
each of the n - 1 calls to MAX-HEAPIFY
takes time O(log n).
Lecture 10: Lower Bounds For Sorting
Review of
Sorting: So far we have seen a number of algorithms for sorting a
list of numbers in ascendingorder. Recall that an in-place sorting algorithm is one that uses no additional array
storage (however,we allow Quicksort to be called in-place even though they need
a stack of size O(log n) for keepingtrack of the recursion). A
sorting algorithm is stable if
duplicate elements remain in the same relativeposition after sorting.
Slow
Algorithms: Include BubbleSort, InsertionSort, and SelectionSort. These
are all simple Θ (n2)in-place sorting algorithms. BubbleSort and
InsertionSort can be implemented as stable algorithms,but SelectionSort cannot
(without significant modifications).
Mergesort: Mergesort is
a stable Θ(n log n) sorting algorithm. The downside is that MergeSort isthe only
algorithm of the three that requires additional array storage, implying that it
is not anin-place algorithm.
Quicksort: Widely
regarded as the fastest of the fast
algorithms. This algorithm is O(n log n) in theexpected case,
and O(n2) in the worst case. The probability that the algorithm takes
asymptoticallylonger (assuming that the pivot is chosen randomly) is extremely
small for large n.
It is an(almost) in-place sorting algorithm but is not
stable.
Heapsort: Heapsort is
based on a nice data structure, called a heap,
which is a fast priority queue.Elements can be inserted into a heap in O(log n) time, and the largest item can be extracted inO(log n) time. (It is also easy to set up a heap for extracting the
smallest item.) If you only wantto extract the k largest values, a heap can allow you to do this is O(n
+ k log n) time. It is anin-place algorithm, but it is not stable.
Lower
Bounds for Comparison-Based Sorting: Can we sort faster than O(n
log n) time?
We will give anargument that if the sorting algorithm is
based solely on making comparisons between the keys in thearray, then it is
impossible to sort more efficiently than (n
log n) time. Such an algorithm is
called acomparison-based sorting
algorithm, and includes all of the algorithms given above.Virtually all known
general purpose sorting algorithms are based on making comparisons, so this
isnot a very restrictive assumption. This does not preclude the possibility of
a sorting algorithm whoseactions are determined by other types of operations,
for example, consulting the individual bits ofnumbers, performing arithmetic
operations, indexing into an array based on arithmetic operations onkeys.We
will show that any comparison-based sorting
algorithm for a input sequence ha1; a2;
: : : ; animust
make at least (n log
n) comparisons in the worst-case.
This is still a difficult task if you think about it.It is easy to show that a
problem can be solved fast (just give
an algorithm). But to show that a problemcannot
be solved fast you need to reason in some way about all the possible
algorithms that might everbe written. In fact, it seems surprising that you
could even hope to prove such a thing. The catch hereis that we are limited to
using comparison-based algorithms, and there is a clean mathematical way
ofcharacterizing all such algorithms.
Decision
Tree Argument: In order to prove lower bounds, we need an abstract way of
modeling “any possible”comparison-based sorting algorithm, we model such
algorithms in terms of an abstract modelcalled a decision tree.In a comparison-based
sorting algorithm only comparisons between the keys are used to
determinethe action of the algorithm. Let ha1; a2;
: : : ; anibe the input sequence. Given two elements, aiandaj, their relative
order can only be determined by the results of comparisons likeai<aj, ai<=aj,ai=aj, ai>=aj,
and ai>aj.A decision tree is a
mathematical representation of a sorting algorithm (for a fixed value of n). Eachnode of the decision tree
represents a comparison made in the algorithm (e.g., a4 : a7), and the
twobranches represent the possible results, for example, the left subtree
consists of the remaining comparisonsmade under the assumption that a4 _
a7 and the right subtree for a4 > a7. (Alternatively, onemight be
labeled with a4 < a7 and the other with a4
_ a7.)Observe that once we know the
value of n, then the “action” of the
sorting algorithm is completelydetermined by the results of its comparisons.
This action may involve moving elements around in thearray, copying them to
other locations in memory, performing various arithmetic operations on
nonkeydata. But the bottom-line is that at the end of the algorithm, the keys
will be permuted in the final array in some way. Each leaf in the decision tree
is labeled with the final permutation that the algorithmgenerates after making
all of its comparisons.To make this more concrete, let us assume that n = 3, and let’s build a decision tree
for SelectionSort.Recall that the algorithm consists of two phases. The first
finds the smallest element of the entire list,and swaps it with the first
element. The second finds the smaller of the remaining two items, and swapsit
with the second element. Here is the decision tree (in outline form). The first
comparison is betweena1 and a2. The possible results are:
a1 <= a2: Then a1 is the current
minimum. Next we compare a1 with a3 whose results might
be either:
a1 <=a3: Then we know that a1 is
the minimum overall, and the elements remain in their
originalpositions. Then we pass to phase 2 and compare a2 with a3. The possible results are:
a2 <=a3: Final output is ha1; a2; a3i.
a2 > a3: These two are
swapped and the final output is ha1; a3;
a2i. a1 > a3: Then we know that a3 is the minimum is the overall
minimum, and it is swapped
witha1. The we
pass to phase 2 and compare a2 with a1 (which is now in the third position
ofthe array) yielding either:
a2 <=a1: Final output is ha3; a2;
a1i.
a2 > a1: These two are swapped and the final output is ha3;
a1; a2i.
a1 > a2: Then a2 is the current
minimum. Next we compare a2 with a3 whose results might be
either:
a2 <=a3: Then we know that a2 is
the minimum overall. We swap a2 with a1, and then pass to
phase 2, and compare the remaining items a1 and a3. The possible results are:
a1 <=a3: Final output is ha2; a1;
a3i.
a1 > a3: These two are swapped and the final output is ha2;
a3; a1i.
a2 > a3: Then we know that a3 is
the minimum is the overall minimum, and it is swapped
witha1. We pass to
phase 2 and compare a2 with a1 (which is now in the third position
of thearray) yielding either:
a2<= a1:
Final output is ha3; a2;
a1i.
a2 > a1: These two are swapped and the final output is ha3;
a1; a2i.
The final decision tree is shown below. Note that there are
some nodes that are unreachable. For example,in order to reach the fourth leaf
from the left it must be that a1 _ a2 and a1 > a2, which
cannotboth be true. Can you explain this? (The answer is that virtually all
sorting algorithms, especiallyinefficient ones like selection sort, may make
comparisons that are redundant, in the sense that theiroutcome has already been
determined by earlier comparisons.) As you can see, converting a complexsorting
algorithm like HeapSort into a decision tree for a large value of n will be very tedious andcomplex, but I
hope you are convinced by this exercise that it can be done in a simple
mechanical way.
(Decision
Tree for SelectionSort on 3 keys.)
Using Decision
Trees for Analyzing Sorting: Consider any sorting algorithm. Let T(n)
be the maximumnumber of comparisons that this algorithm makes on any input of
size n. Notice that the running
timefo the algorithm must be at least as large as T(n), since we are not
counting data movement or othercomputations at all. The algorithm defines a
decision tree. Observe that the height of the decisiontree is exactly equal to T(n),
because any path from the root to a leaf corresponds to a sequence
ofcomparisons made by the algorithm.
As we have seen earlier, any binary tree of height T (n)
has at most 2T(n) leaves. This means that thissorting algorithm can distinguish between at most 2T (n)
different final actions. Let’s call this quantityA (n), for the number of
different final actions the algorithm can take. Each action can be thought of
asa specific way of permuting the original input to get the sorted output.How
many possible actions must any sorting algorithm distinguish between? If the
input consists of ndistinct numbers,
then those numbers could be presented in any of n! different permutations. For eachdifferent permutation, the
algorithm must “unscramble” the numbers in an essentially different way,that is
it must take a different action, implying that A(n) >= n!. (Again, A (n) is usually not exactlyequal to n! because most algorithms contain some
redundant unreachable leaves.)
Since A(n) ≤
2T(n) we have 2T(n) ≥
n!, implying that
T(n) ≥
lg(n!):
We can use Stirling’s
approximation for n! yielding:
n! ≥
T(n) ≥
=
Thus we have the following theorem.
Theorem: Any
comparison-based sorting algorithm has worst-case running time (n log n).
This can be generalized to show that the average-case time to sort is also (n log n) (by arguing aboutthe average height of a leaf in a tree with at
least n! leaves). The lower bound on
sorting can begeneralized to provide lower bounds to a number of other problems
as well.
MODULE
-II
• Lecture 11 -
Dynamic Programming algorithms
• Lecture 12 -
Matrix Chain Multiplication
• Lecture 13 -
Elements of Dynamic Programming
• Lecture 14 -
Longest Common Subsequence
• Lecture 15 -
Greedy Algorithms
• Lecture 16 -
Activity Selection Problem
• Lecture 17 -
Elements of Greedy Strategy
• Lecture 18 -
Knapsack Problem
• Lecture 19 -
Fractional Knapsack Problem
• Lecture 20 -
Huffman Codes
Greedy Method
Introduction
Let we are given a problem to sort
the array a = {5, 3, 2, 9}. Someone says the array after sorting
is {1, 3, 5, 7}. Can we consider the answer is correct? The answer is
definitely “no” because the elements
of the output set are not taken from the input set. Let someone says the array
after sorting is {2, 5, 3, 9}. Can we admit the answer? The answer is again “no” because the output is not
satisfying the objective function that is the first element must be less than
the second, the second element must be less than the third and so on.
Therefore, the solution is said to be a feasible solution if it satisfies the
following constraints.
(i)
Explicit
constraints: - The elements of the output set must be taken from the
input set.
(ii) Implicit constraints:-The objective function
defined in the problem.
The best of all possible solutions is called the optimal
solution. In other words we need to find the solution which has the optimal
(maximum or minimum) value satisfying the given constraints.
The Greedy approach constructs the solution through a
sequence of steps. Each step is chosen such that it is the best alternative
among all feasible choices that are available. The choice of a step once made
cannot be changed in subsequent steps.
Let us consider the problem of coin change. Suppose a greedy
person has some 25p, 20p, 10p, 5paise coins. When someone asks him for some
change then be wants to given the change with minimum number of coins. Now, let
someone requests for a change of top then he first selects 25p. Then the
remaining amount is 45p. Next, he selects the largest coin that is less than or
equal to 45p i.e. 25p. The remaining 20p is paid by selecting a 20p coin. So
the demand for top is paid by giving total 3 numbers of coins. This solution is
an optimal solution. Now, let someone requests for a change of 40p then the Greedy
approach first selects 25p coin, then a 10p coin and finally a 5p coin.
However, the some could be paid with two 20p coins. So it is clear from this
example that Greedy approach tries to find the optimal solution by selecting
the elements one by one that are locally optimal. But Greedy method never gives
the guarantee to find the optimal solution.
The
choice of each step is a greedy approach is done based in the following:
• It must be
feasible
• It must be
locally optimal
• It must be
unalterable
Fractional Knapsack Problem
Let there are n
number of objects and each object is having a weight and contribution to
profit. The knapsack of capacity M is
given. The objective is to fill the knapsack in such a way that profit shall be
maximum. We allow a fraction of item to be added to the knapsack.
Mathematically, we can write
n
maximize∑ pi xi
i=1
Subject to
∑n wi xi ≤ M
i=1
1≤ i
≤ n and 0 ≤
xi ≤1.
Where pi
and wi are the profit and
weight of ith object and xi is the fraction of ith object to be
selected.
For example
Given n = 3, (p1,
p2, p3) = {25, 24, 15}
( w1, w2, w3) = {18, 15, 10} M
= 20
Solution
Some
of the feasible solutions are shown in the following table.
Solution No |
x1 |
x2 |
x3 |
∑wi xi |
∑pi xi |
1 |
1 |
2/15 |
0 |
20 |
28.2 |
2 |
0 |
2/3 |
1 |
20 |
31.0 |
3 |
0 |
1 |
1/2 |
20 |
31.5 |
These solutions are obtained by different greedy
strategies.
Greedy strategy I: In this case, the items are arranged by their profit values.
Here the item with maximum profit is selected first. If the weight of the
object is less than the remaining capacity of the knapsack then the object is
selected full and the profit associated with the object is added to the total
profit. Otherwise, a fraction of the object is selected so that the knapsack
can be filled exactly. This process continues from selecting the highest
profitable object to the lowest profitable object till the knapsack is exactly
full.
Greedy strategy II: In this
case, the items are arranged by fair weights. Here the item with minimum weight
in selected first and the process continues like greedy strategy-I till the
knapsack is exactly full.
Greedy strategy III: In this
case, the items are arranged by profit/weight ratio and the item with maximum
profit/weight ratio is selected first and the process continues like greedy
strategy-I till the knapsack is exactly full.
Therefore,
it is clear from the above strategies that the Greedy method generates
optimal solution if we select the objects with respect to their profit to
weight ratios that means the object with maximum profit to weight ratio will be
selected first. Let there are n
objects and the object i is
associated with profit piand
weight wi. Then we can say
that if p1 w1 ≥ p2 w2 ≥LL≥ p wn the solution
(x1, x2, x3LLxn )generated by greedy method is an
optimal solution. The proof of the above statement is left as an exercise for
the readers. The algorithm 6.1 describes the greedy method for finding the
optimal solution for fractional knapsack problem.
AlgorithmFKNAPSACK (p, w, x, n, M)
// p[1:n] and w[1:n] contains the
profit and weight of n objects. Mis the maximum capacity of knapsack and
x[1:n] in the solution vector.//
{
for (i = 1; i<= n; i
++)
x[i]
= 0 ; |
|
//
initialize the solution to 0 // |
cu = M |
|
// cu is the remaining
capacity of the knapsack// |
for (i =1; i<= n ; i
++){
if(w[i] >cu )
break;
else{
x[i]
= 1 ; cu = cu
– w[i] ;
}
} if( i<= n){
x[i] = cu/w[i] ;
returnx;
}
Algorithm 1. Greedy
algorithm for fractional knapsack problem.
Huffman Coding
Each character is
represented in 8 bits when characters are coded using standard codes such as
ASCII. It can be seen that the characters coded using standard codes have
fixed-length code word representation. In this fixed-length coding system the
total code length is more. For example, let we have six characters (a, b, c, d,
e, f) and their frequency of occurrence in a message is {45, 13, 12, 16, 9, 5}.
In fixed-length coding system we can use three characters to represent each
code. Then the total code length of the message is (45+13+12+16+9+5) x 3 = 100
x 3 = 300.
Let us encode the
characters with variable-length coding system. In this coding system, the
character with higher frequency of
occurrence is assigned fewer bits for representation while the characters
having lower frequency of occurrence in assigned more bits for representation.
The variable length code for the characters are shown in the following tableThe
total code length in variable length coding system is 1 × 45 + 3 × 12 + 3 × 16 × 4 × 9 + 4 × 5 = 224. Hence fixed length code
requires 300 bits while variable code requires only 224 bits.
|
a |
b |
c |
d |
e |
f |
0 |
|
101 |
100 |
111 |
1101 |
1100 |
Prefix (Free) Codes
We have seen that
using variable-length code word we minimize the overall encoded string length.
But the question arises whether we can decode the string. If a is encoded 1 instead of 0 then the
encoded string “111” can be decoded as “d” or “aaa”. It can be seen that we get
ambiguous string. The key point to remove this ambiguity is to use prefix
codes. Prefix codes is the code in which there is no codeword that is a prefix
of other codeword.
The representation of
“decoding process” is binary tree whose leaves are characters. We interpret the
binary codeword for a character as path from the root to that character, where
⇒ “0” means
“go to the left child”
⇒ “1” means
“go to the right child” Greedy Algorithm for Huffman
Code:
According to Huffman
algorithm, a bottom up tree is built starting from the leaves. Initially, there
are n singleton trees in the forest,
as each tree is a leaf. The greedy strategy first finds two trees having
minimum frequency of occurrences. Then these two trees are merged in a single
tree where the frequency of this tree is the total sum of two merged trees. The
whole process is repeated until there in only one tree in the forest.
Let us consider a set
of characters S=<a, b, c, d, e, f> with the following
frequency of occurrences P = < 45,
13, 12, 16, 5, 9 >. Initially, these
six characters with their frequencies are considered six singleton trees in the
forest. The step wise merging these trees to a single tree is shown in Fig.
6.3. The merging is done by selecting two trees with minimum frequencies till
there is only one tree in the forest.
a
: 45 |
|
b
: 13 |
|
c
: 12 |
|
d
: 16 |
|
e
: 5 |
|
f
: 9 |
Step wise
merging of the singleton trees.
Now the left branch is
assigned a code “0” and right branch is assigned a code “1”. The decode tree
after assigning the codes to the branches.
The binary codeword
for a character is interpreted as path from the root to that character; Hence,
the codes for the characters are as follows a
= 0 b = 101 c = 100 d = 111
e = 1100 f = 1101
Therefore, it is seen
that no code is the prefix of other code. Suppose we have a code 01111001101.
To decode the binary codeword for a character, we traverse the tree. The first
character is 0 and the character at which the tree traversal terminates is a. Then, the next bit is 1 for which the
tree is traversed right. Since it has not reached at the leaf node, the tree is
next traversed right for the next bit 1. Similarly, the tree is traversed for
all the bits of the code string. When the tree traversal terminates at a leaf
node, the tree traversal again starts from the root for the next bit of the
code string. The character string after decoding is “adcf”.
AlgorithmHUFFMAN(n, S)
{
// n is the number of symbols and S in the set of characters, for each
character c∈ S, the frequency of occurrence in f(c) //
Initialize
the priority queue;
Q = S
; // Inilialize the priority Q with the frequencies of all the characters of
set S// for(i
=1 ; i<= n-i, i++){ z
= CREAT _NODE ( ); // create a node pointed by z; //
// Delete the character with minimum frequency from the Q and
store in node x//
x = DELETE _MIN (Q);
// Delete
the character with next minimum frequency from the Q and store in node y// y
= DELETE_MIN (Q); z→left = x; //
Place x as the left child of z// z→right = y; //
Place y as the right child of z//
//The value of node z
is the sum of values at node x and node y//
f(z) = f(x) + f(y);
//insert z into the priority Q//
INSERT (Q, z);
}
returnDELETE_MIN(Q)
}
Algorithm 2. Algorithm
of Huffman coding.
Activity Selection Problem
Suppose we
have a set of activities S = {a1,
a2 ………an} that with to use a common resource. The
objective is to schedule the activities in such a way that maximum number of
activities can be performed. Each activity ai
has a start timesi and a finish time fi, where 0 ≤ Si<fi< ∞. The
activities ai and aj are said to compatible if
the intervals [si, fi]
and [Sj, fj] do
not overlap that means si ≥ fj
or sj ≥ fi.
For example, let us
consider the following set S of activities, which are sorted in monotonically
increasing order of finish time.
i a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11
si 1
3 0 5
3 5 6
8 8 2
12 fi 4
5 6 7
8 9 10
11 72 13
14
For this example, the
subsets {a3, a9, a11},
{a1, a4, a8,
a11} and {a2, a4,
a9, a11} consist of mutually compatible activities.
We have two largest subsets of mutually compatible activities.
Now, we can devise
greedy algorithm that works in a top down fashion. We assume that the n input activities are ordered by
monotonically increasing finish time or it can be sorted into this order in O(nlog2n) time. The greedy
algorithm for activity selection problem is given below.
AlgorithmACTIVITY
SELECTION (S, f)
{
n = LENGTH (S) ; // n is the total number of activities //
A = {a1}
; // A is the set of selected activities and initialized to a1)// i = 1; // i represents the recently selected activity // for (j = 2 ; j< = n ; j++)
{
if (sj
≥ fi) {
A = A∪ {am} ; i = j ;
}
}
returnA ;
}
Algorithm 3. Algorithm
of activity selection problem.
The algorithm takes
the start and finish times of the activities, represented as arrays s and f , length (s) gives the total number of activities. The set A
stores the selected activities.Since the activities are ordered with respect to
their finish times the set A is initialized to contain just the first activity a1. The variable i stores the index of the recently
selected activity. The for loop considers each activityaj and adds to the set A if it is mutually compatible
with the previously selected activities. To see whether activity aj is compatible with every
activity assumingly in A, it needs to check whether the short time of aj is greater or equal to the
finish time of the recently selected activity ai.
Minimum Cost Spanning Tree
Let G = (V,
E) be the graph where V is the set of
vertices, E is the set of edges and |V|= n.
The spanning tree G′= (V, E′) is a sub graph of G in which all the vertices of graph G are connected with minimum number of
edges. The minimum number of edges required to correct all the vertices of a
graph G in n – 1. Spanning tree plays a very important role in designing
efficient algorithms.
Let us consider a
graph shown in Fig 6.6(a). There are a number of possible spanning trees that
is shown in Fig 6.6(b).
If we consider a
weighted graph then all the spanning trees generated from the graph have
different weights. The weight of the spanning tree is the sum of its edges
weights. The spanning tree with minimum weight is called minimum spanning tree
(MST). Fig. 6.7 shows a weighted graph and the minimum spanning tree.
A greedy method to
obtain the minimum spanning tree would construct the tree edge by edge, where
each edge is chosen accounting to some optimization criterion. An obvious
criterion would be to choose an edge which adds a minimum weight to the total
weight of the edges selected so far. There are two ways in which this criterion
can be achieved.
1.
The set of edges selected so far always forms a tree,
the next edge to be added is such that not only it adds a minimum weight, but
also forms a tree with the previous edges; it can be shown that the algorithm
results in a minimum cost tree; this algorithm is called Prim’s algorithm.
2.
The edges are considered in non decreasing order of
weight; the set T of edges at each stage is such that it is possible to
complete T into a tree; thus T may not be a tree at all stages of the
algorithm; this also results in a minimum cost tree; this algorithm is called
Krusleat’s algorithm.
Prim’s Algorithm
This algorithm starts with a tree that has only one edge, the
minimum weight edge. The edges (j, q) is added one by one such that node j is already included, node q is not included and weight wt(j,
q) is the minimum amongst all the
edges (x, y) for which x is in the
tree and yis not. In order to execute
this algorithm efficiently, we have a node index near(j) associated with
each node j that is not yet included
in the tree. If a node is included in the tree, near(j) = 0. The node near(j)
is selected into the tree such that wt(j, near(j)) in the minimum amongst all possible choices for near(j).
AlgorithmPRIM
(E, wt, n, T)
//E is the set of
edges, wt(n, n) is the weight
adjacency matrix for G, n is the number of nodes and T(n–1,
2) stores the spanning tree.
{
(k, l)
= edge with minimum wt.
minwt = wt[k, l] ;
T[1, 1] = k, T[1, 2] = l ;
for(i = 1; i<=n; i++){ if(wt[i, k]
<wt [i, l] )
near[i]
= k;
else
near[i]
= l;
}
near[k] = near[l]
= 0; for(i
= 2; i≤=n-1 ; i++)
{
letj be an index such that near[j] ≠ 0 and wt[j,
near[j]] is minimum.
T[i,
1] = j; T[i, 2] = near[j];
minwt = minwt + wt[j, near[j]] ;
near[j]
= 0 ; for(k = 1; k<= n ; k++){ if (near[k] ≠ 0 and wt[k,
near[k]] >wt[k, j])
near[k]
= j;
}
}
if(minwt
== ∞) print(“No spanning tree”); returnminwt;
}
Algorithm 4. Prim’s
algorithm for finding MST.
Fig 6. The weighted undirected graph to
illustrate Prim’s algorithm
Let us consider the weighted undirected graph shown in
Fig.6.8 and the objective is to construct a minimum spanning tree. The step
wise operation of Prim’s algorithm is described as follows.
Step 1 The
minimum weight edge is (2, 3) with weight 5.
Hence, the edge (2, 3) is added to the tree. near(2) and near(3) are set 0.
Step 2 Find near
of all the nodes that are not yet selected into the tree and its cost.
near(1) = 2 |
weight = 16 |
near(4) = 2 |
weight = 6 |
near (5) = - |
weight = ∞ |
near (6) = 2 |
weight = 11 |
The
node 4 is selected and the edge (2, 4) is added to the tree because
weight(4, near(4) ) is minimum. Then near(4) is
set 0.
Step 3
near(1) = 2 weight = 16 near(5) = 4 weight = 18 near(6) = 2 weight = 11
As weight(6, near(6)) is minimum, the node 6 is selected and edge (2, 6) is
added to the tree. So near(6) is set
0
Step 4 near
(1) = 2 weight = 16 near
(5) = 4 weight = 18
Next,
the edge (2, 1) is added to the tree as weight(1,
near(1)) is minimum. So near(1) is
set 0.
Step 5 near
(5) = 1 weight
12
The edge (1, 5) is
added to the tree. The Fig. 6.9(a) to 6.9(e) show the step wise construction of
MST by Prim’s algorithm.
Fig. 6.9 Step wise construction of MST by
Prim’s algorithm
Time complexity of Prim’s Algorithm
Prim’s algorithm has three for loops. The first for loop finds the near of all nodes
which require O(n) time. The second for loop is to find the remaining n-2 edges and the third for loop updates near of each node after
adding a vertex to MST. Since the third for loop is within the second for loop,
it requires O(n2) time.
Hence, the overall time complexity of Prim’s algorithm is O(n2).
Kruskal’s Algorithm
This algorithm starts
with a list of edges sorted in non decreasing order of weights. It repeatedly
adds the smallest edge to the spanning tree that does not create a cycle.
Initially, each vertex is in its own
tree in the forest. Then, the algorithm considers each edge ordered by
increasing weights. If the edge (u, v) connects two different trees, then (u, v)
is added to the set of edges of the MST and two trees connected by an edge (u, v)
are merged in to a single tree. If an edge (u,
v) connects two vertices in the same
tree, then edge (u, v) is discarded.
The Krushal’s
algorithm for finding the MST is presented as follows. It starts with an empty
set A, and selects at every stage the shortest edge that has not been chosen or
rejected regardless of where this edge is situated in the graph. The pseudo
code of Kruskal’s algorithm is given in
Algorithm .
The
operations an disjoint sets used for Krushal’s algorithm is as follows:
Make_set(v) : create a new set whose only member is pointed to v.
Find_set(v) : returns a pointer to the set
containing v.
Union(u, v) : unites the dynamic sets that contain u and v into a new set that is
union of these two sets.
AlgorithmKRUSKAL (V, E, W)
// V is the set of vertices, E is the set of edges and W is the
adjacency matrix to store the weights of the links. //
{
A
= Φ ;
for (each vertex u in V)
Make_set(u)
Create
a min heap from the weights of the links using procedure heapify. for
(each least weight edge (u, v) in E) // least weight edge is the root of the heap// if
(Find_set(u) ≠Find_set(v)){ // u and v are in two different sets //
A
= A∪{u, v}
Union(u, v)
}
}
return
A ;
}
Algorithm. 5
Kruskal’s algorithm for finding MST.
Let
us consider the graph shown in Fig.6.10 to illustrate Kruskal’s algorithm.
The
step wise procedure to construct MST by following the procedure presented given
below.
Step wise
construction of MST by Kruskal’s algorithm
Time complexity of Kruskal’s Algorithm
The Kruskal’s algorithm first creates n trees from n vertices
which is done in O(n) time. Then, a
heap is created in O(n) time using
heapify procedure. The least weight edge is
at the root of the heap. Hence, the edges are deleted one by one from
the heap and either added to the MST or discarded if it forms a cycle. This
deletion process requires O(nlog2n).
Hence, the time complexity of Kruskal’s algorithm is O(nlog2n).
Shortest Path Problem
Let us consider a
number of cities connected with roads and a traveler wants to travel form his
home city A to the destination B with a minimum cost. So the traveler will be
interested to know the following:
• Is there a
path from city A to city B?
• If there is
more than one path from A to B, which is the shortest or least cost path?
Let us consider the graph G
= (V, E), a weighting function w(e) for the edges in E and a source node v0.
The problem is to determine the shortest path from v0 to all the remaining nodes of G. The solution to this problem is suggested by E.W. Dijkstra and
the algorithm is popularly known as Dijustra’s algorithm.
This algorithm finds
the shortest paths one by one. If we have already constructed i shortest paths, then the next path to
be constructed should be the next shortest path. Let S be the set of vertices to which the shortest paths have already
been generated. For z not in S, let dist[z] be the length of
the shortest path starting form v0,
going through only those vertices that are in S and ending at z. Let u is the vertex in S to which the shortest path has already been found. If dist[z] >dist[u] + w(u,z) then dist[z] is updated to dist[u]
+ w(u,z) and the predecessor
of z is set to u. The Dijustra’s algorithm
is presented in Algorithm 6.6.
AlgorithmDijkstra (v0, W, dist, n)
// v0 is
the source vertex, W is the adjacency
matrix to store the weights of the links, dist[k]
is the array to store the shortest path to vertex k, n is the number of vertices//
{
for (i = 1 ; i < = n ; i ++){
S[i] = 0; //
Initialize the set S to empty i.e. i
is not inserted into the set// dist[i] = w(v0,i) ; //Initialize the distance to each node
}
S[vo] = 1; dist[v0] = 0; for (j = 2; j<= n; j++)
{ choose a vertex u from those
vertices not in S such that dist [u] is minimum.
S[u]
= 1;
for (each z adjacent to u with S[z] = 0) { if(dist[z] >dist [u] + w[u, z]) dist[z]
= dist [u] + w[u, z];
}
}
}
Algorithm
6.6. Dijkstra’s algorithm for finding shortest path.
Let us consider the graph. The objective is to find the
shortest path from source vertex 0 to the all remaining nodes.
Iteration |
Set
of nodes to which the shortest path is found |
Vertex selected |
|
Distance |
|
||
|
|
0 |
1 |
2 |
3 |
4 |
|
Initial |
{0} |
- |
0 |
10 |
∞ |
5 |
∞ |
1 |
{0, 3} |
3 |
0 |
8 |
14 |
5 |
7 |
2 |
{0, 3,4} |
4 |
0 |
8 |
13 |
5 |
7 |
3 |
{0, 3, 4, 1} |
1 |
0 |
8 |
9 |
5 |
7 |
4 |
{0, 3, 4, 1, 2} |
2 |
0 |
8 |
9 |
5 |
7 |
The time complexity of the algorithm is O(n2).
Dynamic Programming
Introduction
The Dynamic Programming (DP) is the most powerful design
technique for solving optimization problems. It was invented by mathematician
named Richard Bellman inn 1950s. The DP in closely related to divide and
conquer techniques, where the problem is divided into smaller sub-problems and
each sub-problem is solved recursively. The DP differs from divide and conquer
in a way that instead of solving sub-problems recursively, it solves each of
the sub-problems only once and stores the solution to the sub-problems in a
table. The solution to the main problem is obtained by the solutions of these
subproblems.
The steps of Dynamic Programming technique are:
• Dividing the problem into sub-problems: The main
problem is divided into smaller subproblems. The solution of the main problem
is expressed in terms of the solution for the smaller sub-problems.
• Storing the sub solutions in a table: The
solution for each sub-problem is stored in a table so that it can be referred
many times whenever required.
• Bottom-up computation: The DP
technique starts with the smallest problem instance and develops the solution
to sub instances of longer size and finally obtains the solution of the
original problem instance.
The strategy can be used when the process of obtaining a
solution of a problem can be viewed as a sequence of decisions. The problems of
this type can be solved by taking an optimal sequence of decisions. An optimal
sequence of decisions is found by taking one decision at a time and never
making an erroneous decision. In Dynamic Programming, an optimal sequence of
decisions is arrived at by using the principle of optimality. The principle of optimality states that
whatever be the initial state and decision, the remaining decisions must
constitute an optimal decision sequence with regard to the state resulting form
the first decision.
A fundamental difference between the greedy strategy and
dynamic programming is that in the greedy strategy only one decision sequence
is generated, wherever in the dynamic programming, a number of them may be
generated. Dynamic programming technique guarantees the optimal solution for a
problem whereas greedy method never gives such guarantee.
Matrix chain Multiplication
Let, we have three
matrices A1, A2 and A3, with order (10 x 100),
(100 x 5) and (5 x 50) respectively. Then the three matrices can be multiplied
in two ways.
(i) First,
multiplying A2 and A3, then multiplying A1
with the resultant matrix i.e. A1(A2 A3).
(ii) First,
multiplying A1 and A2, and then multiplying the resultant
matrix with A3 i.e. (A1A2) A3.
The number of scalar multiplications required in case 1 is
100 * 5 * 50 + 10 * 100 * 50 = 25000 + 50,000 = 75,000 and the number of scalar
multiplications required in case 2 is 10 * 100 * 5 + 10 * 5 * 50 = 5000 + 2500 = 7500
To find the best
possible way to calculate the product, we could simply parenthesize the
expression in every possible fashion and count each time how many scalar
multiplications are required. Thus the matrix chain multiplication problem can
be stated as “find the optimal
parenthesisation of a chain of matrices to be multiplied such that the number
of scalar multiplications is minimized”.
Dynamic Programming Approach for Matrix Chain
Multiplication
Let us consider a
chain of n matrices A1, A2……….An,
where the matrix Ai has
dimensions P[i-1] x P[i]. Let the parenthesisation at k results two sub chains A1…….Ak and Ak+1……..An. These two sub chains must
each be optimal for A1……An to be optimal. The cost of
matrix chain (A1….An) is calculated as cost(A1……Ak) + cost(Ak+1…...An) + cost of multiplying two
resultant matrices together i.e.
cost(A1……An)= cost(A1……Ak)
+ cost(Ak+1…...An)
+ cost of multiplying two resultant matrices together.
Here, the cost represents the number of scalar
multiplications. The sub chain (A1….Ak) has a dimension P[0] x P[k] and the sub chain (Ak+1……An) has a dimension P[k] x P[n]. The number of scalar multiplications
required to multiply two resultant matrices is P[0] x P[k] x P[n].
Let m[i,
j] be the minimum number of scalar
multiplications required to multiply the matrix chain
(Ai………..Aj). Then
(i)
m[i, j]
= 0 if
i = j
(ii)
m[i, j]
= minimum number of scalar multiplications required to multiply (Ai….Ak) + minimum
number of scalar multiplications required to multiply (Ak+1….An) + cost of multiplying two
resultant matrices i.e.
m[i, j] =
m[i,k]+
m[k,
j]+
P[i
−1]×P[k]×P[ j]
However, we don’t know the value of k, for which m[i, j] is minimum. Therefore, we have to
try all j – i possibilities.
[ ]
0
if i = j
m i, j =
k
{m[i,k]+ m[k, j]+ P[i −1]×P[k]×P[ j]}
Otherwise
Therefore, the minimum number of scalar multiplications
required to multiply n matrices A1 A2……An
is
m[1,n] = min{m[1,k]+ m[k,n]+ P[0]×P[k]×P[n]}
1≤k<n
The dynamic programming approach for matrix chain
multiplication is presented in Algorithm 7.2.
AlgorithmMATRIX-CHAIN-MULTIPLICATION (P)
// P is an array of length n+1 i.e. from P[0] to P[n].
It is assumed that the matrix Ai
has the dimension P[i1] ×P[i].
{
for(i = 1; i<=n; i++)
m[i,
i] = 0; for(l = 2; l<=n; l++){ for(i
= 1; i<=n-(l-1); i++){
j = i
+ (l-1); m[i, j]
= ∞;
for(k
= i; k<=j-1; k++)
q = m[i, k]
+ m[k+1, j] + P[i-1]
P[k]
P[j]
; if (q<m [i,
j]){
m[i, j]
= q;
s[i, j] = k;
}
}
}
}
returnm and s.
}
Algorithm
7.2 Matrix Chain multiplication algorithm.
Now let us discuss the
procedure and pseudo code of the matrix chain multiplication. Suppose, we are
given the number of matrices in the chain is n i.e. A1, A2………An and the dimension of matrix Ai is P[i1] ×P[i]. The input to the matrix-chain-order
algorithm is a sequenceP[n+1] = {P[0], P[1], …….P[n]}.
The algorithm first computes m[i, i]
= 0 for i = 1, 2, …….n in lines 2-3. Then, the algorithm
computes m[i, j] for j– i
= 1 in the first step to the calculation of m[i, j]
for j – i = n -1 in the last
step. In lines 3 – 11, the value of m[i, j]
is calculated for j – i = 1 to j –i = n – 1 recursively. At each step of the
calculation of m[i, j], a calculation on m[i,
k] and m[k+1, j] for i≤k<j, are
required, which are already calculated in the previous steps.
To find the optimal
placement of parenthesis for matrix chain multiplication Ai, Ai+1,
…..Aj, we should test the
value of i≤k<j for which m[i,
j] is minimum. Then the matrix chain
can be divided from (A1 ……Ak) and (Ak+1 ……. Aj).
Let us consider
matrices A1,A2……A5 to illustrate MATRIX-CHAIN-MULTIPLICATIONalgorithm.
The matrix chain order P = {P0, P1, P2, P3,
P4, P5} = {5, 10, 3, 12, 5, 50}. The objective is to find
the minimum number of scalar multiplications required to multiply the 5
matrices and also find the optimal sequence of multiplications.
The solution can be
obtained by using a bottom up approach that means first we should calculate mii for 1≤i ≤ 5. Then mijis calculated for j – i
= 1 to j – i = 4. We can fill the table shown in Fig. 7.4 to find the
solution.
Fig. 7.4Table to store the partial
solutions of the matrix chain multiplication problem
The value of mii
for 1≤i ≤ 5 can be
filled as 0 that means the elements in the first row can be assigned 0.
Then
For j – i = 1
m12 = P0 P1 P2 =
5 x 10 x 3 = 150
m23
= P1 P2 P3 =
10 x 3 x 12 = 360 m34 = P2 P3
P4 = 3 x 12 x 5 = 180 m45 = P3 P4 P5
= 12 x 5 x 50 = 3000
For j – i = 2
m13
= min {m11 + m23
+ P0 P1 P3, m12 + m33 +
P0 P2 P3}
=
min {0 + 360 + 5 * 10 * 12, 150 + 0 + 5*3*12}
=
min {360 + 600, 150 + 180} = min {960, 330} = 330
m24
= min {m22 + m34
+ P1 P2 P4, m23 + m44 +
P1 P3 P4}
=
min {0 + 180 + 10*3*5, 360 + 0 +10*12*5}
=
min {180 + 150, 360 + 600} = min {330, 960} = 330
m35
= min {m33 + m45
+ P2 P3 P5, m34 + m55 +
P2 P4 P5}
=
min {0 + 3000 + 3*12*50, 180 + 0 + 3*5*50}
=
min {3000 + 1800 + 180 + 750} = min {4800, 930} = 930
For j – i = 3
m14
= min {m11 + m24
+ P0 P1 P4, m12 + m34 +
P0 P2 P4, m13+m44+P0
P3 P4}
=
min {0 + 330 + 5*10*5, 150 + 180 + 5*3*5, 330+0+5*12*5}
=
min {330 + 250, 150 + 180 + 75, 330 +300}
=
min {580, 405, 630} = 405
m25
= min {m22 + m35
+ P1 P2 P5, m23 + m45 +
P1 P3 P5, m24+m55+P1
P4 P5}
=
min {0 + 930 +10*3*50, 360+3000+10*12*50, 330+0+10*5*50}
=
min {930 + 1500, 360 +3000+6000, 330+2500}
=
min {2430, 9360, 2830} = 2430
For j - i = 4
m15 = min{m11+ m25+ P0
P1 P5, m12+m35+ P0 P2
P5, m13 + m45 +P0 P3 P5,
m14+m55+P0 P4 P5 }
=
min{0+2430+5*10*50, 150+930+5*3*50, 330+3000+5*12*50,
405+0+5*5*50}
= min {2430+2500, 150+930+750, 330+3000+3000, 405+1250}
= min {4930, 1830, 6330, 1655} = 1655
Hence, minimum number of scalar multiplications
required to multiply the given five matrices in 1655.
To find the optimal parenthesization of A1……….A5,
we find the value of k is 4 for which
m15 is minimum. So the matrices can be splitted to (A1….A4)
(A5). Similarly, (A1….A4) can be splitted to
(A1A2) (A3 A4) because for k = 2, m14 is minimum. No
further splitting is required as the subchains (A1A2) and
(A3 A4) has length 1. So the optimal paranthesization of
A1 …….A5 in ( (A1 A2) (A3
A4) ) (A5).
Time complexity of multiplying a
chain of n matrices
Let T(n) be the time complexity of multiplying
a chain of n matrices.
Θ(1) 1 if n =1
T(n) =
( ) k=1[T(k)+T(n −k)+Θ( )1 ] if
n >1
n−1
⇒T(n) =Θ( )1 +∑[T(k)+T(n −k)+Θ( )1 ] if
n >1
k=1
n−1
=Θ(
)1
+Θ(n−1)+[T(k)+T(n−k)]
k=1
⇒T(n) =Θ(
)n +
2[T(1)+T(2)+LL+T(n−1)]LLL(7.1)
Replacing n by n-1, we get
T(n−1) =Θ(n−1)+2[T(1)+T(2)+LL+T(n−2)]LLL(7.2)
Subtracting equation 7.2 from equation 7.1, we have
T(n)−T(n−1) =Θ(n)−Θ(n−1)+ 2T(n−1)
⇒T(n) =Θ(
)1
+3T(n−1)
=Θ(
)1
+3[Θ(
)1
+3T(n−2)]=Θ( )1 +3Θ(
)1
+32T(n−2)
=Θ( )1 [1+3+32 +LL+3n−2 ]+3n−1T(1)
=Θ(
)1
[1+3+32 +LL+3n−1]
==Ο( )2n
Longest Common Subsequence
The longest common
subsequence (LCS) problem can be formulated as follows “Given two sequences X = 〈x1, x2 ……….xn〉 and Y = 〈y1, y2………yn〉 and the
objective is to find the LCS Z = 〈z1, z2 ………zn〉 that is
common to x and y”.
Given two sequences X and Y,
we say Z is a common sub sequence of X and Y if Z is a subsequence
of both X and Y. For example, X = 〈 A, B, C,
B, D, A, B〉 and Y = 〈 B, D, C, A, B, A〉, the sequence 〈B, C, A〉 is a common subsequence.
Similarly, there are many common subsequences in the two sequences X and Y.
However, in the longest common subsequence problem, we wish to find a maximum
length common subsequence of X and Y, that is 〈 B, C, B, A〉 or 〈 B, D, A, B〉. This section shows that the LCS
problem can be solved efficiently using dynamic programming.
4.7.1 Dynamic
programming for LCS problem Theorem.4.1.(Optimal
Structure of an LCS)
Let X = 〈x1, x2
……….xn〉 and Y = 〈y1, y2………yn〉 be
sequences and let Z = 〈z1, z2
………zn〉 be any LCS of X and
Y.
Case 1. If xm = yn, then zk
= xm = yn and Zk-1 is an LCS of Xm-1
and Yn-1.
Case 2. If xm≠yn, then zk≠xm implies that Z
is an LCS of Xm-1 and Y.
Case 3. If xm≠yn, then zk≠yn implies that Z is an
LCS of X and Yn-1.
Proof The proof
of the theorem is presented below for all three cases.
Case 1. If xm = yn and we assume that zk≠xm or zk≠yn then xm = yn
can be added to Z at any index after k violating the assumption that Zk is the longest common
subsequence. Hence zk = xm = yn. If we do not consider Zk-1 as LCS
of Xm-1 and Yn-1, then there
may exist another subsequence W whose length is more than k-1. Hence, after
adding xm = yn to the subsequence W increases the size of
subsequence more than k, which again
violates our assumption.
Hence,
Zk=xm = yn and Zk-1 must be an LCS
of Xm-1 and Yn-1.
Case 2.If xm≠yn, then zk≠xm implies that Z
is an LCS of Xm-1
and Yn. If there were a
common subsequence W of Xm-1and
Y with length greater then k than W
would also be an LCS of Xm and Yn violating our
assumption that Zk is an LCS of Xm and Yn.
Case 3. The proof
is symmetric to case-2.
Thus the LCS problem has an optimal structure.
Overlapping Sub-problems
From theorem 4.1, it is observed that either one or two cases
are to be examined to find an LCS of
Xmand Yn. If xm = yn,
then we must find an LCS of Xm-1 and Yn-1. If xm≠yn, then we must find an
LCS of Xm-1 and Yn and LCS of Xm and Yn-1.
The LCS of X and Y is the longer of these two LCSs.
Let us define c[m, n] to be the length of an LCS of the
sequences Xm and Yn. The optimal structure of
the LCS problem gives the recursive formula
0 if m = 0 or n = 0
c[m,n] = c[m−1,n−1]+1 if xm = yn ..............(7.1)
max{c[m−1,n],c[m,n−1]} if xm ≠ yn Generalizing
equation 7.1, we can formulate
0 if i = 0 or j = 0
c[i,j] =
c[i
−1, j −1]+1 if xi = y
j ..............(7.2)
max{c[i −1, j],c[i, j −1]} if xi ≠ y j
Computing the length of an LCS
Based on equation
(7.1), we could write an exponential recursive algorithm but there are only m*n distinct problems. Hence, for the
solution of m*n distinct subproblems,
we use dynamic programming to compute the solution using bottom up approach.
The algorithm
LCS_length (X, Y) takes two sequences X
= 〈x1, x2………..xn〉 and Y = 〈y1, y2………..yn〉 as inputs
and find c[m, n] as the maximum
length of the subsequence in X and Y. It stores c[i, j] and b[i, j]
in tables c[m, n] and b[m,
n] respectively, which simplifies the
construction of optimal solution. AlgorithmLCS_LENGTH (X, Y)
{
m=length [X] n=length
[Y] for(
i =1; i<=m; i++)
c[i,0]
= 0;
for(j=0; j<n; j++) c[0, j]=
0;
for(i=1; i< m; i++){ for(j
= 1; j <= n; j++){ if(x[i]
= = y[j]) { c[i, j]
= 1 + c[i-1, j-1]; b[i,j] =
‘ ’;
}
else{
if(c[
i-1, j] ≥ c[i, j-1] ) c[i, j]
= c[i-1, j]; b[i,j] = ‘↑’;
else
c[i,j] = c[i, j-1];
b[i, j] = ‘ ’
}
return c and b ;
}
Algorithm
7.3 Algorithm for finding Longest common subsequence .
Constructing an LCS
The algorithm LCS_LENGTH returns c and b tables. The b table can be used to construct the LCS
of X and Y quickly.
AlgorithmPRINT_LCS (b, X, i, j)
{
if (i
== 0 || j == 0)
return;
if
(b[i, j] = = ‘ ‘) {
PRINT_LCS (b, X,
i-1, j-1)
Printxi
}
else if (b[i, j] = = ‘↑’)
PRINT_LCS (b, X,
i-1, j)
else
PRINT_LCS (b, X,
i, j-1)
}
Algorithm
7.4 Algorithm to print the Longest common subsequence .
Let us consider two sequences X = 〈C, R, O, S,
S〉 and Y = 〈R, O, A, D,
S〉 and the objective is to find the
LCS and its length. The c and b table are computed by using the algorithm LCS_LENGTH for X and Y
that is shown in Fig.7.5. The longest common subsequence of X and Y is 〈R, O, S〉 and the length of LCS is 3.
Reliability Design Problem
In this section, we
present the dynamic programming approach to solve a problem with multiplicative
constraints. Let us consider the example of a computer return in which a set of
nodes are connected with each other. Let ri
be the reliability of a node i.e. the probability at which the node forwards
the packets correctly in ri.
Then the reliability of the path connecting from one node s to
k
another node d is ∏ri where k is the number of intermediate node.
Similarly, we can also consider a
i=1
system with n
devices connected in series, where the reliability of device i is ri.
The reliability of the
k
system is ∏ri . For
example if there are 5 devices connected in series and the reliability of each
device
i=1
is 0.99 then the reliability of the system is 0.99×0.99×0.99×0.99×0.99=0.951. Hence, it is desirable
to connect multiple copies of the same devices in parallel through the use of
switching circuits. The switching circuits determine the devices in any group
functions properly. Then they make use of one such device at each stage.
Let mi be the number of copies of
device Di in stage i. Then the probability that all mi have malfunction i.e. (1-ri)mi.Hence,
the reliability of stage i becomes 1-(1-ri)mi
. Thus, if ri =0.99 and mi=2, the reliability of
stage i is 0.9999. However, in
practical situations it becomes less because the switching circuits are not
fully reliable. Let us assume that the reliability of stage i in φi(mi), i≤n. Thus the reliability
k
of the system is ∏φi (mi ) .
i=1
Fig. 7.9
The reliability design
problem is to use multiple copies of the devices at each stage to increase
reliability. However, this is to be done under a cost constraint. Let ci be the cost of each unit
of device Di and let c be the cost constraint. Then the objective
is to maximize the reliability under the condition that the total cost of the
system mici
will be less than c. Mathematically,
we can write
maximize
∏φi (mi )
1≤i≤n
subject
to ∑cimi ≤ c
1≤i≤n
mi ≥1
and 1≤ i ≤ n .
We can assume that each ci>0
and so each mi must be in the range 1≤ mi ≤ ui , where
c+ ∑n cj
j=1 and j≠i
ui = .
ci
The dynamic programming approach finds the optimal solution m1,m2…mn. An optimal sequence of
decision i.e. a decision for each mi
can result an optimal solution.
n
Let fn(c) be the maximum reliability of the system i.e. maximum ∏φi (mi ) , subject to the constraint
i=1
∑ |
cimi ≤ c
and 1≤
mi ≤ xi , 1≤ i
≤ n. Let we
take a decision on the value of mn
from {1,2… un}. Then
1≤i≤n
the value of the remaining mn 1≤ i
≤ ncan be
chosen in such a way that φ(mi ) for
1≤
i ≤
n ≤
n −1can be
maximized under the cost constraint c-cnmn.
Thus the principle of optimality holds and we can write
fn(c)=
{φj (mj )f
j−1(x
− mjcj )} (8.1)
n n
We can generalized the equation 8.1 and we can write
fj(x)=
{φj (mj )f
j−1(x
− mjcj )} (8.2)
j j
It is clear that f0(x)=1 for all x, 0≤ x
≤ c
. Let Si consists of tuples of the form (f,x) where f = fi(x). There is atmost one tuple for each
different x that remits from a
sequence of decisions on m1,m2…mn. If there are two tuples (f1,x1)
and (f2,x2) such that f1 ≥ f2 and x1≤
x2 then (f2,x2) is said to be dominated tuple and discarded from Si.
Let us design a three stage system with devices D1,D2
and D3. The costs are Rs 30,Rs 15 and Rs 20 respectively. The cost
constraint of the system is Rs 105. The reliability of the devices are 0.9, 0.8
and 0.5 respectively. If stage i has mi devices in parallel then φ(mi )=(1−(1−ri )mi ). We can write c1=30, c2
=15,c3=20,
c=105, r1=0.9, r2=0.8
and r3=0.5. We can
calculate the value of ui,
for 1≤i ≤3
105−(15+ 20) 70
x1 = 30 =30
= 2
105−(30 + 20) 55
x2 = 15 =15=
3
105−(30 +15) 60
x3 = 20 =20=
3
Then we start with S0={(1,0)}. We can obtain each
Si from Si-1 by trying out all possible values for mi
and combining the resulting tuples together.
S11 ={(0.9,30)} S21 ={(0.99,60)} S1 ={(0.9,30),(0.99,60)}
Considering 1 device at stage q, we can write S12 as follows
S12 ={(0.9×0.8,30+15),(0.99×0.8,60+15)}
={(0.72,45),(0.792,75)}
Considering 2 devices of D2 in stage 2, we
can compute the reliability at stage 2 φ2(m2)=1-(1-0.8)2=0.96 cost at
stage 2=2×15=30
Hence, we can write
S22 ={(0.9×0.96,30+30),(0.99×0.96,60+30)}
={(0.864,60),(0.9504,90)}
The tuple (0.9504, 90) is removed as it left only Rs 15 and
the maximum cost of the third stage is 20. Now, we can consider 3 devices of D2
in stage 2 and compute the reliability at stage 2 is
φ2(m2)=1-(1-0.8)3=1-0.008=0.992.
Hence, we can write
S32 ={(0.9×0.992,30+45),(0.99×0.992,60+45)}
={(0.8928,75),(0.98208,105)}
The tuple (0.98208,105) is discarded as there is no cost left
for stage 3. Combining S12 , S22 and S32 , we get
S 2 ={(0.72,45),(0.792,75),(0.864,60),(0.8928,75)}
The tuple (0.792,75) is discarded from S 2 as it is dominated by (0.864,60).
Now, we can compute S13 assuming 1
device at stage 3.
S13 ={(0.72×0.5,45+20),(0.864×0.5,60+20),(0.8928×0.5,75+20)}
={(0.36,65),(0.432,80),(0.4464,95)}
If there are 2 devices at stage 3, then
φ3(m3)=(1-(1-0.5)2)=0.75
We can write S23 as follows
S23 ={(0.72×0.75,45+40),(0.864×0.75,60+40),(0.8928×0.75,75+40)}
={(0.54,85),(0.648,100)} (tuple(0.8928×0.75,115) is discarded as cost constraint is 105).
If there are 3 devices at stage 3 then
φ3(m3)=(1-(1-0.5)3)=1-0.125=0.875
Hence, we can write S33 ={(0.72×0.875,45+60)}={(0.63,105)}
Combining S13 , S23 and S33 we can
write S3 discarding the dominant tuples ass given below
S3={(0.36,65),(0.432,80),(0.54,85),(0.648,100)}
The best design has the reliability 0.648 and a cost of 100.
Now, we can track back to find the number of devices at each stage. The
tuple(0.648,100) is taken from S23 that is
with 2 devices at stage 2. Thus m3=2.
The tuple (0.648,100) was derived from the tuple (0.864,60)
taken from S22 and
computed with considering 2 devices at stage 2. Thus m2=2. The tuple
(0.864,60) is derived from the tuple (0.9,30) taken from S11 computed
with 1 device at stage 1. Thus m1=1.
Bellman Ford Algorithm
In the previous chapter, it is observed that the Dijkstra’s
algorithm finds the shortest path from one node to another node in a graph G = (V,
E) where the weights of the links are
positive. However, if there are some negative weight edges then dijkstra’s
algorithm may fail to find the shortest path from one node to another. Hence,
Bellman-ford algorithm solves the single source shortest paths problem in
general case. Let us consider the graph G=
(V, E) shown in Fig.7.10. Let us assume that node 1 is the source, node
2 and node 3 are destinations. Then by using Dijkstra’s algorithm, we can
compute the shortest path to node 2 and node 3 as 5 and 7 respectively whereas
it is not actually the case. The shortest path from 1 to 3 is 123 and the path
length is 2. This can be computed by Bellman-ford algorithm.
Before applying Bellman-ford algorithm, we assume that the
negative weight edges are permitted but there should not be any negative weight
cycle. This is necessary to answer that the shortest paths consists of a finite
number of edges. In Fig.7.11 , the path from 1 to 3 is
121212…123 and the path length is −∞. When there are no negafve weight cycles,
the shortest path between any two nodes in nnode graph contains n−1 edges.
Let distl[u] be the length of the shortest path from source node v to the node u under the constraint that the shortest path contains at most l edges. The dist1[u]=cost[v,u],
for 1≤u≤n. As we discussed earlier, if there is no negative weight cycle
then the shortest path contains atmostn−1
edges. Hence, distn-1[u] in the length of the shortest path
from v to u. Our goal is to compute distn-1[u] and this can be done by using Dynamic
programming approach. We can make the following observations
(i) If the
shortest path from v to u with at most k, k>1 edges has no
more than k−1 edges, then distk[u]= distk-1[u].
(ii) If the
shortest path from v to u with at most k, k>1 edges has
exactly k edges, then it is made up
of a shortest path from v to some
vertex j followed by edge<j, u>.
The path from v to j has k−1 edges and its length is distk-1[j]. All vertices i such that <i, u>∈E are the candidates of j.
Since we are interested in a shortest path, the i that minimizes distk-1[i]=cost[i,u]
is the correct value for j.
Distk{distk-1
The Bellman ford algorithm is presented in Algorithm 7.6.
AlgorithmBELLMAN FORD(v,cost,dist,n)
//v is the source, cost is the adjacency matrix
representing the cost of edges, dist
stores the distance to all nodes, n
is the number of nodes//
{
1
for(i=1;i<=n; i++)
2
dist[i]=cost[v][i];
3
for(k=2; k<= n-1; k++){
4
for each u such that k ≠ v and (j, u)∈E
5
if(dist[u]>dist[j]+cost[j][u])
6
dist[u]=dist[j]+cost[j][u];
7
}//end for k
}
Algorithm
7.6 Bellman FordAlgorithm.
The Bellman ford algorithm starts with computing the path
from source to each node i in v.
Since there are n nodes in a graph, the path to any node can contain atmostn-1 nodes. Then for each node u, remaining n-2 nodes are required to be examined. If <j, u>∈E then the condition dist[u] is compared with dist[j]
+ cost[j][u]. If dist[u] is greater than dist[j]+dist[j][u] then dist[u] is updated to dist[j]+cost[j][u].
Let
us consider the example network shown in Fig. 7.12.
The example graph and its adjacency
matric
The time complexity of Bellman ford algorithm is O(ne). The lines 1-2 takes O(n) time. If
the matrix is stored in an adjacency list then lines 4-6 takes O(e) time.
Hence, the lines 3-7 takes O(ne)
time. Therefore, the several time of Bellman ford algorithm is O(ne).
Stepwise working procedure of
Bellman Ford Algorithm
Assembly Line Scheduling
Assembly
lines for automobile manufacturing factory
An automobile company has two assembly lines as shown in
Fig.7.13.The automobile chassis enters an assembly line and goes through all
stations of the assembly line and complete auto exists at the end of the
assembly line. Each assembly line has n
stations and the jth
station of ith line is
denoted as Sij. The time
required at station Sij is
aij and the time required
to travel from one station to the next station is negligible. Normally, the
chassis enters in a line goes through all the stations of the same line.
However, in overload situations, the factory manager has the flexibility to
switch partially completed auto from a station of one line to the next station
of other line. The time required to transfer the partially completed auto from
station Sij to the other
line is tij; where i=1,2 and j=1,2…n-1. Now the
objective to choose some station from line 1 and some station from line 2 so
that the total time to manufacture an auto can be minimized.
If there are many
stations in the assembly lines the brute force search takes much time to
determine the stations through which the auto can be assembled. There are 2n possible ways to chose
stations from the assembly line. Thus determining the fastest way for
assembling the auto takes O(2n)
time, which is infeasible when n is
large. This problem can be efficiently solved by dynamic programming technique.
The first step of
dynamic programming technique is to characterize the structure of an optimal
solution. Since, there are 2 assembly lines with stations in each line, the
computation of time to move to the 1st station of any assembly line
is straight forward. However, there are two choices for j=2,3… n in each
assembly line. First, the chassis may come from station S1,j-1 and
then directly move to S1,j since the time to move from one station
to the next station in the same assembly line is negligible. The second choice
is the chassis can come from station S2,j-1 and then been
transferred to station S1,j with a transfer time t2,j-1.
Let the fastest way through station S1,j is through station S1,j-1.
Then there must be a fastest way through from the starting point through
station S1,j-1. Similarly, if there is a fastest way through station
S2,j-1 then the chassis must have taken a fastest way from the
starting point through station S2,j-1.
Thus the optimal solution to the problem can be found by
solving optimal solution of the sub-problems that in the fastest way to either
S1,j-1 or S2,j-1. This is referred as optimal structure
of assembly line scheduling problem.
If we find the fastest
way to solve assembly line scheduling problem through station j-1 on either
line 1 or line 2. Thus the fastest way through station S1,j is
either
•
the fastest way to S1, j-1 and then
directly through station S1,j.
•
the fastest way to S2,j-1 and a transfer
from line 2 to line 1 and then through station S1,j. Similarly,
the fastest way through station S2,j is
•
the fastest way to S2,j-1 and then directly
through station S2,j.
•
the fastest way to S1,j-1, a transfer time
from station 1.
Let fi[j] be the fastest possible time to get a
chassis from the starting point through station Si,j of the assembly line i. Then chassis directly goes to the first station of each line.
f1[1]= e1 + a1,1 f2 [1]= e2 + a2,1
Now, we can compute fi[j], 2<j≤n and 1≤i≤2 as
f1[j]=min{f1[j-1] + a1,j,
f2[j-1] + t2,j-1 + a1,j } f2[j]=min{f2[j-1] + a2,j, f1[j-1] + t1,j-1 + a2,j }
If the chassis goes all the way through station n either line 1 or line 2 and then
exits, we have
e1 + a1,1
if n = 1
f1[ ]n = {
[ ] [ ] }
min f1 n −1 + a1,n, f2 n −1 +t2,n−1,a1,n Otherwise
e2 + a1,1
if n = 1
f2[ ]n = {
[ ] [ ] }
min f2 n−1 +
a2,n, f1 n −1 +t1,n−1,a1,n Otherwise
Let the fastest time to get the chassis all the way
through the factory is denoted by f*.
Then f*= min{f1[n] + x1, f2[n] + x2
}
Now, we can compute the stations through which the chassis
must move to deliver the end product at minimum time. This can be calculated
with a backward approach. Let l*
denote the line whose station n is
used in a fastest way through the entire process. If f1[n] + x1<f2[n] + x2 then l*=1 else l*=2.
Then let us denote li[j] be the line number 1 or 2 whose
station j-1 is used in a fastest way
through station Si,j. li[1] is not required to be
calculated since there is no station proceeding to station Si,1.
If f1[j-1] + a1,j ≤f2[j-1]+t2,j-1 + a1,j then l1[j]=1.Otherwise, l1[j]=2. Similarly, if f2[j-1]+ a2,j≤ f1[j-1]+t1,j-1
+ a2,j then l2[j]=2. Otherwise, l2[j]=1. The
algorithm for assembly line scheduling problem is presented in Algorithm7.7.
Illustration
of assembly line scheduling procedure.
f1[1] = e1
+ a1,1 = 2 + 4 = 6 f2[1] = e2
+ a2,1 = 3 + 7 = 10 f1[2] = min{f1[1] + a1,2, f2[1]
+ t2,1 + a1,2 }
= min{6 + 7, 10 + 2 + 7} =13
f2[2] = min{f2[1] + a2,2, f1[1] + t1,1 + a2,2
}
= min{10 + 2, 6 + 2 + 2} =10
The time required at all the stations of both theassembly
lines are shown above. The minimum time required to assemble an auto is 33. The
assembly lines through which the complete auto is assembled is shown above.
MODULE -
III
• Lecture 21 -
Data Structure for Disjoint Sets
• Lecture 22 -
Disjoint Set Operations, Linked list Representation
• Lecture 23 -
Disjoint Forests
• Lecture 24 -
Graph Algorithm - BFS and DFS
• Lecture 25 -
Minimum Spanning Trees
• Lecture 26 -
Kruskal algorithm
• Lecture 27 -
Prim's Algorithm
• Lecture 28 - Single
Source Shortest paths
• Lecture 29 -
Bellmen Ford Algorithm
• Lecture 30 -
Dijkstra's Algorithm
Lecture – 21 Disjoint Set
Data Structure
In computing, a disjoint-set data structure, also called a union–find data structure or merge–
find set, is a data structure that keeps track of a set of elements
partitioned into a number of disjoint (non-overlapping) subsets.
It supports the following useful operations:
• Find: Determine which subset a particular element is in. Find typically returns an item from this
set that serves as its "representative"; by comparing the result of
two Find operations, one can
determine whether two elements are in the same subset.
• Union: Join two subsets into a single subset.
• Make Set, which makes a set containing only a given element (a
singleton), is generally trivial. With these three operations, many practical
partitioning problems can be solved.
In order to define
these operations more precisely, some way of representing the sets is needed.
One common approach is to select a fixed element of each set, called its representative, to represent the set as
a whole. Then, Find(x) returns the
representative of the set that x
belongs to, and Union takes two set
representatives as its arguments.
Example :
• partitioning
of a set
• Boost Graph
Library to implement its Incremental Connected Components functionality.
• implementing
Kruskal's algorithm to find the minimum spanning tree of a graph.
• Determine
the connected components of an undirected graph.
CONNECTED-COMPONENTS(G)
1. for each vertex v∈V[G]
SAME-COMPONENT(u,v)
2. do MAKE-SET(v)
1. if FIND-SET(u)=FIND-SET(v)
3. for each edge (u,v) ∈E[G]
2. thenreturn TRUE
4. doif FIND-SET(u) ≠ FIND-SET(v)
3. elsereturn FALSE
5. then UNION(u,v)
Lecture 22 - Disjoint Set
Operations, Linked list Representation
•
A disjoint-set is a collection ={S1, S2,…, Sk} of distinct dynamic sets.
•
Each set is identified by a member of the set, called representative.
Disjoint set operations
–
MAKE-SET(x):
create a new set with only x. assume x is not already in some other set.
–
UNION(x,y): combine the two sets containing x and y into one new set. A new representative is selected.
–
FIND-SET(x):
return the representative of the set containing x.
Linked list Representation
•
Each set as a linked-list, with head and tail, and each node
contains value, next node pointer and back-to-representative pointer.
•
Example:
•
MAKE-SET costs O(1):
just create a single element list.
•
FIND-SET costs O(1):
just return back-to-representative pointer.
UNION Implementation
•
A simple implementation: UNION(x,y) just appends x to the end of y, updates all back-torepresentative pointers in x to the head of y.
•
Each UNION takes time linear in the x’s length.
•
Suppose n
MAKE-SET(xi) operations (O(1) each) followed by n-1 UNION
–
UNION(x1,
x2), O(1),
–
UNION(x2,
x3), O(2), – …..
–
UNION(xn-1,
xn), O(n-1)
•
The UNIONs cost 1+2+…+n-1=Θ(n2)
So 2n-1 operations
cost Θ(n2),
average Θ(n) each
Lecture 23 - Disjoint Forests
•
Three operations
–
MAKE-SET(x):
create a tree containing x. O(1)
–
FIND-SET(x):
follow the chain of parent pointers until to the root. O(height of x’s
tree)
–
UNION(x,y): let the root of one tree point to
the root of the other. O(1)
•
It is possible that n-1
UNIONs results in a tree of height n-1.
(just a linear chain of n nodes).
•
So n FIND-SET
operations will cost O(n2).
Union by Rank & Path Compression
•
Union by
Rank: Each node is associated with a rank, which is the upper
bound on the height of the node (i.e., the height of subtree rooted at the
node), then when UNION, let the root with smaller rank point to the root with
larger rank.
•
Path
Compression: used in FIND-SET(x)
operation, make each node in the path from x
to the root directly point to the root.
Thus reduce the tree height.
Lecture 24 - Graph
Algorithm - BFS and DFS
In graph
theory, breadth-first
search (BFS) is a strategy for searching in a graph when search
is limited to essentially two operations:
(a) visit and
inspect a node of a graph;
(b) gain access
to visit the nodes that neighbor the currently visited node.
• The BFS
begins at a root node and inspects all the neighboring nodes.
• Then for
each of those neighbor nodes in turn, it inspects their neighbor nodes which
were unvisited, and so on.
• Compare BFS
with the equivalent, but more memory-efficient.
Historical Background
•
BFS
was invented in the late 1950s by E.
F. Moore, who used to find the shortest path out of a
maze,
• discovered independently by C. Y.
Lee as a wire routing algorithm
(published 1961).
Example
A BFS search will visit the
nodes in the following order: A, B, C, E, D, F, G
BFS Algorithm
The algorithm uses a queue data structure
to store intermediate results as it traverses the graph, as follows:
1. Enqueue the
root node
2. Dequeue a node
and examine it
•
If the element sought is found in this node, quit the
search and return a result.
•
Otherwise enqueue any successors (the direct child nodes)
that have not yet been discovered.
3. If the queue is
empty, every node on the graph has been examined – quit the search and return
"not found".
4. If the queue is
not empty, repeat from Step 2.
Applications
Breadth-first search can be used
to solve many problems in graph theory, for example:
•
Copying Collection, Cheney's algorithm
•
Finding the shortest path between two nodes u and v (with path length
measured by number of edges)
•
Testing a graph for bipartiteness
•
(Reverse) Cuthill–McKee mesh numbering
•
Ford–Fulkerson method for computing the maximum flow in a flow network
•
Serialization/Deserialization of a binary tree vs serialization in
sorted order, allows the tree to be re-constructed in an efficient manner.
Pseudo Code
Input: A graph G and a root v of G
1
procedure BFS(G,v)
is 2
create a queue Q 3
create a set V 4
add v to V 5
enqueuev onto Q 6
whileQ is not empty loop 7
t ←
Q.dequeue() 8
ift is what we are looking for then 9
returnt 10 end if 11 for all edges e in
G.adjacentEdges(t) loop 12 u ← G.adjacentVertex(t,e) 13 ifu is not in Vthen
14 add u to V 15 enqueueu onto Q 16 end if 17 end loop 18 end loop 19
return none 20
end BFS |
Time and space complexity
The time
complexity can be expressed as [3] since every
vertex and every edge will
be explored in the worst case.
Note: , depending on how sparse the
input graph is.
When the number of vertices in the
graph is known ahead of time, and additional data structures are used to
determine which vertices have already been added to the queue, the space
complexity can be expressed as
where is the cardinality of the set of vertices. If
the graph is represented by an Adjacency list it occupies space in memory, while anAdjacency
matrix representation occupies
Depth-first
search (DFS) is an algorithm for traversing or searching tree or graph data structures. One starts
at the root(selecting some arbitrary
node as the root in the case of a graph) and explores as far as possible along
each branch beforebacktracking.
Historical
Background
A version of depth-first search
was investigated in the 19th century by French mathematician Charles Pierre Trémaux
Example
A DFS search will visit the
nodes in the following order: A, B, D, F, E, C, G
Pseudo Code
Input: A graph G
and a vertex v of G
Output: All vertices reachable from v labeled as discovered
A
recursive implementation of DFS
1
procedure DFS(G,v):
2
label
v as discovered
3
for all edges from v to winG.adjacentEdges(v) do
4
if vertex w is not labeled as discovered then
5
recursively
call DFS(G,w)
A
non-recursive implementation of DFS
1
procedure DFS-iterative(G,v):
2
let
S be a stack
3
S.push(v)
4
whileS is not empty
5
v ← S.pop()
6
ifv is not labeled as
discovered:
7
label
v as discovered
8
for all edges from v to winG.adjacentEdges(v) do
9
S.push(w)
Applications
•
Finding connected components.
•
Topological sorting.
•
Finding the bridges of a graph.
•
Generating words in order to plot the Limit Set of a Group.
•
Finding strongly connected components.
•
Planarity testing
•
Solving puzzles with only one solution, such as mazes. (DFS can be
adapted to find all solutions to a maze by only including nodes on the current
path in the visited set.)
•
Maze generation may use a randomized depth-first search.
•
Finding bi-connectivity in graphs.