Friday, April 27, 2018

Segment Tree




Segment tree is basically taking the recursion pattern of merge sort and turning it into data structure. So, before we start with segment tree we have to know how merge sort works. If we see the diagram below;



So, what we are doing here is- we are recursing down the array by diving it in half and then we are recursing back up and merging the sorted array. On each level while recursing down it take linear amount of time because we are loop over the array and diving in half which takes linear amount of time, then while recursing up while combing the sorted list in liniear amount of time. So we can use this information to analyze mergesort runtime. If we see the width of the array we get at the end of splitting the array , we get the width as n so, that’a O(n) and the height of the array is log n because we are splitting the array in half each time and log of something basically means how many time we can diving that in half. Now if we combine both width and height we get the area a nlog n that means the complexity cof merge sort is  O(nlog n).
Now that we know how merge sort works the question is -how can we turn this into data structure? What we’ll do is we’ll consider each array of the merge sort as different nodes and use the nodes to form a tree. So, I am taking the sequence of the recursion and turning it into data structure. The tree will look like below:






 
So, as we can see our array has 6 element so the root node consists of all the elements from 0-5 then we are splitting the array in half so the left node consists of elements 0-2 and right node consists of elements 3-5 then keep diving those till the end so the tree looks like below

:


If you notice, I have drawn the same tree as before but here instead of putting the segments of array I’ve just put the range of those segments . So, segment tree is useful when we are solving range query type of problem. For example, If you are given an array and you have to find out minimum element of a particular range, let’s say the array is :

If we try to solve this using linear search the if we have n element and 1 query time complexity will be O(n*1) that means O(n).So, if we have q queries the complexity will be O(nQ) . Now if n<=10^5 and q<=10^5 and the time limit is 1second, with linear search we’ll get a TLE. But if we use segment tree, that we’ll help us reducing the time complexity. For this problem we’ll draw a similar tree like before.



But in each node we’ll save the minimum element of the entire range that node. So, the root node will contain minimum element of the range [0-5] and likewise other nodes will contain minimum element of it’s range. And we’ll start storing  the minimum value from the leaf nodes because the range size of the leaf is 1 from each node, so the element of that index itself. First we’ll store the number 8 node where range is 0-0 so we have to just put the element of 0th index which is 1, then in range 1-1 we’ll store element of index 1 which is 3, likewise we’ll fill all the leaf nodes with there values. Now we’ll move on to the parent nodes of the leafs. First we’ll store the minim of range [0-1] which is the minimum of range [0,0] and [1,1] that is 1, then minimum of range [3-4] is minimum of range[3.3] and [3.4] which is -2. Then we’ll store the minimum of range [3-5] which is -2 , minimum of range [0-2] is 1 and minimum of range [0-5] is -2. Now that we’ve build our tree .At first we’ve filled the leaf nodes and then in a bottom up way we’ve filled the rest of the nodes.
Now we’ll see how to build this tree using code. We will use an array to build this tree. If you have noticed I have numbered the nodes in the tree and where 10 and 11 is missing because node 5 can’t be divided further. The numbering of the nodes is done so that we can make a relationship between parent and child nodes. So, node 1’s children are 2 and 3, node 2’s children are 4 and 5, node 3’s children are 6 and 7. Can you see a pattern here? When we are getting node i it’s children are 2*I and 2*i+1 and if we have children and want to find out it’s parent then if the child node is i, the parent node will be i/2 because for children 2,3 we get parent 1 likewise children 4,5 we get parent 2 . By using these notation we can store the children node in the array. So, the array for our tree we’ll look like this:



The nodes are not in the tree I have kept those indexes empty. So for n elements we’ll need total   2n-1  nodes in the tree but when my allocating array that is larger than 2n-1 because we re also taking some extra indexes which are empty, so the exact size of the array will be 2*(2^ceil(log n))+1   4*n+1, this is the maximum size needed to build a tree of size n.
Function for  building the tree is:






Now that we have made the segment tree, we have to do the query in the tree. In this case we are finding the minimum element in a range. So, when the query comes, there will be three cases .


To know the first case let’s see an example- If user does a query for range [0,5] then in the tree the root node is the range [0,5] and it contains -2 that means we’ve got the output -2. Case is:








1)This is complete overlap because the  node range [0,5] is completely inside the query range[0,5], so in complete overlap we just have to return tree[index], that’s how we get -2 as output

To know the second case and third case let’s see an example- If user does a query for range [0,2] then in the tree the root node is the range [0,5] that means we’ve partial overlap Case is:


2)Partial  overlap because the  query  range [0,2] is partially inside the root node  range[0,5], So, in partial overlap we’ll go to both sides left and right. First we’ll go to left in left we get the node range[0,2]and our query range is also [0,2] this is complete overlap so we’ll return the value which this node contains which is 1.


 Now we’ll go to the right node the range of which is [3-5] and query range is [0,2] so we’ve got the third case that is no overlap.
                                        
3) When the query range is outside the node range we get no overlap, in this case we’ll return infinity
So, the minimum of 1 and infinity we’ll be 1. And if you see our input array, the minimum element between [0,2] is also 1 that means we’re getting the right output.

So, the three cases for segment tree is:
          1) Complete overlap where output is element of that node
          2) Partial overlap where we have to find output by going both sides
          3) No overlap, in this case we’ll return infinity

Now let’s see the code for the query:




No comments:

Post a Comment

Introduction To Binary Search Tree

A tree data structure is a way to hold data that looks like a tree when it's visualized. For example: All data points in a tre...