Binary Search Trees
Binary Search Trees
Why do we need Binary Search Trees in the first place?
Because there is a need for data structures that allow both
fast search and flexible updates - not just fast search or
flexible updates.
- Unsorted double-linked lists support insertion and deletion in O(1) time but search takes linear time in worst case.
- Sorted arrays support binary search and logarithmic query times but at the cost of linear-time update.
Binary search requires that we have fast access to two elements - the median elements above and below the given node.
- To combine these ideas, we need a “linked list” with two pointers per node.
- And this is the basic idea behind binary search trees.
A rooted binary tree is recursively defined as either being
- empty, or
- consisting of a node called the root, together with two rooted binary trees called the left and right subtrees, respectively.
A binary search tree labels each node in a binary tree with a single key such that for any node labeled x,
- all nodes in the left of the subtree of x have keys < x while
- all nodes in the right subtree of x have keys > x.
- So if we need to find if an item is in our tree, we’d start at the root node and then, if the item we are looking for is greater than the root node, we’d go right.
- And we do it until we find the element we are looking for.
- And we find our element in much lesser hops than linear search.
- In a normal list (or a tree, but really unbalanced), even in a very small list or unbalanced tree, it would take us seven hops instead of three to see if an item we are looking for is in there.
This search tree labeling scheme is very special. For any binary tree on n nodes, and any set of n keys, there is exactly one labeling that makes it a binary search tree.
- Each of the nodes can also point to two rooted binary trees called the left and right subtrees (or one, or none).
Can there be duplicate elements in Binary Search trees?
Universal Definition of a Binary Search Tree involves storing and search for a key based on traversing a data structure in one of two directions. In the pragmatic sense, that means if the value is <>, you traverse the data structure in one of two ‘directions’. So, in that sense, duplicate values don’t make any sense at all.