Bucket Sort
Table of Contents
Bucket Sort
Bucket Sort is a non-comparison-based sorting algorithm that distributes elements into a number of “buckets,” sorts each bucket individually (often using another sorting algorithm like insertion sort), and then concatenates the sorted buckets to produce the final sorted array. It works best when the input is uniformly distributed over a range.
Pseudocode for Bucket Sort
BucketSort(arr, n):
// Input: array arr of size n, elements in range [0, 1) or scaled to that range
// Create buckets
buckets = array of k empty lists (k is typically n or a constant)
// Step 1: Distribute elements into buckets
for i from 0 to n-1:
bucket_index = floor(arr[i] * k) // Map element to a bucket
buckets[bucket_index].append(arr[i])
// Step 2: Sort each bucket
for each bucket in buckets:
sort(bucket) // Use a sorting algorithm like insertion sort
// Step 3: Concatenate all buckets
result = empty array
for each bucket in buckets:
result.append(bucket elements)
return result
Example of Bucket Sort
Consider the array: [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68]
Assume the elements are in the range [0, 1) and we use k = 10 buckets (one for each range [0.0-0.1), [0.1-0.2), ..., [0.9-1.0)).
Step-by-Step Process
-
Create Buckets:
- Initialize 10 empty buckets:
buckets[0], buckets[1], ..., buckets[9].
- Initialize 10 empty buckets:
-
Distribute Elements:
- For each element, compute
bucket_index = floor(element * k). - Example:
0.78:floor(0.78 * 10) = 7→ Add tobuckets[7]0.17:floor(0.17 * 10) = 1→ Add tobuckets[1]0.39:floor(0.39 * 10) = 3→ Add tobuckets[3]- And so on…
- After distribution:
buckets[0] = []buckets[1] = [0.17, 0.12]buckets[2] = [0.26, 0.21, 0.23]buckets[3] = [0.39]buckets[4] = []buckets[5] = []buckets[6] = [0.68]buckets[7] = [0.78, 0.72]buckets[8] = []buckets[9] = [0.94]
- For each element, compute
-
Sort Each Bucket:
- Sort non-empty buckets (e.g., using insertion sort):
buckets[1] = [0.12, 0.17]buckets[2] = [0.21, 0.23, 0.26]buckets[3] = [0.39]buckets[6] = [0.68]buckets[7] = [0.72, 0.78]buckets[9] = [0.94]
- Sort non-empty buckets (e.g., using insertion sort):
-
Concatenate Buckets:
- Combine all buckets in order:
[] + [0.12, 0.17] + [0.21, 0.23, 0.26] + [0.39] + [] + [] + [0.68] + [0.72, 0.78] + [] + [0.94]
- Resulting sorted array:
[0.12, 0.17, 0.21, 0.23, 0.26, 0.39, 0.68, 0.72, 0.78, 0.94]
- Combine all buckets in order:
Complexity analysis
- Time Complexity:
- Average case: O(n + k) when elements are uniformly distributed, where
nis the number of elements andkis the number of buckets. - Worst case: O(n²) if all elements fall into one bucket and the bucket sorting algorithm is O(n²).
- Best case: O(n + k) when elements are well-distributed.
- Average case: O(n + k) when elements are uniformly distributed, where
- Space Complexity: O(n + k) for the buckets and result array.
- Stability: Stable if the bucket sorting algorithm (e.g., insertion sort) is stable.
- Advantages: Efficient for uniformly distributed data, especially floating-point numbers in a known range.
- Disadvantages: Requires knowledge of the input range, less effective for non-uniform distributions.
Notes
- For non-floating-point numbers (e.g., integers), scale the input to
[0, 1)by dividing by the maximum value or map to an appropriate range for bucket indexing. - Bucket Sort is often used as a subroutine in algorithms like Radix Sort for sorting integers.
- The number of buckets
kis typically set tonor a constant, depending on the input size and distribution.