Find Median from Data Stream

hard

Design a data structure that efficiently supports adding integers from a data stream and finding the median of all elements added so far. Implement the MedianFinder class with the following methods:

  • addNum(num) — Adds the integer num to the data structure.
  • findMedian() — Returns the median of all elements added so far. If the count of elements is even, the median is the average of the two middle values.

Examples

Example 1

  • Input: addNum(1), addNum(2), findMedian(), addNum(3), findMedian()
  • Output: 1.5, 2.0
  • Explanation: After adding 1 and 2, the median is (1 + 2) / 2 = 1.5. After adding 3, the sorted list is [1, 2, 3] and the median is 2.0.

Example 2

  • Input: addNum(6), findMedian(), addNum(10), findMedian(), addNum(2), findMedian()
  • Output: 6.0, 8.0, 6.0
  • Explanation: After [6], median is 6.0. After [6, 10], median is 8.0. After [2, 6, 10], median is 6.0.

Constraints

  • -10^5 <= num <= 10^5
  • There will be at least one element before findMedian is called.
  • At most 5 * 10^4 calls will be made to addNum and findMedian.

Approaches

2 worked solutions ranked from straightforward to optimal — each with intuition, algorithm, complexity, and verified code in C++, Java, Python, and Go.

Approach 1

Sorted List with Binary Search Insert

Intuition

The simplest approach is to maintain a sorted list of all numbers. When adding a new number, use binary search to find the correct insertion position and insert it to keep the list sorted. Finding the median is then an O(1) lookup of the middle element(s). However, the insertion itself is O(n) because shifting elements in an array is linear.

Algorithm

  1. Maintain a sorted array of all added numbers.
  2. **addNum:** Use binary search to find the insertion index, then insert the number at that position.
  3. **findMedian:** If the array has odd length, return the middle element. If even, return the average of the two middle elements.
TimeO(n) for `addNum` (binary search is O(log n) but insertion is O(n)); O(1) for `findMedian`SpaceO(n) to store all numbers

Code (C++ · Java · Python · Go)

C++17
class MedianFinder {
    vector<int> sorted;

public:
    MedianFinder() {}

    void addNum(int num) {
        auto it = lower_bound(sorted.begin(), sorted.end(), num);
        sorted.insert(it, num);
    }

    double findMedian() {
        int n = sorted.size();
        if (n % 2 == 1) {
            return sorted[n / 2];
        }
        return (sorted[n / 2 - 1] + sorted[n / 2]) / 2.0;
    }
};
Java 21
class MedianFinder {
    private List<Integer> sorted;

    public MedianFinder() {
        sorted = new ArrayList<>();
    }

    public void addNum(int num) {
        int idx = Collections.binarySearch(sorted, num);
        if (idx < 0) idx = -(idx + 1);
        sorted.add(idx, num);
    }

    public double findMedian() {
        int n = sorted.size();
        if (n % 2 == 1) {
            return sorted.get(n / 2);
        }
        return (sorted.get(n / 2 - 1) + sorted.get(n / 2)) / 2.0;
    }
}
Python 3.12
import bisect


class MedianFinder:
    def __init__(self):
        self.sorted = []

    def addNum(self, num: int) -> None:
        bisect.insort(self.sorted, num)

    def findMedian(self) -> float:
        n = len(self.sorted)
        if n % 2 == 1:
            return float(self.sorted[n // 2])
        return (self.sorted[n // 2 - 1] + self.sorted[n // 2]) / 2.0
Go 1.26
import "sort"

type MedianFinder struct {
    sorted []int
}

func Constructor() MedianFinder {
    return MedianFinder{sorted: []int{}}
}

func (mf *MedianFinder) AddNum(num int) {
    idx := sort.SearchInts(mf.sorted, num)
    mf.sorted = append(mf.sorted, 0)
    copy(mf.sorted[idx+1:], mf.sorted[idx:])
    mf.sorted[idx] = num
}

func (mf *MedianFinder) FindMedian() float64 {
    n := len(mf.sorted)
    if n%2 == 1 {
        return float64(mf.sorted[n/2])
    }
    return float64(mf.sorted[n/2-1]+mf.sorted[n/2]) / 2.0
}
Approach 2

Two Heaps

Intuition

The optimal approach uses two heaps: a max-heap to store the smaller half of the numbers and a min-heap to store the larger half. The max-heap's top gives us the largest of the small numbers, and the min-heap's top gives us the smallest of the large numbers. By keeping the heaps balanced (sizes differ by at most one), the median is always at the top of one or both heaps. This gives O(log n) insertion and O(1) median lookup.

Algorithm

  1. Maintain a max-heap (`lo`) for the lower half and a min-heap (`hi`) for the upper half.
  2. **addNum:** Add the number to `lo` (max-heap). Then move the top of `lo` to `hi` to ensure all elements in `lo` are less than or equal to all elements in `hi`. If `hi` becomes larger than `lo`, move the top of `hi` back to `lo`.
  3. **findMedian:** If heaps have equal size, the median is the average of both tops. If `lo` has one more element, the median is the top of `lo`.
TimeO(log n) for `addNum`; O(1) for `findMedian`SpaceO(n) to store all numbers

Code (C++ · Java · Python · Go)

C++17
class MedianFinder {
    priority_queue<int> lo;                              // max-heap
    priority_queue<int, vector<int>, greater<int>> hi;   // min-heap

public:
    MedianFinder() {}

    void addNum(int num) {
        lo.push(num);
        hi.push(lo.top());
        lo.pop();

        if (hi.size() > lo.size()) {
            lo.push(hi.top());
            hi.pop();
        }
    }

    double findMedian() {
        if (lo.size() > hi.size()) {
            return lo.top();
        }
        return (lo.top() + hi.top()) / 2.0;
    }
};
Java 21
class MedianFinder {
    private PriorityQueue<Integer> lo; // max-heap
    private PriorityQueue<Integer> hi; // min-heap

    public MedianFinder() {
        lo = new PriorityQueue<>(Collections.reverseOrder());
        hi = new PriorityQueue<>();
    }

    public void addNum(int num) {
        lo.offer(num);
        hi.offer(lo.poll());

        if (hi.size() > lo.size()) {
            lo.offer(hi.poll());
        }
    }

    public double findMedian() {
        if (lo.size() > hi.size()) {
            return lo.peek();
        }
        return (lo.peek() + hi.peek()) / 2.0;
    }
}
Python 3.12
import heapq


class MedianFinder:
    def __init__(self):
        self.lo = []  # max-heap (negated values)
        self.hi = []  # min-heap

    def addNum(self, num: int) -> None:
        heapq.heappush(self.lo, -num)
        heapq.heappush(self.hi, -heapq.heappop(self.lo))

        if len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def findMedian(self) -> float:
        if len(self.lo) > len(self.hi):
            return float(-self.lo[0])
        return (-self.lo[0] + self.hi[0]) / 2.0
Go 1.26
import "container/heap"

type MaxHeap []int

func (h MaxHeap) Len() int            { return len(h) }
func (h MaxHeap) Less(i, j int) bool  { return h[i] > h[j] }
func (h MaxHeap) Swap(i, j int)       { h[i], h[j] = h[j], h[i] }
func (h *MaxHeap) Push(x interface{}) { *h = append(*h, x.(int)) }
func (h *MaxHeap) Pop() interface{} {
    old := *h
    n := len(old)
    x := old[n-1]
    *h = old[:n-1]
    return x
}

type MinHeap []int

func (h MinHeap) Len() int            { return len(h) }
func (h MinHeap) Less(i, j int) bool  { return h[i] < h[j] }
func (h MinHeap) Swap(i, j int)       { h[i], h[j] = h[j], h[i] }
func (h *MinHeap) Push(x interface{}) { *h = append(*h, x.(int)) }
func (h *MinHeap) Pop() interface{} {
    old := *h
    n := len(old)
    x := old[n-1]
    *h = old[:n-1]
    return x
}

type MedianFinder struct {
    lo *MaxHeap
    hi *MinHeap
}

func Constructor() MedianFinder {
    lo := &MaxHeap{}
    hi := &MinHeap{}
    heap.Init(lo)
    heap.Init(hi)
    return MedianFinder{lo: lo, hi: hi}
}

func (mf *MedianFinder) AddNum(num int) {
    heap.Push(mf.lo, num)
    heap.Push(mf.hi, heap.Pop(mf.lo))

    if mf.hi.Len() > mf.lo.Len() {
        heap.Push(mf.lo, heap.Pop(mf.hi))
    }
}

func (mf *MedianFinder) FindMedian() float64 {
    if mf.lo.Len() > mf.hi.Len() {
        return float64((*mf.lo)[0])
    }
    return float64((*mf.lo)[0]+(*mf.hi)[0]) / 2.0
}

Frequently asked questions

What is the optimal time complexity of Find Median from Data Stream?

The most efficient approach in this guide ("Two Heaps") runs in O(log n) for `addNum`; O(1) for `findMedian` time and uses O(n) to store all numbers extra space. Earlier brute-force approaches are slower; see all 2 approaches above for the full progression.

What pattern does Find Median from Data Stream use?

Find Median from Data Stream is a hard-level Heap / Priority Queue problem involving Heap / Priority Queue, Design, Sorting. Recognizing the pattern is half the work — see the intuition section above for how to spot it.

Can Find Median from Data Stream be solved without extra space?

The most space-efficient approach in this guide uses O(n) to store all numbers extra space (excluding the input). If you're aiming for in-place — see the trade-off in the algorithm section above — the brute-force approach uses O(n) to store all numbers, while the optimized version uses O(n) to store all numbers.

Which programming languages does this Ratta solution support?

Every approach above ships with verified, runnable solutions in C++, Java, Python, and Go. The starter templates in the workspace match the same four languages so you can practice in your interview language of choice.

Original problem: leetcode.com

Loading editor...
operations=
["MedianFinder","addNum","addNum","findMedian","addNum","findMedian"]
arguments=
[[],[1],[2],[],[3],[]]
[null,null,null,1.5,null,2]