Remove Duplicates — The Guest List Cleanup · DSA Quest

6Remove Duplicates — The Guest List Cleanup

Welcome back

You've seen four different flavors of two pointers so far: symmetric (Boss 1), steered (Boss 2), composed (Boss 3), and greedy (Boss 4 and 5). Today we unlock the last big shape in this dungeon.

This one is called fast-slow, or sometimes "reader-writer." Both pointers start at the left. One walks ahead and scans. The other sits back and writes. They chase each other instead of meeting in the middle. It sounds unusual but it's the clean way to rewrite an array in place.

Today's boss: The Guest List Cleanup.

The story

Mira's dinner party is back on after the rainstorm. She hands you the guest list.

guests: ["Ada", "Ada", "Ben", "Cara", "Cara", "Cara", "Dan", "Dan", "Eve"]

It's already alphabetized. But whoever typed it up hit Ctrl+V one too many times and every name got duplicated. Your job is to clean it up in place.

"Do it on this same sheet of paper," she says. "I don't want another list. Just fix this one and tell me how many unique guests there are."

You look at the list. You could write the clean version on a new sheet, but she said no new sheet. You could erase and slide everything down, but that's a mess. There's a better way.

The problem, dressed up properly

Given a sorted integer array nums, remove the duplicates in place such that each unique element appears only once. The relative order of the elements should be kept the same. Return the number k of unique elements.

Do not allocate a new array. You must do this by modifying the input array in place with O(1) extra memory.

The first k positions of nums should contain the unique elements in their original order. The elements beyond position k don't matter.

Key phrases: sorted, in place, O(1) extra memory. Those three together are what force the fast-slow pattern.

The naive way

If the "no new array" rule didn't exist, you'd do this:

def remove_duplicates(nums):
    unique = []
    for x in nums:
        if not unique or x != unique[-1]:
            unique.append(x)
    nums[:len(unique)] = unique
    return len(unique)

Works. But that unique = [] is a second array, and the problem said no. We have to clean the list using only the list itself.

The move: fast-slow pointers

Here's the trick.

Slow pointer (let's call it write) points at the next position to write a unique value into.
Fast pointer (let's call it read) walks ahead looking for the next new value.

Start with both at index 0. Actually, start write = 1 and read = 1, because index 0 is always a keeper — it's the first element and has nothing to be a duplicate of.

Then:

For each read, compare nums[read] to nums[write - 1] (the last unique value we wrote).
If they differ, we found a new unique value. Copy it: nums[write] = nums[read]. Then step write forward.
If they're the same, nums[read] is a duplicate. Skip.
Either way, step read forward.

When read reaches the end, the first write elements of nums are the unique values in order, and write itself is the count.

Why this only works when the array is sorted

If the array isn't sorted, duplicates can appear anywhere, and checking nums[read] == nums[write - 1] won't catch them. The sorted order packs all copies of the same value into consecutive positions. That's what makes the "different from the previous kept value" check sufficient.

Watching it work

nums: [1, 1, 2, 2, 2, 3, 4, 4, 5]

Let's trace it by hand.

nums:  [1, 1, 2, 2, 2, 3, 4, 4, 5]
index:  0  1  2  3  4  5  6  7  8

Start: write=1, read=1

read=1 (val 1), nums[write-1]=nums[0]=1 → same, skip, read++
read=2 (val 2), nums[write-1]=nums[0]=1 → different, write it
   nums[1] = 2
   nums becomes [1, 2, 2, 2, 2, 3, 4, 4, 5]
   write=2, read++
read=3 (val 2), nums[write-1]=nums[1]=2 → same, skip
read=4 (val 2), nums[write-1]=nums[1]=2 → same, skip
read=5 (val 3), nums[write-1]=nums[1]=2 → different
   nums[2] = 3
   nums becomes [1, 2, 3, 2, 2, 3, 4, 4, 5]
   write=3, read++
read=6 (val 4), nums[write-1]=nums[2]=3 → different
   nums[3] = 4
   nums becomes [1, 2, 3, 4, 2, 3, 4, 4, 5]
   write=4, read++
read=7 (val 4), nums[write-1]=nums[3]=4 → same, skip
read=8 (val 5), nums[write-1]=nums[3]=4 → different
   nums[4] = 5
   nums becomes [1, 2, 3, 4, 5, 3, 4, 4, 5]
   write=5, read++

read=9, stop

return write = 5

The first 5 elements of nums are [1, 2, 3, 4, 5]. That's the clean list. The garbage after index 4 doesn't matter — the problem said to ignore it.

The code

def remove_duplicates(nums: list[int]) -> int:
    if not nums:
        return 0
 
    write = 1
    for read in range(1, len(nums)):
        if nums[read] != nums[write - 1]:
            nums[write] = nums[read]
            write += 1
 
    return write

That's the whole function. Seven lines with the guard. Read it twice. The loop walks read through every index, and only write moves when we find something new.

Why is this correct?

Two invariants keep the whole thing honest:

nums[0..write-1] always contains the unique values we've kept so far, in order. True at the start (just one element). True after each step: we either skip (no change) or we write a value that's different from the last kept value, which preserves the "unique and sorted" property.
nums[write..read-1] is garbage we're allowed to overwrite. These are indices we've already read from, so writing to them won't lose information. Importantly, write <= read always, so we never write past where we've read.

The second invariant is what makes the in-place write safe. Since write only advances when read has advanced at least once, write can never overtake read. You're always writing into a slot whose original value has already been processed.

Fast-slow is the in-place sibling of map

If you squint, write is what a functional filter would build up, and read is the iterator. The only difference is fast-slow uses the same array instead of a new one. Same idea, zero allocation.

Why is this faster?

Both versions are O(n) time. The difference is space:

	With a new array	Fast-slow two pointers
Time	O(n)	O(n)
Extra space	O(n)	O(1)
Allocations	1 new list	Zero

For a million-element array, the new-array version allocates a million integers of scratch memory. Fast-slow allocates nothing. Same Big-O, very different real-world cost.

Gotchas

1. Starting both pointers at 0. If you start write = 0, read = 0, the first comparison tries to read nums[write - 1] = nums[-1], which in Python is the last element. You'll get wrong answers. Start at write = 1, read = 1.

2. Forgetting the empty-array case. An empty input should return 0, but the write = 1 initialization assumes at least one element. Guard with if not nums: return 0.

3. Applying this to an unsorted array. Don't. The nums[read] != nums[write - 1] check only catches adjacent duplicates. If the array isn't sorted, you'll miss duplicates that are separated. Sort first, or use a hash set instead.

4. Returning the elements instead of the count. The problem wants the count k. The array is the side effect. Read the problem statement carefully before you hit submit.

5. Trying to "delete" from the middle of a list. Python lets you del nums[i], but doing that inside a loop is O(n) per call and changes the indices underneath you. Don't. Just use fast-slow and let the garbage sit at the end.

The fifth shape of two pointers

Here's every shape we've seen in this dungeon, all in one place:

Symmetric (Boss 1) — both ends, both move every step. Palindromes, reversal.
Steered (Boss 2) — both ends, only one moves per step based on comparison. Sorted-array target sums.
Composed (Boss 3) — an outer loop picks an anchor, inner two pointers handle the rest. Triples and quadruples.
Greedy (Boss 4 and 5) — both ends, and a proof that advancing the weaker side can't miss the optimum. Maximum area, trapped water.
Fast-slow (Boss 6) — both at the left, one reads and one writes. In-place array rewriting.

Every two-pointer problem you'll ever meet is a variation on one of these five shapes. Knowing which shape a problem wants is half the work.

Dungeon cleared

That's it. Dungeon 2 is done. You cleared six bosses and walked away with a weapon that applies to palindromes, target sums, triples, max area, trapped rain, and in-place cleanups. And the shape of the pattern stays consistent across all of them.

Take a minute to appreciate that. Most algorithmic patterns only solve one narrow family of problems. Two pointers solves at least five. When you see an array or a string, "could two pointers help here?" is one of the first questions worth asking.

Boss down. Dungeon down. XP gained.

What you walked away with from Dungeon 2 as a whole:

Five shapes of two pointers, with a clear mental model for when each applies
A handful of classic problems you now understand deeply instead of having memorized
The "sorted input" reflex — when you see sorted, two pointers goes on the list
A habit of asking "which side can I safely skip?" instead of "which pair should I check next?"

Next dungeon: Stack. Different weapon, different shape. Whenever a problem has a "latest thing opened is the first thing closed" flavor, you'll be reaching for a stack. Balanced parentheses, monotonic structures, expression evaluation, and some surprising sorting tricks. The good news: you don't need the two-pointer brain for any of it. The better news: you get to learn a fresh tool from scratch.

See you in Dungeon 3.