Cyclic Sort Programming Pattern

Dev Staging
4 min readMay 20, 2021
Photo by Jerry Zhang on Unsplash

(Still only a 55% complete draft. But this still covers everything you need to know about the pattern. Please ignore typos and grammatical errors.)

Introduction

This pattern describes an interesting way to sort arrays in place with a time complexity of O(N). We can use this pattern in problems which provides us with input arrays containing numbers in a given range.

Example problem description:

You are given an unsorted array containing numbers taken from the range 1 to ’n’. The array can have duplicates, which means that some numbers will be missing. Find all the missing numbers.

To efficiently solve this problem, we can use the fact that the input array contains numbers in the range of 1 to ’n’. For example, to efficiently sort the array, we can try placing each number in its correct place, i.e., placing ‘1’ at index ‘0’, placing ‘2’ at index ‘1’, and so on. Once we are done with the sorting, we can iterate the array to find all indices that are missing the correct numbers. These will be our required numbers.

This article dissects the algorithm to sort and find missing or duplicate numbers (and more) in an array, by breaking the algorithm into two easy to understand steps.

Problems

Level 0 - Tutorial/Intro

  1. Sort 1-n array

Level 1 - Novice

  1. Missing Number
  2. Set Mismatch

Level 2 - Intermediate

  1. Find all numbers disappeared in an array
  2. Find the duplicate number (Also a Fast and Slow Pointer Pattern Problem)
  3. Find all duplicates in an array

Level 3 - Master

  1. First Missing Positive
  2. Corrupt Pair

Level 4 - Leeted

  1. Find the first K missing positive numbers

Pattern Identification

Hints that give away that it’s a cyclic sort pattern problem.

  1. Mention of “numbers in the range [0, n]” or, “contains numbers from 1 to n” in the problem description. (This is a blatant giveaway!)
  2. You’re asked to find the missing or duplicate numbers in the specified range.
  3. O(1) space and O(N) time constraints.

Pattern Approach

The cyclic sort pattern usually contains the following 2 main steps.

1. Sort the input array in place. O(N)

Given below is the code to sort an integer array -> int[] nums

int pos = 0; //start indexwhile(pos < nums.length){
//This condition changes based on problem description
if(nums[pos] < nums.length && nums[pos] != nums[nums[pos]]){
swap(nums, pos, nums[pos]);
}else{
pos++;
}
}

Explanation:

if(nums[pos] < nums.length && nums[pos] != nums[nums[pos]])

This line asks 2 questions and I’ll explain why this is essential for understanding the pattern.

Condition 1

  • nums[pos] < nums.length
    Is the value at nums[pos] is less than the highest index in the array?

We have to account for this condition because the failure of accounting for this condition will result in an ArrayIndexOutOfBoundsException as the index may not exist when we try to swap it.

For ex -> Let’s assume that the input array is [2, 14, 3, 1]

If we try to swap “nums[1] (14)” with the nums[14] then we will get an ArrayIndexOutOfBoundsException as the highest index of the above array is 3.

Condition 2

  • nums[pos] != nums[nums[pos]]
    Is the value at current index the same as the value in the position it should be?
    Also can be understood as asking if the nums[pos] != pos
    However, phrasing it this way nums[pos] != nums[nums[pos]], prevents the scenario where we keep swapping duplicate numbers. Causing the program to timeout.

For ex -> Let’s assume that the input array is [3, 2, 0, 3, 4]

If we swap “nums[0] (3)” with the nums[3] then the resultant array will be the same as before swapping. Hence causing the current index to still be wrong. And we will be stuck in a loop of swapping 3 with 3, until the program timeouts.

Finally, we increment the position “pos++”, when the value at the current index is correctly placed, or cannot be swapped due to duplicate or other conditions etc.

2. Look for the duplicate/missing value. O(N)

Given below is the code to find the duplicate/missing in the sorted array.

for(int j = 0; j < nums.length; j++){
if(nums[j] != j){
return j; (missing)
return nums[j]; (duplicate)
}
}

Explanation:

We loop through the indexes to make sure the value at the position matches.

If the value at the position doesn’t match, then the duplicate value is the value that’s on the incorrect position.
Reason: Because it’s actual position is occupied by the duplicate.

If the value at the position doesn’t match, then the missing value is the index.
Reason: Because the index wasn’t filled by the value that matches the index.

Conclusion

I hope that you have a better idea on how to identify and solve problems which ask you to find the missing/duplicates in a range of numbers.

Go ahead and rock all your interviews that require this pattern!

--

--

Dev Staging

Passionate about making the world a better place by cultivating relationships and developing technologies.