There are some greedy algorithms in this section that could use revision.
Example 1:
Input: [3, 1, 5, 12, 2, 11], K = 3
Output: [5, 12, 11]
Example 2:
Input: [5, 12, 11, -1, 12], K = 3
Output: [12, 11, 12]
Implementation: KLargestElements
We can use a max heap and get the top K.
Each heapify()
uses O(logn)
and there are N
elements. So time complexity, is O(NlogK)
.
O(K)
Example 1:
Input: [1, 5, 12, 2, 11, 5], K = 3
Output: 5
Explanation: The 3rd smallest number is '5', as the first two smaller numbers are [1, 2].
Example 2:
Input: [1, 5, 12, 2, 11, 5], K = 4
Output: 5
Explanation: The 4th smallest number is '5', as the first three small numbers are [1, 2, 5].
Example 3:
Input: [5, 12, 11, -1, 12], K = 3
Output: 11
Explanation: The 3rd smallest number is '11', as the first two small numbers are [5, -1].
Implementation: KthSmallestNumber
We can use a min-heap to store the numbers and keep polling till we get the Kth smallest number.
O(nlogn)
O(n)
Example 1:
Input: points = [[1,2],[1,3]], K = 1
Output: [[1,2]]
Explanation: The Euclidean distance between (1, 2) and the origin is sqrt(5).
The Euclidean distance between (1, 3) and the origin is sqrt(10).
Since sqrt(5) < sqrt(10), therefore (1, 2) is closer to the origin.
Example 2:
Input: point = [[1, 3], [3, 4], [2, -1]], K = 2
Output: [[1, 3], [2, -1]]
Dist = sqrt( (x^2) + (y^2) )
We calculate all the distances, store them in a min-heap, and return the K
closest points to origin.
O(N log K)
O(N)
Example 1:
Input: [1, 3, 11, 5]
Output: 33
Explanation: First connect 1+3(=4), then 4+5(=9), and then 9+11(=20). So the total cost is 33 (4+9+20)
Example 2:
Input: [3, 4, 5, 6]
Output: 36
Explanation: First connect 3+4(=7), then 5+6(=11), 7+11(=18). Total cost is 36 (7+11+18)
Example 3:
Input: [1, 3, 11, 5, 2]
Output: 42
Explanation: First connect 1+2(=3), then 3+3(=6), 6+5(=11), 11+11(=22). Total cost is 42 (3+6+11+22)
Implementation: ConnectRopes
Follow a greedy approach to connect the smallest ropes first.
O(nlogn)
O(n)
Example 1:
Input: [1, 3, 5, 12, 11, 12, 11], K = 2
Output: [12, 11]
Explanation: Both '11' and '12' apeared twice.
Example 2:
Input: [5, 12, 11, 3, 11], K = 2
Output: [11, 5] or [11, 12] or [11, 3]
Explanation: Only '11' appeared twice, all other numbers appeared once.
Implementation: TopKFrequentNumbers
We need to find the frequency of each number, we can do this using a map. Once we have the frequencies, we can use a max-heap to find the top k frequent numbers. We can also sort by the frequency values and return the top k.
O(nlogk)
O(n)
Example 1:
Input: "Programming"
Output: "rrggmmPiano"
Explanation: 'r', 'g', and 'm' appeared twice, so they need to appear before any other character.
Example 2:
Input: "abcbab"
Output: "bbbaac"
Explanation: 'b' appeared three times, 'a' appeared twice, and 'c' appeared only once.
Implementation: StringSort
Similar to previous, we can use a max heap but instead of adding the char to the result-list, we repeat the char the frequency number of time.
O(nlogn)
O(n)
Example 1:
Input: [5, 6, 7, 8, 9], K = 3, X = 7
Output: [6, 7, 8]
Example 2:
Input: [2, 4, 5, 6, 9], K = 3, X = 6
Output: [4, 5, 6]
Example 3:
Input: [2, 4, 5, 6, 9], K = 3, X = 10
Output: [5, 6, 9]
Implementation: KClosestNumbers
X
using binary search.K
closest numbers will be adjacent to the closest number.X
.K
numbers from the heap.O(logn)
for binary search. O(nlogn)
for the heap operations. So total of O(nlogn)
.
O(n)
if we need to store all numbers in the heap.
Example 1:
Input: [7, 3, 5, 8, 5, 3, 3], and K=2
Output: 3
Explanation: We can remove two occurrences of 3 to be left with 3 distinct numbers [7, 3, 8], we have to skip 5 because it is not distinct and appeared twice.
Another solution could be to remove one instance of '5' and '3' each to be left with three distinct numbers [7, 5, 8], in this case, we have to skip 3 because it appeared twice.
Example 2:
Input: [3, 5, 12, 11, 12], and K=3
Output: 2
Explanation: We can remove one occurrence of 12, after which all numbers will become distinct. Then we can delete any two numbers which will leave us 2 distinct numbers in the result.
Example 3:
Input: [1, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5], and K=2
Output: 3
Explanation: We can remove one occurrence of '4' to get three distinct numbers.
Implementation: MaxDistinctElements
The trick to this question is figuring out the greedy approach to the solution.
So, there could be two groups of numbers in the array:
Now, we are allowed K
removals, so we take the first non-distinct number with the min frequency and see if we can remove all but one occurences. If we cant, aka, we dont have enough K's, the current count of distinct numbers is the solution. If we can we add that to our count of distinct numbers and keep doing the same for all non-distinct numbers.
O(n)
for frequency counting. O(nlogn)
for heap operations. So, total of O(nlogn)
.
O(n)
for the map.
Example 1:
Input: [1, 3, 12, 5, 15, 11], and K1=3, K2=6
Output: 23
Explanation: The 3rd smallest number is 5 and 6th smallest number 15. The sum of numbers coming
between 5 and 15 is 23 (11+12).
Example 2:
Input: [3, 5, 8, 7], and K1=1, K2=4
Output: 12
Explanation: The sum of the numbers between the 1st smallest number (3) and the 4th smallest
number (8) is 12 (5+7).
Implementation: SumOfElements
Taking example, Input: [1, 3, 12, 5, 15, 11], and K1=3, K2=6
, the sorted form of the array is [1, 3, 5, 11, 12, 15]
. The third smallest number is 5 and the 6th smallest is 15. Between them, we have 11 and 12 which sums up to 23.
So essentially, we need to keep track of the k1
smallest number, the k2
smallest number and the sum of all numbers in between.
We can take a min heap and keep polling till k1=0
. After that, we sum up the numbers till k2
becomes 0. The result sum is what we want.
O(nlogn)
O(n)
Example 1:
Input: "aappp"
Output: "papap"
Explanation: In "papap", none of the repeating characters come next to each other.
Example 2:
Input: "Programming"
Output: "rgmrgmPiano" or "gmringmrPoa" or "gmrPagimnor", etc.
Explanation: None of the repeating characters come next to each other.
Example 3:
Input: "aapa"
Output: ""
Explanation: In all arrangements of "aapa", atleast two 'a' will come together e.g., "apaa", "paaa".
Implementation: RearrangeStrings
The greedy algorithm here is:
O(nlogn)
O(n)
Example 1:
Input: "mmpp", K=2
Output: "mpmp" or "pmpm"
Explanation: All same characters are 2 distance apart.
Example 2:
Input: "Programming", K=3
Output: "rgmPrgmiano" or "gmringmrPoa" or "gmrPagimnor" and a few more
Explanation: All same characters are 3 distance apart.
Example 3:
Input: "aab", K=2
Output: "aba"
Explanation: All same characters are 2 distance apart.
Example 4:
Input: "aappa", K=3
Output: ""
Explanation: We cannot find an arrangement of the string where any two 'a' are 3 distance apart.
Implementation: RearrangeStringKDistApart
There are two considerations here: 1. Repeating characters. 2. And a certain number of counts till a character cannot be used, once it has been used.
<Char, 0>
.index+k
, when it will be next pulled to be added to the result.O(nlogn)
O(n)
Example 1:
Input: [a, a, a, b, c, c], K=2
Output: 7
Explanation: a -> c -> b -> a -> c -> idle -> a
Example 2:
Input: [a, b, a], K=3
Output: 5
Explanation: a -> b -> idle -> idle -> a
Implementation: SchedulingTasks
We can make the following observations from the question: 1. The tasks may execute in any order. 2. Each task takes 1 CPU interval. 3. If no tasks can run on a particular interval, the CPU simply remains idle for that interval. 4. We need to optimize for the min number of CPU intervals.
Since we need to optimize for the min number of CPU intervals, we need to interleave different tasks so we don't 'waste' CPU cycles. If we didn't needed to optimize for CPU intervals, we could have just picked a task and waited for it to finish, then move onto the next one and so on.
First, we need to know how many times a task needs to execute. This we can do by keeping a map of task -> frequency
.
Next, we need to keep track of a couple of things:
1. The current CPU interval. This we can do with a index
variable.
2. The interval on which the next task should run.
But how do we know which task should run on a particular interval, if any should run at all. The second concern is how do we identify the most immediate task that should be run.
We can keep a min-heap
of the task and the interval on which it should run. There can be two scenarios then:
1. The heap tells us that there is a task whose run interval is the current CPU interval or a CPU interval before the current one.
2. The heap tells us that there is a task whose run interval is greater than the current CPU interval and thus, cannot be run yet.
Thus, our algorithm becomes:
1. Create a frequency map of task->count
of the times the task should be run.
2. Create a min-heap of the task task and the interval at which it should run. We can start all tasks with interval=0
, aka, every task has a shot to run on the first CPU interval.
3. Whenever, we run a task, we put the task and interval back to the heap for the next run, iff, the frequency of the task is >0
.
4. We continue running till the heap runs out of tasks.
O(nlogn)
O(1)
push(int num): Pushes the number ‘num’ on the stack.
pop(): Returns the most frequent number in the stack. If there is a tie, return the number which was pushed later.
Example:
After following push operations: push(1), push(2), push(3), push(2), push(1), push(2), push(5)
1. pop() should return 2, as it is the most frequent number
2. Next pop() should return 1
3. Next pop() should return 2
Implementation: FrequencyStack
1 -> 2
2 -> 3
3 -> 1
5 -> 1
t0 t1 t2 t3 t4 t5 t6
push(1), push(2), push(3), push(2), push(1), push(2), push(5)
Taking the example, the expected output is:
1. Since there are 3 2's on the first pop()
call, we return 2. Now frequency of 2 is reduced to 2.
2. On the second pop()
call, we have both 1 and 2 with frequency 2. Since there's a clash, the input time takes over. We popped the second last 2 in the first pop()
call, and 1 was pushed after the 4th last push for 2, so we return 1.
Thus, we observe: 1. There are two conditions for popping. 2. If frequencies mismatch, we return the one with the higher frequency. 3. If frequencies match, we return the one with the higher sequence number.
We ofcourse are going to use a max heap. But how to design the comparator?
We need to keep track of the frequency, which we can do using a map. So our first comparison would be based on that frequency. Next, we need to keep a linked list of times (or sequence numbers), linked list to maintain order of insertion. As we pop elements, we need to keep removing from this list. So our second comparison (if frequencies are equal), would be based on the last element of the list.
A second, probably simpler, way could be to keep insert the map entries as we get new elements. So for instance, we get
push -> 2, 2, 1, 2, 1
first, map will have key -> 2, freq -> 1, seq -> 1. we insert this combination into heap as well.
next, map will have key -> 2, freq -> 2, seq -> 2. we insert this combination into heap as well.
at this point, the map has one element with key 2 but the heap has 2 elements, making our comparator simpler.
O(nlogn)
for both operations.
O(n)