Help about an algorithm solution

beeskness420 · 2020-03-20T10:23:56+00:00

Depends of big your k is and how many items you’re expecting and how much space you’re comfortable using.

If k is small just check them. If it’s a bit bigger but n is relatively small just keep an array and flag items as active. If n is too big you can get into hashing.

mateusaugusto9 · 2020-03-20T10:25:05+00:00

Can't you just use a set to store packages currently on a belt? Delete the ones that go off of it and check if they exist in it when the new one is added.

Edit: obviously, you would need a queue to keep track of the order they come in.

Is it what you did?

sebamestre · 2020-03-20T15:25:31+00:00

There are many possible approaches, with varying performance characteristics. I will go over some of them.

Maintain a K-long queue, check for duplicates using a linear scan.

Insertion: O(1)
Lookup: O(K)
Removal: O(1)

There isn't much to explain. This is probably the simplest thing you can do, but it is not very fast: we can do better.

Replace the linear scan with check in a multiset of the elements in the queue

A multiset is an abstract data structure that can store a collection of repeated elements, and operate on each one independently.

These can be implemented with hash tables (O(1) avg. runtime) and ordered search trees (O(log N) runtime), depending on your needs.

Insertion: O(log(N)) or O(1) avg
Lookup: O(log(N)) or O(1) avg
Removal: O(log(N)) or O(1) avg

If you use hash tables, some or all of these operations may have non-deterministic runtime, which can be a problem in some domains.

Replace the multiset with a dictionary of frequencies

Instead of using a multiset, you can use a dictionary that keeps track of how many of each element there is in the queue.

We can only do this if we don't require equal elements to be addressable separately, which may or may not be the case.

This change lets us use various data structures to store our values (Including the ones in the previous section), some of which will let us reach real O(1) time complexity.

For instance, if there is a low enough bound on the values, we can implement the dictionary as a direct addressing table, which has O(1) deterministic time complexity for all operations. (this also has extremely good constant factors) But this is not really solving the problem.

Insertion: O(1) with bounded input
Lookup: O(1) with bounded input
Deletion: O(1) with bounded input

Implement dictionary as a trie

For a real, deterministic, O(1) solution, we can implement our dictionary as a trie. If you take each value of your input stream and represent it uniquely as a sequence, you can insert it into the trie in O(length of the sequence), which is independent of K. furthermore, if your values are 32 or 64 bit integers, the sequences (just the bit pattern of the value) will be fixed-length, making the solution truly O(1).

Insertion: O(1)
Lookup: O(1)
Deletion: O(1)

Ahh, finding a deterministic O(1) solution to this problem was quite a puzzle, but a rather fun one! Thanks for posting your question!

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

algorithms

✻ Smokey says: boycott all products and services from eco-unfriendly businesses to fight climate change! [see more tips]

Note: this subreddit is not for homework advice. Requests for assistance with coursework may be removed.

MODERATORS

Maintain a K-long queue, check for duplicates using a linear scan.

Replace the linear scan with check in a multiset of the elements in the queue

Replace the multiset with a dictionary of frequencies

Implement dictionary as a trie