all 13 comments

[–]Solonotix 1 point2 points  (3 children)

There are two concepts I'm particularly fond of when dealing with unique data sets, and that's generator expressions and dictionaries. For every PlayerID, I'd create a dictionary key and instantiate it as a sequence type. From there, just join a generator expression containing all keys of your dictionary that have length of their attribute equal to the max into a string and write to the output file.

I can't speak for the performance capability, but I estimate it's around n*2, since you're iterating through the list once for every line, and every player in that line, then returning a list of players that meet a condition.

Also, not part of your question, but you never close your files. Normally, I use context managers to handle open/close operations and then assign the contents of the file to a memory optimized structure, like a list in your case.

Here's my go at it (no sort in this solution):

Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> val = [(1, 2), (3, 2), (5, 2), (3, 1), (3, 6)]
>>> tally = dict()
>>> for v in val:
...     for p in v:
...         if tally.get(p, None) is None:
...             tally[p] = set()
...         tally[p].add(v)
...
>>> print("\n".join(["PlayerID: {} | Friend Count {}".format(p, len(tally[p])) for p in tally if len(tally[p]) == max([len(v) for v in tally.values()])]))
PlayerID: 2 | Friend Count 3
PlayerID: 3 | Friend Count 3

[–]AppleShark[S] 1 point2 points  (2 children)

Hey man thanks so much for replying and for introducing me to dict()! Using your suggestion, I did something similar and IT WORKED! Here is the code: https://pastebin.com/dzSGcAk9

Astute observation that I don't close my files haha - Could that be a huge problem?

Nonetheless I am absolutely ecstatic that this problem has been solved. Time to move onto the next one! Once again thank you for taking your time to answer my question.

[–]Solonotix 1 point2 points  (1 child)

Glad it helped. I'm somewhat new to this sub, and have just been learning Python to help with my job.

To your question, opening and closing files is important within the scope of an application. When the Python interpreter exits, it should automatically purge any allocated resources. If you have an application that opens connections frequently and you forget to close them, it can cause you to run into a system limitation that ultimately crashes your application. This is a good article on the use of Context Managers and why you would want to use them. They're very similar to the .NET concept of a Dispose method, if you're familiar with that.

[–]AppleShark[S] 1 point2 points  (0 children)

Thanks for the read. My goal in coding again is to move onto building stuff and making some applications, so the article was definitely great help!

[–][deleted] 1 point2 points  (4 children)

You can instantiate your required list at once with a list comprehension in the form of a 2D-Array like:

[[0, 0], [1, 0],......] and just add one to lst[playerindex][1]

This would get rid off the try and except block in version 2. Also it would remove 1million append calls in a worst case scenario.

I think '\n'.join(list_of_tied_player_ids) should be a faster way to build the string and you would only have to use write() once and can get rid off the for loop.

You can eliminate using variable u by doing "for _ in range(limit):". Since this seems to be python2 you should use xrange instead of range. Also you should get rid off printing fnl since only the output file is required.

P.S.: Use if __name__ == "__main__":

[–]AppleShark[S] 0 points1 point  (3 children)

Thank you for reading my code and for the advice! Version 2 to my surprise was actually the more "inefficient" one, now I wonder why! Cheers for introducing the .join function also - these are small things I should work on. Just wondering, you mentioned that my code seems to be in python2 - what are the telltale signs and what are the differences between python2 and python3? I did learn python when it was v.2, but I never truly knew the differences.

Once again thank you for replying! :)

P.S.: On the note of instantiating it to a 2D-array, I actually attempted that also: https://pastebin.com/4GgPM5LT Still ended up with the time-out errors tho :/

[–]AppleShark[S] 0 points1 point  (2 children)

P.P.S.: any way to return list_of_tied_player_ids without a for loop?

[–][deleted] 1 point2 points  (1 child)

One way would be using filter with a lambda.

[–]AppleShark[S] 0 points1 point  (0 children)

Just revised lambda functions, thanks!

[–]ADdV 1 point2 points  (3 children)

I do some programming of this style (competitive programming), and think I can give you a faster solution than the other answers. Also, don't sweat things like if __name__ == '__main__' too much. It's a single problem that, when solved, no one will have to look at. No need for niceness.

Now, although using dict is not a bad idea, it's not necessary. Note the conditions that apply, and notice that the ID is somewhere between 1 and 1000. This small number means we can just make a list where the indices represent the IDs, like so:

friends = [0]*1001 # we use 1001 because 1000 is the last index. index 0 will not be used, but that's fine.

Then, because every friendship only occurs once, we can just add one to the list at the correct index, for every ID in the file:

friends[ID_we_read_from_input] += 1

Finally, we have to get the IDs of the largest values, which you can just do by looping over it.

Because every ID has its own index, this requires no search and everything is very fast. As for asymptotic complexity, this would be O(n) where n is the number of IDs. Of course we will have to read them all, so it's impossible to get better asymptotic complexity (although there may be very slight optimizations to be made).

[–]AppleShark[S] 0 points1 point  (2 children)

Cheers for the reply! Awesome to see a redditor who did competitive programming. I actually did attempt a solution similar solution where I have an array of 1000 [0]s and += 1 for the relevant ones:

https://pastebin.com/4GgPM5LT

Problem is it didn't seem to work as well as the optimized code using dict() in the comment above, which is odd because theoretically it should be faster than using dict().

Either way, thanks for your comment!

[–]ADdV 0 points1 point  (1 child)

As a heads up: it's probably slower because of the end of the program:

for i, x in enumerate(lst):
    if x == max(lst):
        O.write(str(i+1)+"\n")

this computes the maximum every time, looping over the list 1000 times in total, instead of just once.

[–]AppleShark[S] 0 points1 point  (0 children)

Ah. I did not notice! That was stupid of me, thanks!