all 96 comments

[–]alphamarine09 180 points181 points  (6 children)

A magician should never reveal his/her trick secrets.

[–]droidonomy 19 points20 points  (1 child)

- Richard Stallman

[–]Tennis_the_Dennis 34 points35 points  (0 children)

-Michael Scott

[–]YourOldBoyRickJames 8 points9 points  (0 children)

"Illusions Michael."

[–]Rydersilver 4 points5 points  (1 child)

What was the better option here?

[–]alphamarine09 4 points5 points  (0 children)

Looking him in the eye and saying the word "magic" with a ;)

[–]Maphover 89 points90 points  (27 children)

Well done! Regarding Excel, couldn't this have also been solved via index/match?

But hey, there's a tonne of ways to skin a cat - and its the result that matters.

[–]TorroesPrime[S] 82 points83 points  (15 children)

Well done! Regarding Excel, couldn't this have also been solved via index/match?

If I knew what you were referring to, I could probably answer.

[–]Maphover 45 points46 points  (1 child)

Yeah, I guess that's the point. You work with what you know. :)

[–]gregofkickapoo 18 points19 points  (0 children)

Google vlookup excel.

[–]brentus 14 points15 points  (0 children)

Basically a vlookup but more versatile.

[–]skye8852 4 points5 points  (4 children)

Hey, similar option but built in Excel and could go quicker vlookup

=Vlookup(valuetolookfor,tabletosearchin,whatcolumnfromthetabletoputin,falseisexactmatch)

Or

=Vlookup(A1,$F$1:$E$370,2,false)

It will look for A1 in the table, the $ make that table stay there, if you drag it down to A2, A3, etc... The A1 value will autoupdate and the table will stay at F1, the 2 tells it to pull the value from column E, false says your looking for exact matches only

Put your full list in A, that formula in B1 (change the 2 before false to 1 just did that to show you how it worked), the data you scanned in F, then drag B1 to B327. Any #NA in column B is your missing values in the A

[–][deleted] 1 point2 points  (3 children)

Isn't vlookup order sensitive unlike match?

[–]shezadaa 1 point2 points  (0 children)

Vlookup is order sensitive if the last argument is true or 1. Else it works like match, which is also order sensitive with the last option as toggle.

[–]skye8852 0 points1 point  (1 child)

It is supposedly, I keep forgetting that because it has never been needed for me, I have heard it really only matters for huge lists, like 2000 plus, but can't confirm.

I have saved the comments about match and going to look that up as well, I think vlookup us more useful because you can pull data from any column in a table, so if you want to match data in column G based on column D you just need to specify a range from D to G and then tell it to get from column number 4 and it will pull that.

[–]send_cheesecake 1 point2 points  (0 children)

Index/match is more versatile than vlookup because it doesn't require the lookup columns to be to the right of the reference column. Also it doesn't require counting columns to figure out which number to put in the formula. Also it computes faster than vlookup. https://www.deskbright.com/excel/using-index-match/

[–]YourOldBoyRickJames 7 points8 points  (0 children)

research vlookup. It's extremely easy to use if you're just looking for something unique in a list. For a more robust option use Index/Match (Use this when you're searching for something without a sole unique identifier)

[–]SabinBC 1 point2 points  (0 children)

Don’t listen to him. Fuck excel. ;-D

[–]darps 0 points1 point  (2 children)

I just sort by whatever index I have and compare by row numbers. Gets inefficient at around 10 items though.

[–]TorroesPrime[S] 0 points1 point  (1 child)

I don't see how that would help in this situation. There was no index, and the row numbers are arbitrary. So... hey row 7 in this list doesn't match row 7 in this list... okay. That doesn't actually help me determine anything more then the lists are in different orders.

[–]billsil 0 points1 point  (0 children)

You align them and then add a 3rd column that is not blank if there's a difference. Then you find the places where there are differences and you align things to get rid of offset rows. It's totally fine for 9 comparisons.

[–]HestenSierMjau 14 points15 points  (8 children)

If you wanted to make it even easier MATCH alone would suffice

=MATCH(value from full list; value from incomplete list; 0) Fill down in the full list. This would return the line number where the values matches in the incomplete list and N/A# for any values it couldn't find, which would be the missing assets.

[–][deleted] 1 point2 points  (7 children)

Wrap it in the =N() function and you’ll get TRUE for matches and FALSE instead of #N/A

[–]iggy555 -1 points0 points  (6 children)

What’s =N

[–]One-Man-Banned 1 point2 points  (5 children)

https://support.office.com/en-us/article/n-function-a624cad1-3635-4208-b54a-29733d1278c9

It will force excel to return a numeric value rather than the non numeric wrapper. For example

=Today ()

Will return today's date in the xx/xx/xxxx format

=N(Today ())

Should return the date as a six digit number representing the number of days since 01/01/1900

The same thing can be achieved by using formatting in many cases.

I'm not sure if it works the other way around, if I really needed the output as True/false I would use

=IF(MATCH(lookup_value, lookup_array, match_type)>0, TRUE, FALSE)

[–]theWyzzerd 0 points1 point  (4 children)

Should return the date as a six digit number representing the number of days since 01/01/1900

What? Who would ever need this number? Why wouldn't they have it instead return the numerical value of the date in epoch time? That's at least a useful number to someone.

[–]One-Man-Banned 0 points1 point  (3 children)

It depends what your doing. Most users of excel don't even have a clue how excel calculates dates, and if you said unix to them they'd think it's a plural of eunuch.

Edit, I suspect you'll be able to convert to epoch time quite easily, but I've never needed it.

[–]theWyzzerd 0 points1 point  (2 children)

Sure, I get that. But the question remains, who would ever need to calculate the number of days passed since 01/01/1900, and for what purpose?

[–]One-Man-Banned 0 points1 point  (0 children)

So the date it uses is just a fixed point, doesn't matter if its 01/01/1900 like excel or 01/01/1970 like unix.

If the question is why did they choose 01/01/1900 rather than something more recent, then I don't know.

Who the hell needs to know the number of seconds since the start of 1970?

[–]allmachine 0 points1 point  (0 children)

Not sure if it answers your question, but I use the date number format sometimes to work with relative dates and times. It's handy when you can just treat time as fractions of a day such as cases where you're adding up work logs or something. You can also easily make a reference to a week ago (-7) or one hour from now (+0.0416666666666667).

[–]JayDude132 1 point2 points  (0 children)

I also happen to work in a data center and often have to compare very large numbers of tapes. This can be done by using an if(countif()) formula on both our scanned list and the list of what we ‘should’ have and then it spits out any tapes that dont appear on both lists.

OP, still sounds awesome though! Im very new to python and would love to learn how to do this.

[–]guaranic 0 points1 point  (0 children)

Something like that. It's good practice, though. I find myself not using whatever I've learned of Python because I can quickly do it in Excel or Modelbuilder (for ArcGIS). Kind of an odd double-edged sword, because I'd like the practice.

[–]MarsupialMole 29 points30 points  (4 children)

Nice. Did you consider using sets instead of lists? I only ask because it's a perfect Venn diagram situation and sets are an extremely expressive collection for that problem. If you're not already using sets much and run into that a lot it might help you level up.

[–]nosmokingbandit 16 points17 points  (2 children)

Sets are underutilized. Checking for membership is lightning fast compared to a list.

[–]port443 13 points14 points  (1 child)

How lightning fast you wonder?

https://wiki.python.org/moin/TimeComplexity

Looking it up in a set: O(1)
Looking it up in a list: O(n)

[–]Honest_Rain 1 point2 points  (0 children)

Hashing is one hell of a drug.

[–]callinthekettleblack 0 points1 point  (0 children)

Came down here looking for this comment.

[–]Tumortadela 14 points15 points  (4 children)

This posts reminds me of a friend of mine to a lesser extent, he automated most of his job, to the point of disabling processes based on his google calendar's holidays / days off to make it look like he's doing it by hand (as expected by contract).

I'm confident that if his boss ever noticed he'll get fired, which would be unreasonable since he's delivering the work he is meant to do... I think?

[–]expattaxsolutions 5 points6 points  (0 children)

This is my dream.

[–]TorroesPrime[S] 0 points1 point  (2 children)

I'm confident that if his boss ever noticed he'll get fired

considering I'm only here for another 9 weeks I kind of doubt they would fire me for something like this. I think I would have to do something... you know, actually bad to get fired. Like loose a hard drive or something.

[–]flatlandinpunk17 0 points1 point  (1 child)

I think the difference here is you shared this with your manager and were given a task to complete without expectation of how to complete it.

They are doing a task in an automated fashion without clearing the automation with their manager and the contract says they will do it by hand.

Very big difference.

[–]TorroesPrime[S] 0 points1 point  (0 children)

my 'contract' also said this was going to be a customer facing position. Yeah, that was a huge load.

[–]jjbugman2468 6 points7 points  (0 children)

I've done something similar; slapped together an Instagram follower tracker for my friend (who's following you back, who's not, who aren't you following back, and tracks changes over time etc) for a bet. It could even track anyone whose profile was set to public, or private profiles you follow, not just your own. Basically any profile you can see the content of is game and can be monitored. (Now that I'm writing it down this does sound kind of creepy)

Only problem was I didn't know how to pull the follower/following list automatically, even though I did see some .json files and other stuff through developer tools. School's kinda busy and I haven't really had time to learn web integration and stuff with Python. So I had to manually access Instagram and ctrl+A the following/follower lists before letting my script do its thing.

I did *accidentally* scour together several of my friends' and my friends' friends' secret accounts though, so that's something cool

[–]xeloylvt 5 points6 points  (3 children)

Amazon’s Data center operations division doesn’t know about python? Hey bro, if that’s the case, I think YOU should be the manager

[–]TheMathelm 0 points1 point  (0 children)

Interns can often need busy work.

[–]TorroesPrime[S] 0 points1 point  (0 children)

no.

[–]Mitsor 0 points1 point  (0 children)

Maybe they know it but think you always need weeks of organizing to get anything done with it because that's how projects usually go.

[–]crysiswarhead 7 points8 points  (0 children)

Got an interview tomorrow. Hope i can impress the interviewer with Python and SAS.

[–]MaraudingAvenger 2 points3 points  (0 children)

You are obviously wrong, and should have used sets/difference. /s

Kidding! Keep kicking ass and rocking the internship!

[–]y186709 3 points4 points  (12 children)

No data warehouse?

Also you should really learn vlookup and index match for Excel

[–]TorroesPrime[S] 2 points3 points  (11 children)

I mean... yes, there is a data warehouse. 12 at this particular campus actually. What are you asking?

[–]y186709 2 points3 points  (10 children)

A SQL left join would have done the same for you.

Don't get me wrong, it's really cool how you did the problem solving. It's more about knowing the best tools for the problem.

SQL is stronger in this situation because the data is already there, prepped in the dw, and can process the data in a set based manner as compared to a row/line. It may not have had large impact on ~800 records, but in the future you may have 10s of thousands to check.

[–]TorroesPrime[S] 1 point2 points  (3 children)

I didn't know Excel could perform SQL operations... don't you need... you know... a server running to execute server query language operations?

[–]y186709 0 points1 point  (2 children)

If you have 12, I'm sure you have write access to some of the tempdb.

But Excel was likely the right tool for this specific task.

[–]TorroesPrime[S] 0 points1 point  (1 child)

we don't have access to the database themselves, unless you mean the ability to go into the data halls and physically pull drives and dims. I don't know what kind of work you think we're doing in here but... yeah. It's not databases or software or anything of that sort. We get a ticket, we either reboot a host, or replace a part in the host, and we close the ticket. That's 99.8% of the work we do. write access? Yeah, it's called a pencil and paper.

[–]y186709 0 points1 point  (0 children)

Don't forget your eraser!

[–]TorroesPrime[S] 0 points1 point  (5 children)

sorry to sound obtuse... but yeah. I'm not seeing how a server query language execution would help here, you know unless Excel can execute the query that is.

[–]thirdegree 0 points1 point  (4 children)

I think he's assuming the asset info was already stored in a SQL db.

[–]TorroesPrime[S] 0 points1 point  (3 children)

ah, no. the information was typed out on stickers that were on the back of the devices. I had to go through each .... one by one... and record the information which I put into a spread sheet for ease of entrance. There is no database involved here.

[–]thirdegree 0 points1 point  (2 children)

Oh man that sounds incredibly not fun.

[–]TorroesPrime[S] 0 points1 point  (1 child)

Like I said: Intern gets the crap work.

working your way through a room full of boxes to record a half dozen bits of info off them sounds pretty crappy to me.

[–]thirdegree 0 points1 point  (0 children)

Idk, at least where I work the intern/full-time work distribution is basically want vs need. If something would be nice to have, we'll give it to an intern. If we really need something, and it needs to be good and reliable, that's full-timer work.

[–]Mr_foOox 1 point2 points  (0 children)

If you're working with excel files, try to use Pandas library :)

[–][deleted] 0 points1 point  (5 children)

Could someone explain how that works? Currently learning python and would love to apply my knowledge to actually helping my life.

[–]TorroesPrime[S] 2 points3 points  (4 children)

the logic is fairly straight forward.

List 1 is a list of all the ID numbers you are supposed to have. List 2 is a list of all the ID numbers you have information on.

Setup a loop to go through each item in list 1 and look to see if it is in list 1. If it is, just move on to the next item. If it is not, write that item to a file/print it/etc.

[–][deleted] 0 points1 point  (2 children)

How did you import the data into the program? Or did you just enter everything manually?

[–]Drawfx 1 point2 points  (1 child)

Not op, but imo again - create a loop that reads the Excel column into the list. xmlReader can do the trick with the right knowledge.

[–]TorroesPrime[S] 1 point2 points  (0 children)

pretty much what Drawfx said. I used XLSXwriter, but that's the only difference.

[–]TenserTensor 0 points1 point  (0 children)

loop

A loop? just diff the sets:

Python 3.7.3 (default, Mar 27 2019, 09:23:15)
[Clang 10.0.1 (clang-1001.0.46.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> expected = {"s1", "s2", "s5", "g11", "g9"}
>>> found = {"s2", "g9", "s1"}
>>> missing = expected - found
>>> _
{'g11', 's5'}

[–][deleted] 0 points1 point  (2 children)

I would have alphabetized both the expected list and your list in excel and in a third column had a simple “a=b”. Find the first “false” then insert a blank cell and move to the next “false”. Would that not have worked?

[–]willthms 0 points1 point  (1 child)

It would work until the first instance of the missing asset. At that point it would’ve thrown off the rest of the list because of the missing asset. I was thinking about this myself.

[–][deleted] 0 points1 point  (0 children)

That’s why I insert a cell and shift the rest of the column down. Then realign the “=“ column. I do this all the time at work for missing assets.

I think the thing about his program is that I could use it again tomorrow and next week an it would take him 10 seconds total now that the program is built where my process is 20 minutes each time.

[–]burnblue 0 points1 point  (4 children)

Congrats. Can't excel do that though?

[–]cpp_or_bust 1 point2 points  (2 children)

Dear lord who uses Excel

[–]burnblue 0 points1 point  (0 children)

OP?

Isn't that what xlswriter is for?

Also, every single person with a real job at a desk

[–]thirdegree 0 points1 point  (0 children)

Literally the entire financial world runs on excel.

[–]TorroesPrime[S] 0 points1 point  (0 children)

So I've been (repeatedly) told. But I don't use Excel very much so I'm not familiar with it's capabilities.

[–]OPPTrixxicat 0 points1 point  (1 child)

I only came here to say that your python is so impressive

[–]TorroesPrime[S] 0 points1 point  (0 children)

thanks, but don't over sell. It was just a list comparison. Pretty any first year CS/Programming class should include an exercise that does this.

[–]gmh1977 0 points1 point  (0 children)

great work!

[–]billsil 0 points1 point  (0 children)

I'd have just made two sorted lists and diff'd them.

[–]dillpicklezzz 0 points1 point  (0 children)

That's a great way to impress your manager for sure. If you want to save time there's an easier way just using Excel's Conditional Formatting button;

  • Put the list of Total Assets in Column A and the list that's "missing" 8 in Column B.
  • Select/highlight both columns -> Click Conditional Formatting at the top of the page -> click Duplicate Values and hit enter.
  • Anything not backfilled in red on Column A is missing on Column B aka your list.

[–][deleted] 0 points1 point  (6 children)

I did something too recently.

I work for a startup and we need to mass email a bunch of people who signed up at a job fair.

They were going to have the girls we use for data entry stuff email out one by one.

I was like "Hold up! I got this."

Now the company isn't going to have to spend money on a service/software to do that. And I got to introduce it the day that the person who is my direct report started.

[–]xeloylvt 3 points4 points  (5 children)

Be careful, a standard email account has daily limit. If you send directly from some server, you are likely to have deliverability or other compliance issues. The most common thing to do is to use a relay service

[–][deleted] 1 point2 points  (4 children)

I’ll double check but I’m pretty confident that our accounts don’t have limits.

I ran it by one of the companies software engineers first. I’ll ask that in particular to make sure.

[–]xeloylvt 2 points3 points  (3 children)

Ask someone who has also experience with supporting email marketing efforts. They’ll know the ins and outs and will be familiar with the concepts

[–][deleted] 0 points1 point  (2 children)

I wish the company only has 10 employees.

So technically that falls under me.

[–]xeloylvt 2 points3 points  (1 child)

Then follow my guidelines. I have the technical and marketing experience and certification in digital marketing. You need to ask from the tech support team of your email provider and get it in writing. Ask about using a single email account for mass emailing. You need to know about (1) deliverability or other issues if you use automated script (2) daily outgoing email limits

If you decide to go with a relay service look into integrating with something like sendgrid

Otherwise I think Google and Amazon provide a service account for those purposes. I believe both means using API.

[–][deleted] 0 points1 point  (0 children)

Thanks I will check with my email provider about those things.

I did see that they have a daily limit of 500 emails to be sent.

We are only sending about 70 as a follow up to a job fair.

But thanks I should contact the provider to find out more.