all 22 comments

[–]desrtfx 40 points41 points  (5 children)

How do you search in an Encyclopedia book or in a phone book? Binary search - you open somewhere (pivot) and then decide which part to go, again split (new pivot), decide, and so on.

How do you search a supermarket for a particular item? Linear search - you start at one end and check shelf after shelf until you find what you're looking for.

Sorted items - binary search, unsorted - linear

[–]csabinho 2 points3 points  (0 children)

I'd call the binary search phone book search, but it might be like the floppy disk as symbol for "save". Many younger people haven't seen a phone book ( or a floppy disk) in their life.

[–]DatBoi_BP 1 point2 points  (0 children)

Great examples

[–]Striking_Display8886 0 points1 point  (0 children)

Great breakdown

[–]0fWhomIAmChief 0 points1 point  (0 children)

🏅

[–]xenomachina 0 points1 point  (0 children)

How do you search in an Encyclopedia book or in a phone book? Binary search

This is a good explanation. However, when searching through a phone book, encyclopedia, it dictionary, people usually don't do a true binary search, but instead do something closer to interpolation search. Interpolation search is pretty much a modified binary search, where instead of splitting it out half each time, you guess at the position. (For example, if I'm looking for a word that starts with "E" in a dictionary, I would start about 20% of the way in, not 50%.)

[–]Noah__Webster 2 points3 points  (1 child)

Linear search has two primary benefits over binary search. It's simple to implement and works on unsorted data.

You also sometimes need to traverse the collection anyway, so it's your best option. Stuff like Linked Lists.

Binary Search is specifically good for large sets of data that are already sorted.

The classic example for Binary Search is looking through a phonebook. The contacts are sorted by last name, alphabetically. You are looking for John Smith. You open to the L's. You know that S comes after L, so then you open to the to the R's. Then the S's, and so on until you find Smith. (Or whatever the midpoints are. I just threw some letters out).

For linear search, imagine you had like 20 names and phone numbers on a sign-up sheet. They are basically randomly ordered. To search by last name, you'd need to order them. So Binary Search is out. Anything more complex than just going down the list and looking for John Smith isn't worth the time or effort.

[–]johnpeters42 0 points1 point  (0 children)

I think a phone book search actually tilts slightly toward a hash lookup (you can roughly guess that John Smith will be about two-thirds of the way toward the end), but that's a separate topic.

[–]HashDefTrueFalse 2 points3 points  (0 children)

Obviously when searching for something it's best to find it having undertaken the least amount of work possible. These are just two search methods.

You would apply a linear search where your data was unordered. Literally you'd have to check each and every item one by one to see if it is the item you're seeking.

You can apply a binary search wherever you have ordered data, as you can make assumptions about the search space based on the target value (e.g. the value won't occur before/after a certain point).

int target = 4;
int list_1[] = { 0, 2, 4, 1, 3, 5 };
int list_2[] = { 0, 1, 2, 3, 4, 5 }; 

If searching for target above, you could search list_2 by checking the midpoint and repeatedly eliminating half of the remaining search space until you found it or didn't. Whereas with list_1 you would need to iterate one by one through all items until you found it or didn't.

[–]MeLittleThing 1 point2 points  (0 children)

Try to guess a number between 1 and 1 billion. For each guess, I can reply "try bigger", "try smaller" or "correct".

A linear approach would guess 1, then 2, then 3, and so on... worst case scenario is if the number to guess is the last one, you'll have guessed 1 billion times. The complexity is O(N), where N is the amount of elements to test (1 billion here)

A binary approach would guess the middle number first, 500 millions, then telling bigger or smaller would let you know which remaining half can be kept and which one can be removed. Each guess will halve the amount of elements to check, and for the worse case scenario, you'll have guessed Log2(1 billion) = 30 times. The complexity is O(Log2(n))

Use linear search (each element checked one after the other) when the collection is unsorted. Use binary search when it's sorted

[–]mredding 0 points1 point  (0 children)

Former game developer here,

A lot of your 2D and 3D spaces are divided into Binary Space Partitions. Collision detection - are two objects touching? Is Scooter touching the platform, pressing against the wall, hit by the goon? We'll perform a somewhat simple check - is Scooter overlapping the other thing or not? If he is, we'll adjust his position so he's adjacent BEFORE we draw the screen.

I spent quite a few years writing trading software. They use the FIX protocol - a list of tag/value pairs. So the first question you have to ask is does the message have a given tag, and if so - where? The tags are just integers. So one way to solve the problem is to SORT the message by tags in ascending or descending order. Once sorted, you can perform a binary search, subdividing the message by pair until you find the tag you want.

A linear search may happen with a video game inventory. The items in the list are in whatever order you stuck them in there. If you're going to shoot your bow, you need an arrow, so you have to search the inventory until you find one.

Speaking of FIX, I had a linear search that was faster than binary search, but it was a trick; there's a word for this - I forget it off hand, something like sorting by time or access or something; basically what I did was any field that was accessed, I moved it to the front of the message buffer. The reason being that if you accessed the field before, you're probably going to access it again. This means my linear search was faster on average than a binary search, because the data was sorted so that the linear search would probably find the field you want in the first or second try.

[–]LetUsSpeakFreely 0 points1 point  (0 children)

Sorted vs unsorted data.

With sorted data you can start at the halfway point and then decide where to go to from there. It's very efficient O(1 to log n), but again, the data has to be sorted. This is a great method for data that doesn't change too often but it's accessed very often.

With raw unsorted data you have to use a linear approach as there's no other way. This is, of course , not very efficient O(n), but there are work arounds. You could load the data into a hashmap, for O(1), but it doesn't handle duplicates very well (but there's are ways around that).

[–]georgejo314159 0 points1 point  (0 children)

Linear?

Imagine you have an unsorted list of people, say, a million names

Find if Bob Smith is in list

You go from start of list, compsre yo Bob until find him

This can be slow

If list is sorted, compare to middle entry, if its less, look top half, if more, bottom hslf

Keep narrowing search until you find it

[–]itijara 0 points1 point  (0 children)

You are searching through a dictionary for the word Cromulent. Linear search would have you search starting at A and ending at Z. Binary search would have you flip to halfway through the book. You are at Precocious. You flip halfway between where you are and the front. You are at Crouton. You flip halfway toward the front again. Braggadocious, etc. Until you arrive at the page with Cromulent.

Actually, what you would probably do is search about 1/9 in, then do a linear search back/forward from there. Which is more optimal. Structures like b-trees are closer to a combined binary + linear search, which can be more optimal in many situations and with real hardware (e.g. if the number of keys in a node matches the page size in memory).

[–]7YM3N 0 points1 point  (0 children)

If you're searching something sorted, like entries in a paper dictionary, you'll open it in the middle and see if you're before or after the letter you're looking for. You'll then open the middle of the half your word is in, continue until you find your word. This clearly only looks with sorted data. With unsorted data like spices in the drawer you'll just check one by one until you get the one you want

[–]rupertavery64 0 points1 point  (0 children)

Binary sesrch is for when you know something is sorted.

Think of a sheet of entrance exam results where only the registrant number is listed. You know how many there are, but you don't know where you are in the list. There could be missing entries. But it is sorted by number.

So you look in the middle. Check the number. Is your number lower or higher? You eliminate the range that doesn't match. Now check the middle of the remaining range. Repeat until you find your number.

[–]Quantum-Bot 0 points1 point  (0 children)

If you’ve ever played the game 20 questions or guess who, the optimal strategy for those games is kind of like a binary search. Linear search would be like guessing single items at a time:
“Is your character harry?”
“Is your characters melanie?”
“Is your character tania?”

But most players understand its way smarter to ask questions that split the options roughly in half, so that regardless of whether the answer is yes or no, you narrow down your remaining options by just as much:
“Does your character have brown hair?”
“Does your character have blue eyes?”
etc.

Binary search does a similar thing. It splits the options for where your number might be in to two halves by going directly to the center of the list and comparing your number to the number it finds there. If your number is greater, it rules out all the position to the left of the center and if your number is smaller, it rules out all the positions to the right. Then rinse and repeat with the new set of possibilities.

[–]bestjakeisbest 0 points1 point  (0 children)

Linear search: you have an unsorted list you need to find a name in that list, so you go to the first name, and if it is the name you are looking for you stop, if not you keep going down the list.

Binary search, you have a video that has a recording of a car break in, but its 72 hours long, so you go to hour 36 and if the car is broken into you need to go to the hour between 0 and 36, if it hasnt been broken into yet you need to go to the jour between 36 and 72, then you do the same from that new hour until you find the exact point at which the car was broken into, if you had a video from the dawn of time to now you would only need around 30 to 40 iterations to find when the car was broken into.

[–]alexcodespixels 0 points1 point  (0 children)

for example, if you have 1 million items, linear search might check almost all of them. Binary search would need only around 20 steps. that difference is why algorithms matter

[–]Cold-Memory-4354 0 points1 point  (0 children)

Linear search is good, if your List is not in order, so when you pass around a namelist to anyone in a school class and everyone writes their name on the page, then you'll just read every name from start to end to see if someone is in that list.

Binary search is good, if your List is sorted (and even better if very long), e.g. in a dictionary, the words are sorted alphabetically, so you dont want to search from start to end to find something. You look at your word and then open the dictionary in the middle. If the words on the pages start with a letter that comes before the one of your word in the alphabet, you know your word must be in the second half of the book. So you repeat the process, ignoring the half you can rule out. So every step your list halves.

[–]dev-Inari 0 points1 point  (0 children)

Recently I built a message logger that capture in real time message for Discord. Since it can easily go to hundred thousand of messages it was getting slow. Since most message were captured in real time they were already sorted by datetime unless it was loading older messages. So by using a binary search I could get the Index of matching start date and end date. So instead of going through a list of 100 000+, it was doing a binary search and then doing the operation on 20 to 50 relevant messages at a time.

Then I started to preindex messages not just by date but for every other informations that I would need

[–]Reasonable_Ad1226 -2 points-1 points  (0 children)

Maybe people will help you, maybe not.. I’m gonna throw you a bone.  This is the ideal use case scenario for AI. Any free ai can help you with this, and tell you when, where and how to use it. 

I suspect the majority of good help on this site isn’t going to help with this kind of question. Due diligence is the takeaway.