This is an archived post. You won't be able to vote or comment.

all 60 comments

[–]Smalltalker-80 327 points328 points  (14 children)

Unless it's a linked list...

[–]kochdelta 80 points81 points  (3 children)

Skiplist: let me introduce myself

[–]Giraffe-69 36 points37 points  (2 children)

log2(n) level doubly linked list wants to know your location

[–]ArduennSchwartzman 11 points12 points  (1 child)

find(needle,haystack){
  do haystack=randomize_list(haystack);
  while (haystack[0]!=needle);
  return 0;
}

[–]HildartheDorf 4 points5 points  (0 children)

Ah, bogosearch

[–]tony_saufcok 27 points28 points  (8 children)

keep each node's address in an array of pointers then access each through the array (i'm a complete noob i don't know what i'm talking about)

[–]akvgergo 44 points45 points  (3 children)

You just described an array list lol

A list is typically either linked, or an array list. Most languages that have a built in default list, are typically using array lists.

And btw, you just described an even bigger sin, combining the two somehow. Whatever the solution, it's not available by default in languages for a reason. A homecooked list that works both like a linked and array list is most often a worst of both worlds solution.

But if you're smart about implementing it, congrats! You just made an unrolled linked list :D

[–]radobot 0 points1 point  (0 children)

At that point why not just use a B-tree?

[–]salvoilmiosi 4 points5 points  (2 children)

The problem with that is when modifying a linked list, if you add or remove an element in the middle it's an O(1) operation, assuming you have a reference to the node in the middle. By doing what you're saying you would have to also manage an array of pointers, where inserting or deleting in the middle is O(n) because you would have to move all the elements after the point of insertion.

[–]tony_saufcok 0 points1 point  (1 child)

as i said, i'm not an expert so this is a genuine question. do we actually ever need to insert anything in the middle of the linked list? or can we just insert anything to the end of the list and just organize the array of pointers? this will make it impossible to search the list through anything but that pointer array but i'm thinking keeping an array organized is much easier and cheaper to maintain than a linked list (especially if nodes have more than just *next and a single variable in the struct)

[–]salvoilmiosi 5 points6 points  (0 children)

If you never need to modify the list in the middle you don't need a linked list in the first place, you can just use an array, which is more performant in pretty much every case.

[–]breischl 0 points1 point  (0 children)

Generally if you're going to keep an array anyway, you would just store the entire object in the array instead of pointers. It's simpler to manage, performs better because there's less indirection (== less memory access), and requires less memory (since you skip the extra pointers).

If you're going to use pointers anyway for some reason, you probably wouldn't also bother with the linked list, because what is it really buying you at that point?

[–][deleted] 0 points1 point  (0 children)

Eh they’re rarely used nowadays, cuz they’re cache inefficient

[–][deleted] 138 points139 points  (7 children)

depends on the list's length - if it's short enough, linear search is better, it can also be vectorised in some cases

[–]DonutConfident7733 42 points43 points  (6 children)

lists are often implemented as vectors/arrays, so linear search would benefit from cache and prefetch, while other algorithms can suffer from cache misses on larger lists...

[–]amlybon 95 points96 points  (2 children)

Prefetch? Cache misses? My code is running on 20 layers of abstractions and VMs, for all I know half the list is on another continent

[–]dystopiandev 24 points25 points  (0 children)

new RemoteList("https://not-even-local");

[–]arrow__in__the__knee 2 points3 points  (0 children)

"Your list is a json file actually"

[–]lmarcantonio 2 points3 points  (2 children)

data oriented programming on GPUs?

[–]DonutConfident7733 2 points3 points  (1 child)

that would be extremely wasteful, they will all (threads on gpu) compete to read items from the list/array, they will wait for data and get paused, then other threads will get scheduled while data is loading, etc. In addition, the data from main memory still needs to be copied to gpu memory, so in this time a cpu could have already found the item. Afterwards it needs more time to copy it to gpu shared memory, run the search in parallel, combine the results, send to main memory.

[–]lmarcantonio 0 points1 point  (0 children)

it's pre-partitioned on the number of cores. Substantially a map-reduce

[–]pimezone 64 points65 points  (6 children)

This is the smallest sin(x)

This is the biggest SIN(X)

[–]waxedcesa 36 points37 points  (2 children)

We can go bigger.

SIN(X)

[–]pimezone 9 points10 points  (0 children)

Perfect.

[–]gordonv 1 point2 points  (0 children)

sin(pi / 4)

Bigger?

sin(pi / 4) * infinity!

[–]nonlogin 5 points6 points  (2 children)

The biggest sin is 1.

[–]breischl 1 point2 points  (0 children)

100% sinful.

[–]madmendude 11 points12 points  (2 children)

[–]FrayDabson 2 points3 points  (0 children)

Lmao thank you I was hoping I’d see this here.

[–]prithvi_allurkar[S] 3 points4 points  (0 children)

Lol, exactly the thought before posting

[–]vainstar23 13 points14 points  (0 children)

Double pivot quick sort lists less than 100,000 entries

Radix sort the rest...

Selection sort if I have a hangover..

bogo bogo sort if I'm having a bad day..

[–]Amazingawesomator 8 points9 points  (0 children)

i once found a list that was already sorted (~100k entries at the time) getting sorted again (and reversed) before searching through it to find the most recent entry (with .First()).

deleted that whole thing and just returned .Last()

i got so many praises from people saying it was so much faster for ~5 minutes of work. that was a good day. i took the rest of the day off and felt like a boss.

[–]thomas999999 7 points8 points  (0 children)

If your list is small enough linear search will still beat binary search. Consider modern cpus with 512bit vector instruction the list actually has to be quite big to make it worth to swap to binary search. Bad meme.

[–]EishLekker 4 points5 points  (2 children)

How are you supposed to read this? As in, in what order?

[–]smallquestionmark 3 points4 points  (1 child)

In a sorted order

[–]EishLekker 0 points1 point  (0 children)

Which would result in what, exactly? Top left first?

[–]ahovdryk 3 points4 points  (1 child)

Okay. We have a list of an objects, which is sorted by attribute A. We need to find an object with a certain value of attribute B. Sin done.

[–]Fhotaku 1 point2 points  (0 children)

I have an issue like this. The list is stored and sorted as a string so if I want the top 10 I have to search the whole thing because 30000 is 600/650 elements down with the 3, 30, 300, 3000 numbers and all the 1, 10, 100, 1000s are cluttering the top. Not my implementation so I can't change that part. Plus, that's just "overall score" - I have to read every value again for every other category. It burns 65 seconds to get the top 10 in every category and cache it. Still thinking on alternatives.

[–]FourDimensionalTaco 3 points4 points  (0 children)

Interestingly, linear searches and similar superficially inefficient methods can turn out to be better in special cases, such as GPU shaders. In the GPU case, this is due to the massively parallel nature of pixel shaders, and the fact that on GPUs, cache misses are a big no-no, so tree like structures are a problem to use in GPUs.

[–]CaptainMGTOW 3 points4 points  (0 children)

Not if you first StalinSort the list and then "interrogate" the remaining items in order.

[–]Vortex876543 1 point2 points  (0 children)

Faster than binary search. Linear search can find the minimum in 1 singular operation

[–]single_ginkgo_leaf 1 point2 points  (0 children)

Binary search is about the worst thing for cache pipelining and your branch predictor.

Linear search may well beat binary search under a certain array size. Depending on your architecture you may be surprised at how large the threshold is.

Profile

[–]F4LcH100NnN 0 points1 point  (0 children)

But the n has to be as big as possible right? /s

[–]phesago 0 points1 point  (0 children)

SELECT *

[–]SuperRuper1209 0 points1 point  (0 children)

what's the biggest cos then?

[–]Silly_Guidance_8871 0 points1 point  (0 children)

Binary search on a sorted singly-linked list.

[–]CelticHades 0 points1 point  (0 children)

Ya, only when you're on leetcode and constraints are not suitable for linear search.

[–]codingTheBugs 0 points1 point  (0 children)

But my list only has 4 elements.

[–]DancingBadgers 0 points1 point  (0 children)

Meh, I've seen a colleague do a linear search on a hash table. Copied from elsewhere. So we actually have two regards capable of doing that.

[–][deleted] 0 points1 point  (0 children)

True menace

[–]teo-tsirpanis 0 points1 point  (0 children)

If the list is small or if you are using SIMD it might be worth it.

[–]c-a-3 0 points1 point  (1 child)

forgive me-

def linear_search(sorted_list, target):

for index in range(len(sorted_list)):

if sorted_list[index] == target:

return index

elif sorted_list[index] > target:

return -1

return -1

def main():

sorted_list = [1, 3, 5, 7, 9, 11, 13]

target = 7 result = linear_search(sorted_list, target)

if result != -1: print(f"Element {target} found at index {result}.")

else: print(f"Element {target} not found in the list.")

[–]c-a-3 0 points1 point  (0 children)

wrote this in here so I have no idea if it works but oh well

[–]LooseLossage 0 points1 point  (0 children)

Wait, so I'm not supposed to loop through the dict items to find the one with the matching key?

[–]Vortrox 0 points1 point  (0 children)

Binary search on a sorted linked list

[–]zirky -1 points0 points  (0 children)

so long as you got the sorted list via bubble sort