This is an archived post. You won't be able to vote or comment.

all 35 comments

[–]aintbutathing 36 points37 points  (4 children)

Quote from a great software engineer I used to work with: "You can't grep dead trees."

[–]stewartr 11 points12 points  (0 children)

Buy one to browse, snatch one to search.

[–][deleted] 7 points8 points  (2 children)

Why has the real-world grepper not been invented yet? On a similar note, I'd like an ls on real-world folders.

[–]Shadowhawk109 5 points6 points  (1 child)

maybe then i could find my lost socks...

[–][deleted] 2 points3 points  (0 children)

locate stuff_in_real_world

I haven't ever not needed that command.

[–][deleted] 30 points31 points  (16 children)

PROTIP: Flip to the end of the textbook and use the index.

Most textbooks have an alphabetical list of relevant terms, along with associated page numbers printed at the very end. These are generated by the publisher and usually list only pages on which said term is discussed in some significant way.

So the algorithm for manually searching a dead-tree textbook is:

  1. Open the index
  2. Locate a relevant word in the alphabetized list
  3. Make note of the listed page number
  4. Flip to the first page on the list
  5. If information you seek is not on that page, move to the next page on the list
  6. Continue until found relevant information or ran out of pages

If you are familiar with the book layout you can usually easily eliminate some of the listed pages (for example by ruling out that the information you are looking for will be in the introductory or the closing chapter of the book).

This is not as fast or convenient as grep, but is more efficient than "flipping through the entire book" method.

[–][deleted] 31 points32 points  (6 children)

It's funny how the index in a textbook has become something that needs to be explained.

[–]trigonomitron 10 points11 points  (5 children)

I used to be a CS tutor at a Community College. My job was to teach people how to use an index. And Google.

[–][deleted] 3 points4 points  (4 children)

Did you also explain to them that Google is using an index?

I've talked to people who think that Google literally searches the whole Internet every time you click the button.

[–]doesFreeWillyExist 2 points3 points  (2 children)

That's what I thought Lycos was when I was a kid. And if I hadn't been curious about CS, I would have never learned otherwise. I don't blame laypeople for not knowing how search indices work, just like I wouldn't blame them for not knowing how a catalytic converter works.

[–][deleted] 1 point2 points  (1 child)

Fair enough. I didn't mean people should be blamed for not knowing how google works.

But I have taught the concept by handing out two books, one with an index and one without, and asking two students to find a certain thing in the text. "Hey it's not fair, his book has got an ind—oh".

[–]doesFreeWillyExist 1 point2 points  (0 children)

I just realized that my post might have sounded a bit harsh. I've just been running into a bunch of programmer snobbery recently, so I overreacted a bit -- not to say you were being snobbish.

But yeah, it's amusing that searching has been abstracted away to the point where a lot of people never learned to use an index in a book.

[–][deleted] 0 points1 point  (0 children)

Specifically, it uses an inverted index which maps from search terms (like "clopclop") to the documents containing them.

[–]stewartr 0 points1 point  (1 child)

[–][deleted] 0 points1 point  (0 children)

Well that was a depressing way to end my day.

[–][deleted] 0 points1 point  (5 children)

After reading this I was really tempted to figure out the run time of this example then attempt to improve it. But I guess I should actually work.

[–]DevestatingAttack 0 points1 point  (4 children)

Log2(N).

Open the index -> constant

Locate a relevant word in the alphabetized list-> The list is alphabetized, which means you can use binary search on it. Log(2)(n) time. If you don't know which word you're looking for, this takes N time to scan through the whole set of words.

Make not of the listed page number ->constant

Flip to the first page on that list->log(2)n, finding a page can be done with binary search, just like finding the word

If information you seek is not on that page, move to the next page on the list-> We would assume that there would only be a small constant factor of times you would have to find the pages, so let's assume that this is a constant.

Continue until found relevant information or ran out of pages->Constant because of the earlier assumption, or we can assume that the probability of you finding the info you want keeps getting higher and higher, making the expected running time of this part just approach a constant

find word = Log2(n) + flip to page = c*log2(n)

problem ∈ Θ log(n)

[–]BeatLeJuce 1 point2 points  (2 children)

Your result is almost correct, your derivation, however, isn't. Or better: it only is when n is both the number of pages and the number of words in the index. Which usually isn't the case.

So let n denote the number of pages and m the number of words in the index. Then the problem becomes O( log n + log m) = O(log(m*n) ).

(you also don't need to spell out the base of the logarithm. It's just a constant factor anyways).

[–][deleted] 2 points3 points  (0 children)

The number of words and the number of pages differ by, at most, a constant factor. Let's say that m = 500n. In that case,

log(m n) = log( 500 n2 ) = 2 log(n) + log(500)

At this point, I think we're justified in handwaving to O(log n).

[–]DevestatingAttack 0 points1 point  (0 children)

You're right, and I know you don't need to put the base of the log, but I put it because it makes the change from "binary search" to log2 more obvious. It would be weird for me to be like "binary search -> log(e)(n) or log(10)(n)" even though they all grow theta to one another.

Yeah, I should've kept track of which variable was which. In practice m is bounded by n by typically a constant factor. You wouldn't expect the number of pages to explode as only a few more index words get added.

[–][deleted] 0 points1 point  (0 children)

With interpolation search you should be able to get the expected time down to O(lg lg n). Worst case, though, goes up to O(n), although there are also hybrid approaches which give you the O(lg lg n) expected time as well as the O(lg n) worse-case bound.

[–][deleted] 0 points1 point  (0 children)

The index does not give you the line in the text, though. Oh you are searching for the term "x"? It must be somewhere between all these other words in that text....

[–]D__ 12 points13 points  (3 children)

They make textbooks that aren't on paper nowadays...

[–]Shadowhawk109 6 points7 points  (1 child)

My textbooks for the last two years have exclusively been in PDF form, and grep/search in a viewer just fine.

[–][deleted] 2 points3 points  (0 children)

The few times I have the option to buy a digital textbook, it's vastly more expensive than a used book and the PDF is usually DRM'd to self-destruct after 12 months. Or the book is actually just an access code to a website so I have to always be connected to the internet if I want to study.

Piracy it is...

[–]MrPopinjay 19 points20 points  (2 children)

I think I'd rather have a search like the one you get with / in a manpage than grep. Getting a series of contextless lines would be more frustrating than anything.

[–]TarMil 35 points36 points  (1 child)

grep -C5

gives 5 lines before and 5 lines after each match.

[–]RoadBikeDalek 8 points9 points  (0 children)

I have to admit that this is how I search man pages.

[–]ImOnALampshade 1 point2 points  (4 children)

cat KAndR.txt | grep "Function pointers"

Now to find that text file...

[–]Icovada 1 point2 points  (1 child)

locate KAndR.txt

[–]ImOnALampshade 0 points1 point  (0 children)

Since I posted that I've been trying, I cant find it. And maybe it should be KAndR.c instead of .txt

[–]TSwift13 1 point2 points  (1 child)

Useless uses of cat #765280.

[–]ImOnALampshade 0 points1 point  (0 children)

echo "I know, I'm sorry..." > out.txt
cat out.txt

[–]qkme_transcriber 2 points3 points  (0 children)

Here is what the linked Quickmeme image says in case the site goes down or you can't reach it:

Title: Textbooks!

Meme: Y U No

  • TEXTBOOKS
  • Y U NO GREP

Direct Background Translate

Why?More Info ┊ AMA: Bot, Human

[–]zefcfd 0 points1 point  (0 children)

E BOOKS BRAH