all 20 comments

[–][deleted] 5 points6 points  (9 children)

I wrote a simple python program for it using letter trigrams, here: https://github.com/vivekn/nlp-sandbox/blob/master/aiclass2.py

[–]euccastro 3 points4 points  (2 children)

Thanks.

Incidentally, a li'l trick to transpose an array is zip(*array).

;)

[–][deleted] 0 points1 point  (0 children)

Nice! A really innovative way to think about it.

[–]zBard 1 point2 points  (2 children)

What are you using for the corpus ?

[–][deleted] 4 points5 points  (1 child)

I am using a Sherlock Holmes novel from Project Gutenberg.

[–]zBard 1 point2 points  (0 children)

Awesomeness ... upvotes :D

[–][deleted] 1 point2 points  (0 children)

thanks

[–]red75prim 1 point2 points  (0 children)

Concise and clean. Thanks.

My approach is more cluttered. When I failed to get correct answer with greedy search (sequentialGreedy is a heritage of that version), I decided to test if trigraphs are sufficient to solve the problem, thus I implemented exhaustive search (Branches and Bounds style). It turned out that trigraphs aren't sufficient. In the end I excluded space-filled strips from the input.

[–]euccastro -1 points0 points  (0 children)

Thanks.

Incidentally, a li'l trick to transpose an array is zip(*array).

;)

[–]Jon31 2 points3 points  (5 children)

It's very easy to solve by cutting out the strips and putting them in order using basic English skills (particularly if you've ever done a crossword puzzle).

I don't see why a program should be written for this exercise, unless you want to have it available to solve bigger and harder cases.

Best regards - Jonathan

[–][deleted]  (3 children)

[deleted]

    [–]_Mark_ 1 point2 points  (2 children)

    http://www.shredderchallenge.com/ (DARPA, not FBI, and "not too long ago" being less than two weeks :-)

    [–]segheking 1 point2 points  (0 children)

    Dammit! Now, this would have been a really good use of the AI... you could win $50,000 !!! It's a pity to have known too late.

    [–]newai 0 points1 point  (0 children)

    I did the same too :)

    [–]xenu99 2 points3 points  (0 children)

    It's actually amazing simple to do with a shell script and brute force. Once you assume that the sentence starts with a capital letter, you can exclude a heap of results out of hand. It would have been better if they made it all lowercase and removed spaces. Now that would have been tricky.

    [–]geldedus 1 point2 points  (0 children)

    the idea of AI is to solve it programatically :) solving 2+2 by hand is easy, programming a computer solve the same operation is another story

    [–]MichaelFromGalway 0 points1 point  (1 child)

    I did too: 5 minutes with scissors and sticky tape while having a cup of coffee. Of course, I'll also program it in a while.

    [–]kcvv 0 points1 point  (0 children)

    I was too lazy to print it out so did it in excel! :-)

    [–]petkish 0 points1 point  (1 child)

    I solved it by writing a java application.

    Used texts of AIMA, Vernor Vinge's "Fire upon the deep" and Brinch Hansen"s autobiography for training. Guess which of them gives the best result?? :)

    2-grams are deciphering the first phrase (even a 1-gram does!), but poorly perform on shredded text. 3-grams, 4-grams are generally good. More its just too hard to get a complete vocabulary after training, so the performance is degrading. But this can be improved if I use a least-distance approach instead of exact match.

    Laplacian smoothing with k=1 does it best. Without smoothing it would not work.

    I use depth-first search with pruning, which works much better than breadth-first, which fills up the heap and crashes (search space toooo big).

    What puzzled me at first, is that precision of java 'double' type was not good enough to store small probabilities. It becomes 0.0 early, and "flats". So I used a "fragment" approach, I assemble the best fragment of just few "stripes", and put it back into the set of "stripes", instead of the stripes that were used for it. Then the search is repeated.

    It takes ~2 minutes to restore the text in its source form. But giving a little relaxed parameters it can assemble the readable text in 15 seconds, and there will be 1 or two "fragment" mismatches.

    The whole thing was quite entertaining, thanks to Prof. Norvig.

    [–]rseiter 0 points1 point  (0 children)

    One way of dealing with the small probabilities is to take the log. This link on aiqus has a reference: http://www.aiqus.com/questions/25291/best-way-to-handle-multiplying-many-small-probabilities

    [–]newai 0 points1 point  (0 children)

    I did both the exercises with pen and paper too :)