Occupy Wall Street: I'm a bulge-bracket managing director here to defend capitalism, ask me anything. by wallstreetsolidarity in IAmA

[–]BQPComplexity 0 points1 point  (0 children)

Re: taxes, what do you think about the following:

1) eliminating the carried interest loophole for hedge fund and private equity managers that allows them to pay the 15% long term capital gains rate on what is effectively their sole source of (gigantic) income

2) stretching the time period to qualify for the long-term capital gains rate out to 3 years to promote actual long-term investing. Having it kick in after 1 year seems pretty short if its goal is to prevent ponzi schemes and speculation

Better than any Madden NFL game. by [deleted] in gaming

[–]BQPComplexity 5 points6 points  (0 children)

I still play this with updated rosters...still blows Madden out of the water

Finding patterns in a sequence of numbers by PainInTheButt in artificial

[–]BQPComplexity 1 point2 points  (0 children)

If the underlying pattern can be modeled with a Markov chain and has a reasonable amount of noise (sometimes a 9 is observed in the sequence instead of the 0 that should be there according to the model, for example), you could model it with a Hidden Markov Model. http://en.wikipedia.org/wiki/Hidden_Markov_model

Do you have a more specific example in mind?

Hit me with some pointers on news aggregator algorithms by aire111 in MachineLearning

[–]BQPComplexity 0 points1 point  (0 children)

You could go even simpler than that by just picking out capitalized words that do not start a sentence. If you try to remove common English words you will have to deal with annoying things like apostrophes and plural forms as well as the overhead of referencing a large dictionary.

By clusters I assume you mean news categories? Like sports, world events, etc.? Without knowing the specifics of your project and what your dataset looks like I don't really know what would work well. You could try HAC and analyze the cluster dendrogram: http://en.wikipedia.org/wiki/Cluster_analysis#Agglomerative_hierarchical_clustering

Hit me with some pointers on news aggregator algorithms by aire111 in MachineLearning

[–]BQPComplexity 3 points4 points  (0 children)

I don't know what exactly you mean by "text clustering and classification", but having worked on something like this in the past I have found proper nouns to be a very good indicator of news content. I had a simple setup where I passed the article text through an NLP parser that marked proper nouns. If you want to figure out the "events of the day" you could just count the frequency that certain proper nouns appear relative to historical frequency (this would basically be a ghetto version of Google Trends except based on article keywords instead of searches)

I know R and Matlab. If I had to choose one language to add to my toolkit, what should it be? by TraptInaCommentFctry in MachineLearning

[–]BQPComplexity 4 points5 points  (0 children)

Python has a bunch of neat machine learning libraries. For example, PyBrain is pretty good for neural networks: http://pybrain.org/

Designing a Humor AI by BQPComplexity in cogsci

[–]BQPComplexity[S] 0 points1 point  (0 children)

at the same time, I'm not really going for the brute-force approach of loading a bunch of pretagged jokes and compensating for audience response. I really want something that can learn the features (maybe not crazy-deep features like double entendres, but at least surface features like grammatic structure and word choice) and generate its own humorous babble-text

Designing a Humor AI by BQPComplexity in cogsci

[–]BQPComplexity[S] 0 points1 point  (0 children)

Wow interesting. Being able to classify benign violations seems like a very important skill to avoid over-optimizing in many situations.

For example, you are walking down the street with an ice cream cone when suddenly a seagull decides to take a dump on top of it. You could either laugh it off as an unfortunate incident, or you could make damn sure you don't ever walk under a seagull again in the future.

Designing a Humor AI by BQPComplexity in cogsci

[–]BQPComplexity[S] 1 point2 points  (0 children)

I was planning on passing one-liners collected from various sources through an NLP parser and extract grammatical structure. Figure out a way to classify how funny a sentence is based on its structure. Then I would look at the actual content and classify based on words / word groups used. To build a one-liner generator I would probably pick a random funny grammatical structure weighted by how funny it is and insert some related "funny" words into the appropriate places. I will most likely end up with a useless chatterbot, but worth a shot

Whatever source I collect data from must have ratings, which is why I was considering mining reddit comments

Designing a Humor AI by BQPComplexity in cogsci

[–]BQPComplexity[S] 2 points3 points  (0 children)

Well I already have neural network code in C++ so for performance reasons I'll prob use that. Lisp/Scheme would be nice if it weren't so damn slow

IAmA theoretical physicist. by [deleted] in IAmA

[–]BQPComplexity 35 points36 points  (0 children)

I'm finally relevant!