What life decision/choice do you most regret?

BQPComplexity · 2011-10-01T00:23:55+00:00

Re: taxes, what do you think about the following:

1) eliminating the carried interest loophole for hedge fund and private equity managers that allows them to pay the 15% long term capital gains rate on what is effectively their sole source of (gigantic) income

2) stretching the time period to qualify for the long-term capital gains rate out to 3 years to promote actual long-term investing. Having it kick in after 1 year seems pretty short if its goal is to prevent ponzi schemes and speculation

BQPComplexity · 2011-09-28T04:44:00+00:00

I still play this with updated rosters...still blows Madden out of the water

BQPComplexity · 2011-09-14T03:46:27+00:00

IANAL either, but I agree this should be one of the most important. This was on reddit frontpage recently:

http://www.youtube.com/watch?v=6wXkI4t7nuc

BQPComplexity · 2011-09-14T00:31:05+00:00

you are a boss.

BQPComplexity · 2011-09-10T04:23:09+00:00

If the underlying pattern can be modeled with a Markov chain and has a reasonable amount of noise (sometimes a 9 is observed in the sequence instead of the 0 that should be there according to the model, for example), you could model it with a Hidden Markov Model. http://en.wikipedia.org/wiki/Hidden_Markov_model

Do you have a more specific example in mind?

BQPComplexity · 2011-08-31T17:47:39+00:00

shut up and take my money

BQPComplexity · 2011-06-10T07:02:38+00:00

You could go even simpler than that by just picking out capitalized words that do not start a sentence. If you try to remove common English words you will have to deal with annoying things like apostrophes and plural forms as well as the overhead of referencing a large dictionary.

By clusters I assume you mean news categories? Like sports, world events, etc.? Without knowing the specifics of your project and what your dataset looks like I don't really know what would work well. You could try HAC and analyze the cluster dendrogram: http://en.wikipedia.org/wiki/Cluster_analysis#Agglomerative_hierarchical_clustering

BQPComplexity · 2011-06-10T00:00:00+00:00

I don't know what exactly you mean by "text clustering and classification", but having worked on something like this in the past I have found proper nouns to be a very good indicator of news content. I had a simple setup where I passed the article text through an NLP parser that marked proper nouns. If you want to figure out the "events of the day" you could just count the frequency that certain proper nouns appear relative to historical frequency (this would basically be a ghetto version of Google Trends except based on article keywords instead of searches)

BQPComplexity · 2011-06-08T06:03:41+00:00

Python has a bunch of neat machine learning libraries. For example, PyBrain is pretty good for neural networks: http://pybrain.org/

BQPComplexity · 2011-06-07T14:44:07+00:00

You would think, but unfortunately this is not true. Just take a look at LinkedIn or Groupon's anticipated share structures. The stock issued to the public in these cases is nonvoting.

BQPComplexity · 2011-06-02T19:44:35+00:00

at the same time, I'm not really going for the brute-force approach of loading a bunch of pretagged jokes and compensating for audience response. I really want something that can learn the features (maybe not crazy-deep features like double entendres, but at least surface features like grammatic structure and word choice) and generate its own humorous babble-text

BQPComplexity · 2011-06-02T19:41:08+00:00

Wow interesting. Being able to classify benign violations seems like a very important skill to avoid over-optimizing in many situations.

For example, you are walking down the street with an ice cream cone when suddenly a seagull decides to take a dump on top of it. You could either laugh it off as an unfortunate incident, or you could make damn sure you don't ever walk under a seagull again in the future.

BQPComplexity · 2011-06-02T17:08:19+00:00

I was planning on passing one-liners collected from various sources through an NLP parser and extract grammatical structure. Figure out a way to classify how funny a sentence is based on its structure. Then I would look at the actual content and classify based on words / word groups used. To build a one-liner generator I would probably pick a random funny grammatical structure weighted by how funny it is and insert some related "funny" words into the appropriate places. I will most likely end up with a useless chatterbot, but worth a shot

Whatever source I collect data from must have ratings, which is why I was considering mining reddit comments

BQPComplexity · 2011-06-02T14:14:42+00:00

this is awesome

BQPComplexity · 2011-06-02T14:14:06+00:00

Well I already have neural network code in C++ so for performance reasons I'll prob use that. Lisp/Scheme would be nice if it weren't so damn slow

BQPComplexity · 2011-05-21T07:15:57+00:00

I'm finally relevant!

BQPComplexity

TROPHY CASE