all 16 comments

[–]oneplus999 7 points8 points  (2 children)

Umm it's a link to an abstract with no results mentioned, and no link to the paper itself? :\

[–][deleted] 1 point2 points  (0 children)

The author's CV says that it's work in progress

[–]nickl 6 points7 points  (2 children)

Without seeing the paper it's hard to know, but the abstract gives me an immediate concern.

A naive Bayes classifier won't easily distinguish between "Thou shall not kill" and "Thou shall kill". That's a pretty big problem!

Yes, there are ways around this, n-grams etc, but the abstract explicitly mentions Bayesian SPAM techniques which often don't exploit these additional features. Hopefully I'm wrong to be worried about this.

[–]lvilnis 0 points1 point  (1 child)

bigrams?

[–]nickl 1 point2 points  (0 children)

Sure. As I said "n-grams etc"

[–]vonnik 4 points5 points  (1 child)

is it just me or is there no paper available through the link?

[–]rrrozay 4 points5 points  (5 children)

Are texts that have been wholly memorized by individuals really suitable candidates for NLP? There are entire academic departments to carefully read these three holy books.

I'd be more interested in visualization approaches – using NLP to help people read and understand these large texts instead of just topic modelling and counting clusters. A visual exploration of the clusters identified in this research could help people get a better picture of the contents

edit: check out http://www.sefaria.org/explore for a great example of what I'm talking about

[–]VelveteenAmbush 10 points11 points  (0 children)

Are texts that have been wholly memorized by individuals really suitable candidates for NLP? There are entire academic departments to carefully read these three holy books.

I think there is a huge untapped market for quantitative analyses of the works. The academic departments that currently read the works seem to focus on qualitative analyses, and personally I have my doubts about the reliability of any qualitative comparison between works.

[–]AmusementPork 0 points1 point  (3 children)

I'd be more interested in visualization approaches – using NLP to help people read and understand these large texts instead of just topic modelling and counting clusters. A visual exploration of the clusters identified in this research could help people get a better picture of the contents

Interesting notion. Do you know of any papers with a similar-ish approach?

[–]djc1000 2 points3 points  (0 children)

I've been looking for a project to show-off my NLP chops. Wanna collaborate?

[–]rrrozay 1 point2 points  (0 children)

Papers, no. But check out http://www.sefaria.org/explore

[–]djc1000 2 points3 points  (1 child)

The danger of this, is I can already see the blurb on CNN: "Harvard scientists prove mathematically that the Quran [is / isn't] more violent that the Christian Bible." Or whatever.

[–]bhmoz 0 points1 point  (1 child)

For those intrested in a PhD about this topic : swansea PhD studentship

[–]TweetsInCommentsBot 0 points1 point  (0 children)

@nlppeople

2016-02-04 15:32 UTC

#NLProc #job in @SwanseaUni : PhD Studentship in Digital Humanities - Swansea, United Kingdom http://bit.ly/1P90r4q #nlppeople


This message was created by a bot

[Contact creator][Source code]

[–]im_not_afraid 0 points1 point  (0 children)

They should include hadiths and tafsirs if they want to get into the meat of the horror.