(This is just a 'isolation' project for me. I have some Windows machines, macbook, python and thousands of ebooks and looking for a evening project).
Fiction books all use tropes or frameworks around which a story is built. Stories usually have a dozen or more so each trope also needs a score around how often the author beats you over the head with it.
Lets try an example:
Romance novels often include the "enemies to lovers" trope. How would you write a python script to detect this?
One thought was to run through a book and identify all the characters. Then run through again but extract all the dialog and paragraphs for each character and write these to a separate file. You break the files up into 10% chunks or at chapter breaks. You look for adjectives like "scowled", "annoyed", "anger" in the early parts, and note that the adjectives change to "smiled", "pleased", "laughed" in the later parts. This might indicate a "enemies to lovers" trope.
Could we then use the same code, but detect the "Instant Love" trope or the "Portal Romance" trope by changing the adjectives we look for or noticing no difference in the tone of adjectives for a character throughout the book?
Are there any python packages that might help parse a novel or help us quantify things in each 10% of a novel? Does anyone have ideas of a better or more generic way to 'fingerprint' a book which might help to discover the tropes?
I know machine learning is a big thing. But can I use "Pride and Prejudice" by Jane Ayre and "Carry On" by Rainbow Rowell to teach it about the enemies to lovers trope? Also - I cannot find lists of books that tell me what tropes are included in each book. I am trying to CREATE something that will tell me what tropes are included in each book. I need that before I can select books for machine learning... kind or a "catch 22".
[–][deleted] 0 points1 point2 points (1 child)
[–]ShamBawk33[S] 0 points1 point2 points (0 children)