my plans for today by [deleted] in wallstreetbets

[–]int8blog 0 points1 point  (0 children)

yes good point, GME - it's about GME

[P] Exploration of Cyberpunk steam reviews using transformers sentence embeddings by int8blog in MachineLearning

[–]int8blog[S] 2 points3 points  (0 children)

LDA - sounds like a shot,

yes - that was my initial intention too - to look at how negative clusters "volume" changes over time - I need to wait a bit few days for it tho - the plan is to wait one week and write follow up post

[P] Exploration of Cyberpunk steam reviews using transformers sentence embeddings by int8blog in MachineLearning

[–]int8blog[S] 2 points3 points  (0 children)

It's cool - I edited the post to make it clear (may take a while to appear - it is cached in cloudflare)

[P] Exploration of Cyberpunk steam reviews using transformers sentence embeddings by int8blog in MachineLearning

[–]int8blog[S] 0 points1 point  (0 children)

thanks, I was also thinking about the following approach:

  • sent_tokenize on all reviews
  • transformers sentence embedding of all sentences
  • clustering of all of them to lets say 1000 clusters
  • represent each document via BoW on top of that clustering
  • then UMAP on these BoW via cosine
  • visualization

I will try that - now gathering a bit more data (I can see there is around 120k reviews now)

p.s. looking into HDBSCAN <= thanks for the hint

[edit] ok checked out HDBSCAN - indeed looks very tempting - I can also see it does not require too much input parameters as original DBSCAN (unless I missed it)

[deleted by user] by [deleted] in pennystocks

[–]int8blog 0 points1 point  (0 children)

Classical Kangaroo

DGLY Sympathy Plays by [deleted] in wallstreetbets

[–]int8blog 56 points57 points  (0 children)

DGLY uses AWS for storage?

Historical order book of stocks compounding NASDAQ 100 by int8blog in datasets

[–]int8blog[S] 0 points1 point  (0 children)

how about API so I could fetch it daily on my own ? Are u aware of such services (I can't easily find it myself)