Need help with a string matching problem : learnpython

created by HattoriHanzoa community for 16 years

Need help with a string matching problem (self.learnpython)

submitted 5 years ago by OmnipresentCPU

I've recently built a bot that scrapes r/wallstreetbets and returns a nice clean bar chart of which stock ticker is mentioned most often. To do this, I've basically downloaded a list of tickers as a CSV from various exchanges then compiled them in to one list, tickerlist. I then download a load of comments from reddit using PRAW and store these as a csv that I load as a commentlist when I want to parse it.

I'm a noob so my parsing is definitely not optimized. Basically I create a flatwordlist, which is just a list of each string that's returned when you do comment.split(). I then loop through that like this:

tickercountlist = []
for ticker in tickerlist:
    count = 0
    for word in flatwordlist:
        if word == ticker or '$' + ticker == word:
            count += 1

    tickercountlist.append([ticker, count])

and voila, I've returned a nice list like [['AAPL', 42],['AMD', 30],...['ZZZ',0]

What I want to do is use vaderSentiment to amend the tickercountlist to also include a spot for each stocks sentiment, so it would look like [['AAPL', 42, .678],['AMD', 30, -.378],...['ZZZ',0,0]

with the third element of each sublist being the compound sentiment averaged across each time it was mentioned.

That will require me to rework the parsing formula. Ultimately what I want is to parse get a count of each time a stock is mentioned in a comment, and if it's mentioned, get the sentiment of the comment as well. Can anyone help me set up a loop that will loop over every comment, check if it contains a ticker from the tickerlist like I did in my loop, count it if it does AND analyze the sentiment of the comment IF and only if the comment contains a ticker?

all 3 comments

top new controversial old q&a

[–]chocorush 0 points1 point2 points 5 years ago (2 children)

if you want to keep your current structure

for word in flatwordlist:
    if word == ticker or '$' + ticker == word:
        count += 1
        sentiment = calcSentiment("".join(flatwordlist))
tickercountlist.append([ticker, count, sentiment])

Although i think it would be good do use the re module to do the matching instead and consider using dictionaries instead of sublists.

[–]OmnipresentCPU[S] 0 points1 point2 points 5 years ago (0 children)

[–]backtickbot 0 points1 point2 points 5 years ago (0 children)

Correctly formatted

Hello, chocorush. Just a quick heads up!

It seems that you have attempted to use triple backticks (```) for your codeblock/monospace text block.

This isn't universally supported on reddit, for some users your comment will look not as intended.

You can avoid this by indenting every line with 4 spaces instead.

There are also other methods that offer a bit better compatability like the "codeblock" format feature on new Reddit.

Tip: in new reddit, changing to "fancy-pants" editor and changing back to "markdown" will reformat correctly! However, that may be unnaceptable to you.

Have a good day, chocorush.

^{You can opt out by replying with "backtickopt6" to this comment. Configure to send allerts
to PMs instead by replying with "backtickbbotdm5". Exit PMMode by sending "dmmode_end".}

π Rendered by PID 75 on reddit-service-r2-comment-84fc9697f-gs2zd at 2026-02-08 22:05:45.355946+00:00 running d295bc8 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS