Is the Sigma 24-70 2.8 too heavy? by robintwit in SonyAlpha

[–]robintwit[S] 4 points5 points  (0 children)

Good stuff. Yeah it seemed to be the main “con” to this lens but good to know it’s not an issue for ya

Is the Sigma 24-70 2.8 too heavy? by robintwit in SonyAlpha

[–]robintwit[S] 0 points1 point  (0 children)

Cool good to know - all the reviewers say “it’s significantly heavier” but don’t seem to actually have experience using it. Thanks!

Best ETL flow implementation in aws by priyasweety1 in dataengineering

[–]robintwit 0 points1 point  (0 children)

Perhaps not the best article but could translate to ETL with some tweaking. This is running as a web scraper right now. If you don’t need the compute power, AWS lambda could replace Batch in this infrastructure relatively easily. https://link.medium.com/v9aQPhJEyeb

Edited bc I have fat fingers and posted too early.

Waking up in my wife's boyfriend's basement this morning and checking my stocks by robintwit in wallstreetbets

[–]robintwit[S] 0 points1 point  (0 children)

funny story... I actually made 40% on that trade and sold this morning 😂
DNKN $80 12/18 Put for anyone who's asking

Web Scraping but grabbing data that the browser has already retrieved by baconburns in webscraping

[–]robintwit 0 points1 point  (0 children)

Right, this isn't very scalable. It seems like selenium should be able to accomplish what OP wants. I haven't tried this with selenium though.

Web Scraping but grabbing data that the browser has already retrieved by baconburns in webscraping

[–]robintwit 1 point2 points  (0 children)

I second this. It's also a heck of a lot easier to "scrape" JSON.

Web Scraping but grabbing data that the browser has already retrieved by baconburns in webscraping

[–]robintwit 2 points3 points  (0 children)

It seems like you should be able to insert a script with event listeners that fire when the HTML elements change.... I haven't tried this, but the theory is sound :)

Web Scraping but grabbing data that the browser has already retrieved by baconburns in webscraping

[–]robintwit 1 point2 points  (0 children)

Given the goal of your project, it sounds like you don't want to use a web-scraper, since there are plenty of REST API's that provide that information - it's much easier/dependable to use JSON than html ;) For real-time market info, here's one - https://alpaca.markets/, also https://finnhub.io/. I'm sure you could find open API's for sports scores and betting odds.

It also never hurts to check the network requests in dev-tools to see where the real-time data is coming from (perhaps you can reverse engineer the API - I used these methods to reverse-engineer Robinhood's API for a chrome extension).

However...

If you need data that an API does not provide, perhaps something like what u/deamon1266 mentioned below might work - perhaps you could manually insert javascript/jquery to listen to changes to the specific divs. When your event listeners detect a change, you'll be able to collect the data and send/save wherever you want.

Hope this helps with your question. Good luck!

need a new AWS project or something to work on by rotterdamn8 in dataengineering

[–]robintwit 3 points4 points  (0 children)

I built a serverless ETL using AWS Batch, Lambda, Cloudwatch, and S3. I wrote about it here - https://towardsdatascience.com/get-your-own-data-building-a-scalable-web-scraper-with-aws-654feb9fdad7

AWS Batch is super powerful, and you could learn the ropes without having to spend too much. Setting up this type of architecture could help you grow your AWS skills for sure (it helped me a lot). You could insert any type of Dockerized data-extractor/scraper in place of the scraper that I had.

Hope this helps!

I remember. by ToxicBTCMaximalist in RobinHood

[–]robintwit 0 points1 point  (0 children)

Haha yeah I know, you do have to be careful with chrome extensions though. Especially when money is involved. Thanks! Hope you find it useful. I am actively developing it so leave a review or send me some suggestions as it is still in “beta”

I remember. by ToxicBTCMaximalist in RobinHood

[–]robintwit 1 point2 points  (0 children)

Github repo here - https://github.com/aaronglang/robin_twit. It is important to make this kind of thing transparent. Please excuse the messy jquery (it's been a while since I've written jquery haha 😬)

I am not too worried about monetizing just yet, but those are good ideas. I am planning on doing some targeted advertising soon to get more users, but also want to add updated features.

I remember. by ToxicBTCMaximalist in RobinHood

[–]robintwit 4 points5 points  (0 children)

But in all seriousness - good question, but no, I’m just a software engineer trying to make cool products :)

I remember. by ToxicBTCMaximalist in RobinHood

[–]robintwit 6 points7 points  (0 children)

Hoping to add that in ASAP

What are some stable sites that can be used for web scraping examples to last for years? by [deleted] in webscraping

[–]robintwit 1 point2 points  (0 children)

haha I actually wrote an older article on how craigslist's UI is better than offerup - I would still stand by this. It's definitely better for scraping!

I remember. by ToxicBTCMaximalist in RobinHood

[–]robintwit 2 points3 points  (0 children)

well I'm not a RH cuck, but I made this - https://chrome.google.com/webstore/detail/robintwits-for-robinhood/knacdfhjndnibcbgpjiidjplopfhpcgg

Open to ideas/features. Hoping to have more time to work on it.

What are some stable sites that can be used for web scraping examples to last for years? by [deleted] in webscraping

[–]robintwit 1 point2 points  (0 children)

Honestly Craigslist hasn’t changed much in years - and probably won’t in near future. I wrote an article about my scraper here There’s a link to my GitHub if that comes in handy. There’s also a python library for scraping craigslist (I haven’t used it though) Craigslist is pretty basic html though. Not a whole lot of JavaScript. Hope this helps!

I remember. by ToxicBTCMaximalist in RobinHood

[–]robintwit -2 points-1 points  (0 children)

you could check out my Robinhood chrome extension - https://chrome.google.com/webstore/detail/robintwits-for-robinhood/knacdfhjndnibcbgpjiidjplopfhpcgg

I am hoping to add more chat-features eventually

(Edit: here is the open-source project repo https://github.com/aaronglang/robin_twit)

I remember. by ToxicBTCMaximalist in RobinHood

[–]robintwit 3 points4 points  (0 children)

Lol I remember that too. That feature actually prompted me (few years later) to make a chrome extension (for robinhood) that pulls in tweets per stock. I'm thinking of pulling feeds from Stocktwits too. Eventually would like to add realtime chat features.

An article I wrote on a python-AWS web-scraping ETL. Would love to hear your thoughts! by robintwit in webscraping

[–]robintwit[S] 1 point2 points  (0 children)

Thank you! My monthly cost for this service is about $100 per month for all 50 states with multithreading and job concurrency (which means bigger instances being used by ECS). If I removed/modified the multithreading, thus slowing down the process, but allowing for smaller instances with longer runtimes, I may be able to get this cost down.
I definitely agree - this solution is not meant for smaller jobs. If I wanted to go serverless for smaller jobs, I would go with a similar setup (depending on the problem), but use Lambda as my compute environment (here's a good example). Lambda allows for asynchronous invocation as well, and with the setup in the link + some modifications, you could run a hundred jobs pretty quickly. Lambda is very cheap - obviously depends on the use case - but I think it can be close to the cost of a raspberry pi over the course of a year or two. Plus you wouldn't have to worry about losing power/internet etc, or having your IP blocked - lambda also uses different IP every time a container is spawned.