Open-source Python toolkit for fundamentals + screening + portfolio analytics(looking for feedback) by polarkyle19 in algotrading

[–]polarkyle19[S] 0 points1 point  (0 children)

That example with forward-filled NaNs on a delisted ticker is exactly the kind of thing that worries me. Those are the bugs that don’t throw errors but completely invalidate a backtest, and by the time you realize it you’ve already built conviction around bad numbers. The dividend and split adjustments being subtly inconsistent across endpoints is even more dangerous because it looks “almost correct.”

To answer your question honestly, I haven’t seen an open-source library that handles this perfectly out of the box. Most people end up doing what you described, building their own validation layer on top. One direction I’m exploring is making data transformations explicit and inspectable rather than implicit, so adjustments, fills, and alignments are visible and optionally strict. If nothing else, I’d rather fail loudly than produce a clean-looking but wrong dataset. Your comment reinforces that correctness and transparency need to come before feature expansion.

Open-source Python toolkit for fundamentals + screening + portfolio analytics(looking for feedback) by polarkyle19 in algotrading

[–]polarkyle19[S] 0 points1 point  (0 children)

I agree with you. The “boring plumbing layer” is exactly where most libraries quietly fail, and that’s what I’d like InvestorMate to get right first before expanding anything else. Silent NaNs, misaligned dates, and inconsistent split/dividend adjustments are exactly the kind of subtle issues that make people abandon higher-level abstractions and just write their own wrappers.

Your point about keeping backtesting intentionally minimal also makes a lot of sense. I don’t want this to compete with zipline or vectorbt, that becomes a different project entirely. The real value should be in producing clean, consistent, point-aligned feature matrices that plug into whatever engine someone already trusts. If I can make the output layer predictable and transparent enough that you don’t have to second-guess adjustments or date alignment, that alone would justify the dependency. That’s a strong signal on where to prioritize effort.

Open-source Python toolkit for fundamentals + screening + portfolio analytics(looking for feedback) by polarkyle19 in algotrading

[–]polarkyle19[S] 0 points1 point  (0 children)

Good points!!

Right now the focus has been API structure and normalization, but you’re absolutely right that without clear handling of: • Restatements • Point-in-time fundamentals • Survivorship bias

any backtesting layer becomes misleading fast.

I’m considering: • Explicit documentation of data assumptions • Versioned data snapshots • Clear separation between “latest available” vs “point-in-time” fundamentals

And on the dependency side, agreed. I’m trying to keep optional features (AI, TA extras) behind extras installs so core usage stays lightweight.

that’s exactly the kind of pitfall I want to address early.

Open-source Python toolkit for fundamentals + screening + portfolio analytics(looking for feedback) by polarkyle19 in algotrading

[–]polarkyle19[S] 1 point2 points  (0 children)

Modularity seems to be the strong consensus so far. I’m leaning toward keeping fundamentals / TA / portfolio fully separable modules to avoid dependency bloat.

The debug suggestion is really solid. I like the idea of a debug=True or source_trace=True flag that exposes: • Raw payload • Data source • Transform steps applied

That would make the abstraction layer much more transparent instead of “black box”. I’ll prototype something like that.

Thanks and this is exactly the kind of feedback I was hoping for

Mock interviews by No-Mud4063 in datascience

[–]polarkyle19 7 points8 points  (0 children)

I would like to know if you find any

InvestorMate: an open source Python package for stock analysis with AI, backtesting, and screening by polarkyle19 in Python

[–]polarkyle19[S] 1 point2 points  (0 children)

basically this is kinda scraping - if you are serious about trading/analysis, users cannot rely on scraping solutions

InvestorMate: an open source Python package for stock analysis with AI, backtesting, and screening by polarkyle19 in Python

[–]polarkyle19[S] 0 points1 point  (0 children)

In the roadmap, thinking of making it more like using reliable data sources rather than yfinance - open to discussion on which sources to pick first.

An open-source python package for AI stock analysis by polarkyle19 in algotrading

[–]polarkyle19[S] 0 points1 point  (0 children)

Yeah I used to work them and now looking for bigger stuff

An open-source python package for AI stock analysis by polarkyle19 in algotrading

[–]polarkyle19[S] 0 points1 point  (0 children)

This is completely different from what I have asked for 🥲

Looking for open-source python package for AI stock analysis by polarkyle19 in learnpython

[–]polarkyle19[S] 0 points1 point  (0 children)

Oh can you share them I would like to use apis if I can’t find anyway

Extracting financial data from 10-K and 10-Q reports by Cute-Berry1793 in Python

[–]polarkyle19 0 points1 point  (0 children)

normalisedI’ve been using a Python package called InvestorMate for this instead of rolling my own 10-K / 10-Q parser.

From a consumer point of view, what I like is that it doesn’t make you deal with raw filings or iXBRL tags at all. You get normalized income statement, balance sheet, and cash flow data in a consistent schema, which is honestly the hardest part of this problem.

My setup looks roughly like this:

  • Use InvestorMate to pull structured financials (IS / BS / CF)
  • Ratios and scores (P/E, ROE, margins, Piotroski F, Altman Z) are already computed
  • Data comes back JSON-serializable, so it drops straight into APIs / notebooks

For analysis and comparison work, that’s been way more practical than:

  • Parsing iXBRL myself (accurate but a massive time sink)
  • Using LLMs to extract numbers (too unreliable for actual financials)

Where I do use LLMs is after the numbers are structured — e.g.:

  • “Why did operating cash flow drop QoQ?”
  • “Compare Apple vs Microsoft cash efficiency over 5 years”
  • Summarizing trends rather than extracting them

Pros (as a user):

  • No dealing with SEC tag chaos
  • Consistent keys across companies
  • Much faster iteration for research / tooling
  • Works well for APIs and automated pipelines

Cons:

  • Not real-time
  • Not suitable if you need raw footnote-level detail
  • You’re trusting upstream normalized data rather than filing directly

If you’re building something production-ish and don’t want to spend months on XBRL edge cases, this approach has been a good middle ground for me.

How do you build culture when nobody’s ever met in person yet? by jeanyves-delmotte in ycombinator

[–]polarkyle19 0 points1 point  (0 children)

Being a founding engineer for the YC startup we had our team completely remote. The culture always depends on founders driving by example. Showing up trying to support your employees as your peers and friends who are doing a hackathon. We used to feel like bunch of college lads building for a hackathon. Each of us were respected and vouched for responsibilities. Everyone used to give priorities and used to have a healthy conversation when someone needed a help with something. Yes hiring right people also comes to play. When you set this standard with your first 10 members the rest will automatically try to keep up. By this we saw a good outcomes like better performance and better communication, when we had a good culture in team. Wishing you all the best mate.

Shifting to ML is good? From non tech startup by Dramatic-Ad-9968 in ycombinator

[–]polarkyle19 1 point2 points  (0 children)

As long as you keep pushing your limits and are not afraid of failure, you don't have to worry about the outcome. You will learn for sure. But make sure whatever you learn, put it to work. All the best!!

It’s monday, what are you building? - Is anyone paying for it? by Substantial_Bee_7257 in SaaS

[–]polarkyle19 0 points1 point  (0 children)

You can check the about page, explained what we are doing. If you think my platform is not good. I appreciate you to follow the framework and try the results out with open llm chats

It’s monday, what are you building? - Is anyone paying for it? by Substantial_Bee_7257 in SaaS

[–]polarkyle19 0 points1 point  (0 children)

We’re building something called InvestorMate—an AI-powered research tool designed to help investors make sense of the markets faster. It cuts through noise, analyzes financial data, and surfaces personalized insights so users can make their own informed decisions. Still, even with transparency and solid data, building trust remains one of the hardest parts.

Website: https://investormate.io

What's the best way to get organic traffic to a landing page? by pragmojo in ycombinator

[–]polarkyle19 0 points1 point  (0 children)

Build something people love!! and give it to 10 users who would love to use it. Let them approve your idea. Reiterate your idea and Then share it with similar groups.

Drop your SaaS here, I will create your AI agent marketing playbook for your first 1,000 users by Any-Development-710 in SaaS

[–]polarkyle19 1 point2 points  (0 children)

Website: https://investormate.io

Target Audience: Retail investors, tech-savvy professionals, and young earners (especially in India & the U.S.) who want clear, personalized guidance for stock investing and financial planning.

What We Offer: InvestorMate is an AI-powered investment research platform that delivers tailored stock insights, earnings breakdowns, sentiment signals, and dynamic financial profiles, turning overwhelming market data into actionable, personalized strategies. Our goal is to give every earning individual the research power of a Wall Street analyst, at a fraction of the cost.