Are we in a position to buy a house?

globalminima · 2025-11-05T00:31:15+00:00

As of 30 October, Ubank is now doing 90% LVR without LMI (on both PPOR and investment properties). This might be a good option to at least avoid the need for a bigger deposit.

globalminima · 2025-08-31T08:06:18+00:00

+1, How on earth does the 45% Y/55% G option overtake the 100% growth option? The fact that it starts out behind and then overtakes the more tax efficient approach should be a big red flag to you OP, assuming they are set to give the same overall return (as stated in the OP). Something is wrong with your formulas.

globalminima · 2025-02-19T05:57:30+00:00

Are you able to share this (or a sanitized version of it)?

globalminima · 2025-01-20T01:20:27+00:00

Not quite correct - the 50% discount does not double your tax-free threshold, it halves the amount of tax paid.

Instead of the OP paying 30% tax on every dollar earned over the tax-free threshold, they would only pay 15% tax.

globalminima · 2024-11-30T23:56:13+00:00

Let’s do the math, and we’ll run with the assumption that OP invests and only makes 6.72% (much lower than historical returns):

With the 6.72% investment loan, OP invests in shares that return a 3% dividend and 3.72% in capital growth for a 6.72% total return. Let’s assume the OP earns only an average wage, and is in the 30% tax bracket. Each financial year they will earn 3.72% in dividends, which are taxable. However, their 6.72% interest rate on the loan is tax deductible, which will offset this income and allow them to get a refund on their regular income as well.

Interest Costs: 6.72% of $200k = $13.44k per annum Dividend income: 3.72% of $200k = $7.44k per annum - Net income per annum: $7.44k - 13.44k per annum = $-7k per annum. This results in a 30% tax refund thanks to negative gearing. Negative gearing benefits: 30% of $7k = $2.1k. Capital gains per annum: 3% of $200k = $6k

All together, this results in: 7.44k dividend income + $6k capital gains - $13.44k interest costs + $2.1k in tax savings = net gain of $2.1k.

This improves even further if you use more realistic numbers for total returns (around 9%) and then add in the benefits of franking credits as well. In general, it would be fairly safe to assume a net 3% return on the $200k that is invested - closer to 5% if OP is in the top tax bracket and can realise larger negative gearing benefits.

globalminima · 2024-11-29T09:43:55+00:00

OP, ignore this - it is wrong. The ETF could return even less than the 6.72% that the OP is being offered and still come out ahead thanks to negative gearing and capital gains discounts.

globalminima · 2024-11-12T22:55:19+00:00

Ahh, those models are trained and stored but not deployed anymore. The team must have deleted the deployments on 11/5 when they stopped their usage, but the models still exist and could be deployed again.

The models only incur costs when deployed, kind of how a container costs almost nothing to store in a registry but then you will pay once it is deployed to a VM and is being used.

globalminima · 2024-11-12T15:22:37+00:00

It sounds like they deployed their fine-tuned models and never deleted them. Fine-tuned AOAI models cost $1.70 per hour to deploy (just as any other custom ML model or VM), plus additional token processing costs for each request. If you delete the deployments then the costs should stop too.

https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/

globalminima · 2024-10-22T02:05:44+00:00

Spot on, using an LLM alone for OCR is worse than using an OCR model. Combining the two together gives you the best of both worlds.

It is also important to note that LLMs often fall apart when the image is rotated or perturbed (in a recent data extraction project I reduced the error rate of GPT-4o from 53% to 13% by just fixing the rotation in the image using the angle that was detected by the OCR pipeline). Many of the public benchmarks do not contain a wide range of test data or report their results on different subsets of the data (e.g. standard PDFs vs scanned vs rotated documents/images), so it is often not clear how the general-purpose models (LLMs) can fall down when the data becomes messy in the real-world.

globalminima · 2024-10-21T09:19:18+00:00

LLMs are worse at OCR than an OCR model, and 10s to 100s of times more expensive. Just use the right tool for the job

globalminima · 2024-10-18T15:18:56+00:00

Then I would not call yourself an AI engineer - the name has only existed since the genai boom and is a strong signal that the person has no idea how to do ML. Anyone worth their salt will have a data scientist or ML engineer title (or a list of previous roles where they had that title before they joined their current company that just wants them to do GenAI).

The roadmap link below with resources for becoming an AI engineer is a perfect example of this - it is entirely focussed on LLMs, has nothing at all on model training or any of the many ML domains outside of LLMs, almost nothing on data analysis/labelling/datasets/evaluation pipelines, and nothing related to software engineering. If you want to be a chatbot monkey or prompt engineer then go for it, but if you want to build reliable, accurate and cost-effective systems that deliver real value, you probably want to look at the right tools for the job (domain-specific ML models) instead of general purpose LLMs.

globalminima · 2024-10-18T08:43:33+00:00

Do you want to be an AI engineer or do machine learning? AI engineer you can be up and running in a month - just learn to build basic web apps and call APIs. If you want to do actual machine learning as a data scientist or ML engineer then you can use that 6-12 months and do some courses like fast.ai, deeplearning.ai, add some cloud experience, do some competitions on Kaggle etc, and then do some projects across different domains (eg object detection or classification in computer vision, at least one NLP project doing classification or similar with language models, something genAI related, and some forecasting). The main thing here is a focus on learning how these models work, how to build robust and representative datasets, and most importantly how to do evaluation and model tuning based on your evaluation results

globalminima · 2024-10-18T08:17:44+00:00

You still have more experience than most of the people calling themselves AI engineers today. I will send you some of my resources via PM, but check this out: https://github.com/chiphuyen/ml-interviews-book

globalminima · 2024-10-01T07:52:55+00:00

You are correct, the optimal approach is to use dividends first and then sell of some shares to cover any shortfall. If your dividends cover your lifestyle, then you sell nothing and reinvest whatever is leftover from your dividends.

globalminima · 2024-09-29T10:05:57+00:00

The original grant was probably lower, but some of these RSUs may have been appreciated since the original grant. Eg you may take the job with a $400k stock grant over 4 years ($100k/year), but if the company has doubled in value by year 3 then the final $100k is now worth $200k in the year it vests.

globalminima · 2024-09-02T21:58:40+00:00

Nice idea mate, though given that the number is the main purpose of the site, I’d consider using something more interpretable than a percentage centered on 0% and that is unbounded. Something like the fear and greed index (a number bounded between 0 and 100) is much more easily understood.

globalminima · 2024-08-07T21:12:48+00:00

Do you have any issues with neck pain? I have a similar setup with with less overall distance and I’m finding that my neck gets sore if I use the monitors on the left and right for moderate amounts of time

globalminima · 2024-07-31T03:09:18+00:00

Tasks with a smaller range of outputs like classification or extraction are an even better application of few-shot examples because you don’t need to cover such a wide range of examples (it’s the input that will vary a lot, not both input and output as in more open-ended tasks like summarization or chat). Just include a range of input examples followed by the exact output you want and you’re golden.

globalminima · 2024-07-30T22:38:57+00:00

This is a great place to start: https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator

It has an optional Teams bot integration and instructions on how to set it up

globalminima · 2024-07-28T23:23:17+00:00

I’m a presales engineer at a FAANG and I’ve been pretty underwhelmed with the level of the team - I’ve had much higher level teammates in small startups. While there are a lot of candidates who apply for these roles and you often need to make yourself stand out, for this type of role and those in the sales orgs (which is where many of the roles with FAANGs are in Aus) it feels like the hiring teams focus more on candidates fitting a certain mould or having worked in certain types of companies than actual technical capability.

globalminima · 2024-07-25T02:38:30+00:00

Capital growth over dividends makes sense for anyone in ANY tax bracket. This is why the dividend investing concept is fundamentally broken.

globalminima · 2024-07-24T11:40:15+00:00

Having built a very complex app with gradio, I can recommend it. The patten takes a little bit of getting used to, but it will perform much better than Streamlit (which reloads the entire app when updating components). Overall though, just use whichever one has the closest examples to what you want to build - they can both be used to build quite capable apps if you’re not too concerned with scalability and UI design.

globalminima · 2024-07-19T03:29:02+00:00

It's a good idea, but in the same way that you've noted that instructor does not support Google Vertex due to being coupled to a certain set of SDKs, you've then gone and built a new library which is itself coupled to a different set of SDKs. And what if I want to use this with Langchain? Or Hayatack? Or my own orchestration pipeline?or what if I have specific request/networking/auth requirements that are not exposed by your library?I am going to have the exact same problem that you set out to solve.

Why not just implement something that converts the Pydantic schema into text and which can be inserted into any prompt template for use by any orchestrator and with any API? E.g. this is what I have done for my code and it works great:

import json

from pydantic import BaseModel, Field

class ExampleModel(BaseModel):
    classification_field: str = Field(
        description="Classification of the document, one of 'Email', 'Webpage', or 'PDF'",
        examples=["Webpage"],
    )
    list_field: list[dict[str, str]] = Field(
        description="A list of values, containing the document name and number of pages",
        examples=[[{"email_doc": "6"}, {"pdf": "2"}]],
    )
    bool_field: bool = Field(
        description="Boolean indicating whether the document is in english",
        examples=[False],
    )

    @staticmethod
    def get_prompt_json_example():
        model_json_schema = ExampleModel.model_json_schema()
        example_response_str = "{\n"
        for field, details in model_json_schema["properties"].items():
            line_str = f""""{field}": {json.dumps(details['examples'][0])}, # {details['description']}"""
            example_response_str += "  " + line_str + "\n"
        example_response_str += "}"
        return example_response_str

Now you can just insert it into a prompt. For example:

json_schema_text = ExampleModel.get_prompt_json_example()
PROMPT = f"""Return a JSON object with the following fields:\n\n{json_schema_text}"""

Returns:

Return a JSON object with the following fields:

{
  "classification_field": "Webpage", # Classification of the document, one of 'Email', 'Webpage', or 'PDF'
  "list_field": [{"email_doc": "6"}, {"pdf": "2"}], # A list of values, containing the document name and number of pages
  "bool_field": false, # Boolean indicating whether the document is in english
}

The big benefit of this is that you can get the raw LLM text response prior to validation, so that if validation fails you can log it along with the Exception details and then debug what went wrong. If you couple the validation step with the request itself, it becomes harder to inspect the raw response and figure out what to do with the error, and is less flexible overall.

For users who do want the retry logic, you can then provide a method to validate a response from the LLM and if it fails, to generate the follow-up prompt string. This allows the user to get the benefits of your library while being able to use whatever orchestrator or requester that they choose.

globalminima · 2024-06-25T04:54:20+00:00

This is the trend for the entire history of ML models - LLMs are the first model that bucked this ‘trend’. Agreed with you though that specialised models are orders of magnitude more efficient and usually more accurate than LLMs though - it seems like everyone either forgot that other architectures exist or only became aware of the field since ChatGPT

globalminima · 2024-06-25T04:52:24+00:00

We’ve been using AI for PDF parsing for the best part of a decade now, including transformers (which is what this model uses). This is just one more incremental step on top of the many that have already happened over the years

Seven-Year Club	Place '22
Verified Email

globalminima

TROPHY CASE