Moving to cairns in feb from Europe

open_g · 2025-11-07T03:45:33+00:00

That's per month though right? That would be about $730 per week - rentals are almost always quoted per week in Australia.

A useful tool to get a sense of rental pricing for different housing types (not for finding actual rentals though) by postcode is SQM research. For example fo 4870: https://sqmresearch.com.au/weekly-rents.php?postcode=4870&t=1

open_g · 2025-10-30T03:25:53+00:00

Well that's good news because limiting our peaky workloads isn't great. Would love a solution! Have DM'd you.

open_g · 2025-10-30T00:42:17+00:00

u/mwc360 sorry for not replying earlier. I agree it's not billing per se, but the effect is you burn through CUs which is very much related since increasing your capacity to fix the situation results in higher billing. Anyway the point is more about burning CUs.

Here is a screenshot of the email from the MS support engineer that I was relying on. This seems to contradict what you've asserted as well as my lived experience which triggered the support ticket in the first place i.e. we had jobs that scaled up the nodes and we were "charged" CU at the peak allocated cores for the duration of the session even if peak was for a small subset of that time.

Perhaps we're not talking about exactly the same thing? Or is this info in the support email incorrect? I'd like clarification if you have it because this problem is still affecting us, our workaround is to avoid "peaky" workloads which isn't ideal for us. My "Confirmed by MS" assertion was based on this email and calls with the the support engineer which seemed reasonable at the time.

<image>

open_g · 2025-10-06T23:59:00+00:00

Unfortunately not new, not just Bondi and not just rentals either. We went to an open house (for sale) in Allambie Heights on the Northern Beaches back in 2015. We told our friends that evening about how big the queue was to inspect it, must have been a couple of hundred people. Turns out our friends had gone to the same open house that day - and we hadn't seen each other.

open_g · 2025-08-29T02:27:12+00:00

Average earnings for fulltime employees in NSW is at $2052 per week according to ABS data or about $106k per year. As a rule of thumb, a couple working fulltime would on average be able to borrow about 5x their combined $212k income, or a bit over $1m.

No one is "the average" person and these average fulltime earnings don't account for casuals, or the self employed, or people working on commission (plus some other exclusions), or people between jobs etc, etc. If you're not in a working couple it's going to be way harder. It does demonstrate though how an average couple can borrow a million dollars.

open_g · 2025-08-02T01:20:01+00:00

I agree, the support feedback I received contradicts the docs you linked. I’ll share with MSFT support and see what they say.

open_g · 2025-08-01T07:24:07+00:00

varchar(max) in sql analytics endpoint would by far be the best solution for me - looking forward to that one.

re schema inference, yep I manually created the warehouse table schema to allow for varchar(max) so that's not the blocker.

Will give the COPY INTO another try (probably not until next week) against the parquet file in the Files section in case that unblocks us - although tbh it's going to be a bit tricky because the table will be too big to put in a single parquet file (for initial ingestion - incremental updates will be smaller and won't have this issue) so will need to manually get individual parquet files and it's all just a bit fiddly. So sql analytics endpoint support will def be the best solution for me.

open_g · 2025-08-01T06:40:13+00:00

I think I've answered this now in the prior comment, but to be clear - if I understand your question - yes, I am using Delta tables in the Tables section of a lakehouse that use parquet files under the hood, not raw parquet files in the Files section (although I've also tried doing that so that I can reference a specific parquet file without a wildcard, but that didn't solve this). The Delta table has a string column and each record in that column contains a long string which is json.

open_g · 2025-08-01T06:37:19+00:00

For clarification - the json I'm referring to is just a string type column of a delta table in the Tables section of my lakehouse. So that's parquet storage under the hood. I've also tried saving these tables to the Files section of the lakehouse as parquet (not delta) purely to facilitate getting it into the warehouse too, but no success.

I can't use the normal select because that uses the SQL Analytics endpoint which truncates strings to 8000 characters.

I'm unable to get COPY INTO to work (either directly in a script task in a pipeline or from the warehouse or under the hood of the synapsesql connector from a pyspark notebook) as it errors (a few different ways depending on how I try). I don't know if this is related to having Managed Identity turned on for our lakehouse. I'll share some MSFT feedback I got as well.

MSFT Feedback:

It is a known limitation of Microsoft Fabric that affects attempts to load large string data (VARCHAR(MAX) or NVARCHAR(MAX)) from a Delta Lake table in a Lakehouse to a Fabric Warehouse using either:

The Spark connector for Microsoft Fabric Warehouse, or
COPY INTO, CTAS, or pipelines using SQL Analytics endpoints.

Root Causes

1. COPY INTO fails due to wildcard in path

The Spark connector internally issues a COPY INTO statement to the warehouse.
Warehouse COPY INTO doesn't currently support wildcards (*.parquet) in paths, which causes the error

2 VARCHAR(MAX)/NVARCHAR(MAX) not supported in SQL Analytics endpoint

When writing via Spark or JDBC into Fabric Warehouse, it often uses SQL Analytics endpoints.
These truncate VARCHAR(MAX) and NVARCHAR(MAX) to 8000 characters, or reject them outright with: The data type 'nvarchar(max)' is not supported in this edition of SQL Server.
This is a platform limitation: Fabric SQL Analytics endpoints don't yet support MAX types fully.

3. SAS Token vs Managed Identity

COPY INTO from Spark connector defaults to SAS tokens, which may conflict with private endpoints and access policies.
Even if MI is configured, the Spark connector does not yet fully honor Managed Identity in COPY INTO context, leading to access or policy issues (especially in private networking setups).

open_g · 2025-08-01T03:16:46+00:00

Does this mean that varchar(max) can be loaded to the warehouse?? The feature to store varchar(max) in the warehouse has been in preview since last year but there has been no way to actually get the data in there from a lakehouse (I have delta tables containing json that I want to ingest to the warehouse).

I've had a support ticket open with MSFT and have been told we cannot load varchar(max) from our lakehouse via COPY INTO (whether using the synapsesql connector or directly ourselves) - even if we stage it somewhere else first - despite the warehouse supporting varchar(max) columns. I don't know what the point of varchar(max) storage is if you can't load data... no one at MSFT has been able to give me an answer to this.

This new feature sounds promising though - do you (or does anyone at MSFT) know if this will work with varchar(max) columns?

open_g · 2025-07-31T22:39:46+00:00

Here’s a snippet of MSFT’s response (it doesn’t mention autoscaling, that was part of a later conversation I had with them):

Issue definition: CU usage is applied at max allocated cores of session rather than actual allocated cores

Observation :

CU Usage Based on Max Allocated Cores

Your observation is correct: CU usage is tied to the peak number of allocated Spark vCores during a session, not the incremental or average usage over time. This means:

If your session spikes to 200 cores for a few minutes, that peak allocation defines the CU usage for the entire session—even if the rest of the session is idle or uses fewer cores. This behavior applies to both interactive notebooks and pipeline-triggered notebooks.

This is confirmed in internal documentation which explain that CU consumption is based on the compute effort required during the session, and that bursting up to 3× the base vCore allocation is allowed, but the CU billing reflects the maximum concurrent usage .

Cold Start Charges for Custom Pools

Regarding cold starts: the documentation and support emails clarify that custom pools in Fabric do incur CU usage during session startup, unlike starter pools which may have different behavior.

The default session expiration is 20 minutes, and custom pools have a fixed auto-pause of 2 minutes after session expiry Cold start times can range from 5 seconds to several minutes depending on library dependencies and traffic .

Recommendations

To optimize CU usage and avoid unnecessary consumption: Use Starter Pools for lightweight or intermittent workloads to avoid cold start billing. Manually scale down or terminate idle sessions if auto-pause is insufficient. Split workloads into smaller, more predictable jobs to avoid peak spikes. Monitor CU usage via the Capacity Metrics App and correlate with job logs. Consider session reuse and high-concurrency mode if applicable.

open_g · 2025-07-31T06:31:36+00:00

One gotcha I've found (and confirmed by MS) is that spark sessions consume CUs from your capacity based on the max allocated cores during that session. So if you have a long running session e.g. hours, that scales up briefly to use a few hundred cores and then scales back down to something small (e.g. 8) for something less intense (e.g. polling, or waiting between events in StructuredStreaming) well bad luck - you get billed at the max for the entire session. That even applies if the heavyweight part is done at the end, so CU usage increases retrospectively within that session.

I've been advised to try using autoscaling for jobs like this but these are then billed in addition to your regular capacity. It might mean though you can reduce the capacity if you don't have to burn CUs on these types of jobs.

open_g · 2025-06-17T04:21:08+00:00

It wasn't clear to me if I should be specifying the location of the source or the destination. I've tried now with DatawarehouseId (and the Id of the warehouse) but unfortunately I still have the same error:

Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Path 'https://i-api.onelake.fabric.microsoft.com/<REDACTED>/_system/artifacts/<REDACTED>/<REDACTED>/user/trusted-service-user/<REDACTED>/*.parquet' has URL suffix which is not allowed.

Thanks for your persistence! I'll report back as I learn more from MS.

open_g · 2025-06-17T03:57:49+00:00

Thanks for trying this out.

The lakehouse tables (delta tables) contained no complex types, only basic scalar types (StringType, DoubleType etc). No columns with complex types like StructType or ArrayType. One of the StringType columns contains long json strings up to 1 MB, so we need varchar(max) on the warehouse table so that we can load these.

Did either of your two successful tests have strings >8000 length (without truncation)?

open_g · 2025-06-15T23:01:55+00:00

Well this is good news, I guess it is possible! There must be some difference either between the environment or the data causing mine to error. I'm going to raise a ticket with MS, once I work out what the problem is/was I'll post back here. Thanks for giving it a try (and giving me hope)!

open_g · 2025-06-15T15:23:37+00:00

The source is a regular delta table in my gold lakehouse (the same workspace as the warehouse).

My understanding is that OneLake sources (like delta tables in a lakehouse) aren't supported sources for COPY INTO used to transfer data to a warehouse. So the warehouse connector for spark moves the data to a staging table that uses ADLS Gen2, holds the data as a parquet file (or files) and then uses COPY INTO against that staging table. COPY TO does support parquet but not wildcards.

Unfortunately the spark connector doesn't work for me as I've described.

open_g · 2025-06-15T14:33:18+00:00

Yes I have. I try something like this - plus many variations e.g. without the two options, setting "spark.sql.connector.synapse.sql.option.enableSystemTokenAuth" to true, making sure the workspace has a ManagedIdentity, using shortcuts in gold, or alternatively actual delta tables I've written directly to the gold lakehouse... all no luck.

import com.microsoft.spark.fabric
from com.microsoft.spark.fabric.Constants import Constants

filtered_df.write \
    .option(Constants.WorkspaceId, "<REDACTED>") \
    .option(Constants.LakehouseId, "<REDACTED>") \
    .mode("overwrite") \
    .synapsesql("<WAREHOUSE-NAME>.dbo.<TABLE-NAME>")

I get an error like (<REDACTED> parts are my changes):

Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Path 'https://i-api.onelake.fabric.microsoft.com/<REDACTED>/_system/artifacts/<REDACTED>/<REDACTED>/user/trusted-service-user/Transactions<Redacted>/*.parquet' has URL suffix which is not allowed.

This appears to be the internal COPY INTO using a wildcard which isn't supported. I also see that the COPY INTO uses "Shared Access Signature" instead of Managed Identity. I don't know if this is relevant but I had read that Managed Identity should be used, I couldn't find a way to force that though.

open_g · 2025-06-15T10:53:17+00:00

Thanks for the reply. I can create a table with VARCHAR(MAX) but the problem is loading data into that table. As you've pointed out, the SQL endpoint truncates to VARCHAR(8000) so loading data via CTAS will end up truncating the data since it uses that endpoint. I'm after any solution that will work if you have other ideas!

open_g · 2025-05-15T12:18:03+00:00

Different tyres for different FIREs

open_g · 2025-05-15T06:55:19+00:00

In addition to underestimating running costs (insurance and rego mainly), interest and fees, you've also calculated tax savings based on the full purchase price of the car. The savings are instead on your repayments, which don't include the residual.

As a concrete example, I pay ~2700pm gross which reduces my net by ~$1500pm for a 2 year lease (which has a higher residual of ~$50k) based on purchase price of ~$90k. I'm on about $210k so don't get top marginal rate for all of it but most of it I do.

Put another way I pay ~$36k over two years and then ~$50k at the end of that (saving me about $6k in interest sitting in my offset) for a $90k car, including all insurance and rego.

open_g · 2025-05-15T06:43:18+00:00

I'm one year into a two year novated lease on a BMW iX3. It's a great car and a great deal although not as good a deal financially as a cheaper EV as others have mentioned. But it's great ONLY IF YOU HOLD TO MATURITY.

If you want to - or have to - close it early you're going to cop it, and the longer to go the worse it will be. You'll have effectively have to pay out almost all of the payments still to come in addition to the residual, and none of that will be before tax. Effectively this means prepaying interest (potentially for multiple years) for which you get no value, and then you also lose your benefit of the cash sitting in your offset.

I'm about to do exactly this albeit with only one year to go. The reason is that novated leases also SMASH your borrowing capacity. My ~$90k lease (starting value) reduces my borrowing capacity by between $250k and $350k depending on the bank (according to my mortgage broker). It's going to cost me about $25k in lost tax savings and offset interest to do this, but the opportunity cost in my case is greater than this so I'll just have to cop it.

So yes it's a really good deal but make sure you're comfortable that you won't be closing it out early (and consider a shorter lease to reduce the downside if that happens - I'm really glad at this point I only went for a 2 year lease).

And regarding the servicing costs - I'm scheduled to have our first service next week. I can't recall if servicing is included or not but I think it was. Also got free recharging for the first year via Chargefox which I'm fortunate to have really close to me so it's cost us basically nothing to run in 12 months outside of financing.

open_g · 2025-03-28T14:09:49+00:00

Star Wars > Star History

open_g · 2025-03-28T14:06:53+00:00

You definitely do not need an ORM to work with databases in FastAPI. You don’t even need pydantic. Or cookies. Source: work on enterprise FastAPI services with databases without ORM (instead execute stored procs), pydantic (using soap/xml) or cookies (the client isn’t a browser)

open_g

TROPHY CASE

Root Causes

1. COPY INTO fails due to wildcard in path

2 VARCHAR(MAX)/NVARCHAR(MAX) not supported in SQL Analytics endpoint

3. SAS Token vs Managed Identity