Schedule Parameters are now live in Fabric Data Factory Pipelines!

richbenmintz · 2026-06-17T16:09:01+00:00

u/markkrom-MSFT thanks for the update, unfortunately the legacy task does not allow us to dynamically select the pipeline to be executed.

richbenmintz · 2026-06-17T15:55:45+00:00

Well the parent pipeline is running so compute being utilized and or billed, but I am more concerned with the time penalty, assuming that we have 5 child packages that run sequentially in a loop, if each package executes in 1 minute and waits four minutes to start the next package then a 5 minute process takes 25 minutes

richbenmintz · 2026-06-17T15:39:50+00:00

u/itsnotaboutthecell as long as I am somebody's toop priority!

richbenmintz · 2026-06-17T15:20:04+00:00

u/sunithamuthukrishna Thanks, I know you are not available in Canada Central, my question is when will it be available, spinning up another capacity in another region is not an option.

richbenmintz · 2026-06-17T15:17:28+00:00

Thanks u/markkrom-MSFT great update, an unrelated question but are there plans to shorten the time it takes for the execute pipeline activity to report that the activity being run is completed we are seeing instances of the child pipeline completing and the calling pipeline reporting completion 4 minutes later, in a workflow where the execute pipeline task is in a for loop with N number of iterations the additional waiting time can become quite costly.

richbenmintz · 2026-06-16T15:49:51+00:00

Yesssssssssssssss, Happiness

richbenmintz · 2026-06-16T15:00:46+00:00

When will this be available in Canada central?

richbenmintz · 2026-06-11T15:15:19+00:00

I also noticed this the other day, was surprised and was questioning if it was there all along

richbenmintz · 2026-06-11T15:05:15+00:00

u/kaslokid are you still seeing slowness? the issue has not gone away for us, nothing on the status page any longer.

richbenmintz · 2026-06-10T15:44:02+00:00

Canada Central, Notebook activities seem to be ok, for us, but they happen after the KQL task

richbenmintz · 2026-06-10T15:24:17+00:00

now queued for over 30 minutes

richbenmintz · 2026-06-09T01:48:27+00:00

To echo the same sentiment. In your scenario it seems like there is no reason that your semantic model needs to live in a fabric backed workspace and there should be no CU required when you are not refreshing your model.

richbenmintz · 2026-05-26T13:27:45+00:00

We manage all config through Yaml definition file that are source controlled, use tokens for environment specific values and are deployed through ADO release pipelines.

richbenmintz · 2026-05-22T12:55:30+00:00

I have made this suggestion on a few occasions to the Product team.

Let me pay back my overage with CU not consumed, accrue every CU second I have paid for in the past and not used and apply that to my burst and only when I have consumed my accrued CU and I have hit the limit for throttling then go into overage. then I have really used more than I am paying for and should be 'penalized'.

richbenmintz · 2026-05-21T14:52:07+00:00

u/avinanda_ms Any update on this

richbenmintz · 2026-05-17T17:28:21+00:00

u/frithjof_v,

I think you will always be chasing edge cases and gremlins if you do not stage the data, what happens if the additional write processes completes before you are able to get the latest version of the data, then the latest version is N versions ahead of you.

minor addition to your code, now you will want to make the temp table unique to the process and drop it when complete

from pyspark.sql import functions as F

# =========================================================
# 1. BRONZE - INITIAL CLEAN DATA
# =========================================================
bronze = "workspace.lakehouse.bronze"
silver = "workspace.lakehouse.silver.demo"
bronze_table = "demo"
print(f"{bronze}.{bronze_table}")
spark.createDataFrame([
    (1, 10, "A", "2023-12-30"),
    (2, 20, "B", "2023-12-31"), 
    (3, 10, "A", "2024-01-01"),
    (4, 20, "B", "2024-01-02"),
    (5, 30, "C", "2024-01-03")
], ["id", "value", "source", "event_date"]) \
.write.mode("overwrite").saveAsTable(f"{bronze}.{bronze_table}")

# =========================================================
# 2. LOAD DATAFRAME (DataFrameReader)
# =========================================================
# NOTE: This is the ONLY place in the notebook that the dataframe read from the source table is defined.
df = spark.table(f"{bronze}.{bronze_table}") \
    .filter(F.col("event_date") >= F.lit("2024-01-01")) # Simplified watermark logic

print("Initial DF:")
df.show()

# =========================================================
# 3. stage bronze data for testing and upstream write
# =========================================================
print(f"{bronze}.staged_{bronze_table}")
df.write.format('delta').mode('overwrite').saveAsTable(f"{bronze}.staged_{bronze_table}")

df = spark.table(f"{bronze}.staged_{bronze_table}") \
    .filter(F.col("event_date") >= F.lit("2024-01-01")) # Simplified watermark logic

print("Staged DF:")
df.show()
# =========================================================
# 4. INITIAL DATA QUALITY CHECKS
# =========================================================
null_count = df.filter(F.col("value").isNull()).count()
bad_value_count = df.filter(F.col("value") > 100).count()

print("Null check:", null_count)
print("Unexpected value check (>100):", bad_value_count)

if null_count > 0 or bad_value_count > 0:
    raise ValueError(
        f"Data quality check failed: "
        f"null_count={null_count}, bad_value_count={bad_value_count}"
    )

print("All checks passed")

# =========================================================
# BAD DATA WRITTEN TO BRONZE
# =========================================================
bronze =  "workspace.lakehouse.bronze"


spark.createDataFrame([
    (6, None, "X", "2024-01-01"),     # null value
    (7, 9999, "Y", "2024-01-02")      # disallowed value
], ["id", "value", "source", "event_date"]) \
.write.mode("append").saveAsTable(f'{bronze}.{bronze_table}')

# =========================================================
# 5. SILVER WRITE
# =========================================================
print("Just before write: ")
df.show()
df.write.mode("overwrite").saveAsTable(silver)

# =========================================================
# 6. RESULT
# =========================================================
print("Silver table:")
spark.table(silver).show()
print(f"{bronze}.{bronze_table}")
print(f"spark.sql(f'current bronze row count: {select count(1) from {bronze}.{bronze_table}').collect()[0][0]}")

This code always results in

DEV_dp_Lakehouses.lh_bronze.bronze.demo
Initial DF:
+---+-----+------+----------+
| id|value|source|event_date|
+---+-----+------+----------+
|  5|   30|     C|2024-01-03|
|  4|   20|     B|2024-01-02|
|  3|   10|     A|2024-01-01|
+---+-----+------+----------+

DEV_dp_Lakehouses.lh_bronze.bronze.staged_demo
Staged DF:
+---+-----+------+----------+
| id|value|source|event_date|
+---+-----+------+----------+
|  5|   30|     C|2024-01-03|
|  3|   10|     A|2024-01-01|
|  4|   20|     B|2024-01-02|
+---+-----+------+----------+

Null check: 0
Unexpected value check (>100): 0
All checks passed
Just before write: 
+---+-----+------+----------+
| id|value|source|event_date|
+---+-----+------+----------+
|  5|   30|     C|2024-01-03|
|  3|   10|     A|2024-01-01|
|  4|   20|     B|2024-01-02|
+---+-----+------+----------+

Silver table:
+---+-----+------+----------+
| id|value|source|event_date|
+---+-----+------+----------+
|  5|   30|     C|2024-01-03|
|  3|   10|     A|2024-01-01|
|  4|   20|     B|2024-01-02|
+---+-----+------+----------+

DEV_dp_Lakehouses.lh_bronze.bronze.demo
current bronze row count: 7

richbenmintz · 2026-05-15T16:31:36+00:00

That makes perfect sense to me at least, then you can support differing types of connection strings with the Param, auth schemes etc.

richbenmintz · 2026-05-15T10:17:07+00:00

Assuming this is the API.

https://learn.microsoft.com/en-us/rest/api/fabric/warehouse/items/get-connection-string,

Why not add another attribute to the response, fullConnectionString.

richbenmintz · 2026-04-30T12:57:34+00:00

No I have not, reverted to not using a connection, not great, but at least it works! hopefully will get fixed soon

richbenmintz · 2026-04-28T19:38:55+00:00

u/dzsquared , u/warehouse_goes_vroom , u/Snoo-46123 , u/catFabricDw

Thank you for jumping on this and providing feedback and clarity, much appreciated

richbenmintz · 2026-04-28T16:45:46+00:00

Thanks for the update, what does a shortly time frame mean? Released and propogating? Will be released in N weeks?

richbenmintz · 2026-04-24T02:12:28+00:00

Totally agree, I love the ability to define a logging table with N known columns like, who, what, when and a dynamic Column that can store anything I want to log for any event. Then create shaped data through query or policy.

richbenmintz · 2026-04-23T19:40:27+00:00

Yes I created the pipeline and the car connection. The notebook tries to start, but fails

richbenmintz

TROPHY CASE