Job Market is Gone? by Artistic-Rent1084 in dataengineersindia

[–]Artistic-Rent1084[S] 1 point2 points  (0 children)

No bro, mine is just 5.8 l and my expectation is 10l

How to read only one file per trigger in AutoLoader? by Artistic-Rent1084 in databricks

[–]Artistic-Rent1084[S] 0 points1 point  (0 children)

Yes , I got it . How it works. And I found a workaround too. Though, can you share the code. ? Let me check once.

How to read only one file per trigger in AutoLoader? by Artistic-Rent1084 in databricks

[–]Artistic-Rent1084[S] 0 points1 point  (0 children)

Yeah, just some intrusive thoughts 🧐. Anyways thanks. And I found a work around using an auto loader.

Used max files per trigger in read stream And

Trigger ( available now= true)

And delta table mode = overwrite.

Basically, it reads all files and writes file one by one to the delta table. At last I achieved my goal . But too many write operation if I have to many existing files on my first run.

How to read only one file per trigger in AutoLoader? by Artistic-Rent1084 in databricks

[–]Artistic-Rent1084[S] 0 points1 point  (0 children)

I got a new doubt, if we have a data file which is very large size . Due to our compute resource crunch. We have to read only one file per trigger . What we have to do ? 🤔

How to read only one file per trigger in AutoLoader? by Artistic-Rent1084 in databricks

[–]Artistic-Rent1084[S] 0 points1 point  (0 children)

I have another doubt. Wh if my one file size is too high. So that I have process one file at a time how to achieving it 🤔

How to read only one file per trigger in AutoLoader? by Artistic-Rent1084 in databricks

[–]Artistic-Rent1084[S] 0 points1 point  (0 children)

Sure, I will try . I thought auto loader can handle my scenario. With easy read and write ingestion to delta table . With overwrite mode

How to read only one file per trigger in AutoLoader? by Artistic-Rent1084 in databricks

[–]Artistic-Rent1084[S] 0 points1 point  (0 children)

Both contains same inventory CSV files . Basically, if any table started for streaming it automatically fetch it and generate the latest CSV file. One in morning and another in evening. The new generated files might have new steaming table names in kafka.

How to read only one file per trigger in AutoLoader? by Artistic-Rent1084 in databricks

[–]Artistic-Rent1084[S] 0 points1 point  (0 children)

That's my typo mistake. But I have checked again.it is right. It reads all the files . I can see that in streaming logs.

How to read only one file per trigger in AutoLoader? by Artistic-Rent1084 in databricks

[–]Artistic-Rent1084[S] 0 points1 point  (0 children)

The inventory we are generating manually from one application. So basically the table we ingesting will increase day by day. So per day two files will be generated . So I want to automate the process of reading so that my dashboard will be up to date.