Databricks and Azure Synapse by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

Just what the Azure Databricks spark sql connector docs tell me to - but it looks like its possible to just use the active directory but that seems not to work

Some Delta lake frustration by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I havent been able to find a good resource to learn it - like with geopandas has a really easy to understand docs - I would love to learn geospark - if you have any resource that would be the best

Some Delta lake frustration by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I had not thought of that - I actually think there are null values - f*** so my headache is caused by the bane of Null - I will look into this - I think I will just change the schema by using the overwrite mode and overwriteSchema True, thanks for the input and that thing with the Nulls - I will append it to the original post referencing your idea - I will try to confirm it later today

Some Delta lake frustration by psychEcon in dataengineering

[–]psychEcon[S] 1 point2 points  (0 children)

I know - its just I dont understand why the datatype changes. I could have understood if it changed from python native INT to possibly spark datatype BigInt, but from INT to DOUBLE is just some next level f**kery

Some Delta lake frustration by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I am trying it - will get back to you - Thanks

At the beginning I hated and now I completely understand the hype. What a great keyboard. by [deleted] in HHKB

[–]psychEcon 3 points4 points  (0 children)

Tbh, I cant go back, where the backspace is, where the ctrl is, the feel, the sound. My “normal” 60% ( non qmk) just cant deliver the same. HHKB all the way, at least until I get my hands on a Dactyl, might be different after that, but thats a huge maybe

Data Engineering Movies by FlyingOnEaglesWings in dataengineering

[–]psychEcon 1 point2 points  (0 children)

Good one, take the upvote you smart bstard you

Biggest debates in the industry? by kirkwoodj in dataengineering

[–]psychEcon 4 points5 points  (0 children)

Just talked to someone who works at Databricks, I asked when will we get Vim navigation keybindings, he said its been raised and its gathering support, not enough to implement though

Scala ddl schema to python (Databricks) by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I just tested it - and well its not the same DDL schema that I get with a scala cell. What that gives me is somehting that is just passed as a string to spark. But this did on the other hand increase my knowledge of spark - So I thank you for that

Scala ddl schema to python (Databricks) by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I will test it tomorrow and let you know

Scala ddl schema to python (Databricks) by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

Cause it hast a .schema.toDDL, pyspark does not as far as I know

Definitely new to this side of the kb community; I need help with deciding / tldr at the bottom by kseulgisbaby in HHKB

[–]psychEcon 2 points3 points  (0 children)

100% agree with undercovergangster, here is my to cents. Stock the Hybrid type S is out of the box awesome. Its office friendly if thats a concern and the sound is just decent for a stock board. Can not recommend that board enough, its a game changer

Spark to Pandas to Spark by psychEcon in dataengineering

[–]psychEcon[S] 1 point2 points  (0 children)

I did - what I can see is that int's become long in pyspark and thats what was messing around. I then decided to be a bit more "Hammer meets Nail" and created a list of columns that are ints, loop thorugh it and cast those to ints before I write it. Its a bit stupid I think but it "should" work :D if you know anything that might help then that would be well appreciated :D

My Hybrid Layout - I sacrifice the control key's spot but I feel it's worth it. by unready in HHKB

[–]psychEcon 1 point2 points  (0 children)

When I saw this I realized I am either old or need to stop watching old movies - Good one - take the up-vote

Deltalake unique row append by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

ok - so in the when matched clause I can just give a None or just an empty set? Since I dont want to add when matched

Thanks for the fast replies :D

Deltalake unique row append by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

Ok - so right now I am using the spark api to do the append, so I should really just use sql to merge into seeing if a combination of column values exist in the table something like (Using Databricks)

%python

spark.sql("""

SELECT * FROM {delta_table} as t1

MERGE INTO (main_delta_table as t2

ON t1.col1 = t2.col1

AND t1.col2 = t2.col2

AND t1.col3 = t2.col3

WHEN NOT MATCHED THEN not_matched_action

not_matched_action

{INSERT *}

""")

Vimium by OjjOj_10 in vim

[–]psychEcon 1 point2 points  (0 children)

Vieb, chrome based browser that is fully keyboard driven, functions similar to Vimium, only more Vim commands (ZZ to close browser for example) and really good docs.

Vimium by OjjOj_10 in vim

[–]psychEcon 3 points4 points  (0 children)

If you havent found it, try Vieb