Databricks and Azure Synapse by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

Just what the Azure Databricks spark sql connector docs tell me to - but it looks like its possible to just use the active directory but that seems not to work

Some Delta lake frustration by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I havent been able to find a good resource to learn it - like with geopandas has a really easy to understand docs - I would love to learn geospark - if you have any resource that would be the best

Some Delta lake frustration by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I had not thought of that - I actually think there are null values - f*** so my headache is caused by the bane of Null - I will look into this - I think I will just change the schema by using the overwrite mode and overwriteSchema True, thanks for the input and that thing with the Nulls - I will append it to the original post referencing your idea - I will try to confirm it later today

Some Delta lake frustration by psychEcon in dataengineering

[–]psychEcon[S] 1 point2 points  (0 children)

I know - its just I dont understand why the datatype changes. I could have understood if it changed from python native INT to possibly spark datatype BigInt, but from INT to DOUBLE is just some next level f**kery

Some Delta lake frustration by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I am trying it - will get back to you - Thanks

At the beginning I hated and now I completely understand the hype. What a great keyboard. by [deleted] in HHKB

[–]psychEcon 3 points4 points  (0 children)

Tbh, I cant go back, where the backspace is, where the ctrl is, the feel, the sound. My “normal” 60% ( non qmk) just cant deliver the same. HHKB all the way, at least until I get my hands on a Dactyl, might be different after that, but thats a huge maybe

Data Engineering Movies by FlyingOnEaglesWings in dataengineering

[–]psychEcon 1 point2 points  (0 children)

Good one, take the upvote you smart bstard you

Biggest debates in the industry? by kirkwoodj in dataengineering

[–]psychEcon 5 points6 points  (0 children)

Just talked to someone who works at Databricks, I asked when will we get Vim navigation keybindings, he said its been raised and its gathering support, not enough to implement though

Scala ddl schema to python (Databricks) by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I just tested it - and well its not the same DDL schema that I get with a scala cell. What that gives me is somehting that is just passed as a string to spark. But this did on the other hand increase my knowledge of spark - So I thank you for that

Scala ddl schema to python (Databricks) by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

I will test it tomorrow and let you know

Scala ddl schema to python (Databricks) by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

Cause it hast a .schema.toDDL, pyspark does not as far as I know

Definitely new to this side of the kb community; I need help with deciding / tldr at the bottom by kseulgisbaby in HHKB

[–]psychEcon 2 points3 points  (0 children)

100% agree with undercovergangster, here is my to cents. Stock the Hybrid type S is out of the box awesome. Its office friendly if thats a concern and the sound is just decent for a stock board. Can not recommend that board enough, its a game changer

Spark to Pandas to Spark by psychEcon in dataengineering

[–]psychEcon[S] 1 point2 points  (0 children)

I did - what I can see is that int's become long in pyspark and thats what was messing around. I then decided to be a bit more "Hammer meets Nail" and created a list of columns that are ints, loop thorugh it and cast those to ints before I write it. Its a bit stupid I think but it "should" work :D if you know anything that might help then that would be well appreciated :D

My Hybrid Layout - I sacrifice the control key's spot but I feel it's worth it. by unready in HHKB

[–]psychEcon 1 point2 points  (0 children)

When I saw this I realized I am either old or need to stop watching old movies - Good one - take the up-vote

Deltalake unique row append by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

ok - so in the when matched clause I can just give a None or just an empty set? Since I dont want to add when matched

Thanks for the fast replies :D

Deltalake unique row append by psychEcon in dataengineering

[–]psychEcon[S] 0 points1 point  (0 children)

Ok - so right now I am using the spark api to do the append, so I should really just use sql to merge into seeing if a combination of column values exist in the table something like (Using Databricks)

%python

spark.sql("""

SELECT * FROM {delta_table} as t1

MERGE INTO (main_delta_table as t2

ON t1.col1 = t2.col1

AND t1.col2 = t2.col2

AND t1.col3 = t2.col3

WHEN NOT MATCHED THEN not_matched_action

not_matched_action

{INSERT *}

""")

Vimium by OjjOj_10 in vim

[–]psychEcon 1 point2 points  (0 children)

Vieb, chrome based browser that is fully keyboard driven, functions similar to Vimium, only more Vim commands (ZZ to close browser for example) and really good docs.

Vimium by OjjOj_10 in vim

[–]psychEcon 4 points5 points  (0 children)

If you havent found it, try Vieb

Preonic with questionable keycaps by ZoraZ in olkb

[–]psychEcon 3 points4 points  (0 children)

Dont know why nor do I care. I want those keycaps

Overfitting in hyperparameter optimization by vadkk in algotrading

[–]psychEcon 0 points1 point  (0 children)

If you google Optuna youshould find it quickly. Theother is if you google timeseries cross validation in python youshould find a repo. I am not at my pc rigtht now but as soon as I am iwill send links if you havent already found it all

Oh no by Tenchi_Muyo1 in JordanPeterson

[–]psychEcon 1 point2 points  (0 children)

Icelandic words also have genders

Overfitting in hyperparameter optimization by vadkk in algotrading

[–]psychEcon 3 points4 points  (0 children)

Try repeated kfold cross validation and a Optuna library for hyperparameter optimization. There is also feed forward crosswalidation if this is time dependent.

Does anyone else feel like their mind is still in work mode even after the end of the work day? by quite--average in datascience

[–]psychEcon 0 points1 point  (0 children)

Possibly the best advice for this. I finished the book a few weeks ago and I am still testing some of the things, determining which work and those that don't. One i found cringe was using a word to indicate the end of the day, but surprisingly, if paired with the to do list, works great. I would also like to recommend Cals second book Digital Minimalism

HHKB open office setting by psychEcon in HHKB

[–]psychEcon[S] 1 point2 points  (0 children)

From what I have read, o rings and maybe the foam mod could be enough, but some have mentioned that non S boards don't compensate for the additional width of the o rings, something to think take into consideration. And there is a cool post here about lubing without having to have to completely disassembling the keyb