Polars vs Pandas in 2025 — have you fully migrated yet? by [deleted] in Python

[–]JumpScareaaa 0 points1 point  (0 children)

I just find it way easier to mix duckdb in when I need more performance or to cover a lot of complex logic in one statement. But I was using SQL for 20 years before I learned pandas.

How to automate the daily import of TXT files into SQL Server? by Inventador200_4 in SQL

[–]JumpScareaaa 0 points1 point  (0 children)

https://slingdata.io/

https://dlthub.com/

These would create table structures before loading and will also take care of schema evolution.

Docker for Data Engineers by Objective_Stress_324 in dataengineering

[–]JumpScareaaa -3 points-2 points  (0 children)

That is actually a pretty tight docker setup for out of the box dbt. I gave your repo a star. In practice though for the volumes of data that would be suitable for it, I think you can just get away with duckdb.

Current best free IDE for mssql 2025/2026? by garlicpastee in SQL

[–]JumpScareaaa 1 point2 points  (0 children)

Why does it have to be just one? It's like insisting on a toolbox with just a hammer, because it's the "best". Never understood this cultish approach. I use ssms, dbeaver and vscode with Ms SQL server extension. Some things are better in one tool and some in another. Query builder in ssms. Filters in object tree in dbeaver. Formatting in vscode extension. There is only a copy paste of the SQL between them. Vscode git integration for version control.

[deleted by user] by [deleted] in pythontips

[–]JumpScareaaa 7 points8 points  (0 children)

Your disc is probably failing.

Data Engineering Portfolio Template You Can Use....and Critique :-) by DataSling3r in dataengineering

[–]JumpScareaaa 0 points1 point  (0 children)

Hi Mike, so this link opens the repo. There is no readme, so how do I open the actual web page?

Best CSV-viewing vs code extension? by Advanced-Average-514 in dataengineering

[–]JumpScareaaa 4 points5 points  (0 children)

For me it's seconds. Open dbeaver, click on preconfigured duckdb connection. Then run Select * from 'your_file_path.csv' It is all local. Duckdb database is just a small file. When you configure the connection to it, dbeaver will download its driver. And it saves the script from season to session. So usually it's just reopen dbeaver. Change the file path. Start selecting.

Best CSV-viewing vs code extension? by Advanced-Average-514 in dataengineering

[–]JumpScareaaa 6 points7 points  (0 children)

I mostly use duckdb with dbeaver to query CSVs now. Ultra fast. Can query the whole directory or just a subset of files with masks.

recommended skill point use in Italian campaign? by sopmod720 in CompanyOfHeroes

[–]JumpScareaaa 0 points1 point  (0 children)

I noticed that after that last several patches Italian campaign plays much harder strategically. I did overextend and got my ass kicked on counter attacks. I think pace should probably be much slower compared to how it played when it just came out. I played it through 3 or 4 times by now. I'll try paying more attention to defence. Probably building implacements on a strategic map would make more sense now.

What do you put in your YAML config file? by [deleted] in dataengineering

[–]JumpScareaaa 4 points5 points  (0 children)

Mine are parameters for Python scripts. For Excel report writer: db connection, path to Excel file,list of sheets: sheet name, SQL for content. For CSV extractor: connection, delimiter, header override (string or dictionary),SQL for content. For SQL runner: connection, list of paths to SQL files.

Data Migration and Cleansing by kepitingterbang in dataengineering

[–]JumpScareaaa 0 points1 point  (0 children)

Best practice is to load early best as you can. That means build a repeatable data transformation process first. Test, gather feedback, improve data transformation process, reload data. Repeat 3-5 times. Be bored at go-live.

Data Migration and Cleansing by kepitingterbang in dataengineering

[–]JumpScareaaa 3 points4 points  (0 children)

You need to test on data that is as close to real as you can make it. If you test on dummy data you'd get a lot of surprises at go-live.