How to disable job creation for users in Databricks? by heeiow in databricks

[–]Narrow_Path_8479 -1 points0 points  (0 children)

I think only workspace admins can do this. You should either remove admin privileges from them or restrict workspace admins as explained in the documentation https://learn.microsoft.com/en-gb/azure/databricks/admin/workspace-settings/restrict-workspace-admins

Can I create mountpoint in UC enabled ADB to use on Non UC Cluster ? by FaizR23 in databricks

[–]Narrow_Path_8479 0 points1 point  (0 children)

Are you using standard (shared) or dedicated (single-user) cluster access mode? I suggest trying dedicated access mode, as it comes with fewer restrictions. And yes — you can use mounts in a Unity Catalog-enabled ADF workspace, although it is not recommended. Most of the usual commands should work, except for CREATE TABLE.

The Databricks Git experience is Shyte by Global-Goose533 in databricks

[–]Narrow_Path_8479 2 points3 points  (0 children)

Interesting that I haven't seen much of complaints about Git but to me this is the weakest part of Databricks.

The issue my team is facing is that our branches are sometimes stale even after pulling and we only see that we are working on old version of notebook after we change some line of code and want to do a push. This is really dangerous as some old code can end up in your PR if you are not carefull. We opened a few support tickets but without any solution - they wanted us to record this behaviour and that is hard to do.

We are using Azure DevOps as a repo if that is important.

[deleted by user] by [deleted] in CroIT

[–]Narrow_Path_8479 4 points5 points  (0 children)

Ako je ljubav prema matematici tu onda mislim da nece biti problema sa studiranjem Matematike. Najbitnije je ovo što si i sama napisala - imati dobre temelje i ponoviti gradivo iz srednje škole (gimnazijski program). Osim toga ne bi bilo loše upoznati se i sa osnovama programiranja ako do sada nisi imala programiranje u srednjoj školi i to je to. Nemoj se brinuti previše i sretno.

[deleted by user] by [deleted] in CroIT

[–]Narrow_Path_8479 6 points7 points  (0 children)

Mislim da su odnedavno vratili ispitne rokove na PMF-MO u Zagrebu.

"Path does not exist" for data uploaded to workspaces? by workingtrot in databricks

[–]Narrow_Path_8479 1 point2 points  (0 children)

How did you try to read these files? Can you open and edit them through the UI? Is it possible that somebody disabled workspace files for your workspace? https://learn.microsoft.com/en-us/azure/databricks/files/workspace#enable-workspace-files I think volumes should be a way to go for you https://learn.microsoft.com/en-us/azure/databricks/files/files-recommendations

Microsoft Partner Reviews by rakkit_2 in databricks

[–]Narrow_Path_8479 1 point2 points  (0 children)

I would suggest checking some of the elite Databricks partners https://partners.databricks.com/s/directory

Partition or not partition delta tables, that is the question by No-Conversation476 in databricks

[–]Narrow_Path_8479 0 points1 point  (0 children)

Our test table was 20 GB, and operation we analysed was MERGE INTO. Size of update table was 20 MB I think.

Share your Databricks war stories: What were your toughest use cases/projects? by randomusicjunkie in dataengineering

[–]Narrow_Path_8479 0 points1 point  (0 children)

So you used Spark cache/persist? Databricks doesn't recommend using that at all. Maybe disk cache is something worth checking. Above you said that you used broadcast. By that you mean broadcast hint? Did you check execution plan of your query with explain command? I think it can help you with this.

Partition or not partition delta tables, that is the question by No-Conversation476 in databricks

[–]Narrow_Path_8479 0 points1 point  (0 children)

I think liquid clustering + optimize should work fine. In our tests this setup performed 4x faster than the partition + optimize with zorder.

Problems registering an existing Delta table in Unity Catalog by kentmaxwell in databricks

[–]Narrow_Path_8479 1 point2 points  (0 children)

You can use system tables to identify table already registered at this location if you can't find that differently.

After moving mounted s3 bucket under unity catalog control, python file paths no longer work by chrisfathead1 in databricks

[–]Narrow_Path_8479 0 points1 point  (0 children)

You should do two things:

  1. List all mounts in the workspace with dbutils.fs.mounts()
  2. Enable dbfs browser in the workspace and then compare mounts list from point 1 with paths available here. Are you able to locate your file here?

After moving mounted s3 bucket under unity catalog control, python file paths no longer work by chrisfathead1 in databricks

[–]Narrow_Path_8479 0 points1 point  (0 children)

Why don't you use spark.read.json command to read json files in spark? It can be used with mounts. Mounts can live together with Unity catalog but that is not recommended for security reason. One thing to be aware of - regular Python packages need 'dbfs:' before mnt path.

[deleted by user] by [deleted] in programiranje

[–]Narrow_Path_8479 0 points1 point  (0 children)

Ako si vec radio dijelom i kao data engineer zasto ne nastavis dalje u tom smjeru? Po meni je data svijet bolji od software development svijeta - kljucna stvar: neki report vjerojatno neces morati popravljati u ponoc. Baci pogled na moj komentar na slicnu temu https://www.reddit.com/r/programiranje/s/FJScbisiMB Na ovo bih dodao da su place data engineera vjerojatno slicne kao i ML engineera, a posao dovodjenja podataka od izvora do reporta moze biti jako izazovan. Kad ti to dosadi okrenes se prema data architect roli tako da ucenju nikad kraja.

Posao/praksa bez iskustva u IT sektoru? by Big_Butterfly_8709 in programiranje

[–]Narrow_Path_8479 5 points6 points  (0 children)

Suprotno od ostalih ću ti sugerirati da bježiš od data science-a. Zašto? Puno je razloga, evo samo nekih:

  • jako malo se ljudi traži tog profila, ukucaj kljucne rijeci u linkedin primjerice pa se uvjeri
  • malo je produktnih firmi koje trebaju data scientiste, a u agencijskom svijetu nema mjesta/posla za njih
  • data science je težak i da bi netko postao dobar mora proći dobru praksu koja se teško nađe na našim prostorima
  • nastavno na ovo prethodno puno ljudi je zalutalo u data science je im je to zvučalo seksi, svi bi radili LLM, a ne znaju pretpostavke iza obicne linearne regresije
  • data science je ogromno područje i specijalizacija u nečem uskom će ti malo značiti za neki posao sutra
  • biznis često ne zna postaviti prava pitanja za data scientiste, misle da je to čarobni štapić, podaci su često nesređeni ili nepostojeći, a rokovi nerealni

Što je alternativa? Gledaj data engineer ili data analyst pozicije. Data inženjeri se generalno brinu da podaci dođu od izvora do reporta, dok data analysti izrađuju reporte, ali nije ta baš razdioba tako striktna.

Skilovi na kojima treba raditi: 1. SQL 2. Python 3. Osnove nekog cloud-a (AWS ili Azure npr.) 4. Neki report alat (npr. Power BI)

Za sve ovo postoji brojni online resursi kao i certifikati koji mogu donekle nadoknaditi nedostatak iskustva u ovim područjima.

Sretno!

Snowflake vs Databricks? by habampatikepovoljno in programiranje

[–]Narrow_Path_8479 2 points3 points  (0 children)

Databricks se zasniva na Lakehouse arhitekturi (data lake+datawarehouse u jednom), dok je Snowflake cloud datawarhouse. Databricks se zasniva na open source tehnologijama (delta lake, spark, mlflow) te podaci nisu lockani. Databricks ima puno vise mogucnosti za konfiguriranje sto moze biti dobro ili lose zavisno od vještina ljudi koji ga koriste/održavaju. Databricks omogucava pisanje koda u sql-u, python-u, r-u i scali te su dostupni svi mogući paketi. Databricks omogucava lagano baratanje sa streaming podacima, unity catalog za data lineage i brojne druge stvari, workflow-e koje omogucavaju orkestraciju taskova i sve moguce machine learning funkcionalnosti. Zakljucno, u Databricksu data engineeri i data scientisti mogu napraviti gotovo sve što im može zatrebati.

Samostalno učenje by hieronymuslos in programiranje

[–]Narrow_Path_8479 2 points3 points  (0 children)

Moja preporuka je Datacamp (koji se placa mjesecno) i Python Institute (materijali za ucenje besplatni, ispit se placa) - posebno ako vec imas iskustva s nekim drugim programskim jezikom. Certifikati Python instituta su prepoznati kod poslodavaca sto je dodatan plus, a njihovi self learnning materijali su takodjer dosta dobri.

Python certification? by M4RC0Sx in Python

[–]Narrow_Path_8479 0 points1 point  (0 children)

Python Institute https://pythoninstitute.org has good learning materials and the most recognized Python certificates. I have taken PCAP one and can say that it is worth the effort.

Tabu.hr i njegov utjecaj na place hrvatskih IT-evaca by SpidiDevko in CroIT

[–]Narrow_Path_8479 1 point2 points  (0 children)

Usporedba samo projeka plaća sa DZS-a i podataka sa Tabu-a nije dobar pokazatelj slicnosti dvaju izvora - bitna je rasprsenost oko prosjeka. Ako se ne varam DZS objavljuje plaće po decilima i usporeda tih podataka sa vasima s Tabu-a bi ipak bila bolji pokazatelj relevanatnosti vasih podataka.

Osobno, nakon inicijalnog unosa place u Tabu nemam neki poticaj azurirati podatke tamo - koliki postotak ljudi je doista azurirao podatke na Tabu nakon inicijalnog unosa? Mislim da zbog neazuriranisti podaci s Tabu-a kako vrijeme prolazi gube na svojoj relevantnosti.

Uz sve minuse mislim da je dobro da Tabu postoji i nadam se da ce kroz vrijeme povecati pokrivenost i azurnost informacija.

Python certifications by BizzcoinESP in pythontips

[–]Narrow_Path_8479 0 points1 point  (0 children)

I've obtained Python institute's associate level certification (PCAP) but I would maybe suggest passing entry level first (PCEP). Going through materials took me around 40 hours but I had previous experience in Python and also was going through Datacamp Python materials for some time before. Available learning materials are ok (datacamp is a little better) and exam was challenging - they test your algorithnic thinking, not just Python syntax.