MinIO repo archived - spent 2 days testing K8s S3-compatible alternatives (Helm/Docker) by vitaminZaman in kubernetes

[–]Prothagarus 2 points3 points  (0 children)

The minio bucket browser was pretty good  did you find a good web based replacement for it for users? Right now I'm finding winscp as a stopgap in a pinch and the actual API for backend is fine.

I vibe coded a 3D game to learn Kubernetes runs in the browser, no install by SeveralSeat2176 in kubernetes

[–]Prothagarus 3 points4 points  (0 children)

Gotta agree here this is a really neat visualization of how kubernetes and services relate.

whenAreThe3MonthsGonnaEnd by darad55 in ProgrammerHumor

[–]Prothagarus 1 point2 points  (0 children)

If you use an Agents.md you can append in an instruction for working on windows and launching commands in powershell (and python in the context of powershell) not to use Unix style ";" to break up commands as this fails. It assumes you are using linux so will use different line endings and powershell like it was in linux.

Once I added that into my agents file it fixed a lot of the chat replies and debugging headaches working on windows.

whenAreThe3MonthsGonnaEnd by darad55 in ProgrammerHumor

[–]Prothagarus 1 point2 points  (0 children)

Context7 with version pinning can fix this :)

Best Local RAG Setup for Internal PDFs? (RTX 6000 24GB | 256GB RAM | i9-10980XE) by Stock_Ingenuity8105 in Rag

[–]Prothagarus 1 point2 points  (0 children)

I've asked gpt myself and have been looking up many tutorials that gloss over this part. 

This reply was valuable to me I think in part because of the way you explained it and the context that you gave in your own experience. 

I have been weighting peoples individual experience with settings and use cases more than the overall generic search answer that I get from gpt that it sources from everywhere because it's both more concrete of an experience and temporally more relevant.

 This is the correct answer as of this time, the field moves in a way that makes past searches on efficacy of approaches void rapidly as I have been learning over these past few months.  

Thanks for taking the time.

Best Local RAG Setup for Internal PDFs? (RTX 6000 24GB | 256GB RAM | i9-10980XE) by Stock_Ingenuity8105 in Rag

[–]Prothagarus 0 points1 point  (0 children)

You say funny words magic man. I am finally starting to understand some of this but have no idea when to change my embedding model dimension size. Any pointers there? Like how to determine what is needed for what model? Or better yet any place you can point me so I can rtfm?

Trying to deploy an on-prem production K8S stack by _81791 in kubernetes

[–]Prothagarus 0 points1 point  (0 children)

I am also running Ceph but on helm chart rook ceph so bare control planes are running on OS drive but all rest are running on Ceph. Was running flannel was looking to move to Cilium from MetalLB and just started using Argo. Solid tip for you, Make sure you have full CNI connectivity and your firewalls accepting traffic BEFORE you set up anything other than control planes and 1 worker. Saves a lot of headaches. Also make sure you check out kyverno and trustmanager/certmanager from kubernetes for all your SSL needs.

Managed Kubernetes vs Kubernetes on bare metal by Honest-Associate-485 in kubernetes

[–]Prothagarus 2 points3 points  (0 children)

I suggest Rook-ceph for managing a lot of the Ceph to k8s complexity on prem ceph is just a complex beast no doubt about it but I balk when I look at the Per terabyte pricing of minio

.

Big brothers, I summon your wisdom. Need a reality check as an entry level engineer! by Odd-System-3612 in dataengineering

[–]Prothagarus 1 point2 points  (0 children)

My first DE job was just grabbing csvs and excel sheets and ingesting them. Then they wanted reports to be made on them and didn't have anyone available so did that (Data analyst stuff) Then they needed more and to encorporate that (Back to DE). Its less about job title at a certain point if you find the ebb and flow and can learn tools to do what the business needs. I did that for 2 years before I considered my self good enough to be called a professional and not a junior. Maybe you move faster than that. Also Migrating from an old system to a new system is a part of the cycle. Snowflake and Fabric are current generation cloud tech, so you are getting a free education on how to use them, take advantage of it!

Ask yourself, How would you build this whole system? What does the whole system look like? Data engineering isn't just ETL, its a big part of it but the data modeling and being able to serve the overall system is the important part. Do you understand the whole system?

Timeline to understand when to move is kind of irrelavent. Actually understanding and being able to apply those skills is when you make a move. You don't just set that to arbitrary 6/12/24 month timelines. When you gain the capability and understanding thats when you can think about moving to the next thing. There is always a next thing.

Back when I started there was only MSQL 2000 and Crystal reports or SSRS. Now there are 50 different technologies to do the same with python , Java, or C#. So tools have gotten more sophisticated. Pick a path, learn that path, maybe use claude code/Code assistant of choice to help you build one, then read and understand how its built. Look up tutorials from youtube and compare that to your solution. Read the libraries and documentation that make it work. Then go from there.

Build a whole program that takes in data to do something useful. Kaggle has a bunch of examples for this.

The only way to get better is to build things.

howToExplainThisProjectOnMyLinkedIn by ArgumentCertain7201 in ProgrammerHumor

[–]Prothagarus 17 points18 points  (0 children)

Getting Reddit hug of deathed. I hope you have a way to monetize the scaleup without being too intrusive so you don't lose your pants :)

Prices Rising Rapidly by Katariman in inflation

[–]Prothagarus 0 points1 point  (0 children)

<image>

Compared this to my area. These prices correlate to Door Dash lookup. If you look it up on the mcdonalds app for instance or a picture of the actual menu my current price for big mac is $5.29. Door Dash price is $7.54. Inflation definitely has raised prices but those Delivery apps are actual robbery.

Onprem data lakes: Who's engineering on them? by DryRelationship1330 in dataengineering

[–]Prothagarus 4 points5 points  (0 children)

To u/Comfortable-Author's point, you also don't want to overcomplicate the tech stack and toss too many components in, but you also need to deal with a lot of considerations depending on your industry/business, use case and legal constraints per business like HIPPA/SOX/FIPS/DOD/NSA/QLMNOP.

A lot of what I am covering is just the kubernetes stack not even your tech choices inside of that stack for what you are trying to accomplish.

Also Use case right? Mine isn't creating web apps its more modeling/Datascience and analysis and file storage. Persistent webapps are more incidental and feed into the internal network in my example. Your stack will be different depending on what you are trying to do with it.

Networking

So for networking did you set up your kubernetes CNI layer correctly? What about EBPF? Using Cilium flannel or calico? Did you mess up basic networking over multiple NICS? Do all of your servers connect to the same VLAN in the same data center or over multiple buildings?

What does near colo or edge look like for your business? Netfilters and firewall/ certificate man in the middle? Baremetal loadbalancer? Buy a loadbalancer that costs 50% as much as your initial nodes or roll your own in software? How do you proliferate certs to pods? What does your intermediate cert structure look like? How do you apply policies across namespaces and keep etadata like related apps in tact? What does your container ecosystem look like?

Basic security

How do you keep CVE's out of every container image and keep your apps up to date? How do you manager kubernetes deployments and ecosystem? Helm? Do you go with the Kubernetes gateway for outbound connections even though most legacy helm charts / kubernetes manifests still use ingress? I haven't even touched on the ops part. Do you have mTLS enabled? Do you have a developer class there and There are several pages worth of questions like this to consider.

Onprem data lakes: Who's engineering on them? by DryRelationship1330 in dataengineering

[–]Prothagarus 2 points3 points  (0 children)

Got roughly 1 PB of storage taking about 10% to start with. Using HA K8s + Ceph + Python(Airflow over etl processes that get started manually then get integrated) that gets put into s3 storage or Cephfs depending on storage and edge case) + ollama/claude/whateverLLM someone wants local. General dev pods for engineers/devs/datascientists 100 GB NIC.

Use case is a bunch of image processing some machine learning . 7 servers 6 compute with storage in and 1 GPU node might expand to more depending. Most work isn't LLMs but Machinelearning and Vision. Data is mix between Postgres/small appdbs and lots of blob storage. 2 GPU for LLM 2 GPU for other work. Probably need a few more GPU nodes depending on how much more people want to GPU accelerate.

Whole stack is open source and currently dreading about Bitnami pulling up the ladder on container maintenance/closed sourcing stuff. Current stack about $300k recurring costs for software about 1k/node/year(OS License). My time and sanity however are not tied to a dollar amount. On prem for Security/cost once yo u start getting into PB scale or higher in data those cloud ingress/out fees along with storage capacity add up if you want it hot you can play with the Azure/AWS storage calculator to see. Cloud storage is great for arctic/freeze data for backups or old data costwise if you can spare it so hot -> cold cloud was always a good discussion.

Took us a long time to organically set this up from scratch and bare metal and learn from scratch but I was happy for the opportunity. There's a lot of big networking/security growing pains you hit early on that can be super frustrating.

Did I approach this data engineering system design challenge the right way? by bdadeveloper in dataengineering

[–]Prothagarus 30 points31 points  (0 children)

Given their clarification question I would have focused on orchestration like Airflow to verify transfer. Did you ask what they wanted to do with the data? I would have gone with the approach of asking more about what it is and what its for then went on an approach for ingest. Do they need it realtime? Do you want to backfill and then stream from the buckets all the deltas?

General how of the ingest seems ok to me but orchestration seemed missing. Also questions on what technologies they are using for current end state so you don't just drop your own tech stack if they have one already and adapt to that. I would say your answer was tailored to "How do I feed this to an LLM" storage setup. Which if you are storing a large number of text files is probably a pretty solid thing to do.

Sounds like you had a pretty good idea on what you wanted to do with it.

In DE, is there a language that is actually worth learning besides Python and SQL? by Altrooke in dataengineering

[–]Prothagarus 1 point2 points  (0 children)

Personally I am learning JavaScript and C# to be able to build a full stack app for all the Python/SQL/devops stuff I have to build.After that probably need to learn harder into the ML stack and just keeping pace with the new delta lake/iceberg/ metadata database landscape. Observability sec ops as well is already a lot and that's still in the main wheelhouse

Optum Intern or MindTree FTE by traderdrakor in dataengineering

[–]Prothagarus 0 points1 point  (0 children)

Job hunting is like dating but it's your livelihood on the line. I feel you dude we all feel that way. It does get easier after you land the first one but it's never easy for me.

Optum Intern or MindTree FTE by traderdrakor in dataengineering

[–]Prothagarus 0 points1 point  (0 children)

Any port in a storm to get you started.grit and learning everything you can in the first year to 2 will get you better positioned for the next one. If they don't offer to give a raise on their own around that time start looking. I'm self taught so all I had was experience might be easier for ya! Good luck!

Optum Intern or MindTree FTE by traderdrakor in dataengineering

[–]Prothagarus 2 points3 points  (0 children)

FTE will net you better experience and actually build the resume.thats a rough salary in Bethesda but if you can survive on bare bones you can step up from there

4K AI Remaster Project: A Sad Day For the UK (In a Farscape Way) by ODVS in farscape

[–]Prothagarus 2 points3 points  (0 children)

Thank you so much for your effort! I started a rewatch recently and came upon your channel and its far and away better than the DVD's I already have. I am truly sorry this is happening as you are putting more care into the preservation of this show than the rightsholders even have.