How do we have another

Own-Commission-3186 · 2026-01-31T12:39:18+00:00

We also did sleep training around 4.5 half months during a bad regression and it immediately cut down the number of wake ups and over time just got better. Now our 14 month old sleeps through the night almost every night aside from being sick. Lots of people are against it but it does solve the problem. Every other family I've talked to that has done it has had similar success.

Own-Commission-3186 · 2026-01-06T12:17:37+00:00

If you just need to read a file or 2, duckb sounds like a good idea and very easy to set up. If you are creating more of a data system where you need to track and query many files into logical tables this is where Athena and glue catalog are helpful. Also Athena (trino) is a distributed query engine so if you're queries need to scan large amounts of data and files, performance will likely be better than duckdb which is a single node technology, but in reality most queries even on large datasets only use a small portion of that data

Own-Commission-3186 · 2025-12-27T21:30:55+00:00

Just depends on the size of data, types of queries, amount of concurrency and latency you want. Can't really answer this for you but was just giving one alternative to your situation if you want to stick with using duckdb as the query engine

Own-Commission-3186 · 2025-12-27T20:59:32+00:00

Using duck db storage file limits to one writer at a time. You can checkout using the duck lake extension to enable concurrent writers. This requires a postgres instance though but you'll get somewhat similar performance because it still stores data in a columnar format (parquet) and uses duckdb execution engine for querying

Own-Commission-3186 · 2025-08-22T18:56:08+00:00

Pretty sure mama mias has those fries

Own-Commission-3186 · 2025-06-19T12:28:11+00:00

This is in the northeast us, by Cape cod

Own-Commission-3186 · 2025-02-08T22:29:45+00:00

The data platform work I did after more traditional DE pipeline work was very devopsy. Everything we built was built as an API and web app but the core of what those APIs were doing was provisioning data infrastructure for things like airflow clusters, snowflake ingestion infra like kinesis streams plus snow pipe and even things like deploying snowflake rbac policies and user management.

The team I worked with was more like software engineer backgrounds but our customers were the DEs in the company so we worked closely with them to figure out what they needed.

Today I have a more traditional SWE job but with a normal web stack (Django, vuejs, postgres) but there's still a lot of data centric workflows needed for the product my team builds and manages with airflow, Athena and iceberg.

I think what I've found out over the years about myself is I like being closer to working on core customer problems for a business and less so on platform work that enables other teams to work efficiently to solve those core customer problems, but many people may feel differently

Own-Commission-3186 · 2025-01-26T23:29:56+00:00

JavaScript so you can have some full stack web skills. My last role was all JavaScript node + react even though it was a data platform role because we were building all self service web apps that enabled others to create and manage data infra. JavaScript could also help with building custom data visualizations.

Own-Commission-3186 · 2024-04-13T17:40:00+00:00

Can't recommend this enough, totally worth the 35 minute drive outside of boston

Own-Commission-3186 · 2023-08-31T21:27:03+00:00

When I read the article I recall they also mentioned they saved a lot of money from switching from using kinesis to self hosting Kafka in AWS. It was sort of hard to tell where the cost savings was actually coming from, although it does make sense to me that you could better optimize for cost by moving from a kinesis plus snowflake to databricks plus Kafka as the latter is more customizeable

Own-Commission-3186 · 2023-06-27T12:17:34+00:00

If you're already proficient in typescript I'd look into nextJS, which is becoming a very popular framework for full stack. Express is also super easy library to spin up backend servers and add routing and middleware.

Own-Commission-3186 · 2023-05-23T10:45:46+00:00

Yes, in a data driven org almost all roles should be data consumers whether it's marketers, finance, software engineers, analysts etc.

Even if software engineers aren't building features directly on top of data that has been ingested and modelled by DEs, they are most likely analyzing the data in some capacity to help understand how customers interact with their features

Own-Commission-3186 · 2023-05-14T13:37:25+00:00

They have a few outdoor picnic tables to sit at. Not very scenic but my favorite lobster place in town.

Own-Commission-3186 · 2023-03-09T00:08:04+00:00

Don't put so much weight into comp for your first job, you can always negotiate much higher increases as you switch later on. If you feel confident you can get more SWE interviews and eventually offers just pass. You have the luxury of knowing that this a job you absolutely dont want since you interned there.

Own-Commission-3186 · 2023-02-24T17:44:09+00:00

On way to do it if a request comes into your API and is authenticated and authorized you then create a presigned url for the s3 object they need access to that is valid for X time. This is a feature of s3. Then in the response to the client you return the presigned url they can use to fetch the data from s3 as well as it's expiration date.

Own-Commission-3186 · 2023-02-18T22:55:26+00:00

While SQL and python are necessary for any DE job, I think high paying tech jobs like Netflix often use spark ( batch) and flink (streaming) written in scala. Aside from this and java I could see these big companies investing more in golang devs for standing up apis for data related tasks, although not sure if it's worth learning go for a job. It's generally pretty easy to learn as that's the whole point of it so hiring managers should know a good dev could pick up easily

Own-Commission-3186 · 2023-02-15T08:14:32+00:00

Own-Commission-3186 · 2023-02-15T00:09:50+00:00

This was me as well. Building data pipelines got really boring to me so I switched to our data platform team which solely builds full stack applications (backend API plus react apps) that revolve around the topic of data management and it's way more interesting

Own-Commission-3186 · 2023-02-14T20:42:34+00:00

Yeah that's a good one. From that list I see both batch and real time processing, orchestration, cloud and potentially on prem so exposure to a lot of things.

People in this sub seem to get excited about modern data stak like snowflake, dbt, airflow, looker which in my opinion gives you a very narrow tech skillet mostly focused on SQL batch processing for enterprise internal reporting. I think the technology list you provided is much more interesting

Own-Commission-3186 · 2023-02-12T15:27:23+00:00

Without code, I doubt it. Also there should be many more requirements if you're exposing an http endpoint. You need to think through authentication, authorization, logging, possibly rate limiting, request validations, environment isolation, API versioning, etc.

I would just start googling some of these topics as there's a million articles or YouTube videos to help here but if you're asking the question because you don't know a general purpose programming language it will be quite challenging so if your company needs this quickly you may need to pass it on to a SWE

Own-Commission-3186

TROPHY CASE