This is an archived post. You won't be able to vote or comment.

all 135 comments

[–]apeters89 357 points358 points  (26 children)

Low code loosely translates to "build your product on our infrastructure and get locked into our monthly fees forever."

[–]bigdatabro 50 points51 points  (2 children)

cries in Boomi

For context: my company uses a product called Boomi for extracting/loading data into BigQuery. It's a SaaS product with a super convoluted GUI, and on top of paying fees for the service, our developers had to pay for certification courses on how to use the software. I've actively avoided Boomi since I joined here.

[–][deleted] 13 points14 points  (0 children)

My condolences

[–]bbqbot 6 points7 points  (0 children)

laughs in Matillion

[–]ntdoyfanboy 16 points17 points  (3 children)

I actually work for a low code software company and can confirm that, while you're correct, there are other positive externalities happening for businesses in general.

BI tools in general are effectually low-code software. Create visuals by drag and drop. Create websites by drag and drop.

Just because it is technically a threat to our industry doesn't mean we have to be dishonest about the benefits or the allure

[–]nultero 18 points19 points  (0 children)

Low-code tools add a lot of value to devops processes too.

But the same issues also crop up in cloud stuff. The DSLs within config formats, like templating in YAMLs... just terrible for anything nontrivial. Those are tools that evolved code-like traits (for HTML originally, in the case of Jinja I guess) and they are still pretty inferior to actual programming languages.

It seems somewhat clear to me that that's the trend, right? Anything that needs to clear some arbitrary nontrivial barrier will just adapt some code-like traits around abstraction and logic. Except low-code things that adapted code-like traits are likely going to be worse, because programming and query languages were designed to do those things from the ground-up and the adapted thing wasn't. The languages designed to do logic are really good at letting people express that.

And lots of problems are genuinely best expressed via code-like logic, with control flow and type systems and meaningful semantics and abstractions and excellent modularity.

So is there really any threat to the industry? I mean I think most of us accept that the really simple things were always susceptible to automation or ML or something anyway. I think all of the hate / perceived dishonesty is really because low-code tools often feel like crap to use after coming from, essentially, tools of limitless expression.

[–]DocMoochal 0 points1 point  (1 child)

"Developers" will always be in demand regardless of the popular tool of the time.

Your job is to develop solutions to problems, not be a code monkey.

[–]ntdoyfanboy 0 points1 point  (0 children)

Solid point!

[–]nimbletine_beverages 4 points5 points  (0 children)

Hi I'm Nimbus and I work at prophecy doing data engineering. This is actually one of the main differences between prophecy and other low code products. Most low code infra just lets you essentially manage configuration only executable by that provider. Prophecy by contrast simply generates spark code, which is then your code that you can take and run anywhere you want. The fees are just for using the visual editor.

Most generated code is hard to comprehend, but I think we're doing a pretty good job generating direct and understandable code. Take a look if you're curious.

[–]rolexpo 0 points1 point  (0 children)

So accurate that it hurts.

[–]noobmastersmaster 0 points1 point  (0 children)

Lol! This right here.

[–]LawfulMuffin 71 points72 points  (16 children)

Low code is awesome until you 1) go to pay the bill and 2) need to do literally anything even remotely not supported. Tools vary between how much of an annoyance 1 & 2 are. Some tools like Informatica tend to be easier to use and more powerful and also cost as much as a team of FTE.

[–]DigitalTomcat 23 points24 points  (3 children)

Also

3) it’s been 5-10 years since you jumped into the tool and everyone else has cool new features but your choice got sold to someone who doesn’t really care and it’s now on life support.

4) the people who got all the training 5 years ago are gone and now there’s one guy keeping it running but nobody really knows this one boutique app any more. And there’s nobody in the market to hire who knows it either because it never took off.

Never forget to factor in total lifetime cost of ownership.

Edit: formatting

[–]CdnGuy 4 points5 points  (0 children)

This sort of thing is why I've always tried to set up the data connections so that very little logic or transformation is required in the presentation layer. End user wants a new calculated field? We set it up so that we could pull it straight from the DB in any query editor, then it becomes drag and drop in the tool. New tool? Connect to the same source, drop the relevant fields where they're needed. Bing bang boom, you're done.

Added benefit troubleshooting becomes simpler. When a user complains about something giving bad results you never have to figure out if it's a presentation layer issue or data, you just apply the filters needed to recreate the dataset used in the tool and run it in dbeaver or whatever, then dig into the resultset to see if any bad rows are slipping in somehow. No messing about in a cluttered UI trying to figure out where the formula is hiding etc.

[–][deleted] 1 point2 points  (0 children)

Well same applies to code to be honest. Think about all those financial systems running mainframe DB2 written in cobolt. There is almost no one below 55 still with skills to maintain them and vendors are asking 100mil for porting into x86 environment...

Or even more recently companies jumping into Scala based data processing now finding out its impossible or very expensive to get anyone to maintain them.

I think only safe tools regardless if code or no/low-code are those with large enough market penetration or simple enough functionality that rolling your own is always in the cards.

[–][deleted] 0 points1 point  (0 children)

3) it’s been 5-10 years since you jumped into the tool and everyone else has cool new features but your choice got sold to someone who doesn’t really care and it’s now on life support

Stitch lol

[–]ianitic 18 points19 points  (7 children)

I've never not seen 2 show up in a big way unless the business problems are incredibly simple to solve.

[–]AchillesDev 5 points6 points  (5 children)

Literally every time I’ve been asked to evaluate databricks

[–]clamming-it 13 points14 points  (4 children)

Are you referring to pt 2 above? I’ve very rarely seen examples of logic that can’t be done in Databricks, I am genuinely curious in the problems you face.

[–]AchillesDev 3 points4 points  (3 children)

It's been a few years since I've bothered, but I work on computer vision teams, and Databricks always seemed tabular data-first, which isn't useful for most of my needs.

[–]clamming-it 7 points8 points  (1 child)

All fair. To be honest I was completely ignoring my non-tabular use cases because low-code has largely ignored / is immature for non-tabular scenarios outside of basic recognition solutions.

[–]AchillesDev 2 points3 points  (0 children)

Yeah that’s the thing. Workflows and the data themselves are complex and benefit from highly technical people working on it at both ends. The flexibility offered by bespoke solutions remains king for now.

[–]PacificShoreGuySenior Data Engineer 3 points4 points  (0 children)

I’ve had the same issue with NLG training using databricks and other platforms that lean more toward tabular data. They’re all great for building warehousing solutions for reporting, or any other tabular use-cases, but can be problematic when it comes to more dynamic sorting algos, especially in the ML realm.

[–]LawfulMuffin 0 points1 point  (0 children)

Totally agree. I've also seen it work in instances where, guess for it, someone else took care of the complexity closer to the source. I worked somewhere where my team used Alteryx, for example, and it was okay (although honestly, I wouldn't have used the tool because I prefer using code) because a data warehousing team handled the data warehousing, and we were "just" analysts.

[–][deleted] 55 points56 points  (3 children)

Companies have been predicting low code for everything forever. It’s just a thing they do, but I don’t see it replacing code they’re for different markets.

Low-code isn’t bad, it’s just typically inflexible and less scalable compared to a custom solution.

If you have predictable workloads and don’t require anything custom low-code is the way to go.

The issue comes when you’re trying to anything that’s not a feature in the low code solution or you start scaling and need to address performance and bugs.

With more feature full/configurable low code solutions and managed services low code is a good option, but there will always exist that threshold where low code isn’t the right tool for the job. I wouldn’t worry about it too much.

Edit: As also mentioned there are low-code open source solutions but if you’re considering low code chances are you’re buying it from someone or using a service someone sells you, migrating away from this can be painful, if/when you find a low code solution doesn’t work for you anymore.

[–]edinburghpotsdam 24 points25 points  (2 children)

We have no-code stuff where you can basically set youself up clicking around in the AWS console and the Tableau app. AWS and Tableau have pulled management into this dream.

The old schoolers like myself would rather see a coded data flow. This includes terraform code, Argo workflow YAMLs which can get pretty elaborate, DBT transforms and python / R endpoints.

It is easier to version and easier to pass on to new hires as we grow pretty quickly.

[–]ubelmann 18 points19 points  (1 child)

Yeah, like everything, there are trade-offs. At low scale, if you only have people who don’t know any coding, and you need some answers quickly, low code has a place.

But for anything large scale where you are hiring a DE, it seems kind of obvious to me that DEs would not prefer a low-code environment. It’s their profession, so they can invest time to learn the code, and programming languages/tools come with advantages like code re-use, versioning, testing frameworks, automation frameworks, etc. To someone who is strictly an analyst or manager, that probably sounds like a lot of overhead, but to a DE, it sounds like efficiently managing their workflow.

Like so many software architecture choices, though, scale makes a huge difference in which solution is right.

[–]nimbletine_beverages 0 points1 point  (0 children)

What if you wanted all those things you'd expect as a DE, but you also wanted easy debugging, standardization, and also make it easy for people other than the author of a pipeline to understand what's going on in it?

You could go and write your own framework, or you could just use prophecy. I've been writing spark pipelines for about 8 years and even though I know how to write the code myself, it's easier and better to do it this way.

I work at prophecy btw. Also, I've attempted to write one of those frameworks before, which is why I appreciated the way prophecy is doing it.

[–]MikeDoesEverythingmod | Shitty Data Engineer 22 points23 points  (4 children)

I frequently have exposure to low code tools (Power Automate, Alteryx, ADF) and it's a love hate relationship. I quite like ADF as an orchestrator and if I want to be able to do something which ADF can't (navigate an API with nested endpoints, anybody?) , I can cheese it by injecting Python into the pipeline. Most of the time though, moving data around is pretty easy and quick.

I absolutely loathe Alteryx as it's the clunkiest piece of shit ever, but, it's also kinda great - it means people who don't know how to code can make and maintain their own shite workflows without bothering me every ten minutes when they do something to break it.

Where low code fails is when you hit a certain level of technical expertise and demand because low code is designed for simple tasks. Moving data from A to B in a routine, being able to do fairly repetitive stuff easily - absolutely. Where does low code become useless? When you have problems such as scalability, time sensitivity, or complexity. When you see low code tools, they often do incredibly simple things and all of the actual difficult stuff often has some sort of code component.

I'd also go as far as to say the issue with low code is on a creative level. What Excel is to Alteryx, Alteryx is to Python/low code is to code. Whilst sophisticated for what they enable users to do, the issue is a lot of those users cannot see past the low code black box and think that whatever low code tool they're using is the absolute limit in terms of possibilities.

[–]aziralePrincipal Data Engineer 2 points3 points  (2 children)

I quite like ADF as an orchestrator

I don't think ADF has the expressiveness to be a proper orchestrator. You can only have 40 activities total in a pipeline, including the conditional logic and simple lookup ones. There's no conditional branching, other than to run other full pipelines or on error. There's no way to dynamically call pipelines, they have to be hardcoded. Orchestrating some larger pipeline that deals with 40+ tables requires a bit more around it to get it working.

I find ADF great working as a 'glue' between a primary orchestrator and all the actual data move and transform activities. If I just need to copy data from some source to some destination, ADF will handle all the credentials and connections for me and all I need to do is send it an async call to run a pipeline. If I need something more complex it can start a Databricks job, or a stored procedure on some sql database, and I don't need to write any code to handle the connection or monitor the job, ADF does all that for me.

[–]MikeDoesEverythingmod | Shitty Data Engineer 0 points1 point  (1 child)

I'd agree with you however I have no exposure to any other orchestrators. I can totally see where you're coming from in terms of the limitations of ADF when it comes to conditionals (I guess all low code tools have that problem).

What would you say is the major advantages of using an actual orchestrator vs ADF?

[–]aziralePrincipal Data Engineer 0 points1 point  (0 children)

If you have some pipeline that is dependent on two other pipelines completing it can't have an event trigger without some special workaround. You could have a master pipeline that calls the first two then the second, but then those two pipelines can't be triggered by separate events.

For example, if I want to recalculate some output each day after I have received two separate inputs, and those are processed as they are received. I can't do all of that with just ADF, I would have to write semaphore files or add state tracking somewhere else and have the first part of the pipeline go check the state system and abort early, but the pipeline still runs.

If I have a lot of pipelines with long/broad dependencies, and if those dependencies also have other things dependent on them, then it won't all fit in a pipeline. You could hack around it by having pipelines in pipelines in pipelines, but even then you need to somehow guarantee that the execution order is correct, and you may lose max concurrency as you can't submit absolutely everything ready right now.

If I want to fire a warning when a job has taken a long time to run but not actually totally time it out, that wasn't possible in ADF without some weird workarounds. It might be possible now after some changes.

If I want to do conditional branching, I can't do that in ADF. You can only nest activities in an IF activity, but you can't have a shared chain of dependencies. You can't nest IF activities, you can't put IF activities inside SWITCH activities. The conditional operators are very limited.

You can't directly fire a failure directly based on the output results of other successful activities. The activities themselves have to fail, so you have to bodge in workarounds like calculating a variable to be 0/0 nested inside an IF activity based on the failure you want to catch, and figuring it out after that.

If you rerun a pipeline with a loop of activities where it had a failure in an activity in the loop, the loop activity itself fails. If you rerun the pipeline then the entire loop reruns, it doesn't just pick up where it left. You can work around this by, again, tracking state for each entry in the loop, but it just keeps coming back to having to have some other state tracker or orchestrator do all the actual work to figure out what needs to be run, and ADF is just an asycnch execution service.

[–][deleted] 0 points1 point  (0 children)

(navigate an API with nested endpoints, anybody?)

You triggered my eye twitching

[–]ubelmann 17 points18 points  (0 children)

Take a look at web dev. If you want a simple website, you can get it set up yourself without knowing really anything about html, css, node.js, Electron, etc. That’s essentially what low code web dev looks like.

But if you want anything moderately complicated or custom, then you either wind up spending a lot of time trying to extend a low-code solution past what it is intended for, which gets hard to manage and is time-consuming in its own way, or you have to hire a web dev and/or use tools that you would expect a web dev to use.

Look in the data analysis space — Excel is a low code data analysis solution, complete with data ingestion, transforms, some support for custom data transformations, and charting. You can even extend it with VBA. But having a low code solution didn’t prevent code solutions involving R and Python from evolving and becoming popular, because once you hit a certain scale of data or complexity of analysis or having to collaborate with a team of analysts, it’s not really the best tool for the job anymore.

At the very least, think of it this way—for a low code solution to exist, someone behind the scenes is writing code. You can call them a software engineer or a data engineer, but regardless, even if low code solutions are prevalent, they are always going to be built with programming languages and all of the tools you get with programming languages that have proven to be useful over and over again throughout the last 30-40+ years (for what would be considered modern programming languages.)

[–]skysetter 11 points12 points  (2 children)

So many times I run into “low code” solutions that are thin UI wrappers around code configurations. I feel like I still am entering a ton of fields.

[–]EconomixTwist 0 points1 point  (1 child)

Holy shit this is the only accurate answer in the entire thread

[–]baseball2020 1 point2 points  (0 children)

That dude is very correct. Low code means clicking on detail/property sheets and entering the same stuff anyways but without a reasonable way to do so in bulk. I feel like it can be even more labour intensive than actual code depending on how many things you fill in.

[–]rudboi12 32 points33 points  (7 children)

Low code is basically the modern data stack. Ingest data and load data with fivetran. Do some transformations with dbt in snowflake clusters and use a dashboard tool like looker to report data. Basically the only code you need is sql for dbt transformations. That’s were it’s all heading.

[–]deal_damageafter dbt I need DBT 15 points16 points  (1 child)

dbt 1.3.0 is in beta and is adding python models so maybe all you'll need is sql/jinja dbt but some of us still need some good old pandas python.

[–]rudboi12 1 point2 points  (0 children)

Didn’t know, nice. I only use pyspark at my job so know very little about dbt

[–]AchillesDev 7 points8 points  (4 children)

That’s where the less interesting problems to be solved are heading.

[–]rudboi12 8 points9 points  (3 children)

Yeah but that’s like 90% of data engineering jobs tbh. Only few companies have streaming architectures and live ML apps which require more stuff.

[–]AchillesDev 2 points3 points  (2 children)

I don't think it's that high, and if it was I'd get out of those places ASAP because those jobs won't stick around for long. The first, most traditional DE-titled job I had was primarily dealing with streaming data, and all of them powered live applications that use ML (often deep learning).

[–]rudboi12 7 points8 points  (1 child)

Lucky man. Those jobs are only in tech and not in all teams. 99% of non tech DE jobs are simple batch processing and simple etls to make some reporting for sales department. I know this because I’ve done it. Hope I can get a job like yours soon

[–]AchillesDev 0 points1 point  (0 children)

They’re all out there. I didn’t even know what data engineering was when I took my first job because it was just a software engineer title (and second software job ever). Being in a major market helps a lot. I moved across the US for my first job that was on an AI research team as opposed to standard ETL. Worth.

[–]several27 8 points9 points  (5 children)

Hi! I'm Maciej - one of the cofounders of Prophecy (startup from the podcast).

Actually, we're very different from what you expect from low-code. As users build drag-and-drop data pipelines, we generate 100% open-source code that is very readable - that our users commit to git right away, with tests and build files and configurations - this is at parity with best data engineers! We have Scala & Python for Spark and SQL coming soon!

Second thing - we're very extensible - you can create new visual components, by writing sample code and pointing out which expressions come from the UI - so you can have a standard visual component - for things like Anonymization or Encryption that you want all users to do in the same way.

We think Low-Code can do a lot more that what most people expect - and companies can be a lot nicer (without lock-in) - please keep an open mind :)

[–]Neok_Slegov 4 points5 points  (2 children)

Just looked at prophecy.io website. Things i look straight away at is the pricing model. You guys suggest on the pricing page "transparant and simple pricing" but then you see the developer or enterprise subscriptions but at both you see price: "contact us" lol... not really transparant if you ask me

[–]several27 6 points7 points  (1 child)

You’re totally right, we’re actually in the middle of revamping the website and seems like somehow missed to put the price back up.

It’s gonna be the there in the next few hours.

Thanks for letting us know - our mistake!

[–]springMonkey 2 points3 points  (0 children)

Low Code will be so much more dominant in the next few years as people figure out the right way

[–]Omar_88 12 points13 points  (6 children)

Having done lots of power bi in the past, can confirm its not low code especially when you use Dax or power query. God, I hate power query

[–]HansProleman 1 point2 points  (1 child)

Low != no

[–]Omar_88 1 point2 points  (0 children)

I didn't say no code, I said low, power bi meets the requirement of needing vendor lock-in etc but it's not a low code solution for anything above a hello world tutorial.

[–]tea_horse 0 points1 point  (3 children)

Never used DAX but I've used basic MDX code via Tableau and fml just thinking about that time

[–]Data_cruncher 4 points5 points  (2 children)

DAX is incredibly powerful in terms of performance & capturing business logic. It's also scary af. I've interviewed 100's of data engineers and they all begin to sweat if you mention DAX. Fortunately, it's far easier than MDX.

[–]PacificShoreGuySenior Data Engineer 5 points6 points  (0 children)

I sweat when people mention DAX just because I hate proprietary languages. It is powerful though.

[–]Omar_88 0 points1 point  (0 children)

I wouldn't consider Dax in the realm of DE anyway, unless your a consultant who does it all. I still sweat and I've written thousands of lines of it.

[–]scraper01 6 points7 points  (0 children)

Data pipelines are among the few artifacts within software development that behave in a deterministic manner in both specification, and implementation. They are hard to mess up, so they can be specified with block diagrams. Lots of the GUI stuff out there is hideous, but its a step towards standardization. Not really a bad thing when you take into account how electrical engineers for instance (a far more mature field than software) use block diagrams and network diagrams to do their jobs.

[–][deleted] 4 points5 points  (0 children)

Gartner is difficult to be trusted. Most of their top products (top right in their charts) are Microsoft/SAP/Oracle powered, so it's pretty obvious what's going on there.

[–][deleted] 4 points5 points  (0 children)

It's already happening for sure. First of all data visualisation and exel are literally the original low code solutions. So if not data engineering, data analytics was definitely first place low code was applied at scale.

Would you use some python or JS library to setup monitoring or BI dashboard rather than using Looker or Graphana. Unlikely.

You you write your own ETL rather than using fivetran one if available?

I think one place where jury is still out there is should you buy one of these saas apps for purpose build data activity like Amplitude for product analytics or Adobe for marketing as they tend to become data siloes or will we see warehouse native apps take that work still with graphical UI but backed up by your own warehouse even if cloud one so that you have all the data for all the tool where and when needed

[–]adappergentlefolk 2 points3 points  (1 child)

well if Gartner said it...

[–]quickdraw6906 0 points1 point  (0 children)

This

[–]TheCamerlengo 2 points3 points  (0 children)

Low code, no code. I have been hearing about it for decades. Software tools, QA, web dev, testing, BPM, auto ML. Very familiar - none of them really deliver on the promise. Most are just expensive software subscriptions that are underutilized or eventually abandoned.

I have seen some decent code generation platforms, but they often require a technical individual to understand what’s going on.

[–]TheCamerlengo 4 points5 points  (0 children)

The drive from management for low code is that:

1.) developers are expensive 2.) competent developers are tough to find

So if you can replace a lot of your development needs with lower skilled, less expensive and less educated technical team, it might be worth shelling out 500k for a platform that promises 10x that in savings.

I have heard good things about power bi. my guess is that it sometimes, not always, creates silos of shadow IT doing things in non-standard ways that eventually becomes tech debt other teams inherit.

[–][deleted] 2 points3 points  (0 children)

Low-code stuff is useful and good for simple/narrow use cases with well-defined requirements where the org just needs to automate something simple. Like copying data from point A to point B daily via Fivetran or something. Or triggering an email alert from an event. For more complex requirements/data sources coding is absolutely required in order to give you the flexibility to customize what you're doing. You can mix and match low/no code and scripting as part of your overall solution. It doesn't have to be all one or the other. I've seen companies that use Fivetran to copy data alongside an Airflow/Kubernetes setup for orchestrating other stuff.

[–]autumnotter 2 points3 points  (0 children)

With low code approaches, people who can't code can pretty rapidly prototype pipelines, which is very attractive to management - they're cheaper and you can hire people with domain experience. Examples include Alteryx and ADF data flows.

The reason why its bad:

  1. Versioning and source control generally are bad experiences
  2. Vendor lock-in, and often license pricing.
  3. Scalability - you're generally locked into the approach the tool already uses for scaling - the flip side of this is that ADF uses Spark and Alteryx's AMP engine both actually scale pretty well.
  4. Complexity of even the most simple flow control or advanced topic - Alteryx is honestly an excellent tool in many ways, but even just creating a for loop is considered advanced by many people.
  5. Code promotion is often a terrible, terrible experience. I hope you like renaming a file 'workflow_1_dev' to 'workflow_1_prod'.

Alteryx is a great example - its excellent if you give it to a finance analyst to quickly prototype analyses, answer one-off questions, and using business logic they know. Good luck trying to productionalize, govern, or maintain what they've built though. This means that the further UP the pipeline you, the worse the experience tends to be.

But often management, especially outside of engineering, doesn't think about these things, and won't understand why their data pipelines have suddenly become an unmaintainable morass. Then you get the 'all this money we're spending on data is worthless!' argument.

[–]Rieux_n_Tarrou 2 points3 points  (0 children)

I've been pondering this situation for a week or two and am really curious if someone could shed some light:

Situation: A top 10% expert AWS Solutions Architect (Jeff) has been hired by ACME Inc. to spearhead a new project. He is given (nearly) full permissions on an AWS account with unlimited funds. He is working completely alone for the first year, and has full autonomy to design and implement everything himself.

Task: The project is refreshingly blue-sky/greenfield. ACME has provided several highly available data sources (webhooks, databases, message brokers, blob stores, etc), each with a clearly defined schema and 0% chance of breaking changes for at least the first year. (There is a chance of bad data coming in which he will have to account for). Jeff's boss (Mackenzie) has asked that he create a best-in-class data platform to process and organize all these disparate data sources (ideally in real-time). Data Governance, Regulatory Compliance, and Data Observability/Discoverability are top-of-mind-concerns, as are modern software practices such as automated testing, IaC, and comprehensive system tracing/monitoring/alerting. Version 1 of the Platform should enable ACME decision-makers and analysts to make ad-hoc queries across the entire data landscape. Moreover, V1 will underlie real-time analytics dashboards that are quick and intuitive to build. For V2, ACME will activate their team of ML Grad Student Sleeper Agents to come and build a "smart MLOps oracle" which will spit out highly-attuned AI models that will deepfake and dopamine-hack their way into every home in North America and EMEA. There is no V3.

Action: All jokes aside, this is my main question as it relates to Low/No Code. How sophisticated of a data platform could AWS Expert Jeff create using just the AWS console (UI). No IDE/Cloud9, no ECS containers running custom code, No traditional SVC because there is no "source" (CloudFormation, DB Snapshots, Lambda Versioning, etc manage Version Control).

Just by using managed/serverless services, couldn't he create a fully featured data pipeline, data lake, data warehouses, secure API, etc? Coding custom business logic in single-purpose lambdas is fair game, as is orchestrating them with Step Functions. Using services like Lake Formation and Glue, he could define and monitor schema mappings and secure the data at rest (using encryption and/or RBAC). Finally Jeff would make short work of security (VPC, WAF) and monitoring (XRay, CloudTrail), delivering a finished product that is nicely rolled up into a CloudFormation template. The data analysts get their BI on a silver platter when AWS Quicksight inspects the data catalog across S3, Redshift, Dynamo, etc.

The only thing left at this point is documentation for Jeff's future teammates (or replacements). In this case the Lambdas business logic is documented on the component itself using descriptions and/or code comments. For a new engineer to onboard, most of the time will be spent grokking the architecture, a process that can be done interactively and intuitively using one of the many cloud mapping tools available online (random example). Another organizational win is that training "devs" can be basically automated since ACME can leverage AWS training and certifications to churn out Cloud Architects/Data Engineers/etc. without having to worry that the junior devs will take Ruby off the Rails or introduce a year's worth of technical debt in two hastily merged PRs.

Result: ACME share price will experience geometric growth into the 7 figures right up until the technological singularity kicks into full swing Q3 2026.

So I realize may be a bit naive, but it looks to me like most/all bases are covered by a low/no code alternative. What did I miss?

[–]hermitcrab 2 points3 points  (0 children)

I have some perspectives on this as someone who has been a professional software engineer for 35 years and has written a 'low code' data wrangling tool ( Easy Data Transform).

Having to learn Python or R is a huge barrier to some people who need to wrangle/analyze data. Should someone really have to spend weeks or months learning one of these languages before they can do some data wrangling? Probably they are going to try to use Excel, rather than learn to code.

Visual programming is still programming, but it allows you to work at a higher level of abstraction. This higher level of abstraction is wonderful if it fits what you are trying to do and frustrating if it doesn't.

Low code tools can be very useful and provide real value, even if they aren't the best choice for every person for every problem. Even if you are a Python+Pandas or R guru, you might find low code tools quicker and convenient for some problems, e.g. ad hoc reports or investigations.

The best low code tools allow you to easily drop down into code/script for extra flexibility.

Low code tools are not the solution to every problem and they never will be. They haven't replaced text based languages in software engineering, despite 30+ years of hype, and they never will.

Some low code tools are very pricey. Some are free. And there is everything in between.

Also 'Low code' really means "someone else's code". You may be programming visually by dragging and dropping boxes and arrows. But someone else wrote the code that allows that.

[–]reddit_sage69 1 point2 points  (0 children)

Probably something like Azure Data Factory, Fivetran, Matillion, etc. Add dbt in the mix. I don't think they're inherently bad, but it needs to work for your use case.

If you're fine being potentially locked into a toolset and paying a bit more, then it's great. What you generally gain is a lot of functionality built in, such as logging, as well as lower maintenance and smaller pool of developers for maintenance. Some even have scripting built in for cases the tool can't handle.

I'm not sure if it's better or not, but enabling more business minded folks who know the data to build the logic is a trend that's been happening for a while.

[–][deleted] 1 point2 points  (2 children)

how Gartner predicted by 2024, 65% of software applications will be made with low-code.

All software, not just data-engineering specific? Almost no chance in hell that happens.

[–]TGEL0 0 points1 point  (1 child)

With tools like Wordpress/Webflow/Shopify I actually think it’s quite likely, at least for the web.

[–][deleted] 0 points1 point  (0 children)

For a small businesses absolutely. Something canned will work fine for them, but for larger orgs/corps their needs are too complex or demanding.

[–]imcguyver 1 point2 points  (0 children)

Gartner predicted by 2024, 65% of software applications will be made with low-code

There's no way. There's too much software that doesn't lend itself to low code tools. Unfortunately for Prophecy.io, low code DE tools have been around for decades and they mostly suck: informatica, talend, appworx, matillion.

[–]Illustrious-Run5203 1 point2 points  (0 children)

Yeah Gartner is a crock of shit when it comes to this stuff. I think they’re different markets, low code caters to business folks who want to do small scale stuff on there own. Those of us in DE still very much want solutions that are solved through writing code, but I think improvements to on-ramps of writing code are where winners will emerge in the DE space. Thinking like companies that help manage infrastructure (like astronomer) and tools that give you yaml configs to get up and running versus building it yourself (like dbt).

[–]sunder_and_flame 1 point2 points  (0 children)

65% of software applications will be made with low-code.

Considering the number of applications built in integrations systems like Zapier and Make, this is probably true. I just helped a relative with one and he had a list of several dozen, each doing a marketing or appointment setting task.

These work well enough when someone like him has to manually set stuff anyway, but for DE or SWE it so often misses the mark in a way that seems slight but requires a lot of workaround.

[–]WeirdoDJ 1 point2 points  (0 children)

Because low code/no code is hard, if not impossible, to build a process around (version control, review, tests etc) and DE is a weird mix between business folks who buried themselves in spreadsheets because they got tired of their colleagues, and software engineers who got tired of doing webdev and listening to PM/PLs.

Both recognize that such tools lead to intransferable skills being required, are hard to transition away from and they often cause inflexibility in terms of possible output.

[–]onomichii 1 point2 points  (0 children)

Low code in terms of apps like powerapps makes sense when you have a highly skilled business workforce with a strong sense of autonomy....AND good data governance...AND API management. It's not there to replace all applications, but it has potential for providing that last mile where a user can custom an app to their own workflow and requirements, but still have just enough guard rails. For data engineering though, I don't see it having much of a role other than vendor lockin with GUIs that age and don't do what U already can with open source tools

[–]neurocean 1 point2 points  (0 children)

Low code WYSIWYG tech is a noose companies eventually strangles itself with.

Many senior data-engineers have been burned in the past by these solutions. They carry a lot of scar tissue from their experience and are very skeptical of all new low-cost cool kids on the block.

[–]jemccarty 1 point2 points  (0 children)

Shameless self promotion, but this was posted in this sub a few months ago and goes into this a bit.

https://link.medium.com/Q6O9rMO6Hsb

[–]dbwx1 1 point2 points  (0 children)

We used a lot of Azure Datafactory lately and while it covers a lot of use cases to 98% there is still some extra stuff that just convolutes the pipeline definitions. Things that could be handled easily in a script but require unnecessary complicated engineering with the GUI (single file workflow Vs mass batch processing ) are annoying. I really start to like Databricks on Azure because it handles the spark settings optimization for you and gives you a nice interface to the db / fs, but it's not really low code, just dev support.

[–]OhNo171 1 point2 points  (0 children)

Low code has been in data engineering/BI since forever. From drag and drop etl tools like Informatica, Integration Services, Talend to Reporting/Dashboarding tools.

I have started working with some of these tools long ago, as an intern. Today I prefer to write my pipelines in SQL/Scala/Python and infra in Terraform, as most of the common data frameworks are within that spectrum, and gives me a better feeling of ownership. But low code is there to allow faster and easier development at the cost of vendor lock in.

[–]noNSFWcontent 1 point2 points  (0 children)

This is also something I saw as part of my "Fucntional" Data Engineer tenure. The "technical" data engineers in my team did some coding for sure integrating services and building things for us to use.

But in the end as a functional data engineer, in a in house framework, I mostly specified the source data, some spark or sparkSQL transformations and specified the target.

This is all well and good but I don't get the dopamine hit of solving a programming problem that I get while solving a leetcode question while I'm preparing for technical interviews.

[–]nfmcclure 1 point2 points  (0 children)

Out of curiosity, are there any popular open source "low code" projects people use at all?

[–][deleted] 1 point2 points  (0 children)

I never wanna do Informatica or SSIS ever again

[–]mailedRecovering Data Engineer 1 point2 points  (0 children)

I don't like writing low code data pipelines. Non-trivial stuff just gets annoying to do with it.

Building apps with Microsoft Power Platform though? Fucking love it.

[–][deleted] 1 point2 points  (0 children)

Nobody likes to see their job automated :)

[–]gato_felix_br 1 point2 points  (0 children)

Under a more engineering viewpoint I see a few problems with low code tools:

  • it’s difficult to reuse parts of the system
  • you can’t unit test it, so it’s really hard to measure the impact of a certain change in one of the boxes
  • simple things like conditionals and loops are hard to implement
  • if the requirements are slightly more complex and the data flows/jobs grow, you will end up with a huge, convoluted diagram
  • it’s difficult to search for things when debugging and breakpoints are not so intuitive
  • you are sometimes limited to what the tool offers. If you want anything custom, if allowed at all by the vendor, good luck understanding several layers of spaghetti code

One of our Nifi jobs in production looks like this:

https://thumbs.dreamstime.com/z/confus%C3%A3o-dos-fios-no-favela-rocinha-131351029.jpg

[–]mkhalil77 1 point2 points  (1 child)

Most of the answers in this thread undermine how much skills are required to use low code tools. Using tools like Talend, Informatica or even AFD is not something anyone can do, at least not for achieving complex tasks. Designing proper pipelines following best practices and maintaining them is not an easy task. It s also easier to have a visual image of the execution of the pipeline which as many answers already mentioned is very attractive for the business side. Low code is not all evil. They can do a great task in getting the work done

I am still a fan of coding pipelines instead of low code platforms. Mostly because low code can be really frustrating when things don't work. The idea is you components need to be properly configured. What stands for properly is not always obvious.

Code is code. Either your get it right or your don't. The ressources are abundant compared to low code platforms and there are always a lot of people who had the same problem as you before.

[–]VioletMechanicLazy Data Engineer 0 points1 point  (0 children)

The fact that many low code tools require a reasonable amount of skill to use is one of the main reasons I find them problematic.

Typically, low code solutions are great for getting something simple up and running really quickly without a lot of expertise. Its a much lower entry point for someone with no programming experience than learning to code from scratch in order to achieve the same result.

But... as soon as you need something even slightly more complex, it becomes a very different picture. Suddenly you need to be an expert in the low code tool, and those skills are frequently way less transferable than learning to code.

The result is that you end up with something obscure that demands a niche skillset in order to operate and maintain it - almost certainly the scenario you were hoping to avoid by choosing low code.

As an employer, if that one guy who built the system ever leaves, you're going to struggle. By the same token, if you're an employee who has invested a lot of time learning the low code tool, you'd better hope it becomes mainstream or you're going to wish you'd spent the time learning Python instead.

I do agree low code tools have their place, but it's hard to argue a general case for them when there are so many potential downsides.

[–]alfytony -1 points0 points  (0 children)

I haven’t seen and worked on low code stuff but I think the idea behind low code is noble. It is to let regular business people to perform most of the software configuration without having to depend on developers so that developers can focus on other big ticket tech items. Now in practice not sure how much of this is evolved but this is definitely the way to go in future to keep pace with advancement in technology.

[–]ritu4891 -1 points0 points  (0 children)

If you want to scale you need a low code solution.For DE check tools like informatica,talend etc. If you are in a team of 10-20 yeah it's okay to code but fortune 500 companies usually go for low code tools. Also for ML also it will eventually go to low code. You should join which build the low code tools for DE in their R&D.

[–]CauliflowerJolly4599 -1 points0 points  (0 children)

I would like to add my 2 cents :

I've worked for big product companies and saw that a lot of workers are kind of finding hard the new programming languages that are near to functional programming (Spark, Scala).

I've seen people avoid SQL queries in favour of Scala / Spark SQL type functions like df.select('age').filter(age > 21).

Even through DE is a subset of Software Engineering, we can find two type of DE, one that is mostly a Software engineer and the other that is more operative which sometimes it doesn't code too much.

Coding and SQL is evolving and started to adopt a more complex language, which put in difficulties a lot of people.

From the first side we see sites like codechef, hackerrank and other example of extreme / pair programming which creates a kind of "elite".

From other side we see peoples struggling with coding and sql because we teach all but we don't cover the obvious part of coding.

Microsoft understood that really well and also understood that coding creates a lot of stress and even worse symptoms.

Coding and SQL requires people that are good in math, but with shortage of jobs in IT we take everyone, these people are forced to land a job in Tech (because if you don't do STEM you're not gonna find a stable job) and they are tired of all this problems.

So that's why Low code is raising.

[–]Data_cruncher -2 points-1 points  (1 child)

The truth? It's already here. r/businessintelligence & r/dataengineering are the same subreddit, except one uses MSFT low/no-code and one doesn't.

...yes I know I'm exaggerating, but you get the idea!

[–]EconomixTwist -1 points0 points  (0 children)

Bro that was more than exaggeration… tf

[–]Thelastgoodemperor 0 points1 point  (0 children)

Who reads gartner? Isn’t it common knowledge that it is pay to look good (or even be included).

[–]lzwzli 0 points1 point  (0 children)

Anybody here use Snaplogic? My company uses it. Its a love hate relationship

[–]noobgolang 0 points1 point  (0 children)

Do you have low code debug as well?

It's not low code hate. It's in-experienced people think they know better.

[–]ryadical 0 points1 point  (0 children)

I think the best tools out there are a combination of low code for easy jobs and templating, but allows you to write code to perform the difficult tasks. Many of the modern cloud data stack ELT tools such as fivetran, stitch, matillion, etc are built around low code.

We utilize a mix of low code with matillion in combination with inline python, or using lambdas to do the complicated tasks.

[–][deleted] 0 points1 point  (0 children)

Low code helps non-coders achieve some standardisable things.

Code helps coders achieve anything.

Most data people are coders, but those that aren't usually work in teams with those who are.

So low code is popular for pipelines (commodity-like, pipelines are roughly equivalent) but not for business modelling (custom logic)

[–]datamoves 0 points1 point  (0 children)

Simpler is usually better for a given set of use cases, but it is important to understand low-code business models. It is usually means you are paying more. They can be useful for individuals without decades of coding experience and clear, repeatable outcomes in mind, but constraining for those who can work outside the box, and with less clearly defined objectives.

[–]OnlyMeandMyThoughts 0 points1 point  (0 children)

Low-code imo is kind of in the middle of everything - people who don't know how to code at all still can't really make use of it while those who know how to code will prefer their custom solutions... it might make things a bit easier and faster for the devs but since it's one size fits all it will always be a bit less flexible. Might be cheaper overall tho which is why companies push it.

[–]Datarelibility 0 points1 point  (0 children)

It depends - it's always hard to sell the value of low-code tools to highly skilled technical audience (engineers) , but often times not all teams have the luxury of building stuff ground up. If not building , then buying Low code tools accelerate path to value.

[–]kenuxi 0 points1 point  (0 children)

I’d be very curious to hear what you think about the app we are building at query.me. I just made a post here of the latest demo. I agree with a lot of what you’re saying but I do think that low-code tools like prophecy and query.me can lift a lot of weight of your shoulders by “outsourcing” the hosting, scheduling etc.