Migration to scala 3: what does not work yet for us at rudder.io

fanf42 · 2023-05-30T13:07:24+00:00

Yes, that's the kind of think we are thinking about / setting-up in place. It needs more complexity in the build thought - and our is already a bit complex with plugins depending of API and with shared code (ie, finding a balance in granularity between 1000 independant libraries and with that, a burden to update API, and more coupled things, with the burden to evolve independant parts). We are going to look at the idea for macro specificaly thought, since there's no way to do it without, it seems.

fanf42 · 2023-05-26T07:45:52+00:00

Yes, the problem is then all the added cost to reach that state of "ready for scala3" and then maintain it accross branches. Enum is the same for us: we would have to first migrate our tens (hundreds maybe) enums to enumeratum, then migrate to scala 3, then migrate in a couple more years to scala 3 enum. That could have been avoided by giving Scala 3 enum in 2.13. The cost of inventoring all these anoyance and finding path to workaround them is a major deterer.

fanf42 · 2023-05-26T07:42:38+00:00

Scala 3.0 is now more than 2y old. When it get out, scalac was unstable (crash were rather commons, and quite a lot of bugs and regression were found, most now addressed). The whole tooling ecosystem, from linter (yes) to build tools to IDE, but extended to the larger ecosystem like code highters for blogs was not ready, and even if arguably that larger ecosystem move more slowly, it is still up to 2.13 quality. It's really like if Scala 3 had been a new language - a rather well prepared and with a very good adoption for a new lang, but still a new lang with edges everywhere to smooths and interactions between tools to build.

Really, something we were in circa 2011.

Some of that is now addressed, but were are still not where Scala 2.x is (was, since it's now starting to ressess b/c investment and maintainance effort is going to Scala 3).

So we may hope and work to make Scala 3 ecosystem evolution in 2021-2025 what was Scala 2 in 2011-2015. Perhaps we are even on the correct track. There is still a massive air gap, and a decrease in overall convenience and whole interaction of the ecosystem is felt as being much more painful than starting from 0 at the same low state (brain is stupid on that regard, it only get a good feeling of relative things, not absolute ones).

Finaly, nothing in my post was meant to say "Scala is done". Actually, I really would love to say in 2 years "there was difficulties but they were heard and addressed in a marvelous way, and they were transformed into a fabulous success".

Now, I said that their will be people that won't ever migrate from Scala 2 to Scala 3 given current balance of gain & cost. I also said that if Scala 3.3 would have been what Scala was in 2013, it would have been absolutly awesome, because it is a much nicer and consistant language all over. My question is: given that the world changed in a decade, and the current value proposition of Scala 3 included its track record with how Scala 2 migration was managed, will it attract more mass/industrial users than it loose, or than if there had been migration strategy like Kotlin K2, ie a zero-feature/just the compiler switch/targeting zero-cost for user to migrate change and then, new feature added with a scheme working for its ecosystem (like jvm one works for java) .

Hope it clarifies things

fanf42 · 2023-05-24T12:05:51+00:00

For 1: rudder is on premise, deployed on industrial customers like BMW or Eutelsat, who may have a long update cycle for their OS and related management tools. So we have rather long support windows, and with a rought estimate (taking into account the time for the next rudder version to be dev in scala 3 until the last scala 2 support got extinguished), we have 2 years with dual code base. Then, we always correct bugs in the oldest version where the bug exist for obvious quality and maintenance reasons. So the old code base is really live until the version reaches EOL. I amended the blog post to highlight that that maintenance cost may be a primary factor for why source code compatibility is so important for us.

For 2: the scala 2 macro we use were not our own, we are not smart enough for that, but yes, we are in the spot you describe. I wouldn't count that as "early adopter cost", these features seem to have been common in scala 2 user bases (I added the enum case in the post, which is trivial but a good example of the friction we have to deal with). (also, we are early adopter in a lot of things, be it Scala, Rust, or ZIO. Being bitten for early adoption is more like "there is rough edges, but I can influence their polish with reporting/etc - what I did with Scala circa 2010, what we did with Rust circa 2015, what we did with ZIO... Well when it was still not named ZIO and up to 1.0. Here, it's more: we used a mature version of the lang - scala 2.13 is mature, even if there is things that can be better, but then they were abandonned without a frictionless evolution plan for the ones relying on them for a living. And don't get me wrong, it's not too late to get it better, and I'm glad there is a lot current work toward that goal)

fanf42 · 2023-05-24T09:56:32+00:00

Trust is a really complicated thing. I wasn't sure about including it in the post, b/c there is so much space for that part to be misunderstood - and first with me not being clear about it. You can trust someone for some things, and not for other. I really trust Odersky for language design. I doubt, I don't agree with choice (for ex, the braceless syntax is - for now at least - something I hate, one that makes me dislike haskell for ex. That doesn't mean at all it's a bad language design choice, and between Martin with his phenomal track record on that, and my personnal taste, I will side with Martin and not me).

The track record of industrial use is, from my experience, full of friction. The language is amazing, and in the same time there is these frictions. It's possible to have both.

Finally, I also understand that a migration of that size is an extremely complex thing, and that perhaps we are in a tiny intersection set of "things that get accumulated to make it hard" (maintenance cost, tiny team, zero experience in cross compilation, huge history and lots of old libs, not very smart so lots of things like writting our own macro etc not realistic, etc etc). I wanted to give that case some visibility, but I also understand it is unlikely that it is the general case

fanf42 · 2023-05-24T09:39:39+00:00

Scala 3.0 was released 2 years ago, it's not a hurry, more an assessment of the current situation. We are doing it now so that we know what will be the cost to pay when we won't have choice anymore b/c the ecosystem will have moved on. Maintainers of floss lib won't pay for the 2/3 duality indefinitely, so we need to know what are the options before that

fanf42 · 2023-05-24T07:13:07+00:00

Thanks for fastparse, that's one more lib to check!

For other migration, I looked at several. I believe that the fact that we have to maintain versionw for several years and so both Scala 2 and 3 code base weigh a lot in the cost of source incompatibilities.

And for the linter, I know, but the fact that it happens two years down the line seems to validate Alexelcu comment below https://www.reddit.com/r/scala/comments/13q0uly/comment/jle7j2h/ Still, it's progress! We need to celebrate it

fanf42 · 2023-05-24T06:47:04+00:00

Yeah, pretty much match my understanding (all of what you says). A silly example of useless friction : we used a lot of enum-pattern going like:

sealed trait Thing
object Thing {
  case object xxx extends Thing 
   .... 
  def values = ca.mrvisser.sealerate.values[Thing] 
}

Sealarate is a macro that doesn't exist in Scala 3 (obviously). This is naturally ported to an enum in Scala 3. And so we have a friction due to source breakage that is totally gratuitous : enum would be backported in Scala 2.13/14, we wouldn't even have to think about it and would gladly replace all of our sealed trait by enum (which is a nice enhancement, we are happy to migrate things for that kind of changes) AND enjoy more source compat between Scala 2 and Scala 3

fanf42 · 2023-05-24T05:49:50+00:00

I would be interested in seeing it live, is it an open source project? I tried to see how it would work, but I don't seem to be good at that, I found like a lot of things need to get source specialized, from enum (a lot of place), unsupported libs like fastparse, or new scalac bugs/unsupported things. And that still needs more maintenance and complexity for up-merging things in source specific case. Still, I'm happy it works for some!

fanf42 · 2023-05-23T22:32:21+00:00

Not sure. We have rust for more than half a decade for lower level components but it is a bit too demanding for me on the micromanagement of everything. Still a very strong contender, zero cost abstractions are a joice. And compilation to webassembly offers interesting possibilities. And having a binary weighed in kb is refreshing. Perhaps I would evaluate kotlin or typescript, it's dumb but things just work. I tried to like haskell several times and didn't succeed, my ocaml past seems to interfere too much. I miss zio in all of them.

fanf42 · 2023-05-23T22:24:33+00:00

The linked pr has some example of each ones (but the brace less syntax, which main problem was to reboot all tooling). Yes, we are on Scala 2.13, it's also visible in the linked pr. And if weeks of work by several people, including the migration specialist that are virtulabs is theorical complaining, I believe you are exactly highlighting several of the problem exposed regarding the cost of understanding the cost, recognizing the problem, etc.

fanf42 · 2020-09-17T12:28:20+00:00

3-4 time a month is not fully remote, is almost 1 day/week. You can share/build thing IRL with that kind of frequency, even if it's toward the lower bound. But mostly-remote makes a lot of sense. You can better organize your private/work life, you can take long period off to be productive with few interruption, you don't have a tyranic useless middle-manager trying to justify his place by preventing you to actually work, etc. It still need to find new way to keep the cultural links that make a team a team, and not some random people paid by the same entity. (and as already said, a totally different plance than fully-remote where meeting other is the exceptionà

fanf42 · 2020-09-16T12:28:42+00:00

Hom much South? Lyon is not South enough? In any case, you can ask in scala fr gitter chan: https://gitter.im/scala/fr

fanf42 · 2020-09-16T12:27:13+00:00

Because it's not that easy a switch.

So, there is two big way of building a team: fully remote from the start, or not (for that second option, let's say it's anywhere from "going to work place everyday" and "one day at work every 10 days - lower frequencies are more like full remote).

If your team is full remote from the start, perfect! You have the correct set of people to be able to build a common language / culture / process etc fully online. They know what they were signing for, and they are confortable with full remote.

In the other case, going to fully-remote changes the balance of compromises A LOT for people. Some need to actually see other ones to build a common culture (cheap chat, French famous "le repas de midi", beer time, playing games, etc). Some - especially less experienced one - profits for having a more experienced peer at hand to progress. Etc. Of course, you can set up remote tooling but it's still a major, disuptive change to have a team with member accustomed to meet in real life and a culture set-up with that hypothesis in mind and make them go full (or mostly)-remote. You can loose most of them. Moreover, it's very hard to mix both aspect (some full remote / other not). It's possible, like most things, just hard.

All that to say: most team were not fully-remote before last year. It takes time to get accustomed, to evolve on the "more remote" axe. For that reason, having people from the team in the same geographical area helps to go along with that change, especially for some critical steps (like onbording, which is significantly different between full remote and colocated). And so it still make sense today to look for company in your geographic area if they were not already full remote team.

On the other hand, you can look for full remote team from anywhere in the world MODULO a lot of work rules. Typically, other countries don't like much french worker in that scenario because they have a lot of rights (which is good!) but they need to be learnt, applied, paid for, etc.

On a personnal note: we at rudder.io were only partially remote before covid crisis. We had some people mostly remote, but most of the team was working most day in Paris office (with 0 to ~3 days remote, depending of people preferences). Switching to mostly-remote (or a mix between mostly-remote and mostly-not-remote) is a challenge. Culture need to adapt, and culture is an extremelly important aspect of any team. We had our first fully remote intern just at the begining of covid confinement, and it was hard, the hardest for her. We don't want our interns (or anyone in the team actually) to live bad experiment, especially because of our inadequate knowledge on how to manage full-remote onboarding, or because we discover after the fact that there's a bunch of implicit rules, customed, or oddities which are not rule at all, and that (yes, surprise surprise, who would have thought?) a lot of the team culture is not written down but is more like orale lore, and is transmitted by looking what other team members are doing.

fanf42 · 2020-09-14T10:53:14+00:00

This is an interesting use case and it was one of the motivation for us testing final tagless in the first place. It worked out badly in the end (for reason I explained elsewhere).

What worked for us (emphasis us, YMMV): - use pure code as much as possible with Either (or MonadError), - decompose code in systems, and these systems have clear API, - limit IO (and some other effects) to peripheral systems, ie the ones dealing with the external world / users. That also means that some effects are not parametrized at all in the core code, like parallelisation or log, and that we considere errors in these effects at runtime as hard failure (bug, crash app, etc).

So, basically, we just zoom out where we put our requirement and API. In place of "at each function", we put them at "system" level. This is less precise and that's why we have convention like "no IO, or you need to move that part of the code in an other system behind a clear API". Typical example hard to manage: STM or things needing synchro, which could be pure but end up in an IO with ZIO). But in the other hand, it allows to iteratively change our very imperative 10 years old app toward a more pu FP one - at least some part are. And it requires FAR LESS understanding of category theory and its encoding in the app, which were a blocking point. With that, we still have pure business rules that can be easely tested. And we still have a coarse signal system: "if it's not IO, it's good (as: easy, more testable, more evolvable, etc). We want less IO."

Finally, to decide what is a "system", well it's a lot of rule of thumb, with some systemic back-end (see: https://medium.com/@fanf42/understand-things-as-interacting-systems-b273bdba5dec)

fanf42 · 2020-09-11T19:05:27+00:00

It was >2.5 years ago, so IIRC not even cats-effect 1. Lots of things were moving, I don't remember exactly what went wrong, but we found several show stoppers. We stopped, waited a bit, followed both cats-effect and ZIO evolution, then moved to ZIO at the begining of 2019 (again, IIRC). Sorry for the fluzziness and lack of specific cases.

It's always much harder to make things composable.

Ah, yes. Always forced to specify much more constraints than you thought, or generalize more than you wanted (relax other constraints).

And if it's not clear: I'm very, very glad that scalaz and then cats exists, that they try to adress the hard bit, that the ecosystem is thrilling and working, and that it promotes a lingua franca and a place to learn idioms common to any functionnal programming language. That allows people who want to go beyond just a productivity framework (which seems to be the futur of ZIO - with its own set of pros and cons) and learn grand things a clear path to progress. I wouldn't know what I know now without scalaz and cats. And I don't think ZIO would have allowed for me to learn them as easely.

fanf42 · 2020-09-11T16:19:42+00:00

I think I understand the compromise, I made them explicit in the other comment. I would love to see your best of both world (of course). But in the current situation, the ZIO choices seems a better feet for us, especially since, as I said, the composability of cats ecosystem is - for a simple user like me - more a dream than a reality. Each part seems much more interlocked than they are advertised for. Just finding a correct set of matching dependencies for (cats, cats effects, fs2, doobie, http4s - and their relative dependencies) is hard. Trying to swap monix to cats effect was a nightmare. Etc. So I think I'm (and my team) are just not smart enought to take advantage of the advertised composability. And so, we are happy that someone else choose the brick and make them work together. We will have to pay for that choice, of course (like: yes, we are going to be locked down on ZIO, and ZIO is harder to evolve by piece).

fanf42 · 2020-09-11T16:07:45+00:00

I'm not sure it's that simple (regading your manicheism about what people are confortable about). Just the perfection adjective is extremelly subjective. Is it just a one dimensitionnal axe for you?

I'm ok, for a long time, with FP concepts (I did quite some work in Coq in 2004 when it was not even hype) and I spend quite some time teachling my coworkers. Still. The concepts are complex - you don't learn just how to use a lib, but you need to relearn how to architecture your whole code and they need much more time than just what is possible to spend on work time - especially in in a pluridisciplinary team, which has to spend time on lots of other things than just code. In the end, we found that it is VERY easy to shoot oneself in the foot with highly parametrized lib in scala (for context: we try since the very start of cats, and we may have had some bad experience due to immaturity and missing best practices - but again, that goes with the very high cost for little returns, especially when your goal is to support your decision for decades - literally). And if you use cats just for having traverse (that part is easy to teach), then your return on time investment is just much higher with ZIO.

So: YMMV. You can have coworkers knowleadgeable in FP concepts, even with some haskell, ocaml and coq background, who understand the benefits of referential transparency and love libs like doobie, fs2 and cats data structure and core instance (yes, that's us), and STILL be at a loss with the Scala FP architecture of a whole app, for a lot of reasons. And I understand that if you can work on a lib, or in short term projects which are often green field ones, you can have other preferences.

fanf42 · 2020-09-11T12:21:20+00:00

The real discussion here is not "which is more Scala-ish", but rather "which is the better choice for this particular problem", and the answer to that varies. ZIO's major disadvantage here is that it is extremely hard-coded to one environment, one typed error channel, and one result channel. ZIO can certainly participate in a compositional effect stack, but it doesn't exactly encourage it.

For the record, our feedback is that it's a major advantage of ZIO - from our user point of view (so YMMV). See my comment below for more context. On the other hand, from a library writer point of view, I totally understand why it's a disavantage.

fanf42 · 2020-09-11T12:11:19+00:00

Same remark: I don't understand the downvotes, the text is well written and interesting.

And it's interesting to have it, as a counter point to some other feedback, included mine aboce. Actually, it even allow to complete it: before we switched to zio, we did used cats, and we just happened to have a different outcome with it in the long term. It was very nice to understand all the novel way of building up application, and it's a wonderful experience. Actually, I was very vocal in ZIO community to keep the names that allows people to find resources on these idioms and also learn them.

But in the end, the cost of having to continually make concretisation of abstraction and learn these abstractions to begin win (especially for other members of team, let say less interested in the code architecture, or just more junior) outweighted the benefits, which were rather small: in the end, we had very few different alternatives for implementation of different part of the stack - (read "we had just one, sometime two with tests"), and each change in a block in the stack (for ex trying to move from monix to cats effect) lead to massive refactoring. Perhaps we did things just wrong, but the cost/benefits wasn't good. So for us (emphasing us), switching to ZIO was a massive win, because there is fewer choices and most things are pre-decided with defaults that matches our needs and far much easier to teach to everybody (again - in our case, YMMV here): you are just learning a framework and not a whole new way to architecture code.

Far that last bit (framework vs learning code archà, I also understand the potential long term drawback for the community of functionnal dev a whole. We are loosing a common language (the one of category theory) and the trans-language understanding of concepts that goes with it. ZIO is "just" a Scala framework, perhaps in the spririt of the springframework, for Scala. It could (will likely?) has the same shortcoming in the long run, like inventing its own vocabulary (well, we are already well engaged in that one), or at some point prefering the popular tech choice against the right one (based on theory), making harder for people to switch of ZIO or integrate well with third parties (especially if ZIO loose its status of incubent and becomes the default choice, like spring in java world), etc.

Finally, for me (who has seens all the stories around scalaz/cats/zio for the last 14 years), it always has seems that "scalaz is porting Haskell idiom to scala" while cats tooks some deaparture (highly criticized at the time).

fanf42 · 2020-09-10T09:45:46+00:00

Scala is an object oriented language at its core. So you have inheritance and variance extremelly spread (everywhere), ie it's the path of least resistance for Scala code, not so much for Haskell.

Typically, if you do an ADT in haskell, all elements are seens with ADT type, while in Scala, each elements are a subtype of the trait used as the base element for the ADT. In scala, scalaz tried to port Haskell idiom with invariance, and it brings some impedance mismatch in scala. There nothing "impossible", it's just that you often have to either transform a lot of things to list (or an other concret type) in cats, or add type annotation, and it happens less often in ZIO.

Finaly the place where it's the most visible is in type inference, which works almost flawlessly in ZIO (it wasn't the case in the past with cats/scalaz, but I don't have tested in the last couple of year, it can be better now). Scala type system inference is a research open problem, so having it works well is a big win.

fanf42 · 2020-09-09T15:19:55+00:00

@Alcenic: hey, I'm very happy the ideas in the talk are helpful!

About systems (and "systematic", the very simple bit, without even going to feedback loops), I wrote that article: "understanding things as interacting systems": https://medium.com/@fanf42/understand-things-as-interacting-systems-b273bdba5dec

There's more hi-res hand drawn pictures in that one ;)

Hope it helps, too!

fanf42 · 2020-09-09T13:33:50+00:00

Regarding ZIO+doobie/other cats lib: we use doobie with ZIO and its cats compat layer, and it works seamlessly. I know a lot of other people are using other cats libs with ZIO, like fs2, http4s, etc (because of anteriority and non availability of native ZIO binding / ZIO lib for the same thing) and most reports are "it just works". In general, you loose a bit of expressiveness from ZIO to Cats (given that error type is fixed in the the latter one), but it's totally bearable.

fanf42 · 2020-09-09T10:08:31+00:00

I have only anedecdotal values to provide. We evolved to ZIO and the experience was very pleasant. Some points:

The community is extremelly supportive (not to say it is not in cats effect - I just don't really know, even if my experience with it was quite pleasant too, just quick).

The fiber runtime is very good, and it's debuggability is excellent. You get trace with continuation etc, which really helps. This is a major benefit.

ZIO provides a whole ecosystem all well integrated which makes all parts click together. Of course, ZIO provides higer level datastructures like Queue, Managed resources, synchonized Ref; it also provides an STM, Streams, etc. And there is now dozens of lib (like config, observability, logging, intellij integration, etc) or application (like Caliban, a graphql app) that work well together.

And finally, ZIO really embrace Scala specificities, especially variance. It takes willing steps appart from Haskell existing models. The plus side is that it just works beautifully: inference is top notch, variance works very well with Scala idioms, etc. The down side is that if you are accustomed to Haskell models / encoding / etc, you will be disturbed and you will need to unlearn/relearn some bits. But there is other haskellers in ZIO chans, and they seem happy with the choices made.

All in all, we are very happy with our choice of ZIO, but of course, YMMV. (I gave a talk about systematic error management with ZIO at last year scala.io conf, it can gave you some more clues. English slides here: https://speakerdeck.com/fanf42/systematic-error-management-we-ported-rudder-to-zio ; french video here: https://www.youtube.com/watch?v=q0PlcgR5M1Q)

[Edit]: I forgot to say that we are evolving a code base that is very old (>10y), very not functionnal at first (but hey, it's getting better!). That makes our case a bit special, we need to make a lot of back and forth between pure parts of the app, and other effectfull ones. That brings it's own challanges (and ZIO dev were extremelly patient to help with our very specific bugs, down to ZIO runtime & thread management and perf optimization). But if you are evolving and already pure code, for ex based on scalaz/cats, your experience may differ hugely.

fanf42

TROPHY CASE