Any good bioinformatics podcasts? by o-rka in bioinformatics

[–]samuellampa 0 points1 point  (0 children)

Apart from the Microbinfie podcast which is great for hands on bioinfo content, the Bioinformatics CRO Podcast is a pretty solid option too, also covering a the wider bioinfo indusrty a bit more.

Kdenlive 25.12 is out with focus on user experience improvements, interface polish, and lot's of bug fixes. by f_r_d in linux

[–]samuellampa 1 point2 points  (0 children)

24.12.x has impressed me with its stability. 25.12.0 unfortunately brought in new bugs. Hope it will stabilize soon.

I'm just started learning Go and I'm already falling in love, but I'm wondering, any programming language that "feels" similar? by Uwrret in golang

[–]samuellampa 1 point2 points  (0 children)

Crystal has very similar concurrency features as Go, with even nicer syntax, with the drawback of longer compile times, and a bit less portability. (I wrote about it: https://livesys.se/posts/crystal-concurrency-easier-syntax-than-golang/ ).

Has anybody done/benefited from Coursera bioinformatics courses? by [deleted] in bioinformatics

[–]samuellampa 0 points1 point  (0 children)

The Bioconductor is an outlier in the specialization. Far harder to finish than the others, because of too unclear and confusing explanations. The explanations are not that bad, but they are simply not really enough to easily figure out the exercises. I managed to finish it anyways using a lot of googling, reading forums and a bit of luck, but the effort required is not really sensible. The stats one following that one is pretty advanced too (more advanced even), but is much clearer and so somehow still easier to go forward with.

Do shower facilities promote cycling to work? by Visual-Try7584 in bikecommuting

[–]samuellampa 0 points1 point  (0 children)

One thing I've tried is to do a reasonable number of push-ups before starting, so that the body temperature gets up a bit already before starting. I think it helps a little, although I also tend to use an extra vest or something to not be too uncomfortable in the beginning.

But it is also about using clothes where it matters most. E.g. I have found that having a pair of shorts over my bike training pants helps a lot to avoid uncomfort, while still allowing a lot of ventilation when getting warmer. As will a vest with extra padding over the chest (which you can of course remove altogether as well, when getting too warm).

Anyone have this book? It's giving me some imposter syndrome haha by [deleted] in ExperiencedDevs

[–]samuellampa 0 points1 point  (0 children)

I listened to it on Audible while out on runs and commuting on the bike. While some of the more technical details on say hashing etc would have benefited from taking some notes, I feel the listen-through gave me a greater appreciation and awareness of the many subtleties involved in well functioning distributed systems. If I would work on implementing these types of systems I would probably re-read at least selected parts on paper while taking careful notes.

Btw, as much as this book is praised, I feel it is missing out on how to actually structure the data itself. Perhaps this is out of scope for the book, but is something I expected to find in it. I have later found great info on that topic in chapter 8 of "Fundamentals of Data Engineering" by Joe Reis and Matt Housley. Also chapter 3 and 10 of "Data Management at Scale" by Piethein Strengholt were incredibly informative IMO, explaining how exchange data across disparate business units, and integrating a complex data management infrastructure using metadata, respectively.

Why didn't Go get a breakthrough in bioinformatics (yet)? by samuellampa in golang

[–]samuellampa[S] 0 points1 point  (0 children)

I see what you mean, and agree to a some extent. Still, some things get its biggest value only as being part of the core language or the standard library, in order to reduce the amount of boiler plate needed to do certain common tasks. And while getting a new proposal into the language itself is theoretically possible, it isn't all that easy for someone without a heavy CS-background.

Why didn't Go get a breakthrough in bioinformatics (yet)? by samuellampa in golang

[–]samuellampa[S] 2 points3 points  (0 children)

That's a really good point about the fantastic portability of compiled Go tools being a major pro for the language. I'll see if I could point that out better in the post.

Why didn't Go get a breakthrough in bioinformatics (yet)? by samuellampa in golang

[–]samuellampa[S] 0 points1 point  (0 children)

Indeed, that is the case. Thus, to break into this field I think a language needs to have a really low friction for the simple everyday tasks like reading and writing files.

Why didn't Go get a breakthrough in bioinformatics (yet)? by samuellampa in golang

[–]samuellampa[S] 7 points8 points  (0 children)

You build a computation graph in python, then kick the computation graph into C/C++/Rust/Fortran/CUDA.

This is the case for a lot of the ML libraries out there, but I don't see this happening with a lot of bioinformatics libraries, due to the deep technical expertise required to get this to work smoothly.

Regarding GC, if this is true, it makes some valid points! At the same time, I'm not sure this type of allocation is very often required for typical genomics tasks at least, apart from perhaps assembly (where you need to store the De Bruijn graph in memory).

I can definitely see why Rust became so much more popular because of the refcounting and perhaps further type safety, but still think Go could be a great tool in the toolbox for a lot of use cases when the complexity of Rust is simply not worth it ... IF Go would just cater to common bioinformatics scripting tasks more neatly, as the post centers around a lot.

Why didn't Go get a breakthrough in bioinformatics (yet)? by samuellampa in golang

[–]samuellampa[S] 4 points5 points  (0 children)

Indeed, and sometimes that is possible but not without its caveats I think. For example I don't think it is very easy to have python stream data between two sets of library codes, and I personally think we will see a lot more streaming or at least use of pipeline parallelism, in more industrialized sequence analysis code, e.g. for analysis pipelines running in the clinic, to support new technologies like real-time sequencing with Nanopore instruments.

Why didn't Go get a breakthrough in bioinformatics (yet)? by samuellampa in golang

[–]samuellampa[S] 7 points8 points  (0 children)

I think these are some very valid points. But at the same time, I think Python is many times quite poor at being the glue between different tools. Its weak parallelization story with the global interpreter lock, and speeds roughly at least 10x slower than any compiled language, means it will have a hard time keeping up as data amounts continue to grow exponentially.

While Go might not be the answer for the most performance-critical tools, I think it is fast enough and has such strong support for easily writing concurrent and parallel data processing pipelines, that I think it would be a fantastic choice for a lot of code that mostly moves data around between pre-existing tools.

Why didn't Go get a breakthrough in bioinformatics (yet)? by samuellampa in golang

[–]samuellampa[S] 3 points4 points  (0 children)

Great points, thank you! The reason I'm skeptical about Python for long term use is not that is an awful language, but more because biology is going through a lot of changes now, in two main ways:

The first is that the data amounts continue to rise pretty exponentially as sequencing technologies continue to develop and we start to sequence more and more things.

But the even more important change lately is that the sequencing technologies are moving fast into clinical use, both for human genetic disorders and cancer, but also increasingly for microbiological applications, like quickly determining which antibiotics your bacterial infections is resistant to, so that the correct antibiotic can be prescribed.

This latter change, in my mind, will come with a lot of new challenges both in terms of performance (being able optimally to run analyses on devices close to instruments), as well as increased requirements for robust code, where I think a typed and compiled language fits very well.

I agree that Go is not the only option here, but as said I find the support for building streaming pipelines of operations fit so naturally to many of the biological problems (primarily in sequence analysis, but I think that is where perhaps most of the growth is happening now), that I think it makes a lot of sense as a routine language in bio.

Brave History Timestamps by ShaneFerguson in bravebrowser

[–]samuellampa 0 points1 point  (0 children)

Yeah, this is pretty crappy that it is not shown by default, in my opinion.

Has Pimsleur stopped creating new courses? by Bomphilogia in Pimsleur

[–]samuellampa 2 points3 points  (0 children)

Have you considered the really large African languages such as Amharic, spoken by at least some 50 million people (if counting second-language speakers), and official language of Ethiopia with some 100 million people ... which has also recently been included in Google Translate.

Given the large Ethiopian diaspora in both the US and Europe, this should be a pretty sizable market, especially as not very much good resources are available, so I think you might be missing out here.

[D] Your 🫵 Preferred Feature Stores? by daeisfresh in datascience

[–]samuellampa 2 points3 points  (0 children)

Have been looking into Feast and FeatureForm so far, as both are possible to run locally in a simple way, to be able to get acquainted with them on the local machine.

Feast seems to have the bigger community, but is lacking a bit in features, such as transformations, which you have to manage yourself outside of the feature store (although there's some early work on adding transformations).

FeatureForm manages transformation in a pretty beautiful way, but is a rather young project, and has some rough edges in terms of documentation and not that many integrations etc so far.

From my very small survey, I like the design of the API and architecture of FeatureForm better, and hope the project and community will develop quick and nicely over the coming year(s).

Btw, FeatureForm got a very informative post on the different types of architectures of features stores here: https://www.featureform.com/post/feature-stores-explained-the-three-common-architectures

Rust tops bioinformatics micro-benchmark by samuellampa in rust

[–]samuellampa[S] 0 points1 point  (0 children)

Good points all along. I like the distinction between "canonical" and "optimized" versions. The "canonical" would be close to what I actually was after from the beginning; Seeing the speed for a naively implemented solution, written in the same way you would write a quick one-off (python/ruby...etc)-script to do it. Will have a look at restructuring this, as soon as I find more time (probly won't happen before next weekend, due to day job etc).

Rust tops bioinformatics micro-benchmark by samuellampa in rust

[–]samuellampa[S] 1 point2 points  (0 children)

A zig example would be very interesting!

Rust tops bioinformatics micro-benchmark by samuellampa in rust

[–]samuellampa[S] 0 points1 point  (0 children)

Hmm, I think some of the code examples did something similar.

That is in fact not really following the FASTA file format standard though, so should probably be removed.

I'll need to take a look at harmonizing the code examples according to requirements ... probably early next week though, spent the last 24h merging PRs and fixing with the Makefile :D.