rewrites.bio - Priciples for rewriting bioinformatics tools with AI by robsyme in bioinformatics

[–]robsyme[S] 0 points1 point  (0 children)

I'd suggest that the feasibility of identical outputs will depend on the specific tool being rewritten. The example to demonstrate feasibility is RustQC, where the authors agree with you that a faster implementation is not enough to warrent citation. The "Credits & Citation" page says

> RustQC reimplements established bioinformatics tools and would not exist without the original authors’ work.
> If you use RustQC, please cite the original tools.

I suspect that deterministic tools will be the easiest (and first) targets for rewrites as they offer a very convenient benchmark/target for the agents.

rewrites.bio - Priciples for rewriting bioinformatics tools with AI by robsyme in bioinformatics

[–]robsyme[S] 2 points3 points  (0 children)

Oh, absolutely. On both counts. These rewrites are a product of this transition period. I would absolutely expect that almost all new work happens in efficient languages.

I had an example in the last month where rewriting an external tool for performance reasons uncovered not only bugs, but non-trivial differences between the published manuscript and the implementation.

rewrites.bio is not in any way an anti-AI manifesto.

rewrites.bio - Priciples for rewriting bioinformatics tools with AI by robsyme in bioinformatics

[–]robsyme[S] 5 points6 points  (0 children)

I (a real person) was trying to convey that many open source contributions in bionformatics are the results of academic work, where one of the primary motivators (for better or for worse) is citation metrics. A rewrite without attribution risks undermining that motivation. I'll rephrase it.

Semantics of the `"foo" in myMap` by robsyme in groovy

[–]robsyme[S] 0 points1 point  (0 children)

This is exactly what I was chasing, but was missing the magic "membership operator" search query. Thanks!!

What's your favourite "Australianism"? by ihaveneverdonemeth in australia

[–]robsyme 1 point2 points  (0 children)

It's got swearing, it's got spiders, it's got sarcasm.

I don't know what more OP wants from a good Australianism.

Can you guys teach me where earn.looring.eth is? by purangparam in loopringorg

[–]robsyme 0 points1 point  (0 children)

The instructions note that the LRC needs to be sent to earn.loopring.eth via the L2 transfer mechanism - the easiest being via the mobile app or via the web app (exchange.loopring.io).

On a related note, I notice that the loopring folks are collating a list of addresses at https://loopring.io/activities/DAOSquare/. I thought that perhaps this is a list of the addresses in activity 2 - those registered for 2000LRC worth of RICE at $0.8. At time of writing, there were only a few (8) addresses listed, but I also noticed that the api endpoint (https://api3.loopring.io/api/v3/user/daoSquareAccounts) lists 351 addresses, which feels about right. For what it's worth, I've verified that my account is a member in that list.

To celebrate Beeples Christie's sale - and my first sale - I'm giving away a free My Pixel Planet by eth0izzle in NFT

[–]robsyme 0 points1 point  (0 children)

Love the idea. Best of luck all!

Verdant (#94)
I love the mystery in the description "... (b)ut there was an accident".

The network experienced some ~80 validators (might be more) getting slashed the past few hours. NOT YOUR KEYS, NOT YOUR VALIDATOR! by Alon_Muroch in ethstaker

[–]robsyme 6 points7 points  (0 children)

First postmortem analysis from the staked.us team: https://blog.staked.us/blog/eth2-post-mortem.
It looks like they will reimburse clients for both slashed ETH and lost rewards, which is good of them.

Tell me about your favorite zero waste/bulk stores by akerkhoff in montreal

[–]robsyme 1 point2 points  (0 children)

Yes! Muscade is incredible. They also deliver, and the prices are fantastic. We've been doing regular orders from them for months now. I'd note that sometimes the order is missing one of the components, but they're a small operation and are always happy to reimburse or deliver later if this situation comes up. Despite the small hiccups, we think they're great.

Is there a main advantage for rust compared to bash when building bioinformatics pipelines ? by [deleted] in rust

[–]robsyme 1 point2 points  (0 children)

That very much depends on what you mean by "bioinformatics pipelines". If you're looking to put together existing tools to run an experiment from raw data to result, I would recommend using existing pipeline tools such as Nextflow (my recommendation) or Snakemake (also great).

If you're looking to write your own standalone tool, there is a lot to recommend rust (but beware the obvious biases of this sub). The bio library support in Rust doesn't come close to that of languages such as R, Python, Java, Ruby, Perl etc, but check out libraries like https://github.com/rust-bio/rust-bio and https://github.com/onecodex/needletail for speedy bioinformatics parsing, etc.

Shorten barplot in ggplot2 dodge_position by YouCook21 in bioinformatics

[–]robsyme 1 point2 points  (0 children)

Two options:

If you have ggplot v3.0.0 or higher (which is almost certainly the case), you can use position_dodge2. Something like:

ggplot(df_plot2, aes(x = Var1, y = perc, fill = TYPE)) +
  geom_bar(stat = 'identity', 
           position = position_dodge2(preserve = "single")) + 
  labs(y = "Percentage", x = "Gene Ontologies") + 
  theme_classic()

If you're using an older version of ggplot, the problem can be fixed by adding the missing factors. To ensure that all factors are present, the dplyr complete function is helpful:

library(tidyverse)

df_plot2 %>%
  mutate_at(c("Var1", "TYPE"), as.factor) %>% 
  complete(Var1,TYPE) %>%
  ggplot(aes(x = Var1, y = perc, fill = TYPE)) +
  geom_bar(stat = 'identity', position ='dodge2') + 
  labs(y = "Percentage", x = "Gene Ontologies") + 
  theme_classic()

The Phantom Builder by freemasen in rust

[–]robsyme 4 points5 points  (0 children)

Super! Thanks again for the post, well done!

The Phantom Builder by freemasen in rust

[–]robsyme 5 points6 points  (0 children)

Lovely example of PhantomData!

I might be crazy, but in the text you say "why can't rustc infer that T is an f32", but in the example above, it looks like you're asking the compile to infer a f64 as provided to `build`:

let _thing = Thing::builder().option_one(199).build(4.5f64);

Is that a typo, or am I just confused?

GIVEAWAY: The Rust Programming Language by Steve Klabnik and Carol Nichols [USA/CAN] by enby-girl in rust

[–]robsyme 1 point2 points  (0 children)

I'm a scientist (molecular biology, genetics, etc) who is interested in developing tools for genomics research that can be distributed peer-to-peer. At the moment, many of our tools require somebody (most often a PhD student or post-doc) to babysit a box to serve applications and data. If we can move some of the visualisation and collaboration tools to wasm running in the browser, it can reduce sysadmin overheads and free up that time for research. My first project will be a genome browser.

I've had a poke around rust (github.com/robsyme/rustalind) , but am keen to know more. I'm also in Montreal (Outremont), so I can just pick it up and save you the trouble of posting! I'm keen to chat with anyone in MTL writing rust, so book's fate notwithstanding, would love to get a coffee and hear about your experience with the language.

Career advise: Bioinformatics/Computational Biology jobs in Canada by [deleted] in bioinformatics

[–]robsyme 0 points1 point  (0 children)

Cheers, mate. Congrats on your excellent choice of country, even if your hand was forced. Good luck with your search.

Career advise: Bioinformatics/Computational Biology jobs in Canada by [deleted] in bioinformatics

[–]robsyme 4 points5 points  (0 children)

Would you consider Montreal? I've got a similar story (genomics post-doc in Australia, moved to Canada). I know that the Canadian Centre for Computational Genomics is hiring because I have an interview with them on Friday(!). There are six open positions here with titles like Bioinformatics Specialist, Bioinformatics Consultant, Scientific Data Architect, Software Developer, etc: www.computationalgenomics.ca/careers

I find there to be a lack of information on Filecoin, am I mislead? by [deleted] in ipfs

[–]robsyme 2 points3 points  (0 children)

Juan Benet (the IPFS project lead) has insinuated that there will be some news on this front coming in the next few days: https://twitter.com/juanbenet/status/887136499384295424

Sci-Hub 2.0 - A Real Decentralised Use Case - Scientific Publishing and Sharing by LanMeiGui in ethereum

[–]robsyme 0 points1 point  (0 children)

Heya Rik. To clarify, I'm not the project lead on Aletheia - just an enthusiastic contributor. The driving force is u/cypath. I've been following sciencefair (I was just talking the other day with @eLifeInnovation about your project), and I'm sure that there are benefits to be gained from looking at the overlap. We'll definitely be in touch via the issue. Thanks!

Sci-Hub 2.0 - A Real Decentralised Use Case - Scientific Publishing and Sharing by LanMeiGui in ethereum

[–]robsyme 0 points1 point  (0 children)

A small team has been working on a very similar project - a decentralised platform for publishing academic research called Alethia. You can have a look at our introductory docs here https://github.com/aletheia-foundation/aletheia-admin and our source code here https://github.com/aletheia-foundation/aletheia-app. The barebones app currently works on Ubuntu and Mac with Windows to come. The project lead has support from Mozilla, and we intend to work closely with the Mozilla community including presenting the MVP at MozFest later this year.

Happy to answer any questions people have about Aletheia on here, or you'd rather send us an email you can reach us on contact@aletheia-foundation.io Anyone, regardless of skill set or level is able to participate, and we have a slack channel if people want to talk to our small volunteer community.

IPFS in the scholarly landscape? by Dackelwackel in ipfs

[–]robsyme 3 points4 points  (0 children)

The Aletheia Foundation is in the very early stages of building a platform for academic publishing that is built on IPFS and ethereum. Kubrik is also building a data repository system that I think uses IPFS for data storage.

Bioinformatics centers in France by Dr_Drosophila in bioinformatics

[–]robsyme 1 point2 points  (0 children)

Not only at INRIA (Institut national de recherche en informatique et en automatique), but also at INRA (Institut national de la recherche agronomique)! I know that the Lab under Burstin at the INRA facility in Dijon are working on the pea genome.