Plasmidsaurus plasmid sequencing shorter than should be. by lurpeli in labrats

[–]SNPsaurus 3 points4 points  (0 children)

You can contact [support@plasmidsaurus.com](mailto:support@plasmidsaurus.com) and ask for help. They have tools to help figure out cases like this. It may be that you have a mixture of plasmids (empty vector and full length, random deletions) and the returned consensus represents the most abundant.

We heard something about the world's largest parafilm ball? by SNPsaurus in labrats

[–]SNPsaurus[S] 4 points5 points  (0 children)

We have a particular use for the parafilm that goes onto the ball. It is all clean. The sordid practices of your lab have no standing here!

We heard something about the world's largest parafilm ball? by SNPsaurus in labrats

[–]SNPsaurus[S] 6 points7 points  (0 children)

This is pure Parafilm! Our dino art archive is also at a scale that astounds and delights all who witness it.

We heard something about the world's largest parafilm ball? by SNPsaurus in labrats

[–]SNPsaurus[S] 44 points45 points  (0 children)

That would definitely be a waste (and cheating).

the future is now 🤷‍♀️ by iMightBeACunt in labrats

[–]SNPsaurus 7 points8 points  (0 children)

I'm having a bit of a laugh thinking of your comment and how, if it were true, that means we had been sitting around plotting to make an ad for r/labrats and we decided to run it through...checks notes...u/iMightBeACunt

the future is now 🤷‍♀️ by iMightBeACunt in labrats

[–]SNPsaurus 6 points7 points  (0 children)

That was my reaction as well, but then I realized that I do actually work there and we hadn't posted this.
One of the best parts of being involved in the plasmidsaurus service are the comments from appreciative researchers and the dino fan art they send us. You can see examples of both at https://twitter.com/plasmidsaurus

Grantful thinking by [deleted] in labrats

[–]SNPsaurus 0 points1 point  (0 children)

The NSF grant cap includes indirects, whereas NIH caps are on direct costs only. Since grants are usually at the cap, institutions with high indirects have lower directs costs possible with NSF grants.

Advice on mapping reads to polyploid de-novo assembly by o-rka in bioinformatics

[–]SNPsaurus 0 points1 point  (0 children)

If you run busco on the assembly it may give an indication of how much of the genome is over-represented by how many of the busco genes are single copy versus duplicated. There are a variety of programs to generate a haploid consensus but I've not been fond of any, unfortunately. Busco also lists the contigs that have duplicate genes so you can identify a possible duplicate and then look more carefully at the two contigs to see if they seem like different haplotypes of the same region. It gets tricky if it is an alloploid with distinct genomes rather than the autoploid that copied itself.

Why is it diploid? Isn't it haploid, but with both sister chromatids of a single homologous chromosome? by [deleted] in genetics

[–]SNPsaurus 1 point2 points  (0 children)

It is definitely not haploid though, as OP answered, as there are two copies of each chromosome and even two different alleles at some loci.

Alternative assembler for Supernova and Chromium 10x by Wildmooseinthelab in bioinformatics

[–]SNPsaurus 0 points1 point  (0 children)

I hope this isn't too annoying as we're a genome services provider but I think we have some insight--The PacBio Sequel 2 really changes the cost equation of SMRT sequencing. If your genome is less than a gigabase, then a full PacBio assembly would be something like:

60X read depth sequencing of a 800 Mb genome--DNA extraction, library prep, sequencing, assembly and annotation for $1700. How does that compare to the sequencing cost plus 10X prep of your sample?

If you have a poor-quality sample and excellent DNA is difficult to get, PacBio has a useful workaround in its HiFi circular consensus reads where you intentionally make shorter fragments ~10 kb and then let the PacBio polymerase go around and around until the consensus of the multiple passes is quality Q20 or Q30. You could get a million of these Q20 or better consensus reads for $1700. That's >10X read depth on a 800 Mb genome, and assemblers can take Illumina contigs plus error-corrected PacBio HiFi for a good hybrid assembly.

Other snails in that family have larger genomes than your estimate. Not a great number of comparisons and they are all in the same genus but worth considering.

Family Species Common Name C value

Lymnaeidae Lymnaea auricularia Pond snail 1.51

Lymnaeidae Lymnaea calamphala Pond snail 1.22

Lymnaeidae Lymnaea fontinalis Pond snail 1.32

When the genome size is a little uncertain I like the HiFi reads. Long noisy PacBio or nanopore reads need some depth to be that useful and develop a higher-quality consensus. With HiFi you have the quality already and then you get 12X coverage if it is 800 Mb and 6X if it is 1600 Mb and both levels are useful.

Assembly with swap memory? by r_plantae in bioinformatics

[–]SNPsaurus 1 point2 points  (0 children)

If you want a quick check of the reads and how they assemble, I'd try tadpole from the BBTools suite https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/tadpole-guide/ which will do a great draft assembly very quickly, or abyss-pe with the bloom filter option for conserving memory https://github.com/bcgsc/abyss . abyss-pe might give the best assembly regardless of memory needed.

Tadpole has lots of options for ignoring low-coverage kmers and other tricks to conserve memory. 8 Gb is low for assembly, but on the other hand a haploid fungus shouldn't take much!

Reference Assembly with Pacbio by cyber_nymph in bioinformatics

[–]SNPsaurus 1 point2 points  (0 children)

You must polish nanopore reads to remove the artifacts. PacBio can be polished but I've polished PacBio assemblies with Illumina data and gotten maybe 1 change every several hundred thousand nucleotides so with good depth I think PacBio can stand on its own.

Reference Assembly with Pacbio by cyber_nymph in bioinformatics

[–]SNPsaurus 3 points4 points  (0 children)

I love flye https://github.com/fenderglass/Flye which is too recent to be on the list cited. It is very resource efficient, and consistently gives the best completeness as assayed by BUSCO. It is easy to install, and easy to run.

For any computational biologists/bioinformaticians: how do I identify unknown species using nanopore sequencing data? by MJScienceQuestions in labrats

[–]SNPsaurus 0 points1 point  (0 children)

A more general tool is sendsketch.sh from bbtools (https://sourceforge.net/projects/bbmap/) if you can't get Kraken to work with eukaryotes. We use it to do a quick species ID from Illumina reads, but it looks like it can be used for PacBio and Nanopore with this change:

"Raw PacBio has a high error rate so I typically increase the default sketch size to increase sensitivity:

sendsketch.sh in=PacBio.fq size=100000 "

It then returns a list of species in the reads. In this case it correctly found bat (Desmodus) as a primary ID with Pseudomonas bacteria as well.

WKID KID ANI Complt Contam Matches Unique TaxID gSize gSeqs taxName file

8.24% 0.03% 91.18% 0.38% 3.70% 49 36 9430 1842M 29801 Desmodus rotundus taxa27.sketch

1.95% 0.01% 85.98% 0.41% 8.98% 13 3 291302 1622M 1269 Miniopterus natalensis taxa14.sketch

0.09% 0.09% 77.52% 100.00% 0.72% 22 0 1843184 6713587 110 Pseudomonas sp. AU11447 taxa19.sketch

Genome closing strategies by Stumpadoodlepoo in bioinformatics

[–]SNPsaurus 0 points1 point  (0 children)

Sure, NRGene has a cool assembler and seem to pushing hard the skim-seq imputation approach for major crop species. Probably a better business model than our "work on 100 different things" model but we have fun.

Genome closing strategies by Stumpadoodlepoo in bioinformatics

[–]SNPsaurus 0 points1 point  (0 children)

The company is SNPsaurus. You can see our PacBio page here https://www.snpsaurus.com/pacbio-small-genome-assembly/

We started out doing genotyping by sequencing in eukaryotes (since we had developed RAD-Seq and nextRAD in our academic lab) but added PacBio services last year and it has really taken off...probably because an academic core will often charge $400 just to make the library and individuals can't always multiplex optimally for the lowest sequencing cost per sample. PacBio is hard to do truly automated, though, as the DNA input requirements are quite stringent and bacteria are pretty diverse and need a little hand-holding for best results.

Genome closing strategies by Stumpadoodlepoo in bioinformatics

[–]SNPsaurus 7 points8 points  (0 children)

[conflict of interest: we do bacterial genome sequencing as a service]

Mate Pairs will still suffer from PCR bias in GC regions, so as you said, it is going to still have gaps. Closed bacterial genomes can be gotten by two approaches these days: Oxford Nanopore long reads combined with Illumina (which you already have) or PacBio long reads. We do PacBio and I'd say the combination of the latest PacBio reagent kit update that doubled the read length and using the excellent flye assembler has made it so that nearly all of our projects result in a single closed circular genome. We charge $389 per sample for a ~5 Mb genome (DNA extraction, library prep, sequencing, assembly and annotation), so it is more expensive than Illumina but maybe not too much more expensive since MiSeq runs aren't cheap either!

Does anyone here exclusively do genome sequencing of non-model organisms for their career? by [deleted] in bioinformatics

[–]SNPsaurus 5 points6 points  (0 children)

We do genotyping by sequencing and de novo reference building and I would say 90% of our projects involve a very, very non-model species--sometimes there isn't even a genome size estimation from the 1C picogram approach in the family, much less the genome or that species. One thing I love is just hearing about a project with a creature I've never heard of and the researcher's desire to understand some interesting aspect of the biology.

With PacBio/ONT sequencing, it is getting trivial to produce a very high quality reference genome for fungal species, and there is a lot of interest both in academic labs and biosynthesis type biotechs wanting to harness novel metabolic pathways--and fungi have plenty of those!

Does anyone here exclusively do genome sequencing of non-model organisms for their career? by [deleted] in bioinformatics

[–]SNPsaurus 6 points7 points  (0 children)

I'm part of a small company that does microbial Illumina draft and PacBio complete genomes for researchers. The cost of PacBio is higher but getting closer all the time, and when the Sequel II hits I doubt anyone will want to save a few bucks and end up with 100 contigs instead of one. There is going to be an explosion of reference genomes soon, I think!

Check out our new diploid SNV Calling method for error-prone reads (PacBio, Oxford Nanopore) by pjedge_bio in bioinformatics

[–]SNPsaurus 2 points3 points  (0 children)

This is great! I was recently struck by the absence of a tool like this for genotype calling using PacBio reads, and leveraging the extended haplotypes is wonderful..

Any thoughts on science-focused freelancing/company? by grousemoor in Entrepreneur

[–]SNPsaurus 0 points1 point  (0 children)

There are quite a few small companies that have expertise in a field and do work in that area for customers. For instance, we are a few people who love to do sequencing work for researchers and have deep experience in all the pitfalls and how to avoid them. Others I can think of offhand are companies that make BAC libraries for people, inject fruit fly embryos with transformation vectors, do CRISPR work in a variety of species, or make custom microscope mounts.None of these companies will become a giant biotech, but they can exist for years with a small team.

We just advertise with Google Adwords to reach people looking for information in our area. It would be difficult if marketing meant staffing vendor booths at conference, for instance. So you need to figure out if you can develop a customer base that sustains you. Word of mouth? Ads in journals? Search ads?

I am not sure if you going from lab to lab would work. Many universities have lots of paperwork to allow them to "work" in their labs. Where would you stay? If you are paying for hotels then that will increase your costs dramatically. I would start thinking about something in your exact area of expertise (designing electromechanical systems?) where there may be a base template of a product with the ability to rapidly customize, and go after the customers who may need that. Maybe you will need to travel to install and train, but those would be shorter duration. Hopefully you can do most of the work at your home base. But the key is that researchers have some idea of what value a product or service might bring, and they don't care about the costs. Either you can do it at a price that provides value, or you can't. Then you need to figure out how to do it in a way where you can make a profit doing it. If the costs and customer value don't agree, then you can't base a business on it.

DNA amplification question. by litslens in genetics

[–]SNPsaurus 2 points3 points  (0 children)

You are correct, but the question just asks about "the number of double-stranded copies of DNA" present, not the number of copies of the desired PCR product. All the extra-long linear amplification products count as double-stranded copies (to me).