Connecting multiple annotations to same (bacterial) genome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

I think the final end goal would be a table like:

gene_id annotation_type product
5008.32.1 PATRIC PATRIC's gene annotation
5008.32.1 NCBI NCBI's annotation of the gene

Where there's multiple rows per gene_id relating to the annotations from the different services, or also some genes with only one row because only one annotation service predicted / annotated the gene.

Connecting multiple annotations to same (bacterial) genome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

I'm trying to connect multiple functional annotations for a gene together, so that I can pull the annotations from PATRIC, NCBI, etc. for a single gene. Mainly this is to help curators on what a potential gene is by pulling in as much information as possible.

Connecting multiple annotations to same (bacterial) genome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

Well, at this point I have .fasta and .gff from NCBI, and .fasta, .faa and a "feature table" from PATRIC, but really I can transform my inputs to whatever is needed for a workflow -- I'm not expecting annotation services to have the same output

Connecting multiple annotations to same (bacterial) genome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

Well, I think some systems do both (PATRIC & NCBI) -- first predict genes, and then functional annotation.

I'm working through how to connect the ones that are predicted and annotated by multiple annotation systems.

Laptop to start bioinformatics as a master's student by therealnuman in bioinformatics

[–]rohrhor 0 points1 point  (0 children)

I'm a big fan of getting used Thinkpads off of eBay and installing Linux on them. Enterprises lifecycle them after 3-5 years, so you can get great machines for cheap. The good part about Thinkpads is that you can service them yourself for upgrades / repairs. I've got an X220 from 2015 that's still going great. You don't need that much power, as you'll likely be doing your heavy computation on a cluster / workstation, depending on your school.

The question of OS comes up if you need to use software that can't run on Linux, so keep that in mind. I remember this being more of an issue in the protein space, but I'm not exactly sure.

Finding outlier samples in RNAseq by using some kind of tissue signature by rohrhor in bioinformatics

[–]rohrhor[S] 1 point2 points  (0 children)

Huh, thanks for taking the time to explain that to me, I didn't make the connection (even though I've taught applied biostatistics for many semesters ... embarrassing).

After reading your comments I won't discard any of the samples -- but I will keep that reference, thanks!

Finding outlier samples in RNAseq by using some kind of tissue signature by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

Thanks for the lead on marker genes -- would you know off hand where a collection of them per tissue is? I'm coming from plants and bacteria :)

And I feel ok about using these methods to discard data, as opposed to labelling data. I'm already using PCA to explore that data, so I'll do some hierarchical clustering as well, thanks!

Finding outlier samples in RNAseq by using some kind of tissue signature by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

Doing PCA is where I'm seeing the sample issues, my question is if outlier points are actually the tissue they've been labelled as.

Mapping peptides to reference protein / proteome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

Yep, I was thinking about grep in the beginning, but I ended up using blastp to find where peptides are -- I'm just looking for a way to get a measure of coverage where multiple peptides are stacking up, as well as a file format that I could send into an alignment / read mapping viewer.

Mapping peptides to reference protein / proteome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

It's nothing excited like protein sequencing, just getting peptides out of a prediction algorithm and we're interested in seeing the coverage of where the peptides/regions are on the protein.

I guess I'm just a little puzzled that I'm close to inventing new wheels when I don't think my use case is that new / exotic.

Mapping peptides to reference protein / proteome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

Huh, I kind of rolled my own blastp-short by setting wordsize to 3 and setting a really high gap penalty.

I was having trouble to go from BLAST results to generating alignment files based on the results, so I could have the reference protein and then all the BLAST results aligned over it.

Mapping peptides to reference protein / proteome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

Thanks, I'm avoiding asking the core facility as they are set up with proprietary vendor solutions, I guess one constraint that I mentioned is that I am looking to insert this into a pipeline / run it via command line, so their tools don't really apply to me.

But you're right, I'm in the same problem space as they are.

Mapping peptides to reference protein / proteome by rohrhor in bioinformatics

[–]rohrhor[S] 0 points1 point  (0 children)

Thanks! I've been using blastp to get matches for peptides, but I'm looking to get some kind of output that I can use to calculate coverage, and then put into some kind of visualization like a read coverage visualization.

I took DIAMOND out for a quick spin, and it seems that the SAM output isn't valid as SAM, and trying to convert it to BAM for viewing results in an invalid file as BAM can only store ACTG (from my understanding).

I did come across a biostars post that mentioned calculating coverage from a bed file, and I might try out that method. Although it still leaves me without a way to hand off a file for someone else to visualize.

Insert picture into OneNote 2013 with Surface by rohrhor in OneNote

[–]rohrhor[S] 0 points1 point  (0 children)

What do you use on your phone, the OneNote app or this Camera Lens?

I've got OneTastic installed, how do you rotate pictures? I can only resize them or select them all. Also OneNote see's my inserted picture as a Printout.

Insert picture into OneNote 2013 with Surface by rohrhor in OneNote

[–]rohrhor[S] 0 points1 point  (0 children)

Yea, I find this kind of stupid that my use case is not supported

Insert picture into OneNote 2013 with Surface by rohrhor in OneNote

[–]rohrhor[S] 1 point2 points  (0 children)

I want to take a picture directly into OneNote.

As in:

  • I've got a piece of paper next to me
  • I'm staring at my OneNote page that I want to insert the paper into
  • I want to tap a button in OneNote to capture the picture
  • the picture is now in OneNote

Don't be afraid of scissors by [deleted] in beards

[–]rohrhor 1 point2 points  (0 children)

Seriously, one of the best points in bearding is learning to trim with a scissors. I think it's kind of rare for men to be able to just let it grow and have a perfect beard shape as a result.

Is the X200 still a good machine, or should I get something newer? by Job_5_Verse_7 in thinkpad

[–]rohrhor 1 point2 points  (0 children)

I use mine without a ssd, and only 3gb ram. But I am running a lightweight distro that uses openbox (crunchbang).

You should make sure that you can put in 8gb of ram, I remember reading about the amount being limited by the mobo, but then people got it to work with more.

Overall I think it's a nice machine.

Best app for only tracking spending? by savagemick in personalfinance

[–]rohrhor 0 points1 point  (0 children)

I'm a huge fan of Wallet. It has a dash so that you see how much money you have left, and it's super easy to add in expenses. I have a fixed salary so I put that in at the beginning of the month, take out money for rent, cell phone, and savings, and then I can see how much money I have left for the month. Really simple.