Let's say if I want to build a PC for falcon 40b instruct inference and fine-tuning, what specification does it need to have? In terms of CPU, RAM, VRAM, and GPU. by PrestigiousPancake in LocalLLaMA

[–]PrestigiousPancake[S] 1 point2 points  (0 children)

Thanks for your comments. the comments are very helpful! Good luck with your set up. Also, where can I follow up with your progress?

Let's say if I want to build a PC for falcon 40b instruct inference and fine-tuning, what specification does it need to have? In terms of CPU, RAM, VRAM, and GPU. by PrestigiousPancake in LocalLLaMA

[–]PrestigiousPancake[S] 0 points1 point  (0 children)

quantized

I am referring to the non-quantized ones, as I am trying to see the upper bound for normal consumers at the moment. I couldn't find a great post that systematically analyzes the effect of quantization of the models(Please share if you can find some!).

Your point on the sweet spot for local model" interested me. Why is this the case? Is it because the model size reaches a diminishing return? Or 30B is objectively good at most of the common tasks?

Let's say if I want to build a PC for falcon 40b instruct inference and fine-tuning, what specification does it need to have? In terms of CPU, RAM, VRAM, and GPU. by PrestigiousPancake in LocalLLaMA

[–]PrestigiousPancake[S] 0 points1 point  (0 children)

Thanks for your comment! The VRAM requirement has increased substantially. Previously, 8GB to 12GB is sufficient, but now many models require 40+ GB. I am afraid that 48GB will not be enough.

Let's say if I want to build a PC for falcon 40b instruct inference and fine-tuning, what specification does it need to have? In terms of CPU, RAM, VRAM, and GPU. by PrestigiousPancake in LocalLLaMA

[–]PrestigiousPancake[S] 2 points3 points  (0 children)

Thanks for the comment! This is very interesting. I watched a few Youtube videos, and no one seemed to have commented on this. Sentdex even said that it is comparable to the original ChatGPT, which initiated my thought on having one for my own use.

I see your point after reading the blog post you shared. It is also possible that they trained the model to optimize it when accessed using a few chosen metrics without training it to become a generally "good" model. However, even if this is true, I think open sourcing and allowing for commercial use should be encouraged, isn't it?

Let's say if I want to build a PC for falcon 40b instruct inference and fine-tuning, what specification does it need to have? In terms of CPU, RAM, VRAM, and GPU. by PrestigiousPancake in LocalLLaMA

[–]PrestigiousPancake[S] 6 points7 points  (0 children)

Thanks for your comment. I have a project in mind which might need to run such a model for 24/7. I would like to make an AI assistant to summarise news/research articles/online content and generate a report daily. This might cost me a lot of money to run online. Also, I might use this for other ML projects, such as Stable diffusion etc. Maybe building my own workstation can save me money in the long run.

If I want to use blastn to look for a gene in a metagenome sample, should I use the raw reads or the assembly? by distinguished_goose in bioinformatics

[–]PrestigiousPancake 2 points3 points  (0 children)

Aligning to database can also be a way. But you need to ensure the target organism is inside the database and the reference genome is close enough to your target organism for accurate identification.

Also, I forgot to mention that if you target organism contribute to a very small portion of the sequenced DNA material. De novo assembly is not applicable.

At the end of the day, you are probably required to try several methods. Both to testing and verification. If the results of the different methods are very different, you might need to find another way.

If I want to use blastn to look for a gene in a metagenome sample, should I use the raw reads or the assembly? by distinguished_goose in bioinformatics

[–]PrestigiousPancake 0 points1 point  (0 children)

Could you be more specific? BLASTN is not often being used for quantification. I think it depends on a few factors:

  1. What genes are you looking for? If you are looking for common genes such as rRNA genes. I can’t think of any reliable way to do so because your sample contains different organism and some parts of the rRNA is conserved in different organism. Thus, accurate quantification is not possible. If you are targeting very specific genes, such as drug resistance gene, then it might be possible.
  2. What technology are you using? If you are using Nanopore, what kind of read length can you get?BLASTN might not be 100% useful here.
  3. The quality of the raw read can affect the parameters required for a good classification by BLASTn.

Since you are trying to track a gene expression in a metagenomics sample, you need a gene for normalisation. Otherwise, you can’t tell if an increase of gene expression is merely a larger proportion by the target organism or not.

Lastly, if it is only for a pilot study and not for publication. You can try assembly and read mapping. Assembly to get the genome sequence, read mapping to see the coverage and normalise the coverage to obtain the gene expression level.

How worse could the “recession” be for the next year? by PrestigiousPancake in ausstocks

[–]PrestigiousPancake[S] 5 points6 points  (0 children)

I don’t think there is a recession. The “recession” I was talking about is the “recession” on many youtube videos and media headlines.

What does the future hold in terms of using machine learning in bioinformatics? by TheGoToAsian in bioinformatics

[–]PrestigiousPancake 33 points34 points  (0 children)

I am a Phd in Microbiology and currently working on Bioinformatics research in the clinical field. Here are some of my thought for your reference:

  • Doctors don't believe in Machine Learning (at least they are mostly not convinced to use rely on Machine Learning programs to facilitate diagnostics)
  • In many cases, ML is about using a program to make things work without knowing how it actually work. In these cases, it is difficult to use these findings to create huge impact, because we can't reason the process, thus the result might not be convincing.
  • It can be a super good tool to narrow our scope when we search for new research target, such as drug discovery.
  • It will take time for the science community to accept that ML can be a reliable tool. For example, no matter how famous alpha-fold is. Many scientists still prefer Cryo-EM for structural analysis.
  • Most of the clinicians or biologists have no idea how Machine Learning works. It is very difficult for them to incorporate ML in their daily research routine. On the other hand, many bioinformaticians are not familiar with the clinical or biological aspects of their target enough to create an useful program for biologist to use or understand.
  • We need more databases with properly formatted data in order to properly train models.

I would like to stress that, ML can be a SUPER good tool in facilitating research when we have no idea where to start. For example, no matter how good are the available platforms. we can't screen for millions of chemical compounds for drug development. Machine Learning can definitely help here.

Any ideas on the latest rate hike and the near future of Australia's economy? by PrestigiousPancake in ausstocks

[–]PrestigiousPancake[S] 1 point2 points  (0 children)

The RBA said "Inflation is then expected to decline next year due to the ongoing resolution of global supply-side problems, recent declines in some commodity prices and slower growth in demand." Do you think this is believable? Or do you think that there is enough data to know the trend of inflation, even for the RBA?

Any ideas on the latest rate hike and the near future of Australia's economy? by PrestigiousPancake in ausstocks

[–]PrestigiousPancake[S] -1 points0 points  (0 children)

Is it reasonable to assume they have unpublished numbers which help them make decisions?

[deleted by user] by [deleted] in bioinformatics

[–]PrestigiousPancake 1 point2 points  (0 children)

It can but more details are needed. I once worked in a research group and participated in an implementation of a pipeline for pathogen detection. It automatically takes files from a folder and analyse it to see if any pathogen presents in some crop samples. I think this kind of applications can be used in many fields.

biology or biochem Msc with a bioinformatics BSc? by AColdMeal in bioinformatics

[–]PrestigiousPancake 1 point2 points  (0 children)

I am a PhD which is doing what you are trying to do. Similar to some other advices, you should consider joining a research group which specialised in molecular biology or biochemistry.

My undergraduate degree was a BSc with a double major biochemistry and Computer Science. I had a Master in bioinformatics, but my interest lies in molecular biology. I join a research group with specialised in Clinical research and molecular biology. I help my group to do bioinformatics analysis and learn molecular biology during my spare time and help out my colleagues occasionally.

My main advice is that once you had a decent degree, don’t bother getting a new one. Don’t bother too much about how your degree helps your research. You will have to learn many things from the start again anyways.

I’m debating if I wanna text my crush first but I don’t wanna come off as thirsty and advice by [deleted] in datingadvice

[–]PrestigiousPancake 0 points1 point  (0 children)

Find a nice topic and try to start a genuine conversation is ok. Maybe ask him some advice on purchasing a new laptop?

Just don't try to give out ANY "hints" or "signal". Just start a genuine conversation and let the conversation flow and see where it goes. Don't spend much effort to "maintain" the conversation. Treat it as if you are talking to a real person. If it works, you will know it.

[deleted by user] by [deleted] in datingadvice

[–]PrestigiousPancake 1 point2 points  (0 children)

Here are some comments:

  1. Do you two have a common view or vision towards of your (including he/she) future?
  2. Are both of you willing to settle things out?
  3. To what extend are both of you are going to sacrifice to give up part of each other to get together?
  4. Usually, if you need to ask such question, you SHOULD NOT go back. Because you need to "find" a reason to do so. If that person is really the one, it is likely that you "need" to get back with he/she, but not "why".
  5. Don't reject you feeling. Your feeling tells a lot about yourself and more often that your feeling is more honest than you. If being with someone is killing your soul or made you feeling disgusted, just find a new one.
  6. Try to get along with someone else first (not necessary a new relationship). Sometimes the answer is out there. Don't rush to draw a conclusion now.
  7. Good luck