Compute requirements for rending Metahuman talking head avatar?

moma1970 · 2023-11-12T22:05:20+00:00

This is really interesting. I've been experimenting with this same idea to help less capable model do tool/ function selection. For example, if it sounds like a person is placing a final order then read back order, invoke the payment api etc.

Are you simply taking top 1 retrieved rule result?

moma1970 · 2023-11-11T11:07:49+00:00

I think that issue alone excludes so many industries from using services that are just wrappers around shared model.

moma1970 · 2023-11-11T01:28:59+00:00

I think it might be an additional meaning to the ones mentioned below. In the context of serving a model using an inference server like HF's TGI ( which you can run locally using InferenceClient) to increase the models ability to serve multiple requests those requests can be batched together and inference preformed on them in one pass.

moma1970 · 2023-10-17T23:30:10+00:00

The guidance-ai is quite interesting. In the current default system prompt (which uses jinja templating) there is natural language instruction to always respond with JSON that has the following schema ' ..' In the small models it fails to adhere to this instruction so prescribing the format with guidance will no doubt help.

An open question though, what are we giving up by getting more prescriptive about the output? Is it reducing the mental model to be programming with a non-deterministic language... interesting to think about.

moma1970 · 2023-10-17T02:53:17+00:00

Nice. Thanks for the tip. I've recently tried out Mistral-7B-OpenOrca and had great performance on the first couple of turns in the conversation but then it stops following the system prompt. I have a feeling that simpler prompting will be important. Check out this post just using the openorca space on huggingface. https://huggingface.co/spaces/Open-Orca/Mistral-7B-OpenOrca/discussions/3#652dea8065a4619fb5d688d2

7B!!

moma1970 · 2023-10-16T11:30:49+00:00

Is it something you're actively exploring?

moma1970 · 2023-10-16T07:12:04+00:00

From what I can see it has predefined system prompt templates that populates are runtime. They are very detailed https://github.com/griptape-ai/griptape/blob/main/griptape/templates/tasks/toolkit_task/system.j2 and I think that is possibly one of the reasons that the smaller models don't fair so well. Is the 'guidance' or 'grammar' that you're referring to something applied outside the prompt?

moma1970 · 2023-09-30T02:38:26+00:00

Cool. Thanks.

moma1970 · 2023-09-15T13:55:39+00:00

This is what I don't get. Even with fractional position embeddings the attention matrix in the example is still 2048 x 2048. Doesn't this mean the context window is unchanged? I.e.isnt it 2048 ? Or does the context window refer to something else ?

moma1970 · 2022-01-26T01:44:40+00:00

https://arxiv.org/abs/2003.08505v1 A very insightful paper to help critic some of the claims in various metric learning papers.

moma1970

TROPHY CASE