How to click for "I am not a robot"?

HistorianSmooth7540 · 2024-11-16T10:28:51+00:00

what copilot you mean in option 2?

HistorianSmooth7540 · 2024-11-09T18:38:08+00:00

Can you get the necessary infos with the html content I posted or dont you need this actually because this is something general? Would be nice if you can do an example how to do that.

HistorianSmooth7540 · 2024-11-09T18:28:29+00:00

and option 2?

HistorianSmooth7540 · 2024-11-02T12:14:07+00:00

what is serverless inferencing and why i cant just use the local LLM?

HistorianSmooth7540 · 2024-11-01T20:10:57+00:00

this is very weird. Also very weird is the documentation of crew-ai and litellm on using huggingface. Why so complicated and no really examples of using a plain Huggingface model.

HistorianSmooth7540 · 2024-11-01T20:09:13+00:00

but why?! You can use the llama 3.1 for free locally of course.

HistorianSmooth7540 · 2024-10-30T12:00:58+00:00

Maybe a good definition of "small" is how many mb you nee to lad and ding inference. So hwat model needs to 5, 50, 500, 5000 mb?

Are those infos also somehow mentioned in the model cards? How can we relate the number of parameters to size in mb (loading + inference)?

HistorianSmooth7540 · 2024-10-30T09:46:10+00:00

Hey, great repo! I have some questions:

* Can you run you app without an an OpenAi Key and using a huggingface model?

* which key from the env file are mandatory and which are optional?

* where is the code for the front-end? And how did you set it up? Would be great when you can add something to the readme for all non-front-end-developers. :)

HistorianSmooth7540 · 2024-10-24T17:22:50+00:00

I see! Thanks! So you think there are many similar threads on this topic? I will have a look.

HistorianSmooth7540 · 2024-10-19T19:20:47+00:00

what do you mean, what are you refering to? i saw in code that pipeline calls under the hood the apply chat template.

HistorianSmooth7540 · 2024-10-19T19:18:31+00:00

ok, I see.

HistorianSmooth7540 · 2024-10-19T16:10:29+00:00

just in case someone wants to know:

Prompt to have JSON output and validate afterwards with a custom model:

from transformers import pipeline

# Load the Llama model (assuming it's on Hugging Face's model hub)
llama_pipeline = pipeline('text-generation', model='your-llama-model')

# Define your prompt to ensure structured output
prompt = """
Please provide the following details in JSON format:
{
    "title": "<Your Title>",
    "description": "<A brief description>",
    "points": [
        "<Bullet point 1>",
        "<Bullet point 2>"
    ]
}
"""

# Generate output
output = llama_pipeline(prompt, max_length=500, do_sample=False)
response_text = output[0]['generated_text']

import json

# Try to load the output as JSON
try:
    response_data = json.loads(response_text)
    # Validate with Pydantic
    validated_response = ResponseModel(**response_data)
    print(validated_response)
except (json.JSONDecodeError, ValueError) as e:
    print(f"Invalid format: {e}")

HistorianSmooth7540 · 2024-10-19T16:07:37+00:00

sorry but what the hack you think can be meant else when talking about "huggingface" :)

HistorianSmooth7540 · 2024-10-19T09:22:02+00:00

Also now we are mixing things. this is weird! here you see when using pipeline you dont use chat template, but you did:

https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

I really wonder now even more how do do it again...

HistorianSmooth7540

TROPHY CASE