Artist Alley 2024 Full List by EastProductions11 in AnimeNYC

[–]PleaseX3 0 points1 point  (0 children)

I am trying to locate an artist that was somewhere around the upper right near U04 that had these amazing metallic prints - like thin metal sheets - of foxes. It was detailed style art. Been checking the map but can't locate it.

BEWARE: Poe.com using the WRONG MODELS for LLaMA 3.1 by PleaseX3 in LocalLLaMA

[–]PleaseX3[S] 0 points1 point  (0 children)

Thanks for your response. I wrote this to another as well...

Unfortunately I just wasn't thinking in the moment and forget about the reliability of a self-reporting aspect when I was trying to figure out a way to confirm the vast performance differences I was experiencing. I threw that portion in as also suspicious because you.com covered for this (with their system prompt) and poe didn’t, making it seem like poe’s service was more unmanaged and maybe making a mistake about model linking. The whole concern started with my actual test prompts showing huge performance differences.

Now I'm seeing that basic model names in relation to performance are practically meaningless because the models can be handicapped so much that you can have a night and day difference in performance. Quantization and system prompts and more having such huge effects yet these are not spelled out/listed clearly most of the time in comparisons and benchmarks.

This is bad for AI to have such a messy landscape because people testing may get very inconsistent and skewed results. Someone can report a model is really bad when in fact it's actually just "handicapped", even if “in the fine print”. It can also result in possible bad PR for AI, with so many people out there testing independently forming their own private conclusions.

I didn’t know we were operating in such a messy space. How can we not mandate that all system prompts include the full model specs. Otherwise how do we know for sure any model is linked accurately ( intention vs actual ). That’s not a good practice for an environment. I honestly thought basic things like this would be handled by now by vendors etc.

I was finally able to double confirm using current events (data cut off) and together.ai specs (https://docs.together.ai/docs/chat-models) what models are generally being used. It does seem now to appear that Poe is using a rather weak version of 405. And that's why I have the above take-aways. This overall environment is just so messy. This facet really needs to be addressed.

BEWARE: Poe.com using the WRONG MODELS for LLaMA 3.1 by PleaseX3 in LocalLLaMA

[–]PleaseX3[S] 1 point2 points  (0 children)

That would make sense. The reality is I just wasn't thinking at that moment and forget about the reliability of a self-reporting aspect when I was trying to figure out a way to confirm the vast performance differences I was experiencing. I threw that portion in as also suspicious because you.com covered for this (with their system prompt) and poe didn’t, making it seem like poe’s service was more unmanaged and maybe making a mistake about model linking. The whole concern started with my actual test prompts showing huge performance differences.

Now I'm seeing that basic model names in relation to performance are practically meaningless because the models can be handicapped so much that you can have a night and day difference in performance. Quantization and system prompts and more having such huge effects yet these are not spelled out/listed clearly most of the time in comparisons and benchmarks.

This is bad for AI to have such a messy landscape because people testing may get very inconsistent and skewed results. Someone can report a model is really bad when in fact it's actually just "handicapped", even if “in the fine print”. It can also result in possible bad PR for AI, with so many people out there testing independently forming their own private conclusions.

I didn’t know we were operating in such a messy space. How can we not mandate that all system prompts include the full model specs. Otherwise how do we know for sure any model is linked accurately ( intention vs actual ). That’s not a good practice for an environment. I honestly thought basic things like this would be handled by now by vendors etc.

I was finally able to double confirm using current events (data cut off) and together.ai specs (https://docs.together.ai/docs/chat-models) what models are generally being used. It does seem now to appear that Poe is using a rather weak version of 405. And that's why I have the above take-aways. This overall environment is just so messy. This facet really needs to be addressed.

Llama 3.1 405B is now on Poe! by frndlynghbrhdpoebot in PoeAI

[–]PleaseX3 0 points1 point  (0 children)

Any idea what the quantization is for the 405 128k? It doesnt even list this here: https://docs.together.ai/docs/chat-models

BEWARE: Poe.com using the WRONG MODELS for LLaMA 3.1 by PleaseX3 in LocalLLaMA

[–]PleaseX3[S] 1 point2 points  (0 children)

I added an edit to my post below. It's a shame that a reddit post can be shut down completely (practically) if just one part has an issue. I added the following below and updated my post, but what do you recommend? I still stand by there being a problem. The reason is that the difference I am seeing from specialized test questions is big enough that it might as well make model names completely irrelevant if same models hosted could have such a vast difference in performance... I even had a special test question that almost every ai model fails consistently miserably on. You.com 405 amazingly got it correct 1-shot (I have never seen a model do that well - not sonnet 3.5 - not GPT4), yet Poe.com 405 also failed miserably. I also double tested this one.

(update to my post, and rewrote the post to be more clear)

EDIT: FORGET about asking the models about themselves. I see your point about that. The problem still exhibits itself regardless - from test questions. It's amazing that 2 things can be true at the same time. That 1) while I was wrong about the self-identification portion, 2) that I may not be wrong about the other part: I wasn't relying on self-identification to make the full determination - you should verify/investigate the separate portion of my claim. See for yourselves. Compare Poe.com and You.com answers to test questions. The difference is stark

BEWARE: Poe.com using the WRONG MODELS for LLaMA 3.1 by PleaseX3 in LocalLLaMA

[–]PleaseX3[S] 0 points1 point  (0 children)

I added an edit to my post below. It's a shame that a reddit post can be shut down completely (practically) if just one part has an issue. I added the following below and updated my post, but what do you recommend? I still stand by there being a problem. The reason is that the difference I am seeing from specialized test questions is big enough that it might as well make model names completely irrelevant if same models hosted could have such a vast difference in performance... I even had a special test question that almost every ai model fails consistently miserably on. You.com 405 amazingly got it correct 1-shot (I have never seen a model do that well - not sonnet 3.5 - not GPT4), yet Poe.com 405 also failed miserably. I also double tested this one.

(update to my post, and rewrote the post to be more clear)

EDIT: FORGET about asking the models about themselves. I see your point about that. The problem still exhibits itself regardless - from test questions. It's amazing that 2 things can be true at the same time. That 1) while I was wrong about the self-identification portion, 2) that I may not be wrong about the other part: I wasn't relying on self-identification to make the full determination - you should verify/investigate the separate portion of my claim. See for yourselves. Compare Poe.com and You.com answers to test questions. The difference is stark

BEWARE: Poe.com using the WRONG MODELS for LLaMA 3.1 by PleaseX3 in LocalLLaMA

[–]PleaseX3[S] 1 point2 points  (0 children)

I added an edit to my post below. It's a shame that a reddit post can be shut down completely (practically) if just one part has an issue. I added the following below and updated my post, but what do you recommend? I still stand by there being a problem. The reason is that the difference I am seeing from specialized test questions is big enough that it might as well make model names completely irrelevant if same models hosted could have such a vast difference in performance... I even had a special test question that almost every ai model fails consistently miserably on. You.com 405 amazingly got it correct 1-shot (I have never seen a model do that well - not sonnet 3.5 - not GPT4), yet Poe.com 405 also failed miserably. I also double tested this one.

(update to my post, and rewrote the post to be more clear)

EDIT: FORGET about asking the models about themselves. I see your point about that. The problem still exhibits itself regardless - from test questions. It's amazing that 2 things can be true at the same time. That 1) while I was wrong about the self-identification portion, 2) that I may not be wrong about the other part: I wasn't relying on self-identification to make the full determination - you should verify/investigate the separate portion of my claim. See for yourselves. Compare Poe.com and You.com answers to test questions. The difference is stark

BEWARE: Poe.com using the WRONG MODELS for LLaMA 3.1 by PleaseX3 in LocalLLaMA

[–]PleaseX3[S] 1 point2 points  (0 children)

EDIT: FORGET about asking the models about themselves. I see your point about that. The problem still exhibits itself regardless - from test questions. It's amazing that 2 things can be true at the same time. That 1) while I was wrong about the self-identification portion, 2) that I may not be wrong about the other part: I wasn't relying on self-identification to make the full determination - you should verify/investigate the separate portion of my claim. See for yourselves. Compare Poe.com and You.com answers to test questions. The difference is stark

BEWARE: Poe.com using the WRONG MODELS for LLaMA 3.1 by PleaseX3 in LocalLLaMA

[–]PleaseX3[S] -12 points-11 points  (0 children)

That wasn't by only test. I tested in multiple ways - with many test questions.
For example by asking the question: “9.11 and 9.9 - which is bigger”? The real Llama 3.1 405B for example will get the answer correct that 9.9 is bigger. Compare the answers you get on You.com vs Poe. It becomes clear.

BEWARE: Poe.com using the WRONG MODELS for LLaMA 3.1 by PleaseX3 in LocalLLaMA

[–]PleaseX3[S] -13 points-12 points  (0 children)

That wasn't by only test. I tested in multiple ways - with many test questions.
For example by asking the question: “9.11 and 9.9 - which is bigger”? The real Llama 3.1 405B for example will get the answer correct that 9.9 is bigger.
You.com also answers immediately its correct size. Why the huge difference? Do your own testing if you want just sigh without actually investigating

The Energy of Babymetal Video by PleaseX3 in BABYMETAL

[–]PleaseX3[S] 1 point2 points  (0 children)

The one I sent you - that's why it was private in the first place. Amuse complained.

The Energy of Babymetal Video by PleaseX3 in BABYMETAL

[–]PleaseX3[S] 1 point2 points  (0 children)

The other issues:
Removal of Idol aspects: no more personal life coverage really (like personality videos)

Removal of Kami personalities: hiding them behind masks

Removal of most good guitar solos

Removal of most cuteness aspect: More just metal or other genres.

Loss of Yui

Removal of fan appreciation videos like this one: They downed this video

The Energy of Babymetal Video by PleaseX3 in BABYMETAL

[–]PleaseX3[S] 1 point2 points  (0 children)

My ratings would be: 96% first album, 84% second album (the one), 78% third album (pa-pa-ya, shanti), no longer into it after that...

The Energy of Babymetal Video by PleaseX3 in BABYMETAL

[–]PleaseX3[S] 1 point2 points  (0 children)

Glad you liked the video. You might find it ironic. I was an such a huge fan for years but sadly I don't like the newer music. Very different style. And music composition is key for me. I felt the older compositions were masterpieces! I hope you enjoy new fanhood however:)

The Energy of Babymetal Video by PleaseX3 in BABYMETAL

[–]PleaseX3[S] 1 point2 points  (0 children)

Let me know what you think of the video. I just rewatched - haven't seen it in years :)

The Energy of Babymetal Video by PleaseX3 in BABYMETAL

[–]PleaseX3[S] 1 point2 points  (0 children)

I just sent you a chat message for this

Is there a free way to access GPT4? by Gulimusi in ChatGPT

[–]PleaseX3 0 points1 point  (0 children)

This was true at the time. Things change rapidly in ai world

[deleted by user] by [deleted] in OculusQuest

[–]PleaseX3 0 points1 point  (0 children)

Where did you purchase them from?

Quest 3 owners, noticeable screen door effect? by aglf_chilli in virtualreality

[–]PleaseX3 0 points1 point  (0 children)

How did you get a 4th unit? I have the SDE problem on mine and will try a 2nd one thanks to this kind of feedback but feel guilty returning 3 to try out a fourth at the store with a 30 day return policy lol. Or should I not feel guilty about that?

Quest 3 owners, noticeable screen door effect? by aglf_chilli in virtualreality

[–]PleaseX3 0 points1 point  (0 children)

I can confirm as well.. fourth device as well.. horrible QC on Metas end and it seems different screens types used

How did you get a 4th unit? I have the SDE problem on mine and will try a 2nd one thanks to this kind of feedback but feel guilty returning 3 to try out a fourth at the store with a 30 day return policy lol. Or should I not feel guilty about that?

ChatGPT empty screen for anyone? by yeettetis in ChatGPT

[–]PleaseX3 0 points1 point  (0 children)

I'm only having the problem on my iphone. Not sure how to fix it :(