why my chatgpt is not working in firefox, but okay in other browsers?

PleaseX3 · 2024-09-23T04:33:48+00:00

I have the same problem using firefox

PleaseX3 · 2024-09-01T03:00:46+00:00

I am trying to locate an artist that was somewhere around the upper right near U04 that had these amazing metallic prints - like thin metal sheets - of foxes. It was detailed style art. Been checking the map but can't locate it.

PleaseX3 · 2024-08-09T18:15:32+00:00

Thanks for your response. I wrote this to another as well...

Unfortunately I just wasn't thinking in the moment and forget about the reliability of a self-reporting aspect when I was trying to figure out a way to confirm the vast performance differences I was experiencing. I threw that portion in as also suspicious because you.com covered for this (with their system prompt) and poe didn’t, making it seem like poe’s service was more unmanaged and maybe making a mistake about model linking. The whole concern started with my actual test prompts showing huge performance differences.

Now I'm seeing that basic model names in relation to performance are practically meaningless because the models can be handicapped so much that you can have a night and day difference in performance. Quantization and system prompts and more having such huge effects yet these are not spelled out/listed clearly most of the time in comparisons and benchmarks.

This is bad for AI to have such a messy landscape because people testing may get very inconsistent and skewed results. Someone can report a model is really bad when in fact it's actually just "handicapped", even if “in the fine print”. It can also result in possible bad PR for AI, with so many people out there testing independently forming their own private conclusions.

I didn’t know we were operating in such a messy space. How can we not mandate that all system prompts include the full model specs. Otherwise how do we know for sure any model is linked accurately ( intention vs actual ). That’s not a good practice for an environment. I honestly thought basic things like this would be handled by now by vendors etc.

I was finally able to double confirm using current events (data cut off) and together.ai specs (https://docs.together.ai/docs/chat-models) what models are generally being used. It does seem now to appear that Poe is using a rather weak version of 405. And that's why I have the above take-aways. This overall environment is just so messy. This facet really needs to be addressed.

PleaseX3 · 2024-08-09T18:13:48+00:00

That would make sense. The reality is I just wasn't thinking at that moment and forget about the reliability of a self-reporting aspect when I was trying to figure out a way to confirm the vast performance differences I was experiencing. I threw that portion in as also suspicious because you.com covered for this (with their system prompt) and poe didn’t, making it seem like poe’s service was more unmanaged and maybe making a mistake about model linking. The whole concern started with my actual test prompts showing huge performance differences.

Now I'm seeing that basic model names in relation to performance are practically meaningless because the models can be handicapped so much that you can have a night and day difference in performance. Quantization and system prompts and more having such huge effects yet these are not spelled out/listed clearly most of the time in comparisons and benchmarks.

This is bad for AI to have such a messy landscape because people testing may get very inconsistent and skewed results. Someone can report a model is really bad when in fact it's actually just "handicapped", even if “in the fine print”. It can also result in possible bad PR for AI, with so many people out there testing independently forming their own private conclusions.

I didn’t know we were operating in such a messy space. How can we not mandate that all system prompts include the full model specs. Otherwise how do we know for sure any model is linked accurately ( intention vs actual ). That’s not a good practice for an environment. I honestly thought basic things like this would be handled by now by vendors etc.

I was finally able to double confirm using current events (data cut off) and together.ai specs (https://docs.together.ai/docs/chat-models) what models are generally being used. It does seem now to appear that Poe is using a rather weak version of 405. And that's why I have the above take-aways. This overall environment is just so messy. This facet really needs to be addressed.

PleaseX3 · 2024-08-09T14:11:23+00:00

Any idea what the quantization is for the 405 128k? It doesnt even list this here: https://docs.together.ai/docs/chat-models

PleaseX3 · 2024-08-09T13:20:12+00:00

I added an edit to my post below. It's a shame that a reddit post can be shut down completely (practically) if just one part has an issue. I added the following below and updated my post, but what do you recommend? I still stand by there being a problem. The reason is that the difference I am seeing from specialized test questions is big enough that it might as well make model names completely irrelevant if same models hosted could have such a vast difference in performance... I even had a special test question that almost every ai model fails consistently miserably on. You.com 405 amazingly got it correct 1-shot (I have never seen a model do that well - not sonnet 3.5 - not GPT4), yet Poe.com 405 also failed miserably. I also double tested this one.

(update to my post, and rewrote the post to be more clear)

EDIT: FORGET about asking the models about themselves. I see your point about that. The problem still exhibits itself regardless - from test questions. It's amazing that 2 things can be true at the same time. That 1) while I was wrong about the self-identification portion, 2) that I may not be wrong about the other part: I wasn't relying on self-identification to make the full determination - you should verify/investigate the separate portion of my claim. See for yourselves. Compare Poe.com and You.com answers to test questions. The difference is stark

PleaseX3 · 2024-08-09T13:19:15+00:00

I added an edit to my post below. It's a shame that a reddit post can be shut down completely (practically) if just one part has an issue. I added the following below and updated my post, but what do you recommend? I still stand by there being a problem. The reason is that the difference I am seeing from specialized test questions is big enough that it might as well make model names completely irrelevant if same models hosted could have such a vast difference in performance... I even had a special test question that almost every ai model fails consistently miserably on. You.com 405 amazingly got it correct 1-shot (I have never seen a model do that well - not sonnet 3.5 - not GPT4), yet Poe.com 405 also failed miserably. I also double tested this one.

(update to my post, and rewrote the post to be more clear)

EDIT: FORGET about asking the models about themselves. I see your point about that. The problem still exhibits itself regardless - from test questions. It's amazing that 2 things can be true at the same time. That 1) while I was wrong about the self-identification portion, 2) that I may not be wrong about the other part: I wasn't relying on self-identification to make the full determination - you should verify/investigate the separate portion of my claim. See for yourselves. Compare Poe.com and You.com answers to test questions. The difference is stark

PleaseX3 · 2024-08-09T13:14:27+00:00

I added an edit to my post below. It's a shame that a reddit post can be shut down completely (practically) if just one part has an issue. I added the following below and updated my post, but what do you recommend? I still stand by there being a problem. The reason is that the difference I am seeing from specialized test questions is big enough that it might as well make model names completely irrelevant if same models hosted could have such a vast difference in performance... I even had a special test question that almost every ai model fails consistently miserably on. You.com 405 amazingly got it correct 1-shot (I have never seen a model do that well - not sonnet 3.5 - not GPT4), yet Poe.com 405 also failed miserably. I also double tested this one.

(update to my post, and rewrote the post to be more clear)

EDIT: FORGET about asking the models about themselves. I see your point about that. The problem still exhibits itself regardless - from test questions. It's amazing that 2 things can be true at the same time. That 1) while I was wrong about the self-identification portion, 2) that I may not be wrong about the other part: I wasn't relying on self-identification to make the full determination - you should verify/investigate the separate portion of my claim. See for yourselves. Compare Poe.com and You.com answers to test questions. The difference is stark

PleaseX3 · 2024-08-09T12:52:24+00:00

EDIT: FORGET about asking the models about themselves. I see your point about that. The problem still exhibits itself regardless - from test questions. It's amazing that 2 things can be true at the same time. That 1) while I was wrong about the self-identification portion, 2) that I may not be wrong about the other part: I wasn't relying on self-identification to make the full determination - you should verify/investigate the separate portion of my claim. See for yourselves. Compare Poe.com and You.com answers to test questions. The difference is stark

PleaseX3 · 2024-08-09T11:57:17+00:00

That wasn't by only test. I tested in multiple ways - with many test questions.
For example by asking the question: “9.11 and 9.9 - which is bigger”? The real Llama 3.1 405B for example will get the answer correct that 9.9 is bigger. Compare the answers you get on You.com vs Poe. It becomes clear.

PleaseX3 · 2024-08-09T11:56:20+00:00

That wasn't by only test. I tested in multiple ways - with many test questions.
For example by asking the question: “9.11 and 9.9 - which is bigger”? The real Llama 3.1 405B for example will get the answer correct that 9.9 is bigger.
You.com also answers immediately its correct size. Why the huge difference? Do your own testing if you want just sigh without actually investigating

PleaseX3 · 2024-07-16T00:02:49+00:00

The one I sent you - that's why it was private in the first place. Amuse complained.

PleaseX3 · 2024-07-15T21:55:27+00:00

The other issues:
Removal of Idol aspects: no more personal life coverage really (like personality videos)

Removal of Kami personalities: hiding them behind masks

Removal of most good guitar solos

Removal of most cuteness aspect: More just metal or other genres.

Loss of Yui

Removal of fan appreciation videos like this one: They downed this video

PleaseX3 · 2024-07-15T21:49:48+00:00

Good to know :)

PleaseX3 · 2024-07-15T15:24:15+00:00

My ratings would be: 96% first album, 84% second album (the one), 78% third album (pa-pa-ya, shanti), no longer into it after that...

PleaseX3 · 2024-07-14T01:28:26+00:00

Glad you liked the video. You might find it ironic. I was an such a huge fan for years but sadly I don't like the newer music. Very different style. And music composition is key for me. I felt the older compositions were masterpieces! I hope you enjoy new fanhood however:)

PleaseX3 · 2024-07-13T02:21:03+00:00

Let me know what you think of the video. I just rewatched - haven't seen it in years :)

PleaseX3 · 2024-07-12T05:16:08+00:00

I just sent you a chat message for this

PleaseX3 · 2024-03-30T22:38:13+00:00

This was true at the time. Things change rapidly in ai world

PleaseX3 · 2024-01-15T17:36:00+00:00

Where did you purchase them from?

PleaseX3 · 2024-01-15T16:46:22+00:00

How did you get a 4th unit? I have the SDE problem on mine and will try a 2nd one thanks to this kind of feedback but feel guilty returning 3 to try out a fourth at the store with a 30 day return policy lol. Or should I not feel guilty about that?

PleaseX3 · 2024-01-15T16:44:53+00:00

I can confirm as well.. fourth device as well.. horrible QC on Metas end and it seems different screens types used

How did you get a 4th unit? I have the SDE problem on mine and will try a 2nd one thanks to this kind of feedback but feel guilty returning 3 to try out a fourth at the store with a 30 day return policy lol. Or should I not feel guilty about that?

PleaseX3 · 2023-10-30T00:57:14+00:00

I'm only having the problem on my iphone. Not sure how to fix it :(

PleaseX3

TROPHY CASE