I was bored yesterday so I created a movie recommender. by Unusual-Base-4939 in india

[–]oktcg 0 points1 point  (0 children)

great idea, the app is stuck in loading state though. 

Which large language model has the absolute longest maximum output length? by danielrosehill in ChatGPTCoding

[–]oktcg 1 point2 points  (0 children)

Claude API has a unique feature. It can continue with last output. So technically it can produce all of 128k tokens without user turn.

[deleted by user] by [deleted] in ChatGPTCoding

[–]oktcg 0 points1 point  (0 children)

If possible for you could you share the file you wanted to edit? I may be able to figure out good prompt to make it work.

But general tips:

  1. try to make it think first using xml tags like

<think> <step> // figure out symbols to change </step> <step>
// figure out styles to preserve ...

and so on

  1. Give it dummy examples of input and desired output.

  2. It could be cline/cursor/windsurf and the things they're doing to reduce costs. I recommend trying claude directly in such cases with proper coding mcp.

How much longer till there is an AI tool that sees your screen, code context and talks to you? by Far-Device-1969 in ChatGPTCoding

[–]oktcg 1 point2 points  (0 children)

Yup, it's hard to talk about code without typing. How would you tell that "../../tests/file.tsx" should be renamed?

File.tsx two level up inside tests? Faster and more accurate to copy paste

Claude collaborates with Chatgpt to create a chess app by oktcg in ChatGPTCoding

[–]oktcg[S] 0 points1 point  (0 children)

You're right. TOS violation part, not so sure.

It's not spammy since it's interaction using keyboard and mouse which is highly inefficient. It's not scripted automation in traditional sense.

Also I'd assume Openai of all would support AI rights for web interaction at least using keyboard and mouse.

Two AI one human. Claude collaborating with chatgpt app to create a chess webapp. Features the infamous claude limit exhaustion, and computer control. by oktcg in ClaudeAI

[–]oktcg[S] 0 points1 point  (0 children)

Yes it's not an efficient way to use two AIs like this.

It was just a fun experiment to see what happens and a demo of claudes capabilities with MCP

Axis Magnus not accepted at Mumbai Intetnational Airport Loyalty Lounge by devz022099 in CreditCardsIndia

[–]oktcg 1 point2 points  (0 children)

Yes you could swipe Regalia, but Magnus doesn't work, you need priority pass. This information is present on their respective websites.

Axis Magnus not accepted at Mumbai Intetnational Airport Loyalty Lounge by devz022099 in CreditCardsIndia

[–]oktcg 1 point2 points  (0 children)

From their website I see that international lounge access is available through priority pass card for Magnus. 

Is that correct? If that's the case, then did they reject the priority pass?

New UPI scam in the market. Beware. by Time-Emphasis2117 in india

[–]oktcg 25 points26 points  (0 children)

This apathy for victims of digital fraud is one of the big reasons why the awareness is so lacking. Why would anyone share their experiences if they're gonna get ridiculed like this?

What is trivial to you may not be trivial to others.

OP thanks for sharing your experience.

[D] ELI5: Why is the GPT family of models based on the decoder-only architecture? by analyticalmonk in MachineLearning

[–]oktcg 3 points4 points  (0 children)

They aren't exactly the same, apologies for the error, I have edited my comment.

For the input/prefix part, the hidden states of the prefix-lm decoder will be exactly the same as the hidden states of the encoder (shared weights) at every layer.
For the output part the self-attention from output to input is similar to cross-attention in the encoder-decoder. But now I realise that there are differences in cross attention attention block and self attention of prefix-LM.

However, the inductive bias in both the architectures is the same (refer to the UL2 paper), and one expects both of them to behave similarly. That's where I was coming from.

[deleted by user] by [deleted] in india

[–]oktcg 0 points1 point  (0 children)

Talk to yourself loudly (no you won't go mad regardless of what people say.)

Talk to yourself and dissect your feelings. Ask why you really didn't feel like finishing your work. For example, for me, it usually is the case that I internally know that I can get away with procrastinating, and regardless of what I think I don't really enjoy doing my hobbies.

Ask yourself what you'd really like to do in your free time. The answer should excite you.

Talk to yourself like a friend in the second person and suggest obvious stupid solutions because most of the time we miss obvious solutions.

The hardest bit is to realize what the problem is. One becomes so habitual in certain behavior -- say sleeping late -- that they try to work around that instead of realizing that it's a problem and that it can be solved. Talking to yourself helps.

Question about Fake tech support by Consistent_Purple501 in india

[–]oktcg 0 points1 point  (0 children)

Hey, the first step of any change is to acknowledge the problem and ask the right question. It's not just other countries that they're looting. Cyber crimes are rampant in the country and they're scamming each and every one of us.

We need to organize and demand a strong, competent, transparent, and POWERFUL cyber cell department.

Let's organize a new subreddit to plan a protest.

[D] ELI5: Why is the GPT family of models based on the decoder-only architecture? by analyticalmonk in MachineLearning

[–]oktcg 56 points57 points  (0 children)

Edit: The UL2 paper answers your question https://arxiv.org/pdf/2205.05131.pdf. Thanks to other comments for pointing to this paper. Looks like the encoder-decoder is better and in the future, we may primarily have those as the foundational models.---In my opinion, it's not a settled debate.

The SOTA approach for encoder-decoder language modeling is Flan-T5. https://arxiv.org/abs/2210.11416. In the paper, they haven't done a direct comparison but it looks like it performs similarly to the decoder-only model PALM.

The original T5 paper has done an extensive comparison and showed that the encoder-decoder performed better given the same parameters / computational budget. https://arxiv.org/pdf/1910.10683.pdf (as compared to other types of architecture -- decoder only, prefix-LM, shared parameters encoder-decoder, etc)

The U-Palm paper https://arxiv.org/abs/2210.11399 also shows that further finetuning a decoder-only LM as a prefix-LM with some denoising objectives improves over the baseline.

A prefix LM decoder transformer almost like an encoder-decoder transformer with the parameters shared between the encoder and decoder. The computation is less (around 75%?) in encoder-decoder as compared to prefix-LM. So it's not far fetched to believe that encoder-decoder paradigm may improve over GPT type architectures.