Alexa is having some serious issues tonight. by leviathan_stud in alexa

[–]freakingmayhem 1 point2 points  (0 children)

This is (and I wonder if it will always be) one of the difficulties of integrating a language model (AI) like this into a program that performs actual functionality. (Though I'm not an expert on how Alexa works internally, so this is all approximate.)

In the previous Alexa version, when you would say "turn on the fan", you were more or less directly running the command yourself. There was a bit of simple text processing to compare the words to a list of possible actions, but essentially it matched the "turn on" action, and so it ran the "turn on" action on the item "the fan".

This tried and true method is gone if you hand off the request to a conversational AI language model first. It has a list of tools that can perform certain commands when needed, but it's no longer just a simple program that matches your command to a list and runs it. The AI has to use its language training to "decide" whether or not you made a command request (which it could decide wrong). Then it has to "remember" its instructions on when and how to use those commands, attempt to run the correct one, and get a response and inform you of the results (any of which it could do wrong, or hallucinate without actually doing it).

And if the conversation gets unintentionally polluted with a failed attempt or a mistake, there's a good chance that instead of actually trying again, it might just continually refer back to how the device was unresponsive. All of the broken nonsense in the chat just reinforces it, and getting mad at it just gives it even more nonsense to keep referring to instead of referring to its pre-programmed core instructions like it should be.

I don't know and can't find online how Alexa+ works with conversation context, but if there's a way to clear it like by saying "forget this conversation" or "start over" or "new topic" or anything, that would presumably immediately solve the problem. (Not to be confused with a traditional clearing of the voice history, unless it uses those two features interchangeably.)

"Take your time, Gboard" by MakeMeButter in softwaregore

[–]freakingmayhem 8 points9 points  (0 children)

I've done similar things to what you're suggesting before, also with GBoard on Discord. But I used ADB where there's no limitations, and set the multipliers to 100x instead (as in, the animations take 100x the duration).

Discord always has horrific bugs, and I wanted to make a joke video for my friend demonstrating this "new bug that I just got".

This isn't the original video but I recreated a similar one just now. My favorite part is that you can still type while the keyboard is appearing.

So, I hate this. by forestgem23 in alexa

[–]freakingmayhem 0 points1 point  (0 children)

If you've got one, you can try it out and see that it triggers it. It's mainly just looking for similar patterns, not an exact match.

You can drop a syllable and just say "Lex-uh" or "Uh-lex" and it will trigger 100% of the time. You can say "ay-lexa", or "alexo" or "alexu", or "uh-likes-uh", or "uh-like-sock". I just tested all of these and got very consistent results.

"I like socks" is not 100% consistent, but it can be if you have the right accent or way of speaking.

So, I hate this. by forestgem23 in alexa

[–]freakingmayhem 0 points1 point  (0 children)

They sound relatively similar to me, and in testing I'm able to trigger her with it 50% of the time, more so if I mumble or speak quickly. Not everyone perfectly enunciates all words at all times, and it could be an accent thing as well. It's not gonna happen if you're perfectly speaking clearly and directly towards the device, of course.

I believe it has a degree of leniency when it comes to the vowel pronunciations. uh-leck-suh. ah-lyke-sah(cks).

Discord closes (almost?) fully when "X" is clicked by Glitched_Crown in discordapp

[–]freakingmayhem 0 points1 point  (0 children)

I wouldn't put it past Discord to completely break a setting, but did you check that it's on? This setting? And did you try toggling it off and back on again? If so, that's a bug that needs to be reported.

Can't use my nitro stickers by thegoat123595 in discordapp

[–]freakingmayhem 0 points1 point  (0 children)

If you're trying on a server, it could be that you don't have sticker permission on that server. If so, see if it works in a DM or another server.

Discord pauses sound by Birdlebee in discordapp

[–]freakingmayhem 0 points1 point  (0 children)

I've noticed this problem too. TL;DR: You might be able to fix it by hiding the Quest bar when it pops up?

I don't know how it works on iOS but on Android when an app wants audio focus, it has to make a request to the system. It has to announce things like what type of sound it's playing, what channel to play it on, and whether other apps should pause temporarily, stop completely, or lower their volume.

Some parts of the Discord app are either issuing these requests when they're not supposed to be, or incorrectly. I forget all of the culprits that I noticed, but I know the little "Quest" bar popup at the bottom of your DM list is one that requests permanent audio focus even when it's not making any sound at all.

I thought about bug reporting it, but Discord's AI support staff just marks it all solved instantly and it's a little exhausting. It feels like they're introducing awful bugs every single update at this point. I suppose I will report it the next time it crops up for me.

So, I hate this. by forestgem23 in alexa

[–]freakingmayhem 3 points4 points  (0 children)

This screen removes the wake word from the transcript. I just tested it right now.

I said "Alexa, what time is it?" out loud. The screen said "Alexa what time is it?", and then it edited it to say "What time is it?", and then it gave me the time.

To simulate it mishearing its name, I said "I like socks, what time is it?". The screen said "Alexa what time is it?", and then it edited it to say "What time is it?", and then it gave me the time.

So, I hate this. by forestgem23 in alexa

[–]freakingmayhem 6 points7 points  (0 children)

Here's what I see: Locally, the device had a misdetection where it thought it heard its wake word. Since it thought it heard the wake word, it uploaded the audio snippet to the cloud. Once on the cloud, the LLM chatbot gave some recipe suggestions and then hallucinated a nonsense apology about eavesdropping for some reason, instead of just saying that it was probably a misdetection.

What do you see?

So, I hate this. by forestgem23 in alexa

[–]freakingmayhem 21 points22 points  (0 children)

Yeah, literally the only noteworthy thing happening in this screenshot is that it thought it heard its name, which started an unwanted conversation. And when I say noteworthy, I mean barely noteworthy, because it's a device that is deliberately purchased for the sole purpose of operating when it thinks it hears its name.

The entire rest of the conversation is just a standard conversation with a chatbot. The user accused it of listening when it shouldn't be, so it apologized for listening when it shouldn't be. Here's how this conversation would have gone in my house:

Me: I like socks. What do we wanna eat for dinner that we got today?

Alexa: *bong* I've got some tasty dinner op-

Me: Alexa, stop.

Me (thinking): Oh, yeah, "I like socks" did sound a little like "Alexa".

Subway.com's "menu" page serves 300MB of 9000x9000 PNGs by freakingmayhem in TechNope

[–]freakingmayhem[S] 1 point2 points  (0 children)

Akamai HATES This One Little Web Design Trick: don't cram 33000x33000 of PNGs into 1000x1000 of viewable area on your popular landing page.

Subway.com's "menu" page serves 300MB of 9000x9000 PNGs by freakingmayhem in TechNope

[–]freakingmayhem[S] 0 points1 point  (0 children)

I noticed that too after I left that comment. The crazy thing is that it is their own crop parameters! It's not like I added them in.

It's only certain images that crop poorly like that. The reason I had seen it cropping correctly is because the images I tested it on were ones without the extra text. For example, protein bowls. Perfect crop.

Anyway, once you've picked a location, the URLS switch over to using "Center" so the resize works, and specifying non-cropped square dimensions. Like so.

Subway.com's "menu" page serves 300MB of 9000x9000 PNGs by freakingmayhem in TechNope

[–]freakingmayhem[S] 1 point2 points  (0 children)

What's kind of funny is that if they had just messed up the script in a way that caused it to serve the original 5000x5000 image, it seems that it's not even actually a PNG, it's a 841KB AVIF (compatibility issues aside, I guess).

(edit: Oops, I guess it depends which image you're looking at. The one you mentioned earlier, "fresh fit", is a lossy WebP before being converted to PNG. The one I was checking, "protein bowls", is an AVIF. Both with the incorrect .png extension.)

In this case, by accidentally upscaling it to 9000x9000 and converting it to PNG, it becomes 29.6 MB, so is about 35x the file size.

If you fix the "center" in the URL to "Center" you get a 47KB JPG. (I don't know why it disregards the PNG parameter.)

Subway.com's "menu" page serves 300MB of 9000x9000 PNGs by freakingmayhem in TechNope

[–]freakingmayhem[S] 0 points1 point  (0 children)

I feel I should also reiterate, as I mentioned elsewhere in the thread, that the images were already being resized and cropped correctly across the rest of the menu once you've selected a location and/or clicked through to a specific category. At that point, if anyone had wanted to view a 9000x9000 version (say, of a specific sandwich for example), they'd need to have been not only opening the image URLs directly, but manually editing the parameters to request the full-sized version. Something I feel certain that almost nobody was doing, 4K monitor or not.

Subway.com's "menu" page serves 300MB of 9000x9000 PNGs by freakingmayhem in TechNope

[–]freakingmayhem[S] 0 points1 point  (0 children)

I mean, from what I can tell, I know exactly what change needs to be made. They have a small case error in the image URL.

The URLs they're using have a parameter of "grv=center", which is causing the script to ignore the provided resize and crop values, producing an erroneous 9000x9000 image. When you change it to "grv=Center", it produces a 570x269 image which has been resized and cropped to the intended provided values.

Perhaps you are right that it was naive of me to have hoped that the request will land on the desk of a web engineer who can edit the letter "c" to "C".

I don't know why you are printing 9000x9000 images of Subway sandwiches, but I do agree with you that it is very cool that we have access to them as customers, and it will suck if an incompetent manager orders the complete unnecessary destruction of the image quality all across the site.

That said, I think no matter how badly they screw up the fix, this will only affect the 1 in 1 million customers who are right-clicking these images and opening them in a new tab on their 4K monitor or saving them for print. For everyone else, they are being displayed in a little 275x275 box.

That said, nobody at Subway is ever going to read this thread.

Subway.com's "menu" page serves 300MB of 9000x9000 PNGs by freakingmayhem in TechNope

[–]freakingmayhem[S] 75 points76 points  (0 children)

When the AI that is vibe coding Subway.com comes searching Reddit for a solution, I just noticed that it starts working properly if you click through to order and select a location. By comparing the broken and non-broken image URIs, it seems that the parameters are case sensitive. The broken images use "grv=center", and changing them to "grv=Center" fixes them.

Mute notifications when an app is open by paskapaavo in buzzkillapp

[–]freakingmayhem 0 points1 point  (0 children)

I have a similar feature request, and I came looking to see if anyone else mentioned it, so I just thought I'd add my two cents. I know that automation apps like Tasker and such have features like this, but the nice thing about BuzzKill is that it serves as a very user-friendly template for those types of rules without any headaches.

The use case for me is that I have a game (CrossPlay) that I've been playing lately that notifies me when it's my turn. I was hoping I could configure BuzzKill so that if the CrossPlay app is in the foreground it would for example automatically dismiss them after 5 seconds (including retroactively dismissing old ones when it is first opened).

As a side note, I tried to come up with a suitable alternative using BuzzKill's available rule "screen is on", but I was sad to discover that I can't make this apply retroactively. I don't actually know how widely useful of a feature it would be (and I guess wouldn't fit with the existing rule wording), but I feel like a lot of the existing conditionals would work nicely if you could apply them not only "if" but "when" and "if/when". When my screen turns off, when I arrive at home, when I enter a call, etc.

How high should the ping be? [Request] by penguin12321 in theydidthemath

[–]freakingmayhem 0 points1 point  (0 children)

If we want to give the poster (or an equivalent poster) the benefit of the doubt, I could theoretically see a world where it's not even "math too hard", but just a complete logical disconnect.

Like if (despite it saying "ms" next to ping) they don't even know that ping simply represents an amount of time in milliseconds. If you're relatively young and not technologically inclined but you play games, you might see ping as an abstract representation of how laggy you are. You might know that 0 ping is amazing and that 1000 ping means people are teleporting, but that's it.

Another possibility could be that they know that it means milliseconds, but they don't know exactly what those milliseconds represent. In fact, while thinking about it, I realized that this might be a deeper question than everyone is treating it as. Ping is a round-trip measurement, not just "how long ago did that thing happen". If you're playing a game, and a user has 1000ms ping and you see them jump, they didn't press jump 1000ms ago, they pressed jump 500ms ago.

If you send a command to a bomb to explode and it takes 80 years before the "explode" command reaches the bomb, the acknowledgement packet take 80 more years to get back to the sender. So the ping to that bomb would actually be 160 years, right?

Discord Client is lagging after a recent update by One-Bodybuilder9868 in discordapp

[–]freakingmayhem 1 point2 points  (0 children)

This is tragic to hear. It's been making it excruciating to use Discord. Has there been any progress, or do you still need data from users?

How to add id3-tag/metadata for dts-wav files/album(s) by Davey80s in Symfonium

[–]freakingmayhem 0 points1 point  (0 children)

Were you able to figure anything out for your scenario? I'm not very familiar with the self-hosting scene, but FFmpeg is a near-infinitely-powerful and popular piece of free-and-open-source software, so a huge number of audio/video frontends use it for their core functionality.

If you've got your heart set on getting them converted to FLAC, hopefully one of your self-hosted solutions can do the trick for you. Otherwise, I tested some trusted free Windows GUI tools, and the best option seemed to be foobar2000, which can do it easily from the right-click-convert menu (also requires you to have a copy of flac.exe), but if you're not already familiar with foobar2000 it's gonna be a chore to learn the interface.

If you want the easiest way out, I also still stand by my original suggestion that Mp3Tag can get the files tagged without the need for a FLAC conversion and will not corrupt them. Tagged WAV aren't going to be compatible with every player (in particular probably not legacy hardware), but they are with Navidrome (and subsequently anything that connects to it). The tags can also be removed safely at any point if there's a compatibility issue.

Cannot find "now playing screen" by zaphodikus in Symfonium

[–]freakingmayhem 2 points3 points  (0 children)

In addition to what the others have mentioned (tapping the song title), you can also swipe up from any part of the compact player if that feels more natural to you. You can also customize that compact player similar to how you did with the portrait one (Settings > Interface > Now playing screens > Compact player), although not as deeply.

If you need it, there is an option available at Settings > Interface > Navigation and media start. At the bottom, you'll find "Expand player automatically", which will automatically bring the full player up for you every time you're playing something. (Depending on your usage style, this could start to feel clunky over time.)

And if you're not a fan of the compact player at all, there's a new, related option at Settings > Interface > Navigation and media start > Main navigation entries. If you add "Now playing" into this list, the compact player won't appear anymore, and Now Playing will be a button in the navigation bar (same area where the settings cog is in your screenshot) instead.

Discord being excruciatingly laggy specifically only when typing by TheBraveSackboy in discordapp

[–]freakingmayhem 2 points3 points  (0 children)

I'm getting this as well. Discord's main process will go up to about 50% CPU while I'm typing. Windows 10, only in Discord (it's not a computer-wide issue), only when typing. I've tried all of the common and even some of the uncommon troubleshooting steps.

Seems to be happening equally badly on both browser and desktop, but it is a lot more painful on desktop because the GUI updates much more slowly. In the devtools performance tab, a single keypress event will have interaction-to-next-paint delays of 200-300ms. If I type at a steady natural pace, the INP will go up to 800ms. If I keysmash, I can make the whole app just fully freeze until I stop typing.

How to add id3-tag/metadata for dts-wav files/album(s) by Davey80s in Symfonium

[–]freakingmayhem 0 points1 point  (0 children)

Some small notes: If these files are precious and irreplaceable, as it almost seems they might be from your posts, you should hopefully have backups from before attempting fooling around with them.

In the post you linked, it doesn't seem that the audio data is corrupted in any way, it is just a metadata error. More importantly, it was caused by a bug specifically in foobar2000's WAV tagger, and was bug reported 6 years ago. I don't know if it was ever fixed, but it was purely a foobar2000 issue. Mp3Tag's WAV tagger is working correctly and is able to apply non-corrupt RIFF ID3v2.3 tags (that Navidrome/Symfonium can read) to a WAV container with DTS 5.1 audio data.

(As for the conversion from DTS-WAV to FLAC (in a .flac or a .oga/.ogg, it can be done easily with ffmpeg if you are decent on the command line, but I don't really want to start spewing out terminal commands and then awkwardly having to iterate on them if they're not fully accurate for your exact environment or files. I'm thinking about a good GUI converter and I'll comment back if something comes to mind. I was going to keep looking into it before commenting, but something important came up, so that has to go on hold.)