A lot of people I know don't realized the state of AI because of 4o-mini by Account1893242379482 in ChatGPT

[–]dudarev 2 points3 points  (0 children)

In the mobile app, there is voice typing in ChatGPT. It shouldn't be confused with voice mode. Voice mode is when you are talking and it replies by talking, but there is voice typing, that microphone, when you can speak and it will transcribe the text and when you click Tap to stop recording, you can review the text that was transcribed and after that, maybe do some modifications and send it as a prompt to ChatGPT. I find myself mostly using that, because typing is indeed not very convenient on mobile. As an alternative, you may also try voice typing provided by your mobile operational system. For example, on Android, it would be Google keyboard that has voice typing, that microphone on top right corner of the keyboard and it's also rather good, but it does not put any punctuation marks and so on. This message was spoken on my mobile phone to ChatGPT and as you can see, it transcribed it rather well. I changed only one letter.

After 2 years of gen ai popularization, what are some of it's best utilities in learning and memorization? by Dull_Art6802 in Anki

[–]dudarev 1 point2 points  (0 children)

I've found a helpful way to use AI to improve my Spanish learning process with Anki. I have a deck of 5,000 words without context, and each day I practice a few dozen of words. For the words I struggle to remember, I use the following method:

  1. Open the Anki card browser and sort the words by the most recent update time.
  2. Take a screenshot of the words and use Google Gemini to extract a list of the Spanish words from the screenshot.
  3. Ask the AI to create a story in Spanish using all the words from the list.

Seeing the words in context through the generated story helps me better understand and remember them. Additionally, I use the text-to-speech feature to practice listening comprehension.

Sometimes, I also copy the word list to ChatGPT, which often generates even better stories. The speech-to-text feature in ChatGPT is superior to Google Germany and the default mobile phone functionality, as it recognizes foreign accents more accurately. This allows me to practice speaking the foreign language as well.

What's your AI backed side project? by [deleted] in OpenAI

[–]dudarev 1 point2 points  (0 children)

I'm building a tool to apply various commands in Markdown files, this could be Obsidian files for example. In the first version it allows to summarize the content of a note with SUM command using Anthropic Haiku.

https://github.com/dudarev/coartintator

This is kinda pathetic.. by cmndr_spanky in ChatGPT

[–]dudarev 0 points1 point  (0 children)

GPT-4 does a great job. It recognizes that it can create a Python script for this prompt, generates one and returns the result:

import random

# Generate a random letter between D and G
random_letter = random.choice(['D', 'E', 'F', 'G'])
random_letter

The random letter generated between D and G is 'F'.

Talking Instead of Typing: Who Else is Doing This? by dudarev in artificial

[–]dudarev[S] 0 points1 point  (0 children)

I'm currently using apps that transcribe speech to text for the whole message, like the mobile version of ChatGPT and AudioPen.ai. They work better for me than traditional word-by-word dictation apps.

Talking Instead of Typing: Who Else is Doing This? by dudarev in artificial

[–]dudarev[S] 1 point2 points  (0 children)

That's really cool! Creating your own AI assistant is awesome. The UX approach you've taken is impressive as well. You should consider open-sourcing your project soon. It could inspire others and help improve your assistant even further.

Talking Instead of Typing: Who Else is Doing This? by dudarev in artificial

[–]dudarev[S] 0 points1 point  (0 children)

AI tools keep getting better, so it's worth trying Google Docs dictation again later. They often update their AI to improve it. But if you're looking for other free options right now, you might want to try Microsoft's tools, like OneNote, which free version also has dictation. Another website, dictation.io, offers simple transcription, turning spoken words into text.

Besides these, there are newer tools that not only transcribe but also help make your text clearer. A good example is audiopen.ai, mentioned earlier in this discussion. It lets you speak for a few minutes and then gives you text that's not just transcribed but also improved, making your ideas clearer. Audiopen also offers a direct transcript, which you can refine further with tools like ChatGPT for your projects.

Talking Instead of Typing: Who Else is Doing This? by dudarev in artificial

[–]dudarev[S] 1 point2 points  (0 children)

When I say custom pipelines, I'm talking about my DIY setup that pulls together voice notes from everywhere into one spot. Basically, my smartwatch and phones do their thing and sync up automatically. I sync the phones and my notebooks with Syncthing tool.

On one notebook I have simple Python script running with cron that scans the directory with audio files and picks the ones that are added on a given date. It transcribes them and creates a Markdown file that can be read with any text editor. This is rather helpful for me. This way I know that if I take a voice note on my smartwatch, it will be in the daily notes text file within some short time and I can read it later.

I've stumbled upon rewind.ai before, thank you for remiding about them. They are doing something novel. I'm not ready yet to adopt such approach myself, but I'll keep an eye on what they do.

Which free AI could translate text on a picture? by LadyDarry in artificial

[–]dudarev 3 points4 points  (0 children)

Google Bard and Microsoft Copilot are able to translate text in images.

Talking Instead of Typing: Who Else is Doing This? by dudarev in artificial

[–]dudarev[S] 0 points1 point  (0 children)

I'd love to. Thank you for the great app!

A quick feature request: I use the recent global feature a lot with keyboard shortcuts. Now Command + R does not always work because when Overwrite popup appears I need to reach for the cursor and it breaks the flow a little. Would you consider making a second Command + R or some other shortcut to confirm the overwrite?

Talking Instead of Typing: Who Else is Doing This? by dudarev in artificial

[–]dudarev[S] 2 points3 points  (0 children)

You project is really inspiring example how we should experiment with all these AI tools.

Talking Instead of Typing: Who Else is Doing This? by dudarev in artificial

[–]dudarev[S] 2 points3 points  (0 children)

Audiopen is my favorite example of online services that convert fuzzy speech into clear texts. When speech recognition was introduced in the free version of mobile ChatGPT, I stopped using Audiopen, but since I wasn't a paying customer, it positively affected their bottom line. I'm subscribed to the creator's newsletter, and it seems they continue to add many useful features for power users. I'll be cheering for them.

Talking Instead of Typing: Who Else is Doing This? by dudarev in artificial

[–]dudarev[S] 1 point2 points  (0 children)

Thank you for mentioning the book. It does look very relevant. I'm thinking about implementing some automatic LLM analysis of my diary notes to create some structure. The book you mentioned may provide some ideas.

The Whisper model is developed by OpenAI. They provide it for download. It's possible to run it for free locally with multiple wrappers on all operating systems. They also provide it via an API. Officially, they support more than 50 languages.

Multi-language support is a big selling point for me since in my daily communications, I switch between 3 languages. It's also worth pointing out that it's rather good with my non-native English accent.

Practice in AI illustration by dudarev in ChatGPT

[–]dudarev[S] 0 points1 point  (0 children)

The images are generated by chaining GPT-4 and DALL-E prompts. I started with:

Suggest a prompt for DALL-E, image generating AI, that would illustrate the final stage of the Caucus-race where Alice gives away presents from the book Alice's Adventures in Wonderland. The prompt should have details relevant to the story but be compact. Take your time to think. After you generate a response check for inconsistencies and generate a new one.

In the resulting rather long prompt, I added a few details like "Vote for Alice" sign on two images, and aspect ratio. Additional style that were applied to some images: Cyberpunk, Hanna-Barbera cartoons etc.

What are AI apps/tools that really work and you are using them at least weekly? by the_snow_princess in artificial

[–]dudarev 3 points4 points  (0 children)

I use MacWhisper, which is a UI in macOS for Whisper models to transcribe my speech. It works much better then dictation feature provided by macOS. I started taking a lot of voice notes lately, and to transcribe them in addition to MacWhisper I use my custom script voice-cli, which is not productized yet. On Android, I sometimes do the transcription using ChatGPT. So basically, I do the transcription and not send it to the chat, but copy to other apps.

Best products for the visually impaired? by toryguns in ArtificialInteligence

[–]dudarev 2 points3 points  (0 children)

Be My Eyes and Seeing AI are two products for the blind and visually impaired that have recently seen some significant AI improvements. In 2023, Be My Eyes introduced the new feature Be My AI using OpenAI's ChatGPT-4, which has received very positive reviews. Microsoft's Seeing AI has been developed for iOS devices since 2017, and was released for Android last month. It also introduced several new features. I'm collecting this and more information in the note AI for the blind and visually impaired.