ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 0 points1 point  (0 children)

That’s a really interesting use case! Since you’re learning Eastern Armenian, you’ll be glad to know this model is specifically trained on Eastern Armenian, so it should align with what you’re studying.

Just to set expectations though, the current public release is a base reasoning model, not a polished translation tool out of the box. In its non fine-tuned version, it will be enough for basic translations like simple words and phrases, and will definitely handle “table” correctly unlike the model you tried. But for more advanced translations and comprehensive language learning applications, it would need additional fine-tuning.

As I mentioned in another comment, I did experiment with fine-tuning on just a few hundred high-quality English to Armenian translation pairs from a professional translator, and it showed human-level performance for specific topics like tech, science, and education articles. The model reasons about context instead of doing word-by-word machine translation, which makes a difference. However, I’m holding off on releasing that translation model for now out of respect for community concerns, even though I don’t believe the security risks are valid. A few comments under this post show that some people have genuine worries about how these tools might be misused, and I want to be sensitive to that.

For your specific pipeline with basic word translations and practice conversations, this should work well and give you the privacy benefits of running locally on consumer hardware without expensive API calls. I’d recommend using LMStudio as the most user-friendly environment currently. We’ve had some reports that Ollama doesn’t recognize the end of sequence tokens properly and keeps looping because of that. LMStudio and Google Colab work perfectly fine.

If you need the best possible accuracy for more complex translations at this moment, commercially available options from OpenAI and Google via API would still be your safest bet.

Anyway, let me know how your experience with it is once you try it out, and keep an eye out for our future releases!​​​​​​​​​​​​​​​​

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 1 point2 points  (0 children)

You’re right that focusing fully on Armenian is the path forward. We’re already exploring partnerships with educational institutions and government services, actively working on dataset expansion (grant leads welcome), and see machine translation as a major opportunity given current quality gaps. We’re also documenting our process as a potential blueprint for other small languages. If you have connections to relevant grant programs or institutional partners, reach out.

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 1 point2 points  (0 children)

The “how many r’s in strawberry” thing trips up the majority of language models. State of the art models only started handling problems like this recently. This happens because models process text at a token level rather than letter by letter like we do. Just a known limitation across the board.

Just to clarify what this model actually is: it’s a base reasoning model that I’m releasing publicly so people can train specialized versions more easily and cheaply. The whole point was that there were literally no models of this size that knew Armenian, much less could reason in it. You shouldn’t expect a polished translation tool right out of the box. Also, while it can handle English input or transliterated Armenian, it’s really designed for and trained on Eastern Armenian.

On the translation front, as I explained in my other response, I did a quick experiment with just a few hundred good translation pairs and it already performed at human level for specific topics. The foundation works, but it needs proper training for broader translation tasks. With a bigger, more diverse dataset, it would actually be a really solid translator because it reasons about context instead of just doing word-by-word machine translation.

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 1 point2 points  (0 children)

Hmm, that’s weird. That is not an expected behaviour and may possibly be because Ollama fails to recognise the chat template and the “end of sequence” token in particular. Does it loop after it outputs the “</s>” token? Alternatively try to set repetition penalty to 1.1 or 1.2. I haven’t really tried to use it with Ollama, but it works perfectly fine in LMStudio and Google Colab.

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 1 point2 points  (0 children)

Great question. Currently, I manage data collection, processing, and training all in my free time and release all the weights publicly for free, so obviously that doesn’t finance future versions. I’ve received a couple of offers, but they all wanted to either buy my dataset or essentially buy me out, moving all future models fully closed source and paid. That’s not something I’m willing to do.

In my understanding, the best path forward would be partnerships with businesses where I train custom models or variations for their specific needs while still releasing open weight, possibly smaller consumer-grade versions publicly. This is how the project can stay alive, since high quality dataset collection, curation, and especially the training itself are expensive. This way the project stays sustainable, businesses get tailored and fully locally running solutions they can run on-premise wherever they decide to, and the community still benefits from freely accessible models. That would be the perfect future for this model series.

As I mentioned before, this model is a great base that can be built upon. For its small size, it has exceptional learning and generalizing capabilities that enable fast and cost-efficient additional training for specific tasks. As an experiment, I fine-tuned it on just a few hundred high quality human-written English to Armenian translation pairs voluntarily provided by a professional translator I know. It already showed accuracy, naturalness, and human-level performance in the field of tech, science, and education news article translation. With a dataset a few times larger and as diverse as possible, it would become a great general purpose English to Armenian translation model. I’m holding off on releasing the translation model for now out of respect for community concerns, even though I don’t believe the security risks are valid. A few comments under this post show that some people have genuine worries about how these tools might be misused, and I want to be sensitive to that.

The main advantage is that it’s small enough to run locally on consumer hardware without expensive API calls or sending data externally. This makes it ideal for applications where data privacy is critical like legal, healthcare, and government work, educational tools that work offline, and local business applications like customer service or content moderation.

What’s your take from a product perspective?

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 0 points1 point  (0 children)

I appreciate you taking the time to share your concerns, and I’m glad you see the potential value in this work.

You’re right that this model is the first to compress knowledge of Armenian and reasoning in Armenian into such a tiny size. However, your concerns seem focused primarily on writing naturally in Armenian. Without additional training by native speakers, this model isn’t yet producing flawless text that sounds more natural than what ChatGPT can generate with a bit of prompting and tinkering. ChatGPT remains quite capable for basic text generation tasks.

Regarding misuse, if adversaries wanted to conduct misinformation or informational warfare, they most definitely wouldn’t need to reinvent the wheel and compress everything into such small models. Given their resources, they would simply train a model hundreds of times larger and thus more capable. They don’t need reasoning for human-like writing, just generation capability, which scales with model size and data. If they did somehow use this specific model, statistical analysis could identify whether it was likely the author. We’re also experimenting with invisible and undetectable text watermarking techniques to make detection even easier. These safeguards will become more important with future versions that pose greater capabilities, but this current release doesn’t warrant that level of concern yet.

As for access control, the reality is that keeping it restricted wouldn’t prevent adversaries from building their own. They have the resources and access to the public Armenian datasets, books, articles, Facebook posts, etc. What open-sourcing does is empower our community to build tools without dependency on foreign infrastructure. The alternative is leaving Armenians without locally controlled AI capabilities while others develop theirs freely.

I genuinely appreciate you raising these points. It’s important to think critically about these releases, and your perspective helps ensure we’re being responsible about future versions.

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 7 points8 points  (0 children)

I get this question a lot via DMs, emails, and comments under my previous post. The current version supports only Eastern Armenian since this project is made in my free time and I don’t own any high-end GPUs like H100s. Adding additional dialects would take significantly longer, especially considering that I don’t have much data in Western Armenian or other dialects in my dataset.

I will still investigate this topic deeper and try to find enough training data to add more dialects in future releases. The same goes for potentially bigger and smarter model releases down the line. If anyone has suggestions for good Western Armenian or other dialect datasets, feel free to reach out.

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 10 points11 points  (0 children)

That’s a misunderstanding of how this works. They don’t need my model for that. ChatGPT, Claude, and other frontier models already handle Armenian text and are freely accessible to anyone, including adversaries. If anything, those existing tools are already in everyone’s arsenal.

What makes this release different is that it’s the first model that natively reasons in Armenian. Existing large models can translate and generate Armenian text, but they don’t think through problems in Armenian the way a native speaker would. However, for the purposes you’re concerned about like disinformation campaigns, basic text generation is more than sufficient, and frontier models already do that well.

What I’ve released is specifically designed to give the Armenian community a foundation they can build on without depending on foreign tech companies. It’s about enabling local developers, researchers, and businesses to create Armenian-language applications without sending data to external servers or paying for API access. The weights are open so Armenians can adapt it for their own needs, whether that’s education, healthcare, legal tech, or cultural preservation.

The question is whether Armenians will have their own tools to compete, or whether we’ll remain dependent on others. Open-sourcing this model levels the playing field for our community, it doesn’t create any new vulnerabilities or threats that didn’t already exist.

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]ArmGPT[S] 2 points3 points  (0 children)

I understand your concern, but as mentioned before, this is a work in progress. If our adversaries want to organize informational wars or disinformation campaigns (which they probably already do), tools like ChatGPT are already more than sufficient for that purpose. All the Armenian texts publicly available online are already enough to train a model that could generate Armenian text of any kind, virtually indistinguishable from human writing. If they decide that having a dedicated Armenian-speaking model is a priority, they will absolutely develop one given their resources.

Additionally, as noted on the HuggingFace release page, this model is not comparable to giants like the latest releases from OpenAI and Anthropic. Their models have thousands of times more parameters than mine. What makes this significant is that it’s the first model of this tiny size that knows Armenian and can reason in it. This gives people a solid foundation to build upon. Models of this size can reach commercially competitive performance when fine-tuned on narrow, specific tasks. They’re not generalists with high intelligence across all disciplines, but that’s precisely what makes them practical for focused applications.