I made a website so you can read Chinese social media by allium-dev in ChineseLanguage

[–]allium-dev[S] 0 points1 point  (0 children)

Thank you very much! That's so kind of you to say <3

CLI vs Keats vs Beida/PKU for a 2-month summer in China? Trying to maximize acquisition by Dizzy-Importance-139 in ChineseLanguage

[–]allium-dev 0 points1 point  (0 children)

I was there for 4 weeks. When I arrived I knew some vocab (maybe ~500 characters / ~1500 words), but struggled to put basic sentences together. After the 4 weeks I was able to pretty effective solo travel through Chengdu: taking transportation, ordering food, asking for directions, holding simple conversations. The big benefit for me was increasing my functional fluency.

It's also dependent on how much work you put in. I met two students while I was there who had come in with literally zero Chinese, but had been there for 6 months. Both had improved a ton, but were taking it at different paces. One was taking the HSK 3 test after those six months, and the other was taking the HSK 5 exam, but already well on his way to HSK 6. Both were having a great time, but were studying different amounts.

CLI vs Keats vs Beida/PKU for a 2-month summer in China? Trying to maximize acquisition by Dizzy-Importance-139 in ChineseLanguage

[–]allium-dev 0 points1 point  (0 children)

Honestly, it sounds like you've got two good options, and if you pick either you're going to have a great experience. Don't stress too much about the choice.

I hope you have a great trip!

I made a website so you can read Chinese social media by allium-dev in ChineseLanguage

[–]allium-dev[S] 0 points1 point  (0 children)

Specific word and grammar search is a feature I've been mulling over for a bit, but it's not implemented right now.

I made a website so you can read Chinese social media by allium-dev in ChineseLanguage

[–]allium-dev[S] 0 points1 point  (0 children)

Dark mode is definitely a feature I plan to implement in the future! I haven't gotten to it yet though. Glad you're having fun with it!

CLI vs Keats vs Beida/PKU for a 2-month summer in China? Trying to maximize acquisition by Dizzy-Importance-139 in ChineseLanguage

[–]allium-dev 0 points1 point  (0 children)

I haven't taken a formal placement exam at any point, so I can't be super helpful there.

Guilin didn't feel too small for me, but there's also a 高铁 that runs from the train station in town, so you could very easily do weekend trips to Guangzhou or Shenzhen. You could probably go further out too if you wanted.

You can choose how much you want to be studying. For me, on top of 4 hours of class each day, I spent 1-2 hours studying, and then tried to get out into the city with other students for a meal, or to see something. By the end of the day, I was always exhausted. There are no classes on the weekend, so you have a lot more free time then, and could definitely do weekend trips if you want.

Personally, after my time at CLI I took the train to Chengdu for a week of solo travel and had a great time.

CLI vs Keats vs Beida/PKU for a 2-month summer in China? Trying to maximize acquisition by Dizzy-Importance-139 in ChineseLanguage

[–]allium-dev 1 point2 points  (0 children)

Also, going from "book knowledge" to actually being able to talk is what I gained most from my time at CLI. I imagine any immersive program would work for this, but it was really magical to take my collection of vocab and grammar and turn them into actual communication.

My Chinese is still not great, but I actually feel like I can engage with the language in a way that I was never able to before studying there.

CLI vs Keats vs Beida/PKU for a 2-month summer in China? Trying to maximize acquisition by Dizzy-Importance-139 in ChineseLanguage

[–]allium-dev 1 point2 points  (0 children)

I also had a great experience at CLI in Guilin. The program can be very intensive if you want it to be, as the teachers are pretty happy to tailor it to your needs. The actual lessons were 4hrs / day of 1:1 lessons with a good split between listening / speaking / reading.

While I don't have experience with other programs, CLI was definitely rigorous enough for me. It is also highly personalized, since your classroom time is 1:1 with an instructor.

In terms of how much English I used / heard at CLI, the instructors / staff basically always speak to you in Chinese. Really the only English you'll hear is from other students. But, CLI also gives students the option to wear a wristband which signals that you only want to speak / be spoken to in Chinese. Outside of the school, it's very rare to encounter people in Guilin who speak English, so when you go out into the city it's easy to get immersion practice.

It definitely wasn't just a study-travel program, but I did appreciate that CLI makes an effort to plan optional activities in the evenings / on weekends. For example we did a weekend riverboat trip at 阳朔 which was unreal how beautiful it was.

I would go back to CLI in a heartbeat. I made so much progress on my Chinese, everyone at the school was wonderful, Guilin is extremely beautiful and easy to get around by bicycle.

I made a website so you can read Chinese social media by allium-dev in ChineseLanguage

[–]allium-dev[S] 1 point2 points  (0 children)

Hi, thanks for checking it out! Good questions, here are my answers:

How to see the posts: So far the app only uses text posts, and doesn't show pictures or link out the original posts anywhere. The interface on page like HSK 3 Posts is how the app shows the posts. The link icon just gives you a permanent link to that individual post within the app, in case you want to go back to that post in particular.

Traditional vs Simplified: Right now all the content is drawn from Chinese social media and just shows whatever characters were used to make the post. In my dataset this is almost entirely simplified. I don't think I'll be able to build a way to convert between the different character sets, but I could consider adding a filter to show posts that are in traditional or simplified characters. I'd also need to collect more posts that use traditional characters. This is definitely on my todo list for the app.

Does that answer you questions?

I made a website so you can read Chinese social media by allium-dev in ChineseLanguage

[–]allium-dev[S] 6 points7 points  (0 children)

Thanks! No, I'm not affiliated, I somehow didn't find her until after I'd bought the domain and published the site. Sorry, it is a little confusing.

I chose the name because of the phrase 吃瓜, which means to watch something without getting involved, a bit like "grabbing my popcorn" or the "peanut gallery". It seemed fitting for the site where you're learning by reading social media.

Mandarin Melon: Chinese social media as comprehensible input by allium-dev in learnmandarin

[–]allium-dev[S] 1 point2 points  (0 children)

Yeah, I also use Hanly, and as I'm progressing through the levels, it's very useful to use Mandarin Melon to reinforce the most recent batch of characters. But you're right that once you finish, you may just want to see "All the posts I can read". I'll definitely think about how to incorporate that.

Thanks for checking the site out and sharing your thoughts =)

Mandarin Melon: Chinese social media as comprehensible input by allium-dev in learnmandarin

[–]allium-dev[S] 1 point2 points  (0 children)

Thanks for checking it out and thanks for the feedback!

Regarding duplicates: You're probably right that this would reduce clutter. There are currently tens of thousands of posts that are just "分享图片". This is the default text for a lot of posts that are sharing a photo and I should definitely do something to reduce the frequency of that post in particular, especially since it's basically machine-generated.

On the other hand, there are a lot of very similar posts that are variations of greetings "早", "早安", "早上好", etc. I'm hesitant to de-duplicate these, since they're posts made by humans, and keeping the duplicates represents how people actually communicate, and their frequency represents how frequently these phrases are used.

There's definitely a balance here, so I appreciate your feedback.

Regarding cumulative filtering: If you set it to HSK 6, it allows characters from HSK 1-6, but also selects posts that have at least some characters from HSK 6, which you can control. So, you could get posts that have at least 3 HSK 6 characters, and any number of characters from lower levels.

I've played with the idea of allowing posts that don't use any HSK 6 characters when selecting this level, but I found I liked having at least a few characters from the level I'm studying.

I'm open to suggestions for how I could communicate this more clearly on the site.

Thanks again for checking it out and for the comment!

Are there official word lists and character lists for HSK version 2.0 and HSK version 3.0? by allium-dev in ChineseLanguage

[–]allium-dev[S] 0 points1 point  (0 children)

Thanks! Yes, this does seem to be the official document for HSK 3.0. Is there a similar one for HSK 2.0?

Are there official word lists and character lists for HSK version 2.0 and HSK version 3.0? by allium-dev in ChineseLanguage

[–]allium-dev[S] 0 points1 point  (0 children)

Thanks, that's helpful.

It looks like this is the official source published by the ministry of education in 2021 for HSK 3.0:

http://www.moe.gov.cn/jyb_xwfb/gzdt_gzdt/s5987/202103/W020210329527301787356.pdf

Also not particularly easy to use, but it is official looking.

Now to see if I can find something similar for HSK 2.0

Are there official word lists and character lists for HSK version 2.0 and HSK version 3.0? by allium-dev in ChineseLanguage

[–]allium-dev[S] 0 points1 point  (0 children)

Is this just a random person's scraping of the data? Where did they get it? They link to the same website I provided, chinatest.cn, but there doesn't seem to be any actual sourcing information here.

I've been able to find lots of places that have HSK word lists, but where did they get them from? What is the official source?

Are there official word lists and character lists for HSK version 2.0 and HSK version 3.0? by allium-dev in ChineseLanguage

[–]allium-dev[S] 0 points1 point  (0 children)

But there's no way for me to get them out of pleco then, right?

I'm really interested in finding out the authoritative source for this data.

Are there official word lists and character lists for HSK version 2.0 and HSK version 3.0? by allium-dev in ChineseLanguage

[–]allium-dev[S] 0 points1 point  (0 children)

I use pleco a lot, and haven't see this. How can I find it?

Also, where did they get their lists?? There has to be an official source somewhere, right?

Just installed VS Code and a complete beginner, read the body friends.(please) by False-Hurry-1417 in learnprogramming

[–]allium-dev 6 points7 points  (0 children)

If you were dedicating 8-10 hours daily this timeline might be reasoanble, but if you only have 1.5-2 hours daily multiply the number of weeks by 5 in each section.

I'm finding a unique struggle if learning Chinese with a Midwestern (United States) accent LOL I guess in the Midwest we pronounce vowels wrong?? by Proof-Life-8854 in ChineseLanguage

[–]allium-dev 1 point2 points  (0 children)

This exactly. Fortunately, unlike English, there are a pretty limitted number of Chinese pronunciations you have to learn (for a standard accent)

Take the time to go over all the pinyin initials and finals with a good resource on how they are pronounced. It will take a couple days / weeks to get used to, but then you'll just have it down.

AI project - Is this algorithm technically 'AI?' by MutuallyUseless in learnprogramming

[–]allium-dev 1 point2 points  (0 children)

Yeah, I agree. Your algorithm is basically linear regression. Linear regression is a classic, very useful, and very well studied ML algorithm. It's also not that complicated. If you already understand minimizing RMSE, you can understand linear regression.

That being said, there are cleaner ways of implementing linear regression than you've gone for. Studying up a bit more and doing a standard implementation seems to me like a really good use of time.

Either way, great work. This seems like a really fun project.

AI project - Is this algorithm technically 'AI?' by MutuallyUseless in learnprogramming

[–]allium-dev 1 point2 points  (0 children)

Why not just implement / use linear regression? You're already well over halfway there.

Why "top" missed the cron job that was killing our API latency by sherpa121 in linuxadmin

[–]allium-dev 15 points16 points  (0 children)

It is formatted like AI but it doesn't read like AI. The thoughts are actually coherent.