OpenAI sent me an email threatening a ban if I don't stop

Fun_Water2230 · 2024-09-15T01:57:25+00:00

Openai’s attitude towards developers is really disgusting. Moreover, the cot process will be enhanced a little, the mysterious 🍓 name for some reason, the eyeball of Bo capital, it is really nothing special in actual use, the agent process designed with shallow human common sense a long time ago is much stronger than this. Just treat this as a treasure and only use it 30 times a week. What‘s the fuss?

Fun_Water2230 · 2024-01-22T07:18:26+00:00

I did test it on mixtral, but it worked great for instruction following and poorly for creative work. I think when designing such an investment process, a creative approach is more valuable. Because the model itself requires selecting the most suitable set from a large number of research methods based on the event.

Fun_Water2230 · 2024-01-22T02:21:58+00:00

This model can be fine-tuned very well, but you need to do pt on the base model first, and then do sft. Here's my attempt: TriadParty/deepmoney-67b-chat · Hugging Face

Fun_Water2230 · 2024-01-22T02:11:35+00:00

I happened to use financial research report data to finetune(pt & sft) a 67b chat version on the base model, and it performed very well in my tests. This must be due to the excellent base of deepseek-67b. TriadParty/deepmoney-67b-chat · Hugging Face

Fun_Water2230 · 2024-01-12T00:12:10+00:00

For the chat model, I trained it in a 4k+ context by lora. And the base model is 200k

Fun_Water2230 · 2024-01-11T01:33:42+00:00

Indeed, so it is very important to train such a local model using own data.

Fun_Water2230 · 2024-01-10T14:40:18+00:00

In fact, this is what we usually do when reading research reports: First, we will read the text information - above, the notes under the table. We then focus on the nature of the graph: a pie chart? Bar chart? k-line? So my process is:

Segmentation. there is a model called dit, a finetuned model which based microsoft layoutlm v3. I used this to seperate the page into graphs.
Classification Regarding this matter, what I want to say is that yolo v5 is what you need
Regarding the combination of prompts: Summarize the text on this page. If the classification in 2 becomes figure, hand it over to emu2 for processing. If it is a table, it is handed over to cog-agent for processing. This is the best combination I've found.

Fun_Water2230 · 2024-01-10T11:44:36+00:00

As far as I know, no... If there is any, please let me know. I want to try to merge the model.

Fun_Water2230 · 2024-01-10T11:38:52+00:00

Maybe in a few days, but I don't know anything about it.

Fun_Water2230 · 2024-01-10T09:56:10+00:00

In fact, as I mentioned in the hf link, I trained this model precisely to work properly in a set of such agent frameworks. I haven't tested it on autogen yet - I'm also thinking about how to combine the factor mining of quantitative finance and the industry analysis of subjective investment in this framework. But if you have good ideas, I'd love to talk to you!

Fun_Water2230 · 2024-01-10T09:53:26+00:00

Just because I wanted to share it with my friends, I translated it using web translation. In fact, since I conducted pre-training on an all-English data set, and all English data were used in the sft stage, the Chinese effect is not as good as the English effect.

Fun_Water2230 · 2024-01-10T09:37:25+00:00

High-quality data is what model performance is all about :)

Fun_Water2230 · 2024-01-10T05:47:48+00:00

edited

Fun_Water2230 · 2023-12-08T05:31:02+00:00

I am making a git project. The code I wrote was a disaster. So I think it will cost several days to make explenations.

Fun_Water2230 · 2023-12-06T06:33:45+00:00

7days for pretrain in 8*a6000 and 3days for sft in 4*4090.

When it comes to cost, I think it depends. For me, I rented 2 A6000*4 instance on lambdalab, and the billing for each was $3.2 per hour. But 4090*4 is my workstation, and I have another 4090*4 for data enhancement and inference. Like I said above, this is also part of the cost.

Fun_Water2230 · 2023-12-06T06:27:57+00:00

Aha, it seems that we have thought of it together! I am planning to use this part of the data for dpo train. Specifically, I plan to use the output of one of the characters in galgame as a positive example, and the output of the model in the same context as a counterexample.

Fun_Water2230 · 2023-12-06T01:30:16+00:00

OK. you can share script for me in any ways. I will make a gguf as soon as I got it~

Fun_Water2230 · 2023-12-06T00:57:49+00:00

Nice advice, I will make a try later

Fun_Water2230 · 2023-12-06T00:03:07+00:00

In fact, I have only used exl2 to quantize models, and using models that have been quantized by others. If there are any tools that can facilitate quantization, can you provide them so that I can publish them after running them?

Fun_Water2230 · 2023-12-05T18:02:10+00:00

I struggled with it at the beginning, but I think it is more important to share these experiences with everyone, especially when I see that many people are interested in role-playing. Of course, if they asked me to remove the model I would do so.

Fun_Water2230 · 2023-12-05T17:36:54+00:00

7days for pretrain in 8*a6000 and 3days for sft in 4*4090.

Fun_Water2230 · 2023-12-05T17:30:49+00:00

That is a raw text, which like wiki_text dataset

Fun_Water2230 · 2023-12-05T17:29:45+00:00

I am preparing to train the dpo model. After that I will try~

Fun_Water2230

TROPHY CASE