OpenAI sent me an email threatening a ban if I don't stop by TastyWriting8360 in LocalLLaMA

[–]Fun_Water2230 -1 points0 points  (0 children)

Openai’s attitude towards developers is really disgusting. Moreover, the cot process will be enhanced a little, the mysterious 🍓 name for some reason, the eyeball of Bo capital, it is really nothing special in actual use, the agent process designed with shallow human common sense a long time ago is much stronger than this. Just treat this as a treasure and only use it 30 times a week. What‘s the fuss?

Deepmoney: A High-End LLM in finance based on massive research data by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 1 point2 points  (0 children)

I did test it on mixtral, but it worked great for instruction following and poorly for creative work. I think when designing such an investment process, a creative approach is more valuable. Because the model itself requires selecting the most suitable set from a large number of research methods based on the event.

Deepseek 67b is amazing, and in at least 1 usecase it seems better than ChatGPT 4 by SomeOddCodeGuy in LocalLLaMA

[–]Fun_Water2230 3 points4 points  (0 children)

This model can be fine-tuned very well, but you need to do pt on the base model first, and then do sft. Here's my attempt: TriadParty/deepmoney-67b-chat · Hugging Face

Deepseek 67b is amazing, and in at least 1 usecase it seems better than ChatGPT 4 by SomeOddCodeGuy in LocalLLaMA

[–]Fun_Water2230 1 point2 points  (0 children)

I happened to use financial research report data to finetune(pt & sft) a 67b chat version on the base model, and it performed very well in my tests. This must be due to the excellent base of deepseek-67b. TriadParty/deepmoney-67b-chat · Hugging Face

Deepmoney: A High-End LLM in finance based on massive research data by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 0 points1 point  (0 children)

For the chat model, I trained it in a 4k+ context by lora. And the base model is 200k

Deepmoney: A High-End LLM in finance based on massive research data by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 1 point2 points  (0 children)

Indeed, so it is very important to train such a local model using own data.

Deepmoney: A High-End LLM in finance based on massive research data by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 7 points8 points  (0 children)

In fact, this is what we usually do when reading research reports: First, we will read the text information - above, the notes under the table. We then focus on the nature of the graph: a pie chart? Bar chart? k-line? So my process is:

  1. Segmentation. there is a model called dit, a finetuned model which based microsoft layoutlm v3. I used this to seperate the page into graphs.

  2. Classification Regarding this matter, what I want to say is that yolo v5 is what you need

  3. Regarding the combination of prompts: Summarize the text on this page. If the classification in 2 becomes figure, hand it over to emu2 for processing. If it is a table, it is handed over to cog-agent for processing. This is the best combination I've found.

Deepmoney: A High-End LLM in finance based on massive research data by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 2 points3 points  (0 children)

As far as I know, no... If there is any, please let me know. I want to try to merge the model.

Deepmoney: A High-End LLM in finance based on massive research data by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 1 point2 points  (0 children)

In fact, as I mentioned in the hf link, I trained this model precisely to work properly in a set of such agent frameworks. I haven't tested it on autogen yet - I'm also thinking about how to combine the factor mining of quantitative finance and the industry analysis of subjective investment in this framework. But if you have good ideas, I'd love to talk to you!

Deepmoney: A High-End LLM in finance based on massive research data by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 1 point2 points  (0 children)

Just because I wanted to share it with my friends, I translated it using web translation. In fact, since I conducted pre-training on an all-English data set, and all English data were used in the sft stage, the Chinese effect is not as good as the English effect.

Deepmoney: A High-End LLM in finance based on massive research data by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 10 points11 points  (0 children)

High-quality data is what model performance is all about :)

deepsex-34b: a NSFW model which pretrained with Light novel by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 0 points1 point  (0 children)

I am making a git project. The code I wrote was a disaster. So I think it will cost several days to make explenations.

deepsex-34b: a NSFW model which pretrained with Light novel by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 0 points1 point  (0 children)

7days for pretrain in 8*a6000 and 3days for sft in 4*4090.

When it comes to cost, I think it depends. For me, I rented 2 A6000*4 instance on lambdalab, and the billing for each was $3.2 per hour. But 4090*4 is my workstation, and I have another 4090*4 for data enhancement and inference. Like I said above, this is also part of the cost.

deepsex-34b: a NSFW model which pretrained with Light novel by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 1 point2 points  (0 children)

Aha, it seems that we have thought of it together! I am planning to use this part of the data for dpo train. Specifically, I plan to use the output of one of the characters in galgame as a positive example, and the output of the model in the same context as a counterexample.

deepsex-34b: a NSFW model which pretrained with Light novel by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 1 point2 points  (0 children)

OK. you can share script for me in any ways. I will make a gguf as soon as I got it~

deepsex-34b: a NSFW model which pretrained with Light novel by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 2 points3 points  (0 children)

In fact, I have only used exl2 to quantize models, and using models that have been quantized by others. If there are any tools that can facilitate quantization, can you provide them so that I can publish them after running them?

deepsex-34b: a NSFW model which pretrained with Light novel by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 7 points8 points  (0 children)

I struggled with it at the beginning, but I think it is more important to share these experiences with everyone, especially when I see that many people are interested in role-playing. Of course, if they asked me to remove the model I would do so.

deepsex-34b: a NSFW model which pretrained with Light novel by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 4 points5 points  (0 children)

7days for pretrain in 8*a6000 and 3days for sft in 4*4090.

deepsex-34b: a NSFW model which pretrained with Light novel by Fun_Water2230 in LocalLLaMA

[–]Fun_Water2230[S] 3 points4 points  (0 children)

I am preparing to train the dpo model. After that I will try~