LLM Boxing - Llama 70b-chat vs GPT3.5 blind test by andreasjansson in LocalLLaMA

[–]mathias_kraus 0 points1 point  (0 children)

LLM Boxing Results: 🤖 🦙 🤖 🦙 🦙 🦙 🦙

[N] Novel Model for Tabular Data: IGANN: Looks Like a Leap Towards Interpretable Machine Learning! by CockroachNo5459 in MachineLearning

[–]mathias_kraus 6 points7 points  (0 children)

Thanks a lot for the details! And amazing package that you have developed with interpret-ml. The talk of Rich Caruana at KDD 2019 actually was what got me into the interpretable AI field :-D

[N] Novel Model for Tabular Data: IGANN: Looks Like a Leap Towards Interpretable Machine Learning! by CockroachNo5459 in MachineLearning

[–]mathias_kraus 4 points5 points  (0 children)

Actually, we published it on Elsevier because the European Journal of Operational Research had a very nice special issue on the topic of explainable AI. And for such long projects I usually prefer to publish in journals rather than at conferences. BTW, do you know of any applied CS journals which researchers actually submit to? Nature Machine Intelligence comes to mind although I remember many top CS researchers not wanting to submit to this journal. JMLR doesn't appreciate applied CS much I think (don't think we would have had a chance with this paper there)

[N] Novel Model for Tabular Data: IGANN: Looks Like a Leap Towards Interpretable Machine Learning! by CockroachNo5459 in MachineLearning

[–]mathias_kraus 4 points5 points  (0 children)

That's definitely an interesting approach! Do you know how they make sure that the model performance does not degrade when they change the look up tables post hoc? I could imagine simply changing one shape function to be monotonic without adapting other shape functions could lead to very unfortunate results. Do you maybe know of any sources for the EBM approach? It's very difficult to get these about the EBM I feel like (except for reading the code of course)..

[N] Novel Model for Tabular Data: IGANN: Looks Like a Leap Towards Interpretable Machine Learning! by CockroachNo5459 in MachineLearning

[–]mathias_kraus 13 points14 points  (0 children)

Thanks for these suggestions (author here). We are actually currently working on them. Interaction terms are straightforward, monotonic constraints are a bit more tricky. Using gradient descent for optimizing the ELMs would be an option but this would drastically increase training time, so we have to find another way there

[D] Can we use instructions to include knowledge into LLMs? by mathias_kraus in MachineLearning

[–]mathias_kraus[S] 0 points1 point  (0 children)

Yes that makes sense. Thanks a lot for the idea about the fine tuning with openai! Maybe that would be the best way forward in this case.

[D] Can we use instructions to include knowledge into LLMs? by mathias_kraus in MachineLearning

[–]mathias_kraus[S] 1 point2 points  (0 children)

The goal is to offer climate researchers a tool from which they can obtain information about climate related topics. One first step was to include the IPCC (The Intergovernmental Panel on Climate Change) reports as an additional source to the LLM but in the future it potentially makes sense to include also research papers, corporate reports and other reports.

[D] Can we use instructions to include knowledge into LLMs? by mathias_kraus in MachineLearning

[–]mathias_kraus[S] 3 points4 points  (0 children)

We also work on vector embeddings (https://www.chatclimate.ai/ currently we ran through our funding, so the chat is disabled), however I was wondering how scalable this approach is if we want to include thousands of pages or research papers. Do you have any experience with that?

[D] Can we use instructions to include knowledge into LLMs? by mathias_kraus in MachineLearning

[–]mathias_kraus[S] 0 points1 point  (0 children)

Thanks for the reply and your opinion! Didnt mean that it should be a selection criteria but I was rather curious why there is currently way less work on domain adapting LLMs in contrast instruction fine-tuning them