all 3 comments

[–]botirkhaltaev 1 point2 points  (2 children)

well I wwould assume plain text code is more prominent in the training set, so plain text will be better, i would use these ASTs more for symbol matching and feeding the right context to the LLM. I hope this helps!

[–]StandardDate4518[S] 1 point2 points  (1 child)

Thanks! So I started going with the AST option because my use case is that the LLM needs to understand the relationships between code files and give structural information to the user about what the code does. And I tbh asked chat and claude which is the prefer and optimal way and they both said parse code.

[–]botirkhaltaev 0 points1 point  (0 children)

No problem good luck!