-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kotlin psi structure for FIM. #829
Conversation
This is cool, many thanks! I'll take a closer look sometime this week. |
Haven't been able to watch the Merge request yet? |
Sorry, not yet. I'll release the pending stuff first and then take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I like the fact that we're taking advantage of IntelliJ's PSI tree to extract important information. However, as you can see, it can get quite ugly quickly because you have to analyze the elements/modifiers separately and then put them all back together.
I think treesitter
would provide much help in scenarios where you need to query for methods, constructors, imports, package declarations, etc.
A while ago, I did a POC for a similar feature but didn't proceed with it (can't remember why), but it can give you an idea of what I'm talking about: 584a1cc#diff-cc18dd78aeb55a5c77b6baed61b6ee1e560828ab872c7e4dbcca94936a8204ecR75
@@ -142,6 +142,8 @@ configurationConfigurable.section.codeCompletion.multiLineCompletions.descriptio | |||
configurationConfigurable.section.codeCompletion.postProcess.title=Enable tree-sitter post-processing | |||
configurationConfigurable.section.codeCompletion.postProcess.description=If checked, the completion will be post-processed using the tree-sitter parser. | |||
configurationConfigurable.section.codeCompletion.gitDiff.title=Enable git diff context | |||
configurationConfigurable.section.codeCompletion.collectDependencyStructure.title=Collect the dependencies of the analyzed file. | |||
configurationConfigurable.section.codeCompletion.collectDependencyStructure.description=Enabling the setting allows the plugin to collect the dependency structure, which increases the accuracy of the proposed data, but consumes more tokens per request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a small note somewhere that this logic only applies to Java and Kotlin sources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, of course! I will point out that only Kotlin is currently supported.
|
||
class ClassStructureSerializer { | ||
|
||
fun serialize(classStructure: ClassStructure): String = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a short JavaDoc what we're serializing here, or what's the end result?
import ee.carlrobert.codegpt.codecompletions.psi.structure.models.* | ||
import org.jetbrains.kotlin.utils.addToStdlib.ifNotEmpty | ||
|
||
class ClassStructureSerializer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be singleton - object ClassStructureSerializer {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, yes, but when multiple languages are supported, it won't make much sense to keep all the parsers in memory.
It seems that we will have to make a parser provider for each language that is included in the structure analysis.
I looked at the possibility of analyzing using this library. Ultimately, we will have the same limitations as in the current parsing implementation. companion object {
val myField = "123"
} If you like this approach, I plan to develop Kotlin analysis by connecting a compiler, which will make it possible to output types of complex constructions and generics. Looking ahead, I have colleagues who can advise me on working with Go, TypeScript, PhP, and JS languages. |
One more question, I wanted to add an analysis of the file structure when working in a chat. Should I do this through the setting or enable it by default? In the next pull request. |
39f3c18
to
24eb009
Compare
Not sure I follow. With proper Kotlin grammar rules, I believe it's possible to extract the necessary types and declarations. However, to correctly resolve declarations for specific references, you need to index the entire codebase first. I believe PSI trees provide an advantage over that, as IntelliJ handles this indexing automatically. For example, you can explore Kotlin's AST structure using the Tree-sitter playground: https://fwcd.github.io/tree-sitter-kotlin/
Yes, it looks good in general. If we have a clean interface for code analysis and post-processing/serialization, then we can easily replace the underlying logic with something else in later phases (if needed). |
Hmm, how about creating a new chat action tag, similar to how "Include Open Files" is done? |
I think it's a good idea. Do I need any more actions for merge pull request? |
Are you able to merge the PR? If not, then I might need to configure something. |
I can't |
I'll take a look at it later. |
* Add Kotlin dependency and configuration file * Add collect dependency structure option to code completion settings * Add Kotlin PSI structure analysis and serialization components * Add repository name and dependencies structure to InfillRequest and update prompt template for QWEN_2_5 --------- Co-authored-by: a.iudin <a.iudin@vk.team>
* Add Kotlin dependency and configuration file * Add collect dependency structure option to code completion settings * Add Kotlin PSI structure analysis and serialization components * Add repository name and dependencies structure to InfillRequest and update prompt template for QWEN_2_5 --------- Co-authored-by: a.iudin <a.iudin@vk.team>
I present the concept of collecting a class structure for dependencies. The structure analysis is currently implemented only for Kotlin and works using PSI tree analysis, without using the Kotlin compiler.
This approach increases the accuracy of prediction for auto-completion of the code.
It will also be useful to use the dependency structure collection when adding code to the chat, for more accurate answers.
For example, a simple structure of 3 classes is used.
As a result of the code analysis, the FIM request is transformed into the following format: