Skip to content
This repository has been archived by the owner on Sep 19, 2024. It is now read-only.

Ask Command #663

Merged
merged 37 commits into from
Sep 21, 2023
Merged

Ask Command #663

merged 37 commits into from
Sep 21, 2023

Conversation

Keyrxng
Copy link
Contributor

@Keyrxng Keyrxng commented Aug 23, 2023

Resolves #291

Adding the /ask command using OpenAI.

@netlify
Copy link

netlify bot commented Aug 23, 2023

Deploy Preview for ubiquibot-staging ready!

Name Link
🔨 Latest commit 4e1a7b8
🔍 Latest deploy log https://app.netlify.com/sites/ubiquibot-staging/deploys/650baa1de12a760008e7a821
😎 Deploy Preview https://deploy-preview-663--ubiquibot-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

    Simple prompt engineering, one response per user per question.
@Keyrxng Keyrxng marked this pull request as ready for review August 23, 2023 16:53
better initial prompt
src/handlers/comment/handlers/ask.ts Outdated Show resolved Hide resolved
@0x4007
Copy link
Member

0x4007 commented Aug 23, 2023

I have some concerns about this implementation.

Since creating the task, ChatGPT rolled out the capability to share full conversations which has been pretty sufficient for this purpose.

However in order to enhance the developer experience, we should automatically provide gpt4 with as much context as possible (it will be faster than manually copy pasting all of the relevant details into ChatGPT normally.)

I know this isn't in the specification, my apologies. But the prompt you provided signals to me that the answers are low quality without all of the context, which makes this inferior to just going to ChatGPT directly and asking the question.

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Aug 23, 2023

So when it comes to "memory", you have to keep in mind that this one instance will be running over all of the issues/repos. In order to feed the previous context we'd need to store each prompt, feeding it into the next so that gpt actually has context for a given interaction.

As for additional context, I tried to limit follow-ons and follow-ups because it's not possible to feed prompt after prompt into gpt effectively here I don't think unless you guys will have an openai key that can be aggressively ran as the fact that is, the bigger the input the more tokens which are used as input token size is priced in as well as output token size. So that sort of "memory" isn't cost effective long term.

The way I read the issue was this:

You are having a debate, discussion, pondering a problem, solution or idea and while you and team are all doing the heavy lifting maybe there's a one off question or one off need for inspiration then this solution is near perfect. You have all the context as you are debating the problem, solution whatever.

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Aug 23, 2023

@pavlovcik My backend is running right now and will be for the next 5 mins if you want to fire a few different tester queries at the current prompt setup on my repo but I'm happy to discuss and try to implement other setups.

I recently placed 2nd in a hackathon for building an AI Therapist ChatBot, it had the sort of "memory" that you speak of but it was a lot easier as it was a single instance so there was never the possibility of previous context being confused as it would reset on window reload.

Thinking out loud: To store context, for each line of questioning for each user on each issue and track all of that would be hell.

Is the /ask command supposed to be well-versed in all things UBQ/Web3 or is it just an easy route to GPT directly in issues? The latter obviously being a more generalized approach and I guess would infer it's own context like GPT does with near free reign and the other would have to be guard railed not to hallucinate it's own context like it often does in that sort of context.

@0x4007
Copy link
Member

0x4007 commented Aug 23, 2023

I am unfamiliar with what exactly is returned by the GitHub API to view a full conversation of an issue or pull request, but from what I understand it's a JSON object that describes every contributor and every interaction (reactions and comments.)

We can scrub the unnecessary output like the node IDs, and reactions, and pass in this entire conversation as context to the /ask command behind the scenes.

Even better if we can pass in the original issue specification as well, which would require some page navigation if the /ask command is being invoked from the pull request. Fortunately, we already implemented similar logic for the "comment incentives on pull request" feature which was recently merged into development.

My backend is running right now and will be for the next 5 mins

Thank you for the offer, but unfortunately we operate asynchronously and generally this strategy is not viable for reviewers as we are usually busy reviewing tens of issues/pull requests per hour in sort of a round-robin style.

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Aug 23, 2023

Hmm I'll have a think about it and try to set something up tomorrow.

Can anyone use the /ask command?

Is there a limit to questions per interaction?

@0x4007
Copy link
Member

0x4007 commented Aug 24, 2023

In this implementation I had not considered limitations but it seems responsible.

  • We could check if they are added as a contributor to the repository/organization.
  • We also have a simple access control system built in (which might already be able to take care of this?) please see the code for /allow although it might only be for adding labels.
  • No limits for now, a contributor can keep asking.

original context indexable across repos, issues and pull requests
@Keyrxng
Copy link
Contributor Author

Keyrxng commented Aug 25, 2023

https://github.com/Keyrxng/didactic-octo-train/issues/15

So the way things are at the moment:

command = '/ask didactic-octo-train pull 16 1693816494 "Provide some examples"'

  • repoName
  • issue or pull
  • issue or pull number
  • commentID
  • question

If commentID is zero, it's a standalone single interaction with no previous context.

If it's defined then it's fed the original context and produces results, you could chain context together using the AI response commentID but it would be superficial as it would only have the context of that response but not your query verbatim.

Is that sufficient or should it be possible to reference a comment on another issue or pull, and then chain together say 4 or 5 questions relating to that?

The trouble I'm having right now is trying to think of a way to allow for multiple users to create multiple chains of interactions all on the one issue but so far it's been pretty messy.

I assume there will be multiple concurrent chains of interactions being run, the most rudimental way is what we have currently and it does the job but is it to spec?

Would it be cleaner to allow multiple commentIDs to be passed in and we manually give it context from potentially a variety of sources?

Or have the bot handle previous interactions on the issue or pull and try to find a way to distinguish the chain of interactions in amongst potentially multiple other user's interactions as well as potentially multiple single-user chains of interactions too?

I need some perspective, cheers

comments etc
@0x4007
Copy link
Member

0x4007 commented Aug 25, 2023

I'm going to spend some time researching today by looking through the GitHub API docs and passing in the full conversation context JSON of a single pull request or issue.

I imagine that the interface would only be:

/ask is it possible to implement the todo list with react hooks?

And ChatGPT would be able to read the full conversation leading up to that question before providing a response.

I'll post back here later today.

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Aug 25, 2023

So you are wanting to be able to traverse repos, issues and pulls and pull individual context from a comment onto a separate issue/pull and then feed the entire conversation object of the current issue/pull and not the context source?

Feeding the current issue/pull conversation is easily done as is, I had it collecting previous answers it had posted for a specific user along with all of the relevant question but once I started new interactions it got messy. The only thing to consider if feeding an entire conversation into it with every call I think is token usage and costs in that sense.

I think some more thought into it would be great mate I feel we are not really close to being on the same page with it at the moment, the bounds and objectives are a bit unclear for me at least.

No rush, I'm going to look at picking up something else in the meantime.

@0x4007
Copy link
Member

0x4007 commented Aug 26, 2023

As a heads up I haven't had the chance to look into it yet. But, at the risk of veering a bit off topic, just so that you have perspective of what I have in mind for the future, I looked into ubiquity/ubiquibot-telegram#20 which is a tool from Microsoft which allows you to store a whole repository in context for ChatGPT. This could probably be useful for a future version and extension (Telegram Chatbot version) of this command.

So you are wanting to be able to traverse repos, issues and pulls and pull individual context from a comment onto a
separate issue/pull and then feed the entire conversation object of the current issue/pull and not the context source?

More simple. There are two types of locations I anticipate the invocation of this command:

  1. On an issue
  • Generally this is for when a bounty is being clarified and still being adjusted. Conversation usually only happens here early in the lifecycle of a bounty if the original specification is not clear enough. I imagine that it would be very straightforward to collect the full conversation history from the GitHub API and pass it in as context to ChatGPT. At minimum it should pass in the original specification (the original comment) for context on the question asked.
  1. On a pull request
  • Pull requests usually are associated with an issue, which has a specification. It would be useful to pass in at minimum the issue specification, then the pull request entire conversation, then the issue entire conversation (if there's enough capacity)

I dont see a situation where a user needs to define which issue or pull request to pass in as arguments. This should be handled automatically based on the instructions above.

Feeding the current issue/pull conversation is easily done as is, I had it collecting previous answers it had posted for a specific user along with all of the relevant question but once I started new interactions it got messy. The only thing to consider if feeding an entire conversation into it with every call I think is token usage and costs in that sense.

I think you had a far more complex implementation in mind.

I think some more thought into it would be great mate I feel we are not really close to being on the same page with it at the moment, the bounds and objectives are a bit unclear for me at least.

No rush, I'm going to look at picking up something else in the meantime.

Seems like you haven't had great luck with the last two bounties so I want to try and rectify this. Usually bounties are pretty straightforward sorry about this!

@0x4007
Copy link
Member

0x4007 commented Aug 26, 2023

https://api.github.com/repos/ubiquity/ubiquibot/issues/663/comments

Scrubbed:

[
  {
    "id": 1690288399,
    "user": {
      "login": "netlify[bot]",
      "id": 40209326,
      "type": "Bot"
    },
    "created_at": "2023-08-23T16:40:21Z",
    "updated_at": "2023-08-25T19:50:48Z",
    "body": "### <span aria-hidden=\"true\">❌</span> Deploy Preview for *ubiquibot-staging* failed.\n\n\n|  Name | Link |\n|:-:|------------------------|\n|<span aria-hidden=\"true\">🔨</span> Latest commit | 8e6b42b183dbff14c2d3bf63fb9cbbf066cf753e |\n|<span aria-hidden=\"true\">🔍</span> Latest deploy log | https://app.netlify.com/sites/ubiquibot-staging/deploys/64e905dee1b3550008d6b2d0 |"
  },
  {
    "id": 1690727556,
    "user": {
      "login": "pavlovcik",
      "id": 4975670,
      "type": "User"
    },
    "created_at": "2023-08-23T22:29:41Z",
    "updated_at": "2023-08-23T22:29:41Z",
    "body": "I have some concerns about this implementation. \n\nSince creating the task, ChatGPT rolled out the capability to share full conversations which has been pretty sufficient for this purpose. \n\nHowever in order to enhance the developer experience, we should automatically provide gpt4 with as much context as possible (it will be faster than manually copy pasting all of the relevant details into ChatGPT normally.)\n\nI know this isn't in the specification, my apologies. But the prompt you provided signals to me that the answers are low quality without all of the context, which makes this inferior to just going to ChatGPT directly and asking the question. "
  },
  {
    "id": 1690747118,
    "user": {
      "login": "Keyrxng",
      "id": 106303466,
      "type": "User"
    },
    "created_at": "2023-08-23T22:52:58Z",
    "updated_at": "2023-08-23T22:52:58Z",
    "body": "So when it comes to \"memory\", you have to keep in mind that this one instance will be running over all of the issues/repos. In order to feed the previous context we'd need to store each prompt, feeding it into the next so that gpt actually has context for a given interaction.\r\n\r\nAs for additional context, I tried to limit follow-ons and follow-ups because it's not possible to feed prompt after prompt into gpt effectively here I don't think unless you guys will have an openai key that can be aggressively ran as the fact that is, the bigger the input the more tokens which are used as input token size is priced in as well as output token size. So that sort of \"memory\" isn't cost effective long term.\r\n\r\nThe way I read the issue was this:\r\n\r\nYou are having a debate, discussion, pondering a problem, solution or idea and while you and team are all doing the heavy lifting maybe there's a one off question or one off need for inspiration then this solution is near perfect. You have all the context as you are debating the problem, solution whatever."
  },
  {
    "id": 1690757038,
    "user": {
      "login": "Keyrxng",
      "id": 106303466,
      "type": "User"
    },
    "created_at": "2023-08-23T23:06:54Z",
    "updated_at": "2023-08-23T23:06:54Z",
    "body": "@pavlovcik My backend is running right now and will be for the next 5 mins if you want to fire a few different tester queries at the current prompt setup on my repo but I'm happy to discuss and try to implement other setups.\r\n\r\nI recently placed 2nd in a hackathon for building an AI Therapist ChatBot, it had the sort of \"memory\" that you speak of but it was a lot easier as it was a single instance so there was never the possibility of previous context being confused as it would reset on window reload. \r\n\r\nThinking out loud: To store context, for each line of questioning for each user on each issue and track all of that would be hell.\r\n\r\nIs the /ask command supposed to be well-versed in all things UBQ/Web3 or is it just an easy route to GPT directly in issues? The latter obviously being a more generalized approach and I guess would infer it's own context like GPT does with near free reign and the other would have to be guard railed not to hallucinate it's own context like it often does in that sort of context."
  },
  {
    "id": 1690777642,
    "user": {
      "login": "pavlovcik",
      "id": 4975670,
      "type": "User"
    },
    "created_at": "2023-08-23T23:36:59Z",
    "updated_at": "2023-08-23T23:40:13Z"
  },
  {
    "id": 1690784310,
    "user": {
      "login": "Keyrxng",
      "id": 106303466,
      "type": "User"
    },
    "created_at": "2023-08-23T23:47:15Z",
    "updated_at": "2023-08-23T23:47:15Z",
    "body": "Hmm I'll have a think about it and try to set something up tomorrow.\r\n\r\nCan anyone use the /ask command?\r\n\r\nIs there a limit to questions per interaction?\r\n"
  },
  {
    "id": 1690799114,
    "user": {
      "login": "pavlovcik",
      "id": 4975670,
      "type": "User"
    },
    "created_at": "2023-08-24T00:09:06Z",
    "updated_at": "2023-08-24T00:09:55Z",
    "body": "In this implementation I had not considered limitations but it seems responsible. \n- We could check if they are added as a contributor to the repository/organization.\n- We also have a simple access control system built in (which might already be able to take care of this?) please see the code for `/allow` although it might only be for adding labels. \n- No limits for now, a contributor can keep asking. "
  },
  {
    "id": 1693875268,
    "user": {
      "login": "Keyrxng",
      "id": 106303466,
      "type": "User"
    },
    "created_at": "2023-08-25T20:07:36Z",
    "updated_at": "2023-08-25T20:08:39Z",
    "body": "https://github.com/Keyrxng/didactic-octo-train/issues/15\r\n\r\nSo the way things are at the moment:\r\n\r\ncommand = '/ask didactic-octo-train pull 16 1693816494 \"Provide some examples\"'\r\n\r\n- repoName\r\n- issue or pull\r\n- issue or pull number\r\n- commentID\r\n- question\r\n\r\nIf commentID is zero, it's a standalone single interaction with no previous context. \r\n\r\nIf it's defined then it's fed the original context and produces results, you could chain context together using the AI response commentID but it would be superficial as it would only have the context of that response but not your query verbatim.\r\n\r\nIs that sufficient or should it be possible to reference a comment on another issue or pull, and then chain together say 4 or 5 questions relating to that?\r\n\r\nThe trouble I'm having right now is trying to think of a way to allow for multiple users to create multiple chains of interactions all on the one issue but so far it's been pretty messy.\r\n\r\nI assume there will be multiple concurrent chains of interactions being run, the most rudimental way is what we have currently and it does the job but is it to spec?\r\n\r\nWould it be cleaner to allow multiple commentIDs to be passed in and we manually give it context from potentially a variety of sources? \r\n\r\nOr have the bot handle previous interactions on the issue or pull and try to find a way to distinguish the chain of interactions in amongst potentially multiple other user's interactions as well as potentially multiple single-user chains of interactions too?\r\n\r\nI need some perspective, cheers"
  },
  {
    "id": 1693955739,
    "user": {
      "login": "pavlovcik",
      "id": 4975670,
      "type": "User"
    },
    "created_at": "2023-08-25T21:29:07Z",
    "updated_at": "2023-08-25T21:29:07Z",
    "body": "I'm going to spend some time researching today by looking through the GitHub API docs and passing in the full conversation context JSON of a single pull request or issue. \n\nI imagine that the interface would only be:\n\n```\n/ask is it possible to implement the todo list with react hooks?\n```\n\nAnd ChatGPT would be able to read the full conversation leading up to that question before providing a response. \n\nI'll post back here later today. "
  },
  {
    "id": 1693966412,
    "user": {
      "login": "Keyrxng",
      "id": 106303466,
      "type": "User"
    },
    "created_at": "2023-08-25T21:43:10Z",
    "updated_at": "2023-08-25T21:43:10Z",
    "body": "So you are wanting to be able to traverse repos, issues and pulls and pull individual context from a comment onto a separate issue/pull and then feed the entire conversation object of the current issue/pull and not the context source?\r\n\r\nFeeding the current issue/pull conversation is easily done as is, I had it collecting previous answers it had posted for a specific user along with all of the relevant question but once I started new interactions it got messy. The only thing to consider if feeding an entire conversation into it with every call I think is token usage and costs in that sense.\r\n\r\nI think some more thought into it would be great mate I feel we are not really close to being on the same page with it at the moment, the bounds and objectives are a bit unclear for me at least. \r\n\r\nNo rush, I'm going to look at picking up something else in the meantime."
  },
  {
    "id": 1694182327,
    "user": {
      "login": "pavlovcik",
      "id": 4975670,
      "type": "User"
    },
    "created_at": "2023-08-26T05:59:24Z",
    "updated_at": "2023-08-26T06:03:44Z",
    "body": "As a heads up I haven't had the chance to look into it yet. But, at the risk of veering a bit off topic, just so that you have perspective of what I have in mind for the future, I looked into https://github.com/ubiquity/telegram-ubiquibot/issues/20 which is a tool from Microsoft which allows you to store a whole repository in context for ChatGPT. This could probably be useful for a future version and extension (Telegram Chatbot version) of this command.\r\n\r\n> So you are wanting to be able to traverse repos, issues and pulls and pull individual context from a comment onto a \r\nseparate issue/pull and then feed the entire conversation object of the current issue/pull and not the context source?\r\n\r\nMore simple. There are two types of locations I anticipate the invocation of this command:\r\n1. On an issue\r\n  - Generally this is for when a bounty is being clarified and still being adjusted. Conversation usually only happens here early in the lifecycle of a bounty if the original specification is not clear enough. I imagine that it would be very straightforward to collect the full conversation history from the GitHub API and pass it in as context to ChatGPT. At minimum it should pass in the original specification (the original comment) for context on the question asked.\r\n2. On a pull request\r\n  - Pull requests usually are associated with an issue, which has a specification. It would be useful to pass in at minimum the issue specification, then the pull request entire conversation, then the issue entire conversation (if there's enough capacity)\r\n\r\nI dont see a situation where a user needs to define which issue or pull request to pass in as arguments. This should be handled automatically based on the instructions above. \r\n\r\n> Feeding the current issue/pull conversation is easily done as is, I had it collecting previous answers it had posted for a specific user along with all of the relevant question but once I started new interactions it got messy. The only thing to consider if feeding an entire conversation into it with every call I think is token usage and costs in that sense.\r\n\r\nI think you had a far more complex implementation in mind. \r\n\r\n> I think some more thought into it would be great mate I feel we are not really close to being on the same page with it at the moment, the bounds and objectives are a bit unclear for me at least.\r\n> \r\n> No rush, I'm going to look at picking up something else in the meantime.\r\n\r\nSeems like you haven't had great luck with the last two bounties so I want to try and rectify this. Usually bounties are pretty straightforward sorry about this!"
  }
]

@0x4007
Copy link
Member

0x4007 commented Aug 26, 2023

Enhanced scrubbing; bots and other metadata for super streamlined information. This seems very straightforward to me to pass in the relevant context. I think its relevant to see the username and their text so that the bot knows to add "weight" to opinionated contributors' messages:

[
  {
    "login": "pavlovcik",
    "body": "I have some concerns about this implementation. \n\nSince creating the task, ChatGPT rolled out the capability to share full conversations which has been pretty sufficient for this purpose. \n\nHowever in order to enhance the developer experience, we should automatically provide gpt4 with as much context as possible (it will be faster than manually copy pasting all of the relevant details into ChatGPT normally.)\n\nI know this isn't in the specification, my apologies. But the prompt you provided signals to me that the answers are low quality without all of the context, which makes this inferior to just going to ChatGPT directly and asking the question. "
  },
  {
    "login": "Keyrxng",
    "body": "So when it comes to \"memory\", you have to keep in mind that this one instance will be running over all of the issues/repos. In order to feed the previous context we'd need to store each prompt, feeding it into the next so that gpt actually has context for a given interaction.\r\n\r\nAs for additional context, I tried to limit follow-ons and follow-ups because it's not possible to feed prompt after prompt into gpt effectively here I don't think unless you guys will have an openai key that can be aggressively ran as the fact that is, the bigger the input the more tokens which are used as input token size is priced in as well as output token size. So that sort of \"memory\" isn't cost effective long term.\r\n\r\nThe way I read the issue was this:\r\n\r\nYou are having a debate, discussion, pondering a problem, solution or idea and while you and team are all doing the heavy lifting maybe there's a one off question or one off need for inspiration then this solution is near perfect. You have all the context as you are debating the problem, solution whatever."
  },
  {
    "login": "Keyrxng",
    "body": "@pavlovcik My backend is running right now and will be for the next 5 mins if you want to fire a few different tester queries at the current prompt setup on my repo but I'm happy to discuss and try to implement other setups.\r\n\r\nI recently placed 2nd in a hackathon for building an AI Therapist ChatBot, it had the sort of \"memory\" that you speak of but it was a lot easier as it was a single instance so there was never the possibility of previous context being confused as it would reset on window reload. \r\n\r\nThinking out loud: To store context, for each line of questioning for each user on each issue and track all of that would be hell.\r\n\r\nIs the /ask command supposed to be well-versed in all things UBQ/Web3 or is it just an easy route to GPT directly in issues? The latter obviously being a more generalized approach and I guess would infer it's own context like GPT does with near free reign and the other would have to be guard railed not to hallucinate it's own context like it often does in that sort of context."
  },
  {
    "login": "pavlovcik",
    "body": "I am unfamiliar with what exactly is returned by the GitHub API to view a full conversation of an issue or pull request, but from what I understand it's a JSON object that describes every contributor and every interaction (reactions and comments.) \n\nWe can scrub the unnecessary output like the node IDs, and reactions, and pass in this entire conversation as context to the `/ask` command behind the scenes. \n\nEven better if we can pass in the original issue specification as well, which would require some page navigation if the `/ask` command is being invoked from the pull request. Fortunately, we already implemented similar logic for the \"comment incentives on pull request\" feature which was recently merged into `development`.\n\n> My backend is running right now and will be for the next 5 mins\n\nThank you for the offer, but unfortunately we operate asynchronously and generally this strategy is not viable for reviewers as we are usually busy reviewing tens of issues/pull requests per hour in sort of a round-robin style.  "
  },
  {
    "login": "Keyrxng",
    "body": "Hmm I'll have a think about it and try to set something up tomorrow.\r\n\r\nCan anyone use the /ask command?\r\n\r\nIs there a limit to questions per interaction?\r\n"
  },
  {
    "login": "pavlovcik",
    "body": "In this implementation I had not considered limitations but it seems responsible. \n- We could check if they are added as a contributor to the repository/organization.\n- We also have a simple access control system built in (which might already be able to take care of this?) please see the code for `/allow` although it might only be for adding labels. \n- No limits for now, a contributor can keep asking. "
  },
  {
    "login": "Keyrxng",
    "body": "https://github.com/Keyrxng/didactic-octo-train/issues/15\r\n\r\nSo the way things are at the moment:\r\n\r\ncommand = '/ask didactic-octo-train pull 16 1693816494 \"Provide some examples\"'\r\n\r\n- repoName\r\n- issue or pull\r\n- issue or pull number\r\n- commentID\r\n- question\r\n\r\nIf commentID is zero, it's a standalone single interaction with no previous context. \r\n\r\nIf it's defined then it's fed the original context and produces results, you could chain context together using the AI response commentID but it would be superficial as it would only have the context of that response but not your query verbatim.\r\n\r\nIs that sufficient or should it be possible to reference a comment on another issue or pull, and then chain together say 4 or 5 questions relating to that?\r\n\r\nThe trouble I'm having right now is trying to think of a way to allow for multiple users to create multiple chains of interactions all on the one issue but so far it's been pretty messy.\r\n\r\nI assume there will be multiple concurrent chains of interactions being run, the most rudimental way is what we have currently and it does the job but is it to spec?\r\n\r\nWould it be cleaner to allow multiple commentIDs to be passed in and we manually give it context from potentially a variety of sources? \r\n\r\nOr have the bot handle previous interactions on the issue or pull and try to find a way to distinguish the chain of interactions in amongst potentially multiple other user's interactions as well as potentially multiple single-user chains of interactions too?\r\n\r\nI need some perspective, cheers"
  },
  {
    "login": "pavlovcik",
    "body": "I'm going to spend some time researching today by looking through the GitHub API docs and passing in the full conversation context JSON of a single pull request or issue. \n\nI imagine that the interface would only be:\n\n```\n/ask is it possible to implement the todo list with react hooks?\n```\n\nAnd ChatGPT would be able to read the full conversation leading up to that question before providing a response. \n\nI'll post back here later today. "
  },
  {
    "login": "Keyrxng",
    "body": "So you are wanting to be able to traverse repos, issues and pulls and pull individual context from a comment onto a separate issue/pull and then feed the entire conversation object of the current issue/pull and not the context source?\r\n\r\nFeeding the current issue/pull conversation is easily done as is, I had it collecting previous answers it had posted for a specific user along with all of the relevant question but once I started new interactions it got messy. The only thing to consider if feeding an entire conversation into it with every call I think is token usage and costs in that sense.\r\n\r\nI think some more thought into it would be great mate I feel we are not really close to being on the same page with it at the moment, the bounds and objectives are a bit unclear for me at least. \r\n\r\nNo rush, I'm going to look at picking up something else in the meantime."
  },
  {
    "login": "pavlovcik",
    "body": "As a heads up I haven't had the chance to look into it yet. But, at the risk of veering a bit off topic, just so that you have perspective of what I have in mind for the future, I looked into https://github.com/ubiquity/telegram-ubiquibot/issues/20 which is a tool from Microsoft which allows you to store a whole repository in context for ChatGPT. This could probably be useful for a future version and extension (Telegram Chatbot version) of this command.\r\n\r\n> So you are wanting to be able to traverse repos, issues and pulls and pull individual context from a comment onto a \r\nseparate issue/pull and then feed the entire conversation object of the current issue/pull and not the context source?\r\n\r\nMore simple. There are two types of locations I anticipate the invocation of this command:\r\n1. On an issue\r\n  - Generally this is for when a bounty is being clarified and still being adjusted. Conversation usually only happens here early in the lifecycle of a bounty if the original specification is not clear enough. I imagine that it would be very straightforward to collect the full conversation history from the GitHub API and pass it in as context to ChatGPT. At minimum it should pass in the original specification (the original comment) for context on the question asked.\r\n2. On a pull request\r\n  - Pull requests usually are associated with an issue, which has a specification. It would be useful to pass in at minimum the issue specification, then the pull request entire conversation, then the issue entire conversation (if there's enough capacity)\r\n\r\nI dont see a situation where a user needs to define which issue or pull request to pass in as arguments. This should be handled automatically based on the instructions above. \r\n\r\n> Feeding the current issue/pull conversation is easily done as is, I had it collecting previous answers it had posted for a specific user along with all of the relevant question but once I started new interactions it got messy. The only thing to consider if feeding an entire conversation into it with every call I think is token usage and costs in that sense.\r\n\r\nI think you had a far more complex implementation in mind. \r\n\r\n> I think some more thought into it would be great mate I feel we are not really close to being on the same page with it at the moment, the bounds and objectives are a bit unclear for me at least.\r\n> \r\n> No rush, I'm going to look at picking up something else in the meantime.\r\n\r\nSeems like you haven't had great luck with the last two bounties so I want to try and rectify this. Usually bounties are pretty straightforward sorry about this!"
  }
]

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Aug 26, 2023

Seems like you haven't had great luck with the last two bounties so I want to try and rectify this. Usually bounties are pretty straightforward sorry about this!

Nothing to apologise for mate, I'm having a great time and it's been a smooth experience the whole way. Can't expect to go from A-Z without friction on every bounty and anyone who does has lost it but I appreciate it, cheers!

I'm following you much better now but I'm still a bit unsure how you want to grab original context from another issue or pull request that may be in another repo without passing in an identifier for it either the repo or pr/issue number? Am I missing something lmao, is every single fresh issue or pr that is opened guaranteed to have a backlink to original context? What if the original context isn't where the hashtag url points to, how do we then locate it with only passing in the question body?

I'm just thinking that as you said it's typically at the beginning of the life cycle but if it's the case it's used much further down the line on juicy threads we'll max our token usage, unless you plan on running the 32k token api which I doubt could be maxed based on only using it at the beginning of life cycles as it's ~6k vs ~24k words.

"...where 1,000 tokens is about 750 words. "

Model Input Output
8K context $0.03 / 1K tokens $0.06 / 1K tokens
32K context $0.06 / 1K tokens $0.12 / 1K tokens

@0x4007
Copy link
Member

0x4007 commented Aug 26, 2023

but I'm still a bit unsure how you want to grab original context from another issue or pull request that may be in another repo without passing in an identifier for it either the repo or pr/issue number?

https://github.com/ubiquity/ubiquibot/pull/657/files#diff-0031f10c8304cb81d02e83ba1b26c8f9273cc02fea992e27df58ea2abc63074eR29-R57

@wannacfuture not sure if you have any advice.

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Aug 27, 2023

Looking at your enhanced scrubbing, have I only to add comments from users or should I also include any gpt responses in the streamlined conversation?

So I'm thinking, we don't have to declare issue or pull as the counter is shared between them for any one repo and we have current repo context so we can ID it as a pull or an issue and we have the current issue/repo number.

If it's a PR then it'll have a backlink to the context issue which may be on another repo, so we grab the entire conversation and feed that to gpt.

So that leaves identifying a specific comment & context that is on a different repo without passing in the repoName and isn't called from a PR with a backlink.

Idk but would it be impractical (possible even?) to loop all repos, all issues and pulls for all comments and match the passed in commentID?

Without a comment ID I don't see a way to ID a comment that is sent from any user arbitrarily numbered and located within it's own issue/pull and potentially on another repo unless called from within a PR.

And without a repoName then when called within an issue it'll only be able to fetch the current repo's issue and pull comments.

ubiquity/ubiquibot/pull/663#issuecomment-1694355778
ubiquity/ubiquity-dollar/issues/728#issuecomment-1628379186
/* 
 Called from an issue: grabs specific comment, that convo as well as current convo
 Called from a pull: grabs specific comment, that convo, the hashtag convo and current convo
*/
/ask 1694355778 "provide some examples"  

/*
 Called from an issue: only grabs the current issue convo as context
 Called from a pull: uses hashtag url to grab issue convo as well as current
*/
/ask 0 "an error is occurring due to... how would you approach this?" 

^ I can't find any docs on whether or not commentIDs are unique across various repos within an org/acc or if there is a collision issue there potentially with that approach of looping of all repos, their issues and pulls and their comments. I'm hoping you guys might know more. As I'm thinking it may happen even when called from within any scenario just passing in the commentID, if both are true then I think we'll need all three: repoName, issue/pull ID & commentID.

I'm probably wrong so looking forward to any more insight.

waiting for more input
@0x4007
Copy link
Member

0x4007 commented Aug 29, 2023

Looking at your enhanced scrubbing, have I only to add comments from users or should I also include any gpt responses in the streamlined conversation?

It's a good idea to preserve the ChatGPT replies for context as well.

So I'm thinking, we don't have to declare issue or pull as the counter is shared between them for any one repo and we have current repo context so we can ID it as a pull or an issue and we have the current issue/repo number.

I don't follow

If it's a PR then it'll have a backlink to the context issue which may be on another repo, so we grab the entire conversation and feed that to gpt.

Yes sounds good.

So that leaves identifying a specific comment & context that is on a different repo without passing in the repoName and isn't called from a PR with a backlink.

If the users comment is saved in the json object that represents the conversation context, I don't see why our bot needs to know the comment node ID. There isn't a reason for the bot to edit the comment.

Idk but would it be impractical (possible even?) to loop all repos, all issues and pulls for all comments and match the passed in commentID?

Without a comment ID I don't see a way to ID a comment that is sent from any user arbitrarily numbered and located within it's own issue/pull and potentially on another repo unless called from within a PR.

And without a repoName then when called within an issue it'll only be able to fetch the current repo's issue and pull comments.


ubiquity/ubiquibot/pull/663#issuecomment-1694355778

ubiquity/ubiquity-dollar/issues/728#issuecomment-1628379186


/* 

 Called from an issue: grabs specific comment, that convo as well as current convo

 Called from a pull: grabs specific comment, that convo, the hashtag convo and current convo

*/

/ask 1694355778 "provide some examples"  



/*

 Called from an issue: only grabs the current issue convo as context

 Called from a pull: uses hashtag url to grab issue convo as well as current

*/

/ask 0 "an error is occurring due to... how would you approach this?" 

^ I can't find any docs on whether or not commentIDs are unique across various repos within an org/acc or if there is a collision issue there potentially with that approach of looping of all repos, their issues and pulls and their comments. I'm hoping you guys might know more. As I'm thinking it may happen even when called from within any scenario just passing in the commentID, if both are true then I think we'll need all three: repoName, issue/pull ID & commentID.

I'm probably wrong so looking forward to any more insight.

The only interface is "/ask question"

Any other parameters should be automatically inferred. If you want to ask a question based on another users input, just like using normal ChatGPT you repeat the input (I generally use markdown syntax when using ChatGPT to express that it is quoted) and ask your question after. There's no need for node IDs

@0x4007 0x4007 added the ping label Aug 29, 2023
moved base ask and context call into helpers
@Keyrxng
Copy link
Contributor Author

Keyrxng commented Sep 10, 2023

I've placed the base ask call to GPT as well as the context decision call into /helpers/gpt as these functions will be shared across /ask and /review and I expect more in the future.

@rndquu rndquu self-requested a review September 13, 2023 16:16
Copy link
Member

@rndquu rndquu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Keyrxng If there is no linked PR in an issue then there is an error, pls fix

Overall the PR solves the original issue, we should move all other related enhancements (/review, better parsing Resolves #1, spec check) to another issue because it's starting to be hard to follow the conversation history

.env.example Outdated Show resolved Hide resolved
src/bindings/config.ts Outdated Show resolved Hide resolved
src/utils/private.ts Outdated Show resolved Hide resolved
@Keyrxng
Copy link
Contributor Author

Keyrxng commented Sep 13, 2023

@Keyrxng If there is no linked PR in an issue then there is an error, pls fix

Overall the PR solves the original issue, we should move all other related enhancements (/review, better parsing Resolves #1, spec check) to another issue because it's starting to be hard to follow the conversation history

'review' has been stripped and I tried to do my best with splitting things, let me know of anything else specifically mate

requested changes
src/configs/shared.ts Outdated Show resolved Hide resolved
@rndquu
Copy link
Member

rndquu commented Sep 14, 2023

@Keyrxng Is this error fixed? I mean the /ask command should work if there are no linked PRs

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Sep 14, 2023

Keyrxng Is this error fixed? I mean the /ask command should work if there are no linked PRs

https://github.com/Keyrxng/UbiquityBotTesting/issues/44#issuecomment-1719080540

Sure is mate, I accidentally returned instead of logging that nothing could be found

rndquu
rndquu previously approved these changes Sep 14, 2023
src/types/shared.ts Outdated Show resolved Hide resolved
src/types/config.ts Outdated Show resolved Hide resolved
@0xcodercrane
Copy link
Contributor

To merge the PR,
1/ We should create an openai-api-key.
2/ Open config-related PRs in ubiquity and ubiquibot org with the openai-api-key. @rndquu would you open PRs once we get the key?

@rndquu
Copy link
Member

rndquu commented Sep 14, 2023

To merge the PR, 1/ We should create an openai-api-key. 2/ Open config-related PRs in ubiquity and ubiquibot org with the openai-api-key. @rndquu would you open PRs once we get the key?

Yes, I just need the key :)

@0xcodercrane
Copy link
Contributor

Yes, I just need the key :)

Sure. I am handling it with pavlovcik, will keep you updated in dm

new types
@0x4007
Copy link
Member

0x4007 commented Sep 20, 2023

Hey guys, I shared a key to both of you. What's the status with merging this in? @rndquu

@0xcodercrane 0xcodercrane merged commit c40efc5 into ubiquity:development Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ask ChatGPT /ask
4 participants