Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arihan/mar 8 ability to comment by line of code in pr #55

Merged
Merged
Show file tree
Hide file tree
Changes from 60 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
48df5f7
Specify Weaviate schema (#49)
arihanv Nov 29, 2023
88e74d3
dispatch coding agent
DexterStorey Nov 30, 2023
80ce114
Setup commands for sandbox
arihanv Nov 30, 2023
5088ffa
dispatch engineer instructions
DexterStorey Dec 1, 2023
5b1cef8
fix: sandbox clone repo
arihanv Dec 1, 2023
49d3193
fix: add git access token to sandbox
arihanv Dec 1, 2023
c01b888
fix: process.env usage
arihanv Dec 1, 2023
5f0696c
engineer prompt
DexterStorey Dec 1, 2023
5ddc312
Add reviewer agent for pull request
arihanv Dec 1, 2023
3f8b119
feat: add error handling to reviewer
arihanv Dec 3, 2023
72074e9
feat: create specific function for PR comments
arihanv Dec 3, 2023
0da48d1
feat: turn code reviewer into an agent
arihanv Dec 3, 2023
484e2f9
Fix labelling regression
tedspare Dec 5, 2023
f9092cc
Lint and minor refactors
tedspare Dec 5, 2023
4a651be
Fix instruction updater
tedspare Dec 5, 2023
707aac3
Merge branch 'staging' of https://github.com/RubricLab/maige into ari…
tedspare Dec 5, 2023
eea274f
Comment out dispatcher for now
tedspare Dec 5, 2023
fba6cfc
dep: install parse-diff
arihanv Dec 6, 2023
54179b8
feat: remove severity from prComment
arihanv Dec 6, 2023
4e34447
feat: create function for inline comments
arihanv Dec 6, 2023
836883e
feat: reviewer can use prComment & codeComment
arihanv Dec 6, 2023
2b5384e
feat: automatically review PRs and commits
arihanv Dec 6, 2023
9b6b922
fix: remove custom OctokitReqType
arihanv Dec 6, 2023
cd7d978
Merge branch 'staging' of https://github.com/RubricLab/maige into ari…
tedspare Dec 6, 2023
482f313
Add newline
tedspare Dec 6, 2023
6666976
Lint package.json
tedspare Dec 6, 2023
a009c12
Address 'pull_request of undefined' error
tedspare Dec 6, 2023
7eec620
Use camelCase
tedspare Dec 6, 2023
d291152
add filePath filter to code search function
arihanv Dec 7, 2023
d909615
improved agent tooling, prompting, and inputs
arihanv Dec 7, 2023
a1016db
refactor webhook route and add cleaner PR review
arihanv Dec 7, 2023
8a86bad
prompt engineering
arihanv Dec 7, 2023
a20406d
more prompt engineering
arihanv Dec 7, 2023
be974c6
Merge branch 'staging' of https://github.com/RubricLab/maige into ari…
tedspare Dec 7, 2023
b1626ed
Clarify repoFullName var name
tedspare Dec 7, 2023
a05c2b6
Comment out verbose param
tedspare Dec 7, 2023
3bd0960
Move footer to constants file
tedspare Dec 7, 2023
78ad53a
Use implied fields
tedspare Dec 7, 2023
16729c8
Shorten prompts. Refactor.
tedspare Dec 7, 2023
101df88
Rename commitId
tedspare Dec 8, 2023
b27846f
Trigger build
tedspare Dec 8, 2023
f429951
Trigger webhook
tedspare Dec 8, 2023
57bbc1e
Trigger webook again
tedspare Dec 8, 2023
10dde12
Remove unused type
tedspare Dec 8, 2023
b6e26e2
Allow synchronize event
tedspare Dec 8, 2023
6e8691e
Iterate review prompt
tedspare Dec 8, 2023
54dbda1
Parametrize code comment filepath
tedspare Dec 8, 2023
169e35d
Trigger webhook
tedspare Dec 8, 2023
9aaf357
Trigger webhook
tedspare Dec 8, 2023
2b592dd
Update Weaviate test
tedspare Dec 8, 2023
058459e
Trigger webhook
tedspare Dec 8, 2023
949214a
Call review agent once per diff
tedspare Dec 8, 2023
086083a
Update vector search test
tedspare Dec 8, 2023
9887f0e
Fix codesearch for reviewer
tedspare Dec 8, 2023
2462800
Trigger webhook
tedspare Dec 8, 2023
8b5dde6
Trigger webhook
tedspare Dec 8, 2023
4160457
Fix weaviate test
tedspare Dec 8, 2023
a9534f0
Add comment
tedspare Dec 8, 2023
c2e2a2e
Iterate prompt
tedspare Dec 8, 2023
0c4b925
Clean diff before passing to GPT
tedspare Dec 8, 2023
2f9e7f2
Avoid clogging agent with tool output
tedspare Dec 8, 2023
73b29f5
Trigger webhook
tedspare Dec 8, 2023
61631e0
Format whitespace in codesearch
tedspare Dec 8, 2023
129f403
Undo formatting changes
tedspare Dec 8, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 57 additions & 39 deletions app/api/webhook/github/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -167,60 +167,24 @@ export const POST = async (req: Request) => {
if (comment && !comment.body.toLowerCase().includes('maige'))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new changes in the webhook route do not include a check for the 'maige' keyword in the comment body, which was present in the original code. This could potentially allow irrelevant comments to trigger the webhook actions.

By Maige. How's my driving?

return new Response('Irrelevant comment', {status: 202})

if (issue?.pull_request) {
const {
pull_request: {diff_url: diffUrl},
node_id: pullId
} = issue

// Get GitHub app instance access token
const app = new App({
appId: process.env.GITHUB_APP_ID || '',
privateKey: process.env.GITHUB_PRIVATE_KEY || ''
})

const octokit = await app.getInstallationOctokit(instanceId)

const response = await fetch(diffUrl)

if (!response.ok) return new Response('Could not fetch diff', {status: 503})

const data = await response.text()

await reviewer({
octokit: octokit,
input: `Instruction: ${comment?.body}\n\nPR Diff:\n${data}`,
pullId
})

return new Response('Reviewed PR', {status: 200})
}

if (
!(
(action === 'opened' && payload?.issue) ||
(action === 'created' && payload?.comment)
(action === 'created' && payload?.comment) ||
(action === 'opened' && payload?.pull_request) ||
(action === 'synchronize' && payload?.pull_request)
)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The App class is being instantiated (lines 177-180), but there are no references to it in the codebase. Ensure that the class is defined and used correctly, or remove the instantiation if it's not needed.

By Maige. How's my driving?

return new Response('Webhook received', {status: 202})

const {
issue: {
node_id: issueId,
title,
number: issueNumber,
body,
labels: existingLabels
},
repository: {
node_id: repoId,
name,
owner: {login: owner}
}
} = payload

const existingLabelNames = existingLabels?.map((l: Label) => l.name)

const customer = await prisma.customer.findUnique({
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prisma.customer.findUnique method is used (line 188), but there are no references to a customer model in the codebase. Verify that the model exists and is correctly defined in Prisma's schema, or adjust the code to use the correct model.

By Maige. How's my driving?

tedspare marked this conversation as resolved.
Show resolved Hide resolved
tedspare marked this conversation as resolved.
Show resolved Hide resolved
tedspare marked this conversation as resolved.
Show resolved Hide resolved
where: {
name: owner || undefined
Expand Down Expand Up @@ -289,6 +253,60 @@ export const POST = async (req: Request) => {

await incrementUsage(prisma, owner)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incrementUsage function is being called (line 254), but there are no references to it in the codebase. Ensure that this function is defined and used appropriately, or remove it if it's not needed.

By Maige. How's my driving?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incrementUsage function is defined and used in the app/api/webhook/github/route.ts file. It is utilized to track and increment the usage count for a customer when processing webhook events.

By Maige. How's my driving?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incrementUsage function is being called without checking if the usage has exceeded the usageLimit. This could lead to overcounting usage and potentially allowing customers to exceed their usage limits without proper checks. Consider adding a check before incrementing usage to ensure that the limit has not been surpassed.

By Maige. How's my driving?


if ((action == 'opened' || action == 'synchronize') && payload.pull_request) {
tedspare marked this conversation as resolved.
Show resolved Hide resolved
const {
pull_request: {diff_url: diffUrl}
} = payload

const res = await fetch(diffUrl)
if (!res.ok) return new Response('Could not fetch diff', {status: 503})
const diff = await res.text()

await reviewer({
tedspare marked this conversation as resolved.
Show resolved Hide resolved
customerId,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The customerId parameter is missing in the reviewer function call within the POST method. This could lead to runtime errors when the reviewer function tries to access the customerId which would be undefined.

By Maige. How's my driving?

octokit,
input: `Instruction: ${comment?.body}\n\nPR Diff:\n${diff}`,
pullNumber: payload.number,
repoFullName: `${owner}/${name}`,
commitId: payload.pull_request.head.sha
})

return new Response('Reviewed PR', {status: 200})
}

if (issue?.pull_request) {
const {
pull_request: {diff_url: diffUrl},
node_id: pullId
} = issue

const res = await fetch(diffUrl)
if (!res.ok) return new Response('Could not fetch diff', {status: 503})
const diff = await res.text()

await reviewer({
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The customerId is not being passed to the reviewer function call for the case when issue?.pull_request is true. This is inconsistent with the other reviewer function calls and might cause issues if customerId is expected within the reviewer function.

By Maige. How's my driving?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The customerId parameter is not being passed to the reviewer function call within the POST method when handling the issue?.pull_request case. This could lead to runtime errors when the reviewer function tries to access the customerId which would be undefined.

By Maige. How's my driving?

customerId,
octokit,
input: `Instruction: ${comment?.body}\n\nPR Diff:\n${diff}`,
pullId,
repoFullName: `${owner}/${name}`
})

return new Response('Replied to PR comment', {status: 200})
}

const {
issue: {
node_id: issueId,
title,
number: issueNumber,
body,
labels: existingLabels
}
} = payload

const existingLabelNames = existingLabels?.map((l: Label) => l.name)

/**
* Repo commands
*/
Expand Down
Binary file modified bun.lockb
Binary file not shown.
123 changes: 100 additions & 23 deletions lib/agents/reviewer.tsx
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
import {initializeAgentExecutorWithOptions} from 'langchain/agents'
import {ChatOpenAI} from 'langchain/chat_models/openai'
import {SerpAPI} from 'langchain/tools'
import parse, {Change, Chunk} from 'parse-diff'
tedspare marked this conversation as resolved.
Show resolved Hide resolved
import env from '~/env.mjs'
import {codeComment} from '~/tools/codeComment'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the codeComment import on this line is not used anywhere in the provided code snippet. If it's not used in the subsequent code, consider removing the import to keep the code clean and maintainable.

By Maige. How's my driving?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YO THIS REVIEWING FEATURE IS SICK

@arihanv flagging this (as a much higher-level PR review than usual, which is SICK):

Since codeComment is used below, how could we prevent this kind of error? Eg. prompt engineering ("this snippet is not the full file and might be missing parts etc") or passing longer code snippets?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like you're importing codeComment from ~/tools/codeComment, but in the provided tools, codeComment is a function within the functions namespace. Please make sure you're importing from the correct module or using the correct namespace for codeComment.

By Maige. How's my driving?

import {codebaseSearch} from '~/tools/codeSearch'
import {prComment} from '~/tools/prComment'
tedspare marked this conversation as resolved.
Show resolved Hide resolved
import {isDev} from '~/utils'

Expand All @@ -12,36 +15,110 @@ const model = new ChatOpenAI({
})

export async function reviewer({
tedspare marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prComment tool's schema in lib/agents/reviewer.tsx is outdated. The severity parameter is no longer used in the prComment function, as seen in lib/tools/prComment.ts. Update the schema and function call to match the current implementation of prComment which only requires a comment parameter.

By Maige. How's my driving?

customerId,
input,
octokit,
pullId
pullId,
repoFullName,
pullNumber,
commitId
}: {
customerId: string
tedspare marked this conversation as resolved.
Show resolved Hide resolved
input: string
octokit: any
pullId: string
pullId?: string
repoFullName?: string
pullNumber?: number
commitId?: string
}) {
const tools = [new SerpAPI(), prComment({octokit, pullId})]

const prefix = `
You are senior engineer reviewing a Pull Request in GitHub made by a junior engineer.
You MUST leave a comment on the PR according to the user's instructions using the prComment function.
Format your answer beautifully using markdown suitable for GitHub.
DO NOT use any emojis or non-Ascii characters.
{agent_scratchpad}
`.replaceAll('\n', ' ')

const executor = await initializeAgentExecutorWithOptions(tools, model, {
agentType: 'openai-functions',
returnIntermediateSteps: isDev,
handleParsingErrors: true,
verbose: false,
agentArgs: {
prefix
/**
* Comment on a PR
*/
if (pullId) {
const tools = [new SerpAPI(), codebaseSearch({customerId, repoFullName})]

const prefix = `
You are a 1000x senior engineer summarizing a pull request on GitHub.
Provide a high-level summary (maximum 5 sentences) of the diff.
If you write too much, the author will get overwhelmed.
Limit prose.

{agent_scratchpad}
`.replaceAll('\n', ' ')

const executor = await initializeAgentExecutorWithOptions(tools, model, {
tedspare marked this conversation as resolved.
Show resolved Hide resolved
agentType: 'openai-functions',
returnIntermediateSteps: isDev,
handleParsingErrors: true,
verbose: true,
agentArgs: {
prefix
}
})

const result = await executor.call({input})

// Forcefully call the prComment tool
await prComment({octokit, pullId}).func({
comment: result.output
})

return
} else {
/**
* New or updated PR
*/
const prefix = `
tedspare marked this conversation as resolved.
Show resolved Hide resolved
You are a 1000x senior engineer reviewing a pull request on GitHub.
Only comment on modified code.
Only flag the top few issues: bad patterns, clear mistakes, or potential breaking changes.
If it looks like new code is unused, try searching for it.
Think step by step.
Limit prose. If you write too much, the author will get overwhelmed.

{agent_scratchpad}
`.replaceAll('\n', ' ')

let files = parse(input)
let diff = ''

for (const file of files) {
let changes = `File Path: ${file.from}\n\n`

file.chunks.forEach((chunk: Chunk) => {
chunk.changes.forEach((change: Change & {ln2?: string; ln?: string}) => {
const line = change.content.replaceAll('\t', ' ')
changes += `${change.ln2 || change.ln} ${line}\n`
})

changes += '='.repeat(10) + '\n'
})

diff += changes + '='.repeat(20) + '\n\n'
}
})

const result = await executor.call({input})
const {output} = result
const tools = [
codebaseSearch({customerId, repoFullName}),
codeComment({
octokit,
repoFullName,
pullNumber,
commitId
})
]

const executor = await initializeAgentExecutorWithOptions(tools, model, {
agentType: 'openai-functions',
returnIntermediateSteps: isDev,
handleParsingErrors: true,
verbose: true,
agentArgs: {
prefix
}
})

await executor.call({input: diff})

return output
return
}
}
4 changes: 4 additions & 0 deletions lib/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,7 @@ export const TIERS = {
priceId: process.env.STRIPE_BASE_PRICE_ID || ''
tedspare marked this conversation as resolved.
Show resolved Hide resolved
}
}

export const COPY = {
tedspare marked this conversation as resolved.
Show resolved Hide resolved
tedspare marked this conversation as resolved.
Show resolved Hide resolved
FOOTER: `By [Maige](https://maige.app). How's my driving?`
}
21 changes: 12 additions & 9 deletions lib/tests/weaviate.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,26 @@ test.skip('Bun test runner - Weaviate', () => {
expect(Bun.version).toInclude('1.0')
})

// To test this, pass the following env vars to the process:
// If Bun is unable to access env vars, pass these to the test runner:
// WEAVIATE_HOST, WEAVIATE_SCHEME, OPENAI_API_KEY
// since Bun seems to not be able to access env vars
test.skip(
test(
'Embed repo',
async () => {
const customerId = Math.random().toString(36).substring(7)
const repoUrl = 'https://github.com/RubricLab/shot'
const branch = 'main'
const customerId = 'clot5gx6a0000uvdovvfi1x9q'
const repoUrl = 'https://github.com/RubricLab/maige'
const branch = 'staging'
const query = 'codeSearch' // arbitrary - should change over time

const vectorDB = new Weaviate(customerId)
const docs = await vectorDB.embedRepo(repoUrl, branch)

const docs = await vectorDB.embedRepo(repoUrl, branch, true)
expect(docs?.length).toBeGreaterThan(0)

console.log(`Loaded ${docs?.length} docs`)
const search = await vectorDB.searchCode(query, repoUrl, 1, undefined, branch)

expect(docs.length).toBeGreaterThan(0)
expect(search?.length).toBeGreaterThan(0)

if (search?.length > 0) expect(search[0].text).toInclude('search')
},
15 * 1000
)
45 changes: 45 additions & 0 deletions lib/tools/codeComment.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import {DynamicStructuredTool} from 'langchain/tools'
import {z} from 'zod'
import {COPY} from '~/constants'

/**
* Comment on code
*/
export function codeComment({
octokit,
repoFullName,
pullNumber,
commitId
}: {
octokit: any
repoFullName: string
pullNumber: number
commitId: string
}) {
return new DynamicStructuredTool({
description: 'Adds a comment to code in a PR',
func: async ({comment, line, side, path}) => {
const res = await octokit.request(
`POST /repos/${repoFullName}/pulls/${pullNumber}/comments`,
{
body: `${comment}\n\n${COPY.FOOTER}`,
commit_id: commitId,
path,
line,
side,
headers: {'X-GitHub-Api-Version': '2022-11-28'}
}
)
return JSON.stringify(res)
},
name: 'codeComment',
schema: z.object({
comment: z.string().describe('The comment to add'),
line: z.number().describe('The line number'),
side: z
.enum(['LEFT', 'RIGHT'])
.describe('If Deletion then LEFT, else RIGHT'),
path: z.string().describe('The path to the file')
})
})
}
7 changes: 4 additions & 3 deletions lib/tools/codeSearch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ export function codebaseSearch({
return new DynamicStructuredTool({
tedspare marked this conversation as resolved.
Show resolved Hide resolved
description:
'Search the codebase by query. Uses vector similarity; format queries to make use of this.',
func: async ({query}) => {
func: async ({query, filePath}) => {
const db = new Weaviate(customerId)
const docs = await db.searchCode(query, repoFullName)
const docs = await db.searchCode(query, repoFullName, 3, filePath)

if (!docs?.length) return 'No results found'

Expand All @@ -30,7 +30,8 @@ export function codebaseSearch({
},
name: 'searchCode',
schema: z.object({
query: z.string().describe('The query to search')
query: z.string().describe('The query to search'),
filePath: z.string().optional().describe('The file path to search')
})
})
}
Loading