-
Notifications
You must be signed in to change notification settings - Fork 138
Open
Description
I have 4txt files to make their kg, I wrote sanitize def to delete any citation between sentences and default neo4j prompt; but it makes empty json and in raw_response.txt it shows only context not json format!
I used these two:
def sanitize_chunk(chunk):
"""Remove citations, years, and escape problematic characters."""
chunk = re.sub(r'\[\d+\]', '', chunk) # remove [1], [2], ...
chunk = re.sub(r'\(\d{4}\)', '', chunk) # remove (2001), (1999), ...
chunk = chunk.replace('"', "'") # replace quotes
chunk = re.sub(r"\s+", " ", chunk).strip()
return chunk
kg_prompt = (
"You are a strict information extraction system.\n"
"Return ONLY valid JSON in EXACTLY this format:\n"
'{"nodes": [{"name": "<entity>", "label": "<type>"}], '
'"relationships": [{"start_node": {"name": "<entity1>"}, '
'"end_node": {"name": "<entity2>"}, "type": "<relation>"}]}\n'
"If nothing found, return {\"nodes\": [], \"relationships\": []}.\n"
"Do NOT include any explanation, citation, or other text.\n\n"
f"Text:\n'''{rephrased}'''\n"
Many thanks for your response
Metadata
Metadata
Assignees
Labels
No labels