cristiano.hoshikawa/rfp_response_automation

Fork 0

mirror of https://github.com/hoshikawa2/rfp_response_automation.git synced 2026-03-03 16:09:35 +00:00

Files

hoshikawa2 60f0dcaac4 first commit

2026-02-18 20:34:33 -03:00

6.7 KiB

Raw Permalink Blame History

🧠 Oracle GraphRAG RFP AI -- Complete Tutorial

Enterprise-grade deterministic RFP validation engine built with:

Oracle Autonomous Database 23ai
Oracle Property Graph
OCI Generative AI (LLMs + Embeddings)
FAISS Vector Search
Flask REST API
Hybrid Graph + Vector + JSON reasoning

📌 Introduction

This project implements a deterministic RFP validation engine.

Unlike traditional RAG systems that generate conceptual answers, this solution is designed to:

Validate contractual and compliance requirements
Produce only: YES / NO / PARTIAL
Provide exact documentary evidence
Eliminate hallucination risk
Ensure full traceability

This tutorial walks through the full architecture and implementation.

🏗️ Full Architecture

PDF Documents
 └─► Semantic Chunking
     ├─► FAISS Vector Index
     ├─► LLM Triple Extraction
     │     └─► Oracle 23ai Property Graph
     │           ├─► Structured JSON Node Properties
     │           ├─► Edge Confidence Weights
     │           └─► Evidence Table
     └─► Hybrid Retrieval Layer
            ├─► Vector Recall
            ├─► Graph Filtering
            ├─► Oracle Text
            └─► Graph-aware Reranking
                  └─► Deterministic LLM Decision
                        └─► REST Response

🧩 Step 1 -- Environment Setup

You need:

Oracle Autonomous Database 23ai
OCI Generative AI enabled
Python 3.10+
FAISS installed
Oracle Python driver (oracledb)

Install dependencies:

pip install oracledb langchain faiss-cpu flask pypandoc

📄 Step 2 -- PDF Ingestion

Load PDFs
Perform semantic chunking
Normalize headings and tables
Store chunk metadata including:
- chunk_hash
- source_url

Chunks feed both:

FAISS
Graph extraction

🧠 Step 3 -- Triple Extraction (Graph Creation)

The function:

create_knowledge_graph(chunks)

Uses LLM to extract ONLY explicit relationships:

SERVICE -[SUPPORTS_CAPABILITY]-> CAPABILITY
SERVICE -[DOES_NOT_SUPPORT]-> CAPABILITY
SERVICE -[HAS_LIMITATION]-> LIMITATION
SERVICE -[HAS_SLA]-> SLA_VALUE

No inference allowed.

🏛️ Step 4 -- Oracle Property Graph Setup

Graph is created automatically:

CREATE PROPERTY GRAPH GRAPH_NAME
VERTEX TABLES (...)
EDGE TABLES (...)

Nodes are stored in:

KG_NODES_GRAPH_NAME

Edges in:

KG_EDGES_GRAPH_NAME

Evidence in:

KG_EVIDENCE_GRAPH_NAME

🧩 Step 5 -- Structured Node Properties (Important)

Each node includes structured JSON properties.

Default structure:

{
  "metadata": {
    "created_by": "RFP_AI_V2",
    "version": "2.0",
    "created_at": "UTC_TIMESTAMP"
  },
  "analysis": {
    "confidence_score": null,
    "source": "DOCUMENT_RAG",
    "extraction_method": "LLM_TRIPLE_EXTRACTION"
  },
  "governance": {
    "validated": false,
    "review_required": false
  }
}

Implementation:

def build_default_node_properties():
    return {
        "metadata": {
            "created_by": "RFP_AI_V2",
            "version": "2.0",
            "created_at": datetime.utcnow().isoformat()
        },
        "analysis": {
            "confidence_score": None,
            "source": "DOCUMENT_RAG",
            "extraction_method": "LLM_TRIPLE_EXTRACTION"
        },
        "governance": {
            "validated": False,
            "review_required": False
        }
    }

This guarantees:

No empty {} stored
Auditability
Governance extension capability
Enterprise extensibility

🔎 Step 6 -- Hybrid Retrieval Strategy

The system combines:

FAISS semantic recall
Graph filtering via Oracle Text
Graph-aware reranking
Deterministic LLM evaluation

This ensures:

High recall
High precision
No hallucination

🎯 Step 7 -- RFP Requirement Parsing

Each question becomes structured:

{
  "requirement_type": "NON_FUNCTIONAL",
  "subject": "authentication",
  "expected_value": "MFA",
  "keywords": ["authentication", "mfa"]
}

This structure guides retrieval and evaluation.

📊 Step 8 -- Deterministic Decision Engine

LLM output format:

{
  "answer": "YES | NO | PARTIAL",
  "confidence": "HIGH | MEDIUM | LOW",
  "justification": "Short factual explanation",
  "evidence": [
    {
      "quote": "Exact document text",
      "source": "Document reference"
    }
  ]
}

Rules:

If not explicitly stated → NO
No inference
Must provide documentary evidence

🌐 Step 9 -- Running the Application

Run preprocessing once:

python graphrag_rerank.py

Run web UI:

python app.py

Open:

http://localhost:8100

Or use REST:

curl -X POST http://localhost:8100/chat -H "Content-Type: application/json" -d '{"question": "Does the platform support MFA?"}'

🧪 Example RFP Questions

Security, SLA, Performance, Compliance, Vendor Lock-in, Backup, Governance.

The engine validates each with deterministic logic.

🔐 Design Principles

Evidence-first
Deterministic outputs
Zero hallucination tolerance
Enterprise auditability
Structured graph reasoning

🚀 Future Extensions

Confidence scoring via graph density
Weighted edge scoring
SLA numeric comparison engine
JSON-based filtering
PGQL advanced reasoning
Enterprise governance workflows

📌 Conclusion

Oracle GraphRAG RFP AI is not a chatbot.

It is a compliance validation engine built for enterprise RFP automation, legal due diligence, and procurement decision support.

Deterministic. Traceable. Expandable.

6.7 KiB Raw Permalink Blame History