Files
rfp_response_automation/README_COMPLETE_TUTORIAL.md
2026-02-18 20:34:33 -03:00

6.7 KiB

🧠 Oracle GraphRAG RFP AI -- Complete Tutorial

Enterprise-grade deterministic RFP validation engine built with:

  • Oracle Autonomous Database 23ai
  • Oracle Property Graph
  • OCI Generative AI (LLMs + Embeddings)
  • FAISS Vector Search
  • Flask REST API
  • Hybrid Graph + Vector + JSON reasoning

📌 Introduction

This project implements a deterministic RFP validation engine.

Unlike traditional RAG systems that generate conceptual answers, this solution is designed to:

  • Validate contractual and compliance requirements
  • Produce only: YES / NO / PARTIAL
  • Provide exact documentary evidence
  • Eliminate hallucination risk
  • Ensure full traceability

This tutorial walks through the full architecture and implementation.


🏗️ Full Architecture

PDF Documents
 └─► Semantic Chunking
     ├─► FAISS Vector Index
     ├─► LLM Triple Extraction
     │     └─► Oracle 23ai Property Graph
     │           ├─► Structured JSON Node Properties
     │           ├─► Edge Confidence Weights
     │           └─► Evidence Table
     └─► Hybrid Retrieval Layer
            ├─► Vector Recall
            ├─► Graph Filtering
            ├─► Oracle Text
            └─► Graph-aware Reranking
                  └─► Deterministic LLM Decision
                        └─► REST Response

🧩 Step 1 -- Environment Setup

You need:

  • Oracle Autonomous Database 23ai
  • OCI Generative AI enabled
  • Python 3.10+
  • FAISS installed
  • Oracle Python driver (oracledb)

Install dependencies:

pip install oracledb langchain faiss-cpu flask pypandoc

📄 Step 2 -- PDF Ingestion

  • Load PDFs
  • Perform semantic chunking
  • Normalize headings and tables
  • Store chunk metadata including:
    • chunk_hash
    • source_url

Chunks feed both:

  • FAISS
  • Graph extraction

🧠 Step 3 -- Triple Extraction (Graph Creation)

The function:

create_knowledge_graph(chunks)

Uses LLM to extract ONLY explicit relationships:

SERVICE -[SUPPORTS_CAPABILITY]-> CAPABILITY
SERVICE -[DOES_NOT_SUPPORT]-> CAPABILITY
SERVICE -[HAS_LIMITATION]-> LIMITATION
SERVICE -[HAS_SLA]-> SLA_VALUE

No inference allowed.


🏛️ Step 4 -- Oracle Property Graph Setup

Graph is created automatically:

CREATE PROPERTY GRAPH GRAPH_NAME
VERTEX TABLES (...)
EDGE TABLES (...)

Nodes are stored in:

KG_NODES_GRAPH_NAME

Edges in:

KG_EDGES_GRAPH_NAME

Evidence in:

KG_EVIDENCE_GRAPH_NAME

🧩 Step 5 -- Structured Node Properties (Important)

Each node includes structured JSON properties.

Default structure:

{
  "metadata": {
    "created_by": "RFP_AI_V2",
    "version": "2.0",
    "created_at": "UTC_TIMESTAMP"
  },
  "analysis": {
    "confidence_score": null,
    "source": "DOCUMENT_RAG",
    "extraction_method": "LLM_TRIPLE_EXTRACTION"
  },
  "governance": {
    "validated": false,
    "review_required": false
  }
}

Implementation:

def build_default_node_properties():
    return {
        "metadata": {
            "created_by": "RFP_AI_V2",
            "version": "2.0",
            "created_at": datetime.utcnow().isoformat()
        },
        "analysis": {
            "confidence_score": None,
            "source": "DOCUMENT_RAG",
            "extraction_method": "LLM_TRIPLE_EXTRACTION"
        },
        "governance": {
            "validated": False,
            "review_required": False
        }
    }

This guarantees:

  • No empty {} stored
  • Auditability
  • Governance extension capability
  • Enterprise extensibility

🔎 Step 6 -- Hybrid Retrieval Strategy

The system combines:

  1. FAISS semantic recall
  2. Graph filtering via Oracle Text
  3. Graph-aware reranking
  4. Deterministic LLM evaluation

This ensures:

  • High recall
  • High precision
  • No hallucination

🎯 Step 7 -- RFP Requirement Parsing

Each question becomes structured:

{
  "requirement_type": "NON_FUNCTIONAL",
  "subject": "authentication",
  "expected_value": "MFA",
  "keywords": ["authentication", "mfa"]
}

This structure guides retrieval and evaluation.


📊 Step 8 -- Deterministic Decision Engine

LLM output format:

{
  "answer": "YES | NO | PARTIAL",
  "confidence": "HIGH | MEDIUM | LOW",
  "justification": "Short factual explanation",
  "evidence": [
    {
      "quote": "Exact document text",
      "source": "Document reference"
    }
  ]
}

Rules:

  • If not explicitly stated → NO
  • No inference
  • Must provide documentary evidence

🌐 Step 9 -- Running the Application

Run preprocessing once:

python graphrag_rerank.py

Run web UI:

python app.py

Open:

http://localhost:8100

Or use REST:

curl -X POST http://localhost:8100/chat -H "Content-Type: application/json" -d '{"question": "Does the platform support MFA?"}'

🧪 Example RFP Questions

Security, SLA, Performance, Compliance, Vendor Lock-in, Backup, Governance.

The engine validates each with deterministic logic.


🔐 Design Principles

  • Evidence-first
  • Deterministic outputs
  • Zero hallucination tolerance
  • Enterprise auditability
  • Structured graph reasoning

🚀 Future Extensions

  • Confidence scoring via graph density
  • Weighted edge scoring
  • SLA numeric comparison engine
  • JSON-based filtering
  • PGQL advanced reasoning
  • Enterprise governance workflows

📌 Conclusion

Oracle GraphRAG RFP AI is not a chatbot.

It is a compliance validation engine built for enterprise RFP automation, legal due diligence, and procurement decision support.

Deterministic. Traceable. Expandable.