rfp_response_automation/README_COMPLETE_TUTORIAL.md

# 🧠 Oracle GraphRAG RFP AI -- Complete Tutorial

Enterprise-grade deterministic RFP validation engine built with:

-   Oracle Autonomous Database 23ai
-   Oracle Property Graph
-   OCI Generative AI (LLMs + Embeddings)
-   FAISS Vector Search
-   Flask REST API
-   Hybrid Graph + Vector + JSON reasoning

------------------------------------------------------------------------

# 📌 Introduction

This project implements a **deterministic RFP validation engine**.

Unlike traditional RAG systems that generate conceptual answers, this
solution is designed to:

-   Validate contractual and compliance requirements
-   Produce only: YES / NO / PARTIAL
-   Provide exact documentary evidence
-   Eliminate hallucination risk
-   Ensure full traceability

This tutorial walks through the full architecture and implementation.

------------------------------------------------------------------------

# 🏗️ Full Architecture

    PDF Documents
     └─► Semantic Chunking
         ├─► FAISS Vector Index
         ├─► LLM Triple Extraction
         │     └─► Oracle 23ai Property Graph
         │           ├─► Structured JSON Node Properties
         │           ├─► Edge Confidence Weights
         │           └─► Evidence Table
         └─► Hybrid Retrieval Layer
                ├─► Vector Recall
                ├─► Graph Filtering
                ├─► Oracle Text
                └─► Graph-aware Reranking
                      └─► Deterministic LLM Decision
                            └─► REST Response

------------------------------------------------------------------------

# 🧩 Step 1 -- Environment Setup

You need:

-   Oracle Autonomous Database 23ai
-   OCI Generative AI enabled
-   Python 3.10+
-   FAISS installed
-   Oracle Python driver (`oracledb`)

Install dependencies:

    pip install oracledb langchain faiss-cpu flask pypandoc

------------------------------------------------------------------------

# 📄 Step 2 -- PDF Ingestion

-   Load PDFs
-   Perform semantic chunking
-   Normalize headings and tables
-   Store chunk metadata including:
    -   chunk_hash
    -   source_url

Chunks feed both:

-   FAISS
-   Graph extraction

------------------------------------------------------------------------

# 🧠 Step 3 -- Triple Extraction (Graph Creation)

The function:

    create_knowledge_graph(chunks)

Uses LLM to extract ONLY explicit relationships:

    SERVICE -[SUPPORTS_CAPABILITY]-> CAPABILITY
    SERVICE -[DOES_NOT_SUPPORT]-> CAPABILITY
    SERVICE -[HAS_LIMITATION]-> LIMITATION
    SERVICE -[HAS_SLA]-> SLA_VALUE

No inference allowed.

------------------------------------------------------------------------

# 🏛️ Step 4 -- Oracle Property Graph Setup

Graph is created automatically:

    CREATE PROPERTY GRAPH GRAPH_NAME
    VERTEX TABLES (...)
    EDGE TABLES (...)

Nodes are stored in:

    KG_NODES_GRAPH_NAME

Edges in:

    KG_EDGES_GRAPH_NAME

Evidence in:

    KG_EVIDENCE_GRAPH_NAME

------------------------------------------------------------------------

# 🧩 Step 5 -- Structured Node Properties (Important)

Each node includes structured JSON properties.

Default structure:

``` json
{
  "metadata": {
    "created_by": "RFP_AI_V2",
    "version": "2.0",
    "created_at": "UTC_TIMESTAMP"
  },
  "analysis": {
    "confidence_score": null,
    "source": "DOCUMENT_RAG",
    "extraction_method": "LLM_TRIPLE_EXTRACTION"
  },
  "governance": {
    "validated": false,
    "review_required": false
  }
}
```

Implementation:

``` python
def build_default_node_properties():
    return {
        "metadata": {
            "created_by": "RFP_AI_V2",
            "version": "2.0",
            "created_at": datetime.utcnow().isoformat()
        },
        "analysis": {
            "confidence_score": None,
            "source": "DOCUMENT_RAG",
            "extraction_method": "LLM_TRIPLE_EXTRACTION"
        },
        "governance": {
            "validated": False,
            "review_required": False
        }
    }
```

This guarantees:

-   No empty `{}` stored
-   Auditability
-   Governance extension capability
-   Enterprise extensibility

------------------------------------------------------------------------

# 🔎 Step 6 -- Hybrid Retrieval Strategy

The system combines:

1.  FAISS semantic recall
2.  Graph filtering via Oracle Text
3.  Graph-aware reranking
4.  Deterministic LLM evaluation

This ensures:

-   High recall
-   High precision
-   No hallucination

------------------------------------------------------------------------

# 🎯 Step 7 -- RFP Requirement Parsing

Each question becomes structured:

``` json
{
  "requirement_type": "NON_FUNCTIONAL",
  "subject": "authentication",
  "expected_value": "MFA",
  "keywords": ["authentication", "mfa"]
}
```

This structure guides retrieval and evaluation.

------------------------------------------------------------------------

# 📊 Step 8 -- Deterministic Decision Engine

LLM output format:

``` json
{
  "answer": "YES | NO | PARTIAL",
  "confidence": "HIGH | MEDIUM | LOW",
  "justification": "Short factual explanation",
  "evidence": [
    {
      "quote": "Exact document text",
      "source": "Document reference"
    }
  ]
}
```

Rules:

-   If not explicitly stated → NO
-   No inference
-   Must provide documentary evidence

------------------------------------------------------------------------

# 🌐 Step 9 -- Running the Application

Run preprocessing once:

    python graphrag_rerank.py

Run web UI:

    python app.py

Open:

    http://localhost:8100

Or use REST:

    curl -X POST http://localhost:8100/chat -H "Content-Type: application/json" -d '{"question": "Does the platform support MFA?"}'

------------------------------------------------------------------------

# 🧪 Example RFP Questions

Security, SLA, Performance, Compliance, Vendor Lock-in, Backup,
Governance.

The engine validates each with deterministic logic.

------------------------------------------------------------------------

# 🔐 Design Principles

-   Evidence-first
-   Deterministic outputs
-   Zero hallucination tolerance
-   Enterprise auditability
-   Structured graph reasoning

------------------------------------------------------------------------

# 🚀 Future Extensions

-   Confidence scoring via graph density
-   Weighted edge scoring
-   SLA numeric comparison engine
-   JSON-based filtering
-   PGQL advanced reasoning
-   Enterprise governance workflows

------------------------------------------------------------------------

# 📌 Conclusion

Oracle GraphRAG RFP AI is not a chatbot.

It is a compliance validation engine built for enterprise RFP
automation, legal due diligence, and procurement decision support.

Deterministic. Traceable. Expandable.