# Master Data Management (MDM) Project Deployment Guide ## 1. Introduction This project implements a **Master Data Management (MDM) pipeline** powered by **AI agents** and **GPU acceleration**. Its purpose is to **normalize, validate, deduplicate, harmonize, and enrich master records** across multiple domains, such as: - **Customer records** (names, phone numbers, emails, addresses, etc.) - **Product data** (SKU, EAN, units, volumes, etc.) - **Supplier information** (legal entities, CNPJs, contact data) - **Financial data** (transaction codes, normalization rules) - **Address standardization** (postal codes, neighborhoods, city/state consistency) ### Example Use Cases - Consolidating duplicated **customer profiles** coming from multiple systems (CRM, ERP, Mobile App). - **Normalizing Brazilian addresses** with CEP validation via **ZipCodeBase API**. - Formatting **CPF, CNPJ, and phone numbers** into consistent formats. - Enriching records with **external data sources** (postal APIs, product catalogs). ### Infrastructure This deployment is designed for **NVIDIA A10 GPU instances** on **Oracle Cloud Infrastructure (OCI)**. OCI provides **specialized GPU compute shapes** that are CUDA-enabled, allowing high performance for **large language models (LLMs)** and **parallel inference workloads**. The system leverages **CUDA acceleration** to maximize throughput and process large amounts of records efficiently, distributing the workload across multiple GPU endpoints. --- ## 2. Prerequisites ### Hardware - **GPU**: NVIDIA A10 or higher (OCI `VM.GPU.A10.1` or `BM.GPU.A10.4`). - **vCPUs**: Minimum 16 cores. - **RAM**: Minimum 64 GB. - **Disk**: At least 200 GB SSD (recommended NVMe). ### Software - **Operating System**: Oracle Linux 8 or Ubuntu 22.04. - **CUDA Toolkit**: Version 12.2+ with NVIDIA drivers installed. - **Python**: Version 3.10 or higher. - **Ollama**: Serving local LLMs in GGUF format. - **Conda Environment**: ```bash conda create -n mdm python=3.10 -y conda activate mdm pip install -r requirements.txt ``` ### Required Python Packages - `fastapi` - `uvicorn` - `httpx` - `pydantic` - `orjson` - `rake-nltk` - `regex` - `numpy` ### External Services - **ZipCodeBase API key** for address enrichment. - Access to **OCI tenancy** with GPU compute shapes enabled. --- ## 3. Understand the Architecture The project follows a **modular architecture** with clear separation of responsibilities. ```mermaid flowchart TD A[Input Records] --> B[FastAPI App - mdm_app] B --> C[Normalize Service] B --> D[Validate Service] B --> E[Deduplication Service] B --> F[Address Parser Service] B --> G[ZipCodeBase Enrichment] C --> H[(Ollama GPU - CUDA A10)] D --> H E --> H F --> H G --> I[(ZipCodeBase API)] H --> J[Golden Record Consolidation] J --> K[Output JSON Results] ``` ### Module Responsibilities - **FastAPI App**: Orchestrates API requests and workflows. - **Normalize Service**: Uses LLM to reformat CPF, CNPJ, phone, and names. - **Validate Service**: Ensures compliance with domain-specific rules. - **Deduplication Service**: Detects and merges duplicate records. - **Address Parser Service**: Extracts structured components (street, city, neighborhood, state). - **ZipCodeBase Enrichment**: Complements address data with official postal information. - **Golden Record Consolidation**: Produces a unified, conflict-free record. --- ## 4. Deploy the Application ### Step 1 — Prepare Environment ```bash git clone https://github.com/your-org/mdm-server.git cd mdm-server conda activate mdm ``` ### Step 2 — Configure Environment Variables # Ollama Multi-GPU Setup on OCI A10 ## Systemd Services **/etc/systemd/system/ollama-gpu0.service** ``` bash [Unit] Description=Ollama on GPU0 (A10 #0) After=network.target [Service] User=opc Group=opc # <<< IMPORTANT: SAME MODEL FOLDER FOR BOTH >>> Environment=OLLAMA_MODELS=/home/opc/.ollama/models Environment=CUDA_VISIBLE_DEVICES=0 # On the server do not use "http://" Environment=OLLAMA_HOST=127.0.0.1:11434 Environment=OLLAMA_NUM_PARALLEL=4 # Keeps the model loaded between calls Environment=OLLAMA_KEEP_ALIVE=5m # Useful verbose logs (INFO/DEBUG) Environment=OLLAMA_DEBUG=INFO ExecStart=/usr/local/bin/ollama serve Restart=always RestartSec=2s LimitNOFILE=65535 [Install] WantedBy=multi-user.target ``` **/etc/systemd/system/ollama-gpu1.service** ``` bash [Unit] Description=Ollama on GPU1 (A10 #1) After=network.target [Service] User=opc Group=opc # <<< SAME MODEL FOLDER AS GPU0 >>> Environment=OLLAMA_MODELS=/home/opc/.ollama/models Environment=CUDA_VISIBLE_DEVICES=1 Environment=OLLAMA_HOST=127.0.0.1:11435 Environment=OLLAMA_NUM_PARALLEL=4 Environment=OLLAMA_KEEP_ALIVE=5m Environment=OLLAMA_DEBUG=INFO ExecStart=/usr/local/bin/ollama serve Restart=always RestartSec=2s LimitNOFILE=65535 [Install] WantedBy=multi-user.target ``` ## Ollama Activation ``` bash sudo systemctl daemon-reload sudo systemctl enable --now ollama-gpu0 ollama-gpu1 journalctl -u ollama-gpu0 -f & journalctl -u ollama-gpu1 -f & ``` ## CUDA Configuration Variables ``` sh # 2 endpoints for 2 GPUs export OLLAMA_ENDPOINTS="http://127.0.0.1:11434,http://127.0.0.1:11435" # Model-side settings (per server) export NUM_CTX=8192 export NUM_BATCH=1024 # later try 1280→1536→2048 if you have VRAM export NUM_GPU=999 # “all layers on GPU” export NUM_THREAD=48 # ~ useful vCPUs, not 600 # App concurrency export CONCURRENCY_NORMALIZE=24 export CONCURRENCY_ADDRESS=24 # Timeouts/logs export REQUEST_TIMEOUT=180 export LOG_LEVEL=INFO ``` ### Step 3 — Run FastAPI Application ```bash uvicorn mdm_app.app:app --host 0.0.0.0 --port 8080 --workers 4 ``` --- ## 5. Test ### Send a Test Request ```bash curl -X POST http://localhost:8080/mdm/process -H "Content-Type: application/json" -d '{ "domain": "customer", "operations": ["normalize", "validate", "dedupe", "consolidate"], "records": [ { "source": "CRM", "id": "cust-1001", "name": "Ana Paula", "cpf": "98765432100", "phone": "21988887777", "cep": "22041001", "address": "Rua Figueiredo Magalhaes, 123" } ] }' ``` ### Expected Output - **CPF** formatted as `987.654.321-00`. - **Phone** formatted as `+55 21 98888-7777`. - **CEP** formatted as `22041-001`. - **Address enriched** with neighborhood `Copacabana`, city `Rio de Janeiro`, state `RJ`. - **Golden record** returned with deduplication applied. --- ✅ At this point, the project should be fully deployed, running on **OCI A10 GPUs**, and producing clean, standardized, and enriched master data records. ## Reference - [Oracle Cloud GPU Instances](https://www.oracle.com/cloud/compute/gpu/) - [Using NVidia GPU with Oracle Cloud Infrastructure](https://docs.oracle.com/pt-br/iaas/Content/Compute/References/ngcimage.htm) ## Acknowledgments - **Author** - Cristiano Hoshikawa (Oracle LAD A-Team Solution Engineer)