Conversational AI for Digital Payment Analytics
✦ Project Overview
A full-stack analytics platform that eliminates the need for SQL or Python expertise when interrogating UPI transaction data. Users ask questions in plain English; an LLM-powered backend routes each intent to the right statistical or ML tool — delivering instant aggregations, charts, K-Means clusters, regression models, and counterfactual 'What-If' simulations through a modern React dashboard.
✦ Key Features
- ♥LLM-driven query routing: natural language questions are parsed by LangChain + Groq LLM and dispatched to specialized Python analytics tools — no manual query engineering required.
- ♥Comprehensive analytics suite covering descriptive aggregations, K-Means user segmentation, multivariate/Ridge/Lasso/Poisson regression, polynomial curve fitting, and cross-validation — all triggered conversationally.
- ♥What-If Lab powered by gradient boosting models for success rate, fraud risk, and expected transaction amount; supports scenario simulation, tornado-chart sensitivity analysis, and parameter-combination recommendations.
- ♥React + TypeScript frontend with Recharts visualizations and a real-time conversational interface — making payment analytics accessible to product owners, fraud teams, and business stakeholders.
- ♥Modular backend architecture separating data aggregation (algorithms.py), predictive simulation (stats_engine.py), and API routing (main.py) for clean maintainability and extensibility.
✦ Methodology
An LLM-agentic framework that bridges natural language intent with production-grade data science, serving both descriptive and predictive analytics through a single conversational interface:
Intent Parsing & Tool Routing
User questions enter the FastAPI backend where LangChain orchestrates a Groq LLM to classify intent and select the appropriate analytics function — aggregation, clustering, regression, or what-if simulation — reducing manual query engineering to zero.
Multi-Model Analytics Execution
The algorithms.py module executes the selected operation: time-series aggregations, K-Means segmentation, or regularized regression. Each function returns structured JSON optimized for direct chart rendering in the frontend.
Counterfactual What-If Simulation
The stats_engine.py module trains gradient boosting models on the UPI dataset and exposes a scenario simulator where users override input parameters and instantly compare predicted outcomes — enabling business-impact analysis without a data scientist in the loop.
Visualization & UI Rendering
The React + TypeScript frontend (Vite, Tailwind CSS, Recharts) renders structured API responses as interactive charts or conversational text, providing a polished, responsive dashboard experience for non-technical decision-makers.