Available for Projects & Consulting

Charuka Gunawardhane

Building AI, ML & Data Systems That Drive Real Business Impact

Senior Data Scientist & AI Engineer specializing in end-to-end ML systems, LLM-powered applications, and production-grade data pipelines. I turn complex data problems into scalable solutions that generate measurable value.

Work With Me View Projects

Scroll

About Me

I build intelligent systems that solve real business problems, not just models

I'm a Senior Data Scientist and AI Engineer focused on solving real business problems end-to-end, from data pipelines to production ML systems to enterprise AI systems.

I've built and deployed scalable solutions across multiple industries, including forecasting systems, real-time data platforms, and AI assistants powered by LLMs.

My work has directly contributed to revenue growth, cost reduction, and operational efficiency, not just model performance improvements.

I focus on building systems that are production-ready, maintainable, and aligned with business outcomes.

Machine LearningLLM SystemsRAG ArchitecturesData EngineeringMLOpsTime SeriesPySparkAzureAWSEnterprise SolutionsCI/CDData Pipelines

Years of Experience

Across industry and research

20+

Production ML Systems

Shipped and maintained at scale

Industries Served

Retail, finance, enterprise

Services

What I Build

Specialized services across the full AI and data stack , from raw data to intelligent, production-ready systems.

Machine Learning Systems

End-to-end predictive modeling for forecasting, classification, and recommendation. From exploratory analysis through model optimization to production deployment.

Demand & time series forecasting
Classification & recommendation systems
Model optimization & performance tuning
MLOps , tracking, versioning, monitoring

LLM & AI Systems

Production-grade AI applications powered by large language models. RAG pipelines, intelligent assistants, and autonomous agent workflows.

RAG system design & implementation
AI assistants & chatbots for business
Agent-based workflow automation
LLM fine-tuning & prompt engineering

Data Engineering and BI

Scalable data infrastructure from ingestion to transformation. Batch and real-time pipelines designed for reliability and performance at scale.

Batch & real-time pipeline design
Data warehouse design & optimization
CDC architectures & event streaming
BI dashboards & reporting

End-to-End AI Products

From idea to production. I take AI concepts through rapid prototyping, validation, and deployment , fully integrated into your business workflow.

Concept → prototype → production
REST API development & deployment
Business workflow integration
Stakeholder-ready dashboards & reporting

Featured Projects

What I've Built

Real systems built for real businesses. Each project delivered measurable, sustained value.

View All Projects

Machine Learning

Enterprise Sales Forecasting Transformation for FMCG Retail

Problem

A large FMCG retailer lacked a reliable forecasting system to support inventory planning and promotion execution across thousands of SKUs and outlets. The existing process failed to properly separate baseline demand, seasonal effects, and promotional uplift. This resulted in: - Frequent stock-outs during promotions leading to lost sales - Excess inventory for slow-moving items increasing holding costs - Heavy manual effort during promotion cycles - Poor inventory balance across the network The business needed an enterprise-grade forecasting solution that could operate at multiple granularities and support different decision points in the supply chain.

Approach

I designed a modular machine learning forecasting system that separates demand into baseline, promotional uplift, and uncertainty buffers. Key decisions: - Use XGBoost for its ability to model non-linear demand patterns and handle mixed feature types - Build a single global model across item × outlet combinations to improve generalization - Introduce multi-horizon forecasting aligned with supply chain decision timelines - Replace static safety stock rules with error-driven buffer calculations - Automate the full pipeline from data ingestion to order management system integration

Tech Stack

PythonXGBoostMachine LearningPower BISQLPySpark+1 more

Read full case study

Agentic AI

Agentic AI-Powered Ecommerce Assistant for Retail

Problem

A large retailer's ecommerce platform was optimised for structured browsing but failed to handle real-world customer behaviour. Customers frequently attempted to build grocery baskets using free-text lists, photos of handwritten notes, and recipe-based requests — formats that keyword-based search and rule-driven flows could not handle. This led to high cart abandonment on multi-item orders, lost revenue from incorrect or incomplete baskets, heavy manual load on customer support for order changes and tracking, and a poor experience for mobile-first and time-constrained shoppers. Traditional chatbot architectures could not scale to this level of conversational commerce complexity.

Approach

We designed and delivered a hierarchical agentic AI system orchestrated using LangGraph, integrated with a custom ecommerce backend. A central supervisor agent routes user intent to specialised sub-agents responsible for product search, recipe intelligence, cart management, and order execution. Product retrieval combines lexical search (pg_trgm) with semantic vector search (pgvector) to handle misspellings, synonyms, and local grocery terminology, with business-aware re-ranking applied on top. A draft-and-confirm transaction model ensures no order is placed, modified, or cancelled without explicit user approval. Conversation state is persisted in PostgreSQL and rehydrated on each LLM turn, enabling stateful multi-turn interactions without relying on LLM memory. Multimodal inputs — text lists, conversational queries, and OCR-extracted image lists — are unified through a single downstream processing pipeline.

Tech Stack

PythonLangGraphFastAPIReact.jsPostgreSQLpgvector+5 more

Read full case study

Generative AI

myHR — Enterprise RAG Chatbot for Internal HR Policy

Problem

A large Sri Lankan conglomerate with 14,000+ employees across multiple business verticals had a persistent but invisible productivity drain: HR policy knowledge was locked inside dozens of policy PDFs, benefits documents, holiday calendars, and insurance plan files spread across S3 and Microsoft SharePoint. Employees asking routine questions — 'What are our paid holidays?', 'How does parental leave work?', 'What are my health insurance options?' — waited 1–2 days for email replies from HR staff who spent the majority of their time re-answering the same questions. There was no searchable, access-controlled, conversational layer over these documents. Standard keyword search failed because questions were natural language and documents were unstructured. The HR team could not scale policy communication as headcount and business units grew, and sensitive policy documents needed to remain visible only to employees in the relevant country and business unit.

Approach

We designed and implemented a two-pipeline RAG architecture on AWS. An offline batch ingestion pipeline (ECS Fargate) pulls HR documents from S3 and SharePoint, extracts and cleans text, chunks documents into 200–300 token windows with overlap, generates 1536-dimensional embeddings via Amazon Titan Embeddings v2, and bulk-indexes chunks with metadata (business unit, country, ACL groups) into Amazon OpenSearch's k-NN Vector Engine. A real-time query pipeline handles each employee request: the authenticated query passes through API Gateway and AWS WAF to an AWS Lambda orchestrator, which embeds the question with Titan, runs a k-NN search on OpenSearch with RBAC metadata filters to restrict results to the user's business unit and country, constructs a grounded prompt from the top-K retrieved chunks, and calls Claude 3.7 Sonnet on Amazon Bedrock for answer generation. The two pipelines share the same OpenSearch index cluster, keeping infrastructure lean while keeping ingestion fully decoupled from query-time logic. The implementation (GitHub) uses FastAPI, GPT-4o, and Qdrant as a portable local stack that mirrors the production AWS design.

Tech Stack

PythonFastAPIReactTypeScriptViteAmazon Bedrock+13 more

Read full case study Live Demo

See all {4} projects

How I Work

How I Deliver Data & AI Systems

A practical, business-first approach to building scalable data platforms and AI systems that deliver measurable results.

Define the Business Problem

I start with the decision, not the data. We clarify the business goal, success metrics, and what "impact" actually means before building anything.

Design the Data & System Architecture

I map how data flows through the system. From ingestion to transformation to dashboards or ML, everything is designed for scalability and reliability.

Build & Deploy Production Systems

I build systems that actually run in production. Pipelines, models, APIs, and dashboards, all versioned, monitored, and ready for real usage.

Deliver Insights & Iterate

Dashboards, reports, and systems are refined continuously based on real usage and feedback.

Core Principles

Business outcomes over technical complexity
Reliable data > fancy models
Every model needs monitoring and a feedback loop
Clear communication to technical and non-technical audiences

“The best model is the one that solves the problem , not the most sophisticated one. Production readiness is not optional.”

What I Share

Insights on Building Intelligent Systems

I share practical insights on building data platforms, analytics systems, and AI applications, focused on what works in production.

RAG Systems·Article

Building Production-Ready RAG Pipelines

A deep dive into designing reliable RAG architectures , from chunking strategies and embedding selection to retrieval evaluation and hallucination mitigation.

MLOps·Guide

What Most ML Teams Get Wrong About Monitoring

Feature drift, model staleness, and silent failures. Practical patterns for building monitoring systems that actually catch problems before they reach production.

Data Engineering·Case Study

CDC at Scale: Moving Beyond Batch ETL

Why Change Data Capture transforms real-time analytics capability and a practical walkthrough of a Kafka + Debezium + Delta Lake architecture.

Professional insights and project updates

GitHub

Open source projects and code samples

YouTube

Technical walkthroughs and tutorials

Contact

Let's Build Something Impactful

Have a data, analytics, or AI problem you're trying to solve? I work with teams to design and build scalable systems, from pipelines to production-ready AI.

Email Me Directly

hello@charuka.dev

Connect

LinkedIn , Professional network GitHub , Open source work

What to Expect

Quick response within 24 hours
A short intro call to understand your problem
Clear recommendations and possible approach
Transparent scope, timeline, and next steps