sharmadaksh2001@gmail.com

Hi, my name is

Daksh Sharma.

I build |

AI-native Data Engineer with 2+ years specialising in cloud-native ETL/ELT pipelines, Snowflake, and ML-powered analytics. Currently building intelligent data systems at DS Group, Noida. First-Class MSc in Data Analytics from National College of Ireland.

01.About Me

I'm a data engineer and analyst based in Delhi, India — recently back after three years working across Dublin's FMCG and tech scene, and now diving into the Indian market headfirst.

My work spans cloud-native Snowflake pipelines (Snowpipe, CDC Streams, Cortex AI agents), machine learning (LightGBM, XGBoost, BERT, Neural Networks), and no-code/low-code BI tools that let non-technical teams actually interact with their data and not just receive a report about it. If a business analyst can answer their own questions without raising a ticket, I've done my job.

When I'm not at a keyboard I'm on a badminton court, travelling somewhere new, or working through a brain teaser that has no business taking this long to solve.

02.Experience

Apr 2026 – Present
Data Analytics Intern
DS Group, Noida
Delivered 3 end-to-end data science projects independently over 8 weeks on enterprise Snowflake — Flipkart analytics, Japan iPhone market, SnowOps health advisor
Designed event-driven pipelines using Snowpipe, CDC Streams, and automated Tasks with Merge statements; built dimensional models and JSON parsing views
Built a 4-stage Cortex AI agent with classifier, injectable skills system, 10-turn rolling memory, verified query cache, and DML review loop
Completed 16 certifications across Snowflake, Databricks, and AWS during the internship
Feb 2025 – Feb 2026
Business Support Associate
Daybreak (Musgrave Group), Dublin
Sole BI/data analytics expert across 8 retail stores — built the entire analytics infrastructure independently from scratch
Automated ETL pipelines using SQL and Power Query integrated with Power BI, eliminating 8+ manual reporting hours per week
Reduced overstocking by 15% and improved supplier order accuracy by 20% through inventory modelling and self-serve dashboards
Jan 2024 – Jan 2025
Operations Supervisor
ACCHL Limited, Dublin
Built BI dashboards and optimised SQL views for €500K+ annual B2B/B2C wholesale; EDA and statistical analysis on 100K+ POS transactions
Predictive analytics and forecasting models achieved 18% reduction in stock-outs and 12% reduction in operational waste

03.Featured Projects

📊
Apr – May 2026
Flipkart E-Commerce Analytics

End-to-end platform with 6 ML models. XGBoost price predictor R²=0.9988, RMSE=₹16.59. LightGBM category classifier 74% accuracy across 24 classes. TF-IDF recommendation system on 8,011 products. Full Snowflake ETL from GCS, 4-page Streamlit dashboard with Cortex AI.

SnowflakeXGBoostLightGBMStreamlitCortex AI
View Project
📱
May – Jun 2026
Japan iPhone Market Analytics

103K records from Snowflake Marketplace. LightGBM price predictor R²=0.9804, RMSE=¥6,739. Sell-speed classifier 62.1% accuracy, +9.3pp above baseline. Cortex AI agent with 4-stage architecture, rolling memory, verified query cache. 7-tab Streamlit dashboard.

SnowflakeCortex AILightGBMStreamlitYAML
View Project
🔬
May 2026
SnowOps — Health Advisor

Self-initiated tool for DS Group's Snowflake migration. 4-layer architecture: ACCOUNT_USAGE → CORE views → RECOMMENDATIONS → Streamlit + Task. 5 business rules with HIGH/MEDIUM/LOW severity. Daily 9AM HTML email report. Presented to engineering leadership.

SnowflakeSQLStreamlitACCOUNT_USAGE
View Project
🧠
Dec 2023
Ukraine-Russia Opinion Mining

MSc dissertation. Novel consensus-labelling: tweets agreed on by both VADER and RoBERTa used as pseudo ground-truth for BERT fine-tuning. Best model 83.95% accuracy, F1=74.47%. 18 hyperparameter combinations tested per model. Supervisor: Mrs. Harshani Nagahamulla.

BERTRoBERTaVADERPyTorchNLP
View Project
🚲

Last-Mile Delivery Simulation

SimPy discrete-event simulation with Floyd-Warshall shortest paths and TSP via PuLP ILP. 193 parcels simulated; 65.8% same-day delivery over 10 days. MSc — Modelling, Simulation & Optimisation.

SimPyPuLPPython
2023
🌿

Plant Leaf Disease Detection — Custom CNN

Custom 3-block CNN from scratch in TensorFlow/Keras. Binary healthy/diseased classifier across 3,601 images from 11 plant species. 256×256 RGB, Adam optimiser, TensorBoard. MSc group project.

TensorFlowKerasCNNOpenCV
2024
⚕️

Nurse Roster Optimisation

OR-Tools CP-SAT constraint solver. 12 constraints including EU EWTD 48-hour limit, team depletion prevention, night-shift continuity, weekend equity. All 10 nurses within 1 shift of optimal. Interview assessment task.

OR-ToolsCP-SATPythonpandas
Sep 2024

04.Technical Arsenal

Languages

SQLPythonR

Data Platforms

SnowflakePostgreSQLMongoDBBigQueryAWS S3GCPDatabricks

Data Engineering

ETL/ELT PipelinesDimensional ModellingdbtAirflowPySparkCDC StreamsSnowpipe

AI & Machine Learning

Cortex AILightGBMXGBoostBERTRoBERTaPyTorchTensorFlowscikit-learn

BI & Visualisation

StreamlitPower BITableauPlotlyExcel VBA

Tools & Methods

GitJIRAAgile/ScrumCI/CDA/B TestingPrompt Engineering

05.Education

NCI
2023 – 2024
MSc Data Analytics
National College of Ireland · Dublin
First-Class Honours

Dissertation: Opinion Mining on Ukraine-Russia War Tweets — novel consensus-labelling, 83.95% BERT accuracy. Modules: ML, Simulation & Optimisation, Data Mining, Research Methods.

DU
2019 – 2022
BSc Statistics (Honours)
University of Delhi
GPA 9.243 / 10

Statistical inference, probability theory, regression analysis, time series, forecasting, operations research, and data analysis using R.

16 Certifications

06. What's Next?

Get In Touch

Open to data engineering, analytics, and ML engineering roles worldwide. Whether you have a question, an opportunity, or just want to say hi — my inbox is always open.