Data Science

At M2Lab CSDS, we design and build AI-powered data products — from composable agent frameworks and MCP servers to document intelligence pipelines, interactive dashboards, and reproducible research workflows.

Follow our latest thinking: Data by Michel on Substack.

Discuss your project →

Sample interactive visualisations — illustrating the type of data products and analyses M2Lab CSDS builds and deploys.

Anomaly Detection

2 anomalies flagged

Time series with 95% prediction interval and automated outlier detection — sample sensor data

ML Classification

K-means style cluster separation — sample multi-dimensional dataset

Selected projects

Research

On the Statistical Insignificance of Persona-Based Prompt Bloating in LLM-Driven Machine Translation

Michel d. S. Mesquita

Controlled empirical study with GPT-4o across four languages comparing full, stripped, and meta-optimised prompt variants. Key finding: a 45.7% token reduction yields no measurable quality loss (BLEU Δ −0.04, COMET Δ −0.0001) — prompt optimisation delivers economic, not qualitative, benefits.

Tools

Python
R
SQL
React
AWS Sagemaker
Azure
Hadoop
Jupyter
Pandas
Scikit-learn
TensorFlow
Claude Code
Codex
LLMs
Generative AI tools

Methods

Machine learning & deep learning
Large language models (LLMs)
Generative AI & prompt engineering
Agent frameworks & MCP development
Document intelligence
Data pipelines & deployment
Time series forecasting
Computer vision
Interactive dashboards & web apps
Reproducible research