|

data_engineer.init()

ALEJANDRO MARCHÁN

I turn messy data into things that actually work. Sometimes I explain it to my family. They nod politely.

Madrid --:--:--

Signal Source

┌─ subject.info ─────────────────────┐
│ name        Alejandro Marchán      │
│ role        Data Engineer          │
│ location    Madrid, Spain          │
│ education   CS · UC3M              │
│ status      ● employed @ Verisure  │
│ languages   Python · Scala · SQL   │
└────────────────────────────────────┘

I've spent 6+ years building data systems that handle millions of records daily — mostly at scale, occasionally at 3am.

I like problems where the interesting part isn't the algorithm, it's understanding what the data is actually trying to tell you.

Outside of work: amateur football stats nerd, Renfe schedule obsessive, and builder of things nobody asked for but someone ends up using.

Transformation Stages

STAGE [02] VERISURE
Sep 2024 — Present CURRENT
Data Engineer
  • Speech Analytics: real-time call processing · 300K+ calls/month → automatic transcription + LLM analysis
  • Cloud architecture for new projects (image processing, ML pipelines)
  • ML model productionization and LLM solutions integrated in AWS
STACK: Python · AWS · LLM · Terraform · Jenkins · Azure · GCP
STAGE [01] STRATIOBD
Jun 2021 — Sep 2024
Big Data Engineer
  • NLP typeahead for 1.5M+ products → +15% search usage · +7% sales conversion
  • Relevance ranking · ~8GB/day → +60% improvement in global KPIs
  • Pattern detection 0.1ms · Spark/Scala + DSL
  • ETL visualizer → integrated into core product
  • Spark instructor · NUMA program
STACK: Python · Scala · Spark · Kafka · NLP
STAGE [00] DOMINION GLOBAL
Jun 2019 — May 2021
Data Engineer
  • ETL for 300K meals/day · 165+ hospitals
  • ML time series · -15% preparation cost
  • Full-stack Angular + Java/Springboot
  • Redis cache → -35% search time
STACK: PySpark · Python · Angular · Java

Output Layer

Collage Maker LIVE

K-Means clustering applied to aesthetics. Upload photos, get a mosaic.

techPython · Dash · K-Means
inputphoto library
outputmosaic
urlcollage-maker.amarchan.com ↗
WhatsApp Wrapped LIVE

NLP on your chat history. Stats your contacts will argue about.

techPython · NLP · Dash
inputchat export
outputinsights
urlwhatsapp-wrapped.amarchan.com ↗
La Liga Stats LIVE

Scraped, cleaned and served fresh. Interactive stats for people who argue with data.

techPython · Scraping · Dash
inputLa Liga data
outputinteractive viz
urlla-liga-player-stats.amarchan.com ↗
Mi Cercanías LIVE

Real-time train positions. Because waiting on the platform is for people without a terminal.

techHTML · CSS · JS · Leaflet
inputRenfe API
outputlive map
urlmi-cercanias.amarchan.com ↗
BarkTracker ARCHIVE

Final thesis: bark detection with AI. Neural networks for a very specific problem.

techPython · TensorFlow · DSP
inputaudio
outputbark / no bark
urlgithub.com/AlejandroMarchan ↗
CacaTracker PERSONAL

Bathroom visit tracker during a trip to India. Science demands data, even the uncomfortable kind.

techPython · Dash
inputfield data
outputinsights
urlgithub.com/AlejandroMarchan ↗

Data Dictionary

╔══ DATA DICTIONARY v2026 ════════════════════════╗
PRIMARY STACK
║ ─────────────
║ Python production · daily
║ Apache Spark 3+ years · Scala/PySpark
║ SQL it's always SQL
║ Scala Spark DSLs & streaming
ALSO WORKS WITH
║ ─────────────
║ TypeScript side projects
║ Java legacy Springboot
║ HTML/CSS/JS you're looking at it
TOOLING
║ ─────────────
║ Docker · nginx · Kafka · Airflow
║ Plotly · Dash · Marimo · Leaflet
╚═════════════════════════════════════════════════╝

Output Sink

If you've made it this far, you've seen the whole pipeline. Let's build something.