← All work
PRJ-02

Clustiq

Centralized contacts, assisted CSV/PDF/OCR import, mail campaigns.

Year 2025 · in production
Status In production
ReactTailwindFastAPIPostgreSQLOpenAISendGridBrave Search API

Problem

Centralize heterogeneous contact records (people, companies, institutions) coming from different sources — customer CSV/XLSX, supplier PDFs, paper scans — in a single workspace where data cleaning, deduplication and outreach live together.

Architecture

  • React + Tailwind frontend for the operational backoffice (single-page workspace).
  • FastAPI Python backend: REST APIs, Pydantic validation, async tasks for OCR and imports.
  • PostgreSQL as source of truth: normalized schemas for people/companies/institutions, trigram indexes for fuzzy matching.
  • CSV/XLSX import wizard with guided mapping and explicit commit (preview before writing).
  • OCR pipeline on PDFs and scans with a draft review step before promoting records into the database; OpenAI extracts structured fields from raw text and generates personalized variants for mail campaigns.
  • Contact enrichment via the Brave Search API: given a company name, it fetches the official site, address and public contacts.
  • Duplicate detection with similarity scoring and tracked merge (audit log of merge decisions).
  • Mail campaign delivery via SendGrid with dynamic list segmentation and open/click tracking.

Stack

React · Tailwind · FastAPI · PostgreSQL · Python · OpenAI · SendGrid · Brave Search API

Outcome

A single backoffice that replaces the Excel + email + mail-marketing tool dance. Cuts onboarding time for new contact lists and removes duplicates at the first import thanks to tracked merges.