FanumTag
This project was initiated to solve specific technical and domain challenges. Building a privacy‑first captioning and tagging engine capable of offline inference across multiple formats while maintaining a sleek cross‑platform UI.

About This Project
Building a privacy‑first captioning and tagging engine capable of offline inference across multiple formats while maintaining a sleek cross‑platform UI.
Integrated vision‑language models (SMOLVLM2, Qwen2‑VL) and KeyBERT in a modular Tauri/Rust backend, added OCR for documents, and built a responsive SolidJS interface with batch processing and progress tracking.
Offered a fast, offline captioning tool that respects user privacy and extends beyond simple image tagging.
Backend Developer
2024
Public
Personal
Technology Stack
Project Story
Building a privacy‑first captioning and tagging engine capable of offline inference across multiple formats while maintaining a sleek cross‑platform UI.
Integrated vision‑language models (SMOLVLM2, Qwen2‑VL) and KeyBERT in a modular Tauri/Rust backend, added OCR for documents, and built a responsive SolidJS interface with batch processing and progress tracking.
Offered a fast, offline captioning tool that respects user privacy and extends beyond simple image tagging.
Insights & Takeaways
Highlights
- Case study content natively baked into the project dataset.
- Clear storytelling built around the specific problems faced and the technologies used.
Challenges
- Strict focus on performance and maintainability.
- Selecting standard tools to ensure scalability: Python, Rust, Tauri, SolidJS, Tailwind CSS, KeyBERT, SMOLVLM2, Qwen2-VL
- Integrated vision‑language models (SMOLVLM2, Qwen2‑VL) and KeyBERT in a modular Tauri/Rust backend, added OCR for documents, and built a responsive SolidJS interface with batch processing and progress tracking.
Lessons Learned
- Offered a fast, offline captioning tool that respects user privacy and extends beyond simple image tagging.