PersonalPublic2024

FanumTag

This project was initiated to solve specific technical and domain challenges. Building a privacy‑first captioning and tagging engine capable of offline inference across multiple formats while maintaining a sleek cross‑platform UI.

FanumTag

About This Project

Building a privacy‑first captioning and tagging engine capable of offline inference across multiple formats while maintaining a sleek cross‑platform UI.

Integrated vision‑language models (SMOLVLM2, Qwen2‑VL) and KeyBERT in a modular Tauri/Rust backend, added OCR for documents, and built a responsive SolidJS interface with batch processing and progress tracking.

Offered a fast, offline captioning tool that respects user privacy and extends beyond simple image tagging.

Role

Backend Developer

Year

2024

Status

Public

Type

Personal

Technology Stack

PythonRustTauriSolidJSTailwind CSSKeyBERTSMOLVLM2Qwen2-VL

Project Story

The Challenge

Building a privacy‑first captioning and tagging engine capable of offline inference across multiple formats while maintaining a sleek cross‑platform UI.

The Approach

Integrated vision‑language models (SMOLVLM2, Qwen2‑VL) and KeyBERT in a modular Tauri/Rust backend, added OCR for documents, and built a responsive SolidJS interface with batch processing and progress tracking.

The Outcome

Offered a fast, offline captioning tool that respects user privacy and extends beyond simple image tagging.

Insights & Takeaways

Highlights

  • Case study content natively baked into the project dataset.
  • Clear storytelling built around the specific problems faced and the technologies used.

Challenges

  • Strict focus on performance and maintainability.
  • Selecting standard tools to ensure scalability: Python, Rust, Tauri, SolidJS, Tailwind CSS, KeyBERT, SMOLVLM2, Qwen2-VL
  • Integrated vision‑language models (SMOLVLM2, Qwen2‑VL) and KeyBERT in a modular Tauri/Rust backend, added OCR for documents, and built a responsive SolidJS interface with batch processing and progress tracking.

Lessons Learned

  • Offered a fast, offline captioning tool that respects user privacy and extends beyond simple image tagging.