Content: Plug Vector Vector Vector

GitHub Repo https://github.com/allaz002/focused-webcrawler

allaz002/focused-webcrawler

A modular topic-focused web crawler featuring pluggable relevance models (Boolean, Vector Space, Naive Bayes) and a lightweight crawling architecture designed for content-based link prioritization.

GitHub Repo https://github.com/weixuan0102/NCKU_RAG_BOT

A lightweight Python toolkit that scrapes the pages you list, breaks the text into chunks, embeds them, and drops everything into a pluggable vector store (Chroma by default). Ships with both a CLI for totally offline Q&A and a ready-to-deploy Line Bot, so you can chat with your content on-prem or in the cloud, no extra wiring required.

GitHub Repo https://github.com/cstroie/dokuwiki-plugin-dokullm

cstroie/dokuwiki-plugin-dokullm

DokuLLM is a comprehensive DokuWiki plugin that integrates Large Language Model (LLM) capabilities with semantic search functionality through ChromaDB integration. The project enables advanced text processing directly within the DokuWiki editing environment while maintaining content in a vector database for semantic search and retrieval.

GitHub Repo https://github.com/pradeepgithubrepo/ai_wiki_assistant

pradeepgithubrepo/ai_wiki_assistant

A lightweight RAG-based assistant that converts internal documents into a searchable knowledge engine. It ingests PDFs/Docs, builds vector + keyword indexes, and provides an API to query content with citations. Supports Chroma, BM25, hybrid retrieval, and pluggable LLM/embedding providers.

GitHub Repo https://github.com/ShantanuPaliwal2419/Semantic-crawler

ShantanuPaliwal2419/Semantic-crawler

An end-to-end data ingestion pipeline for AI systems: async web crawling, content extraction, chunking, embedding generation, and semantic search using a vector database (Qdrant). Built with FastAPI, httpx, and pluggable embedding providers.

GitHub Repo https://github.com/pld-linux/mozilla-plugin-svg

pld-linux/mozilla-plugin-svg

A Mozilla plug-in to view W3C's SVG (Scalable Vector Graphics) content