Showing results for Content: Plug Vector Vector Vector
GitHub Repo
https://github.com/allaz002/focused-webcrawler
allaz002/focused-webcrawler
A modular topic-focused web crawler featuring pluggable relevance models (Boolean, Vector Space, Naive Bayes) and a lightweight crawling architecture designed for content-based link prioritization.
GitHub Repo
https://github.com/weixuan0102/NCKU_RAG_BOT
weixuan0102/NCKU_RAG_BOT
A lightweight Python toolkit that scrapes the pages you list, breaks the text into chunks, embeds them, and drops everything into a pluggable vector store (Chroma by default). Ships with both a CLI for totally offline Q&A and a ready-to-deploy Line Bot, so you can chat with your content on-prem or in the cloud, no extra wiring required.
GitHub Repo
https://github.com/cstroie/dokuwiki-plugin-dokullm
cstroie/dokuwiki-plugin-dokullm
DokuLLM is a comprehensive DokuWiki plugin that integrates Large Language Model (LLM) capabilities with semantic search functionality through ChromaDB integration. The project enables advanced text processing directly within the DokuWiki editing environment while maintaining content in a vector database for semantic search and retrieval.
GitHub Repo
https://github.com/pradeepgithubrepo/ai_wiki_assistant
pradeepgithubrepo/ai_wiki_assistant
A lightweight RAG-based assistant that converts internal documents into a searchable knowledge engine. It ingests PDFs/Docs, builds vector + keyword indexes, and provides an API to query content with citations. Supports Chroma, BM25, hybrid retrieval, and pluggable LLM/embedding providers.
GitHub Repo
https://github.com/ShantanuPaliwal2419/Semantic-crawler
ShantanuPaliwal2419/Semantic-crawler
An end-to-end data ingestion pipeline for AI systems: async web crawling, content extraction, chunking, embedding generation, and semantic search using a vector database (Qdrant). Built with FastAPI, httpx, and pluggable embedding providers.
GitHub Repo
https://github.com/pld-linux/mozilla-plugin-svg