ihnkenmjaghoplblibibgpllganhoenc
Use Local LLM extension: run llm locally (LLama 70B or DeepSeek with WebLLM + Gemini Nano), ask ai models on your tabs - private ai. Meet a browser helper that makes running local models feel natural, fast, and private. If you like theese chats and want to have your personal, which will run on your computer - that's it. You now not limited for external services and can download it. Chat about website, pdf or any other file just in your browser. Get a good summary of pages. It is built for people who want a smooth and clean local llm ui, and a practical way to ask assistant while browsing. 🚀 Next models available for local usage: - Google’s Gemini Nano - embedded in Chrome - DeepSeek R1 Distill Llama 8B - deep thinking - running on your local - Qwen 3 - available from smallest 0.6B until strong 8B - Llama 3.1 70B ✅ How to start tips 1. Add the extension and open the sidebar 2. Pick from list in the local llm ui 3. Tap download once 4. Then keep using the same setup every day Models available for choosing are always visible, so switching is quick and predictable. You can rotate between lightweight and stronger options, keep your preferred local llm suite, and stay productive without tool hopping. Available choices includes different pre-built webllm for an in-browser experience. This helps you ask in chat on any website and use models with tab context, while keeping private ai as the default goal. 📦 Download once, then run always 1️⃣ Get local webllm single time 2️⃣ keep chat with GPT ready across sessions 3️⃣ reduce setup friction for ai llm tasks 4️⃣ stay consistent even when offline Website context is available for models, so it behaves like a real browser assistant. You can ask ai about your tabs, selected text, and current page content, then get grounded answers that fit your workflow. What you can do every day: • Summaries, highlights, and action items on webpage - and no internet required • Rewrite selected text in your tone - and llm runs in your local • Have a conversation across all your session • Turn a tab into notes or a checklist • Draft quick replies with page context Why it feels like an anything llm helper: ➤ Fast, repeatable prompts with ai ➤ Clear status and model switching in the local llm ui ➤ A easier flow for research and writing ➤ If you just to want to run llm on your local 📺 More smart things you can right away: 🔄 Turn messy pages into structured outlines for study or work 🔍 Extract definitions, names, and key facts without leaving the tab ❓Create follow-up questions to llm deepen understanding in seconds 🛡️Summarize multiple sources, then compare conclusions side by side 🎁 Build a short brief using only local open website context Privacy and control, without extra complexity: ▸ no query will leave your computer, while you using opensource models ▸ private ai, local by design ▸ WebLLM - framework to run LLM in browser For builders and explorers, it giving open source llms 🧩 - feels like you constructing your own local one. If you like to tinker, you can treat your setup like a personal lab: ➤ keep a lightweight suite for speed, and a stronger one for reasoning ➤ test different local llms for different tasks and writing styles ➤ follow updates to track improvements in quality and stability ➤ explore open source llms trends without changing your daily flow Updates stay manageable and user-friendly: 💎 Updates that improve output quality 🚀 Simple model switching without breaking your flow Extra details that help in real life: • Clear separation between tab context and your own notes on you local • A repeatable workflow that supports research, writing, and planning • Consistent behavior even across long sessions and many open pages • A simple path from download ai to daily use, without extra steps If you are teaching someone an easy way to install their own llm, this is a clean starting point. It also matches a downloadable ai repository mindset, so you can download ai once, keep ai models ready, and avoid repeating the same setup steps. Use cases people love: ◆ summarize long articles ◆ extract action items from docs ◆ rewrite emails from page context ◆ compare sources in website ◆ create structured notes fast Bonus ideas for everyday browsing: 1️⃣ Create a quick glossary from technical pages 2️⃣ Turn documentation into a checklist you can follow 3️⃣ Rewrite content for different audiences or levels 4️⃣ Generate a short briefing from tab at once 5️⃣ Capture key takeaways for your personal knowledge base
Offline GPT: Offline (local) AI Chat Assistant
Chat with AI using locally downloaded models (WebLLM) and current page context Offline AI Chat Assistant - LLM in your browser 🎯 KEY FEATURES • 100% Offline AI - All processing happens locally on your device • Multiple Models - Choose from Phi-3-mini (2.3GB), Llama-3.1-8B (4.9GB), or Qwen2.5-7B (4.3GB) • Page Context Aware - Automatically reads current webpage to answer questions about it • Complete Privacy - No data sent to external servers, no API keys needed. Actually, no interned is needed. 🔧 HOW IT WORKS 1. Select a model from the dropdown 2. Model downloads once and caches permanently 3. Start chatting - AI runs entirely in your browser using WebGPU 4. Ask questions about the current webpage or general topics 💡 USE CASES • Summarize articles and documentation • Answer questions about webpage content • Code assistance and explanations • General knowledge queries • All without internet connection (after initial setup) ⚙️ REQUIREMENTS • Chrome 113+ (Stable, Beta, Dev, or Canary) • WebGPU support (enabled by default) • 2-7GB storage for models • 2-6GB available RAM 🔒 PRIVACY • All AI processing happens locally • No external API calls (except one-time model download) • No user data collection or tracking • Conversations stored locally in your browser only Models powered by WebLLM and MLC AI.
Ollama Sidekick
Chat with your local Ollama AI models in a side panel. Includes webpage content as context. All data stays on your device. Ollama Sidekick is a browser side panel interface for chatting with your locally-hosted Ollama AI models. All conversations happen entirely on your machine - no data is sent to external servers. WHAT THIS EXTENSION DOES This extension provides a chat interface that connects to Ollama running on your local computer. You can ask questions about any webpage you're viewing, and the extension will include the page content as context for the AI. Key capabilities: - Chat with local Ollama models in a side panel - Automatically extract content from the current webpage as context - Highlight text on any page to include it in your question - Manage multiple chat conversations - Switch between any Ollama models you have installed This extension requires Ollama to be installed and running locally: 1. Install Ollama from ollama.com 2. Download a model: ollama pull llama3.2 3. Start Ollama with CORS enabled: - macOS/Linux: OLLAMA_ORIGINS='*' ollama serve - Windows (Command Prompt): set OLLAMA_ORIGINS=* && ollama serve - Windows (PowerShell): $env:OLLAMA_ORIGINS='*'; ollama serve 4. Click the extension icon to open the side panel The extension connects only to localhost:11434 (the default Ollama port). No external servers are contacted. HOW PAGE CONTEXT WORKS When you visit a webpage, the extension can read the page content and send it to your local Ollama server along with your question. This lets you ask things like: - "Summarize this article" - "Explain this code" - "What are the main points?" You can enable or disable page context with a toggle in the extension. - All processing happens locally on your device - No account or sign-up required - No data collection or analytics - No external API calls - Chat history is stored only in your browser Works with any model available in Ollama, including Llama, Mistral, Gemma, CodeLlama, and others. The extension detects which models you have installed and lets you switch between them. If you see a connection error: 2. Stop any existing Ollama process: - macOS/Linux: pkill ollama - Windows: Close Ollama from the system tray or use Task Manager 3. Start Ollama with CORS enabled: - macOS/Linux: OLLAMA_ORIGINS='*' ollama serve - Windows (Command Prompt): set OLLAMA_ORIGINS=* && ollama serve - Windows (PowerShell): $env:OLLAMA_ORIGINS='*'; ollama serve 4. Click the retry button in the extension Windows users: If you installed Ollama as a desktop app, you may need to set the OLLAMA_ORIGINS environment variable in System Settings > Environment Variables, then restart the Ollama app.
Better Tab Split
Open links in the other side of Split View. 在 Split View 中点击链接时,在另一侧打开。 When using Chrome's Split View, clicking a link normally opens a new tab. This extension opens it in the opposite pane instead. • Auto-activates when entering Split View (configurable in Options) • Click the icon to toggle on/off • Optionally, you can also open in the current split view
Ollama Client - Chat with Local LLM Models
Local-first Chrome extension for private LLM chat with Ollama, LM Studio, and llama.cpp, including local RAG workflows. Ollama Client – Local LLM Chat in Your Browser Ollama Client is a privacy-focused browser extension for interacting with locally hosted AI models. Connect to supported local LLM servers and chat directly inside your browser without relying on cloud-based inference. Features • Connect and manage multiple local AI providers • Switch models and monitor provider status • Streaming chat responses with stop and regenerate controls • Session history and chat management • Local file attachments and optional webpage context • Custom prompt templates and model parameter controls • Responsive interface optimized for desktop workflows Privacy • No cloud inference • No external data transfer required • Data stays on your device and local network Who It’s For • Developers working with local AI models • Researchers testing self-hosted LLMs • Students learning offline AI workflows • Privacy-conscious users 1. Install the extension 2. Run a supported local LLM server 3. Connect using localhost or a LAN IP 4. Start chatting Important Notes • This extension is a frontend client and does not include AI models • Performance depends on your hardware and backend server configuration