Fast and Comprehensive Code Review, Now in Windsurf
Windsurf receives positive feedback for being a powerful AI coding tool with capabilities that impress users, yet it is criticized for high token consumption when handling simple tasks such as reading a git diff. Users express concerns about its pricing due to inefficiencies in token usage, which can lead to higher operational costs. Overall, the reputation of Windsurf is somewhat mixed; it's valued for its functionality, but the cost and inefficiencies leave several users considering alternatives.
Mentions (30d)
10
Reviews
0
Platforms
4
Sentiment
14%
7 positive
Windsurf receives positive feedback for being a powerful AI coding tool with capabilities that impress users, yet it is criticized for high token consumption when handling simple tasks such as reading a git diff. Users express concerns about its pricing due to inefficiencies in token usage, which can lead to higher operational costs. Overall, the reputation of Windsurf is somewhat mixed; it's valued for its functionality, but the cost and inefficiencies leave several users considering alternatives.
Features
Use Cases
Industry
information technology & services
Employees
120
Funding Stage
Merger / Acquisition
Total Funding
$2.6B
I wasted $500 testing AI coding tools so you don't have to 💸 Here's what actually works: 🧪 Testing ideas? → V0 or Lovable Built a landing page in 90 seconds. Fully clickable, looked real. Code's me
I wasted $500 testing AI coding tools so you don't have to 💸 Here's what actually works: 🧪 Testing ideas? → V0 or Lovable Built a landing page in 90 seconds. Fully clickable, looked real. Code's messy but perfect for validation. 🏗️ Shipping real apps? → Bolt Full dev environment in your browser. I built a document uploader with front end + back end + database in one afternoon. 💻 Coding with AI? → Cursor or Windsurf Cursor = stable, used by Google engineers Windsurf = faster, newer, more aggressive Both are insane. 📚 Learning from scratch? → Replit Best coding teacher I've found. Explains errors, walks you through fixes, teaches as you build. Here's what 500+ hours taught me: The tool doesn't matter if you're using it for the wrong stage. Testing ≠ Building ≠ Coding ≠ Learning Stop comparing features. Match your goal first. Drop what you're building 👇 I'll tell you exactly which tool to use Save this. You'll need it. #AI #AITools #TechTok #ChatGPT #Coding
View originalPricing found: $10, $0/month, $20/month, $200/month, $40/user
Glia – Local-first shared memory layer (SQLite-vec + FTS5 + Offline Knowledge Graph)
Hey everyone, I wanted to share a project I've been working on called Glia. It is a 100% offline, local-first RAG and memory layer designed to connect your AI web chats (Claude, ChatGPT, DeepSeek) with your local developer tools (Claude Code, Cursor, Windsurf) using a unified local database. I wanted something lightweight that did not require pulling heavy Docker containers or subscribing to third-party memory APIs. I settled on a Node.js + SQLite architecture running sqlite-vec (for 768-dim float32 embeddings) alongside SQLite FTS5 for hybrid search, powered completely by local Ollama instances. We just launched a live website that outlines the details and demonstrates the features in action: Website: https://glia-ai.vercel.app/ Codebase: https://github.com/Eshaan-Nair/Glia-AI Technical Stack & Features: Hybrid Search Retrieval: SQLite-vec (using nomic-embed-text locally) + FTS5 keyword prefix matching (porter stemmer). Surgical Sentence-level Trimming: Chunks are sliced into sentences. When a prompt is intercepted, only the exact matching sentences are pulled out of the vector store instead of the whole paragraph. It cuts LLM prompt bloat by ~90-95% in my benchmarks. Knowledge Graph Extraction: An offline task queue uses a local LLM (llama3.1:8b via Ollama) to extract entity triples (subject-relation-object). These are stored in a SQLite facts table (or Neo4j if you run the full Docker compose profile) and fused with the vector retrieval score. HyDE (Hypothetical Document Embeddings): Queries are pre-processed to generate a hypothetical answer, which is embedded together with the original query to bridge semantic gaps. Concurrency: Running SQLite in WAL (Write-Ahead Logging) mode allows the browser extension dashboard and active MCP sessions to read/write concurrently without locking. PII Redaction: Aggressive scrubbing of JWTs, API keys, emails, and IPs in the extension before data is saved. The extension works on Claude.ai, ChatGPT, DeepSeek, Gemini, Grok, and Mistral. The MCP server runs out of the same backend database for your terminal agent or Cursor. You can set it up with a single command: npx glia-ai-setup Glia is completely open-source (MIT). If you like the local-first approach or want to contribute to the SQLite vector pipeline, PRs are very welcome, and a star on GitHub helps the project get discovered! I would appreciate any feedback on the SQLite hybrid search scaling, the scoring fusion algorithm (RAG pipeline details are in RAG_PIPELINE.md), or local graph extraction performance. submitted by /u/Better-Platypus-3420 [link] [comments]
View originalWe built a tool that installs frameworks like ComfyUI, Ollama, OpenWebUI etc on any cloud GPU in one command and saves your whole setup between sessions [R]
We kept running into the same problem every time we rented a GPU to run Ollama + OpenWebUI or ComfyUI, we'd spend the first 45 minutes reinstalling everything. Custom nodes, models, configs, all of it. Docker images went stale fast, different providers had different base images, and nothing was truly portable. We got sick of it and built swm. Here's what it does for ComfyUI users specifically: swm gpus -g a100 --max-price 2.00 --sort price shows you the cheapest available GPU across RunPod, Vast ai, Lambda, and 7 other providers in one view swm pod create — spins up an instance on whatever provider you pick swm setup install comfyui — installs ComfyUI on the pod From there the main thing is the workspace sync. Your entire setup custom nodes, models, outputs, configs lives in S3-compatible object storage (I use B2). When you're done you run swm pod down and it pushes everything, kills the instance, and next time you spin up on any provider you just pull and everything is exactly where you left it. No more reinstalling 15 custom nodes and redownloading checkpoints every session. We also built a lifecycle guard because we kept falling asleep mid-session and waking up to dumb bills. It watches GPU utilization and if nothing's happening for 30 minutes (configurable), it saves your workspace and terminates automatically. Has saved us more money than we want to admit lol. A few other things: Background auto-sync daemon pushes changes every 60 seconds so you don't have to remember to save Tar mode for huge workspaces with tons of small files packs everything into one S3 object instead of 600k individual uploads Also supports vLLM, Ollama, Open WebUI, SwarmUI, and Axolotl if you do more than SD Works with Cursor, Claude Code, Codex, Windsurf if you want your AI agent to manage GPU instances for you Free, open source, Apache 2.0. pipx install swm-gpu Site: https://swmgpu.com GitHub: https://github.com/swm-gpu/swm Would love feedback from anyone who rents GPUs. What's the most annoying part of your current workflow? We are also looking for contributors to the open source repo and suggestions on new frameworks/extensions to be included. Please share your thoughts submitted by /u/Tkpf18 [link] [comments]
View originalI built SeeFlow – architecture diagrams that actually run, wired to your live app
Architecture diagrams rot. You spend an afternoon in Confluence, three months later it's wrong, and nobody updates it because there's no forcing function. https://preview.redd.it/l14h40ly3m1h1.png?width=2508&format=png&auto=webp&s=df60b2ba6da04fadf7e1039b9472a106ed163314 SeeFlow tries to fix that by making diagrams executable. It generates a flow canvas from your codebase, then wires each node to your actual running app. There's a Claude Code / Codex/ Cursor / Windsurf plugin that does the heavy lifting: /seeflow show me the shopping cart feature It also ships an MCP server so any MCP-aware editor can register and edit demos without leaving the IDE. Link to the site: https://seeflow.dev 100% Free/ MIT Open Source submitted by /u/mrtule [link] [comments]
View originalI built SeeFlow - architecture diagrams that actually run, wired to your live app
Architecture diagrams rot. You spend an afternoon in Confluence, three months later it's wrong, and nobody updates it because there's no forcing function. https://preview.redd.it/9svmg8ih3m1h1.png?width=2508&format=png&auto=webp&s=0d06df1f82fd417ee9a45e504efd26628eaf33fd SeeFlow tries to fix that by making diagrams executable. It generates a flow canvas from your codebase, then wires each node to your actual running app. There's a Claude Code / Codex/ Cursor / Windsurf plugin that does the heavy lifting: /seeflow show me the shopping cart feature It also ships an MCP server so any MCP-aware editor can register and edit demos without leaving the IDE. Link to the site: https://seeflow.dev 100% Free/ MIT Open Source submitted by /u/mrtule [link] [comments]
View originalI built push notifications for Claude Code so I stop wasting 40 min/day checking if it's done
I run Claude Code for big refactors pretty regularly. The pattern is always the same: start a task, go do something else, come back 40 minutes later, find out it finished 35 minutes ago. Or worse, it's been waiting for permission to run a command the entire time. So I built a thing. It sends push notifications to your phone when Claude Code: - Finishes a task - Needs permission to run a command (you can approve/deny from your lock screen) - Hits an error Setup is one command It works by registering an MCP tool and permission hooks in your Claude Code config. When something happens, it sends a push via Web Push API. For permission requests, your response routes back to the agent through Redis pub/sub. The whole loop is under 500ms. The permission hook is the part I use most. Claude wants to run rm -rf ./src? I get a notification with Allow/Deny. Tap from my lock screen. Agent continues. I don't need to be staring at the terminal. Also works as a control panel with Cursor, Codex, Windsurf, and Hermes if anyone's using those. Free to start at pushary.com/ai-coding Would love feedback, especially from anyone running Claude Code on longer tasks. submitted by /u/Dapper_Ad620 [link] [comments]
View originalI built Skills Curator - a context-aware Claude skill manager that understands your stack
https://preview.redd.it/r1leyp4g2x0h1.png?width=1078&format=png&auto=webp&s=adca2d7a39b77c859665d5281818b84010bb501f Repo: https://github.com/captkernel/Skills_Curator Install: npx skills add captkernel/Skills_Curator Huge catalog, no memory of past decisions, re-evaluating the same skills every few weeks. That was my loop. But the deeper issue: every skill you install wholesale brings its author's opinions and tradeoffs into your codebase. Stack enough of them and your project stops being yours. So I built Skills Curator — a Claude Code skill whose entire pitch is judgment, not plumbing. What makes it different from npx skills, asm, or vercel/find-skills: 1. Skill customization. Don't just install — decompose. Strip any skill to its most granular parts, understand what each piece does, rebuild a version for your stack and constraints. Skills Curator structures that process. One voice shaping the output. Yours. 2. Comes to you. Reads your config files (deps, CLAUDE.md) at session start. Nothing changed, nothing happens. Added a framework? It surfaces top picks as a quiet observation — not a pitch. 3. Symptom-based matching. "Tests are slow", "deploys are manual", "UI looks ugly" — maps your complaint to skill categories via a 17-pattern table. 4. Pre-install security scan. 14 risk patterns: RCE, hardcoded API keys, GitHub PATs, base64 obfuscation, credential-store access. Automatic before any verdict. 5. Decisions persist forever. Every evaluation stores pros/cons/conflicts/verdict/partial-adoption plan. Same skill resurfaces six months later — you read your past judgment in 5 seconds. --export-eval for PR-ready markdown. 6. Ranks by fit, not popularity. Tag overlap × trust tier. A 200-install skill that matches your stack beats a 50,000-install one that doesn't. 7. Cross-agent portability. 55 platforms — Claude Code, Copilot, Cursor, Codex, Gemini CLI, Cline, Windsurf, OpenCode, and 47 more. 8. Two tiers, same plugin. Lite is default — pure markdown, zero friction. Python tier (~2.3k LOC, stdlib-only) for large catalogs and cross-device Gist sync. Different registry paths, no conflicts. 37 pytest cases, CI on 3 OS × 4 Python versions, MIT licensed. Genuinely curious about the activation trigger problem — when should an agent proactively suggest skills vs. stay silent? If you've thought about this, I'd love to hear it. submitted by /u/captkernel99 [link] [comments]
View originalWhere I'm at with AI Assisted Building + Current and Future Workflow Overview
I've been in an AI dive bomb for probably a couple of years now. The early days... when models couldn't be trusted for more than 5% of the code you wrote. Over the last 2 years that's evolved so quickly that I now write nearly 0% of my code by hand, on personal projects and at work. I've used all kinds of tools in that time too. OpenCode, Zed, Claude Code, Codex, Cursor, Windsurf, OpenCLAW, Lovable... and probably a bunch more I can't recall in the haze that's been AI ADHD for me. Over that time, I started with just copy-pasting code between ChatGPT's interface and my IDE almost like a slightly faster Stack Overflow search. Then that somewhat evolved with Cursor quite a bit. I sort of went from prompt engineering to something closer to a human relay pattern. Then, with Plan Mode becoming a thing, I think I naturally gravitated more towards planning everything because planning felt so cheap. Originally, I used to think that architectural discussion and planning was something that was reserved for larger features, but with expediting my ability to do research, orient myself within a codebase, and know what tools I have to reach for doing technical specifications for everything felt reasonable. From the human relay pattern, I started evolving into more autonomy, especially when Claude Code came out earlier last year. Between the combination of Cursor and Claude Code, starting to get orchestration, starting to use skills more heavily, starting to create actual agent personas that could replace some of my common prompt chains it was around then that I kinda started going all in on true context engineering, utilizing sub-agents optimizing cache reads, and it's probably when many of my first (I call it) sophisticated commands were born. All of this converged pretty rapidly in November of 2025 with the release of what was probably the biggest step increase for AI as far as code quality went with Opus 4.5 and Codex 5.3. The Codex app and Codex CLI were quickly growing. Claude Code was improving at a breakneck pace, introducing all kinds of new ways to introduce deterministic gates within the autonomy of the harness. Fast forward to today, I have a pretty sophisticated workflow with a combination of agents that do everything within the SDLC, commands for almost every type of entry point for work, and skills for just about everything I could possibly do in my day-to-day the workflow with some of the latest tools is able to run quite autonomously overnight do large feature implementations, minimally supervised while producing production-worthy code quality It somewhat reached a point I realized, probably a month and a half ago or so where I needed to figure out a way to remove myself even more from the loop without jeopardizing the determinism that I bring to what is effectively a probabilistic LLM. The models are exceptional, and they seem to have a massive step increase each release, but continuous execution, strict instruction rigor, and preventing hallucinations is still very much difficult to achieve. That's predominantly what I've been doing. I've effectively offloaded a lot of thinking to the agents and LLMs that I use, but none of the understanding. I've asked myself, "How do I maintain that understanding, though maintain the determinism from my steering, without actually physically being there to steer?" This was essential, and I realized or had a bit of an aha moment, just like how I manage teams of engineers that are working on numerous projects, most of which I can never really go too deeply on even though they do most of the thinking, most of the building, and even most of the implementation planning, I was still there, very close to the architecture. I could speak to enough breadth and enough depth to keep us out of trouble and keep things moving I kind of started thinking more about what the shape of me was within the agentic harness and how I could replicate that. More on what I landed on a little bit later. My Setup and How I Work Today To start, I'll probably just talk a little bit about my current working setup. I am predominantly in the terminal now a days using Claude Code. Claude Code orchestrates both the Claude models, of course, and I use it to orchestrate Codex through a series of run books, skills, and commands that I have set up on several hooks so that Codex, when it gets dispatched, also has access to the same skills and agent personas Claude does. I use Ghostty as my terminal of choice and use the IDE integration in claude code pretty heavily to review Markdown or HTML files in my IDE. I also use it to review code snippets and diff reviews, although lately I find myself only really looking at the code nowadays once it's hit a merge request. Some of my adjacent tools are Wispr Flow for faster steering, since I can speak a lot faster than I can type and then I use quite a few MCPs and tools to improve my token usage, but the big ones are I have a custom doc maintenance suite of
View originalClaude in the editor vs terminal vs bridge.
Claude Code is excellent at writing code. Your IDE, however, already knows things the model doesn’t. Right now the field is bridging that gap in three very different ways: Option 1 – Bring the model into the editor (Cursor, Windsurf, Copilot family, Antigravity). The editor is the host; the model is a privileged guest. Tight UX, but the editor vendor decides what the model is allowed to see or touch. Option 2 – Keep the model in the terminal with shell tools (The default Claude Code experience). Full power, zero opinions. But the model reads your codebase like a brand-new contributor: grep, cat, ad-hoc CLIs. No LSP, no symbol graph, no debugger state. It re-derives everything every session. Option 3 – The bridge Run a tiny process next to the editor that exposes the IDE’s knowledge (diagnostics, LSP, debugger, terminal buffers, git state) as MCP tools. Claude Code stays in the terminal, the editor stays the editor, and a clean protocol seam sits in the middle. This is what claude-ide-bridge / Patchwork OS does, and it’s roughly the shape of Anthropic’s per-language LSP plugins and JetBrains’ recent native MCP integration. The "bet" behind the bridge approach The bridge bets that the single biggest difference between a good agent run and a bad one is how much of the real situation the model can see before it acts. If you believe that, the architectural consequences are almost mechanical: You optimize for tool fidelity, not tool count. Five tools that return exactly what the LSP returns beat fifty tools that shell out and parse stdout. You stop treating the IDE as a UI and start treating it as a knowledge source. The extension’s job is to answer questions for the model (“What diagnostics are active right now?" "What's in the debugger locals?”, “What did the terminal just print?”). The human is incidental. You stop shipping the agent and start shipping the seam. The bridge is a protocol, not an application. Any capable model (Claude Code, Codex, or future agents) can drive it. In short: the bridge approach is a bet that the hard part of agentic coding is context and that everyone is quietly converging on the same shape of solution. Where it gets uncomfortable More tools and more context are not always better. Sometimes Claude Code + bash + a good prompt beats a fully wired bridge because the model doesn’t waste turns figuring out which of 170 tools to call. My take: tool surface should be a function of task, not a constant. My setup uses a MCP bridge giving Claude Code tools. “Slim mode” (~60 tools: LSP + debugger + editor state) is usually better for refactoring. “Full mode” (~170 tools) earns its keep on multi-stage work (diagnostics → fix → test → commit → PR) because the alternative is the model constantly context-switching between bash calls. The other uncomfortable truth: the more the model can see, the faster you need an oversight layer (approval queues, write-gating, audit logs). Not because models are evil, but because silence is the wrong default when the surface is large. That layer isn’t a nice-to-have — it’s an architectural consequence. (Full disclosure: my own project is in this space, which is why I’m being upfront.) So the interesting open question isn’t “will models obviate this?” It’s “will agent harnesses absorb this?” Claude Code (or any future harness) could grow its own native LSP, run tsc --noEmit, parse ASTs with tree-sitter, and manage its own debugger session. That still validates the “deterministic tools beat simulation” thesis, but the seam moves inside the agent. The editor stops being load-bearing. I still think the bridge wins (it isn't for a glamorous reason), the editor is already running all this stuff warm. The LSP server is hot, diagnostics are computed, and the debugger is attached. An agent that cold-starts all of it on every turn is doing redundant work that compounds over a long session. The bridge isn’t just a protocol, it’s a cache of expensive computations the human already paid for. What do you think? submitted by /u/wesh-k [link] [comments]
View originalI built a persistent memory MCP server for Claude Code (open source, Go, single binary)
Claude Code forgets everything between sessions. Same mistakes, same questions, same conventions re-explained. I built mnemos to fix that. It's an MCP server that gives Claude Code persistent memory across sessions. On session start, it pushes a ranked context block back into Claude: conventions you've established, corrections you've made before, skills it learned, hot files, recent session summaries. Next session starts already knowing what the last one figured out. What it does: Records corrections as tried / wrong_because / fix. Three corrections on the same topic auto-promote into a reusable skill with When this applies / Avoid / Do sections. No LLM in the loop, just deterministic pattern-mining, so it's reproducible and token-free. Bi-temporal store: facts carry valid/invalid timestamps, so "we used to use X, now Y" works without poisoning context with stale info. Compaction recovery: when Claude Code compacts mid-session, one tool call restores the goal and key decisions. Prompt-injection scanner at the write boundary, since memory stores are a new attack surface (instruction overrides, zero-width unicode, MCP spoofing). Retrospective replay: regenerate any past session as markdown with everything learned since layered in, paste it back to Claude, ask "what would I do differently now." Stack: Single static Go binary, 15 MB. No Python, no Docker, no vector DB, no CGO. SQLite + FTS5 for retrieval, optional cosine similarity if Ollama is running. Install (free, MIT, no paid tier): curl -fsSL https://raw.githubusercontent.com/polyxmedia/mnemos/main/scripts/install.sh | bash mnemos init mnemos init auto-wires Claude Code, Claude Desktop, Cursor, Windsurf, and Codex CLI. Restart your agent and the mnemos_* tools show up. GitHub: https://github.com/polyxmedia/mnemos Built it because I was tired of re-teaching Claude the same conventions every session. Happy to answer questions. submitted by /u/snozberryface [link] [comments]
View originalAn agent skill to enforce AI to write modern CSS
An agent skill that enforces modern CSS practices based on your project's browser targets. Covers 57+ CSS features across color, layout, selectors, animation, typography, positioning, and component patterns. Works with Claude Code, Cursor, Windsurf, Codex, Cline, GitHub Copilot, and other AI coding agents. https://github.com/rushenn/css-modern-features submitted by /u/Informal-Fan-8590 [link] [comments]
View originalI built a tool to stop Claude Code from reading half my codebase on every task and Im curious what you think
Hey everyone, I have been using Claude Code heavily for the past few months and kept running into the same problem on any non-trivial task it would grep through dozens of files, pull in tests, unrelated callers, config files, etc Token burn was too much. So I built something to fix it for myself, and its gotten to a point where I feel okay sharing it publicly: I called it Coograph, it parses your repo into a SQLite dependency graph. Agent queries that graph before opening any files. Instead of "read everything related to OrderService" for example it gets back the 3–5 files that actually matter. On a benchmark task I set up, it went from 20 files / ~4,700 tokens down to 4 files / ~970 tokens. MCP-native, multi-tool (should work with Cursor, Windsurf, OpenCode too), and the graph lives in a .code-graph/graph.db file in your repo. Its early (v0.1.0), and honestly Im still figuring out the rough edges GitHub: https://github.com/paullukic/coograph Docs/getting started: https://coograph.com/docs/getting-started/ Happy to answer questions or hear your experience using it submitted by /u/Melodic_Volume_2888 [link] [comments]
View originalWith just one prompt, AI successfully found and emailed 200 potential investors for my startup.
I’m a solo founder, and fundraising outreach used to drain me — scraping emails, checking duplicates, writing personalized cold emails, and logging everything to Notion. Hours of grind per batch. So, I built one prompt that does all of it. I paste it into any AI agent (Claude Code, Cursor, Windsurf, whatever), and it: Searches the web for relevant investors, partners, or customers. Checks my Gmail + Notion to ensure no one is contacted twice. Writes a personalized email for each one (no generic templates). Sends every email individually via my SMTP. Logs everything to Notion with thread IDs. Auto-corrects itself if something fails. Yesterday, it found and emailed 200 targets while I made lunch. Zero duplicates. Full audit trail in Notion. Multiple replies already. This works for investors, customers, B2B partners, job applications — anything that requires personalized mass outreach. The entire skill file is open-source: 👉 github.com/samihalawa/swarm-massive-outreach-skill Just drop it into your AI agent, plug in your SMTP + Notion creds, edit the 5 lines about your startup, and run it. One prompt. Done. Happy to answer questions in comments. submitted by /u/BlacksmithHot17 [link] [comments]
View originalOpen-source MCP server for Ejentum cognitive harnesses / (reasoning, code, anti-deception, memory)
Open-source MCP server that exposes four cognitive harnesses as tools any agentic client can call. Each tool returns a structured cognitive scaffold (failure pattern to avoid, procedure, suppression vectors, falsification test) that the calling LLM absorbs internally before generating its response. The four tools: - harness_reasoning - multi-step analysis, planning, diagnostics, cross-domain synthesis - harness_code - code generation, refactoring, review, debugging - harness_anti_deception - sycophancy pressure, hallucination risk, manipulation pressure - harness_memory - perception sharpening, drift detection across turns What it catches: LLM failure modes that ship as confidently-wrong answers. Sycophancy under user pressure. Hallucinated citations. Causal shortcuts. Reasoning decay across long chains. Install via Smithery: npx -y u/smithery/cli install ejentum/ejentum-mcp --client claude Replace `claude` with cursor, windsurf, cline, etc. Manual install JSON for any MCP client is in the README. Works in: Claude Desktop, Cursor, Windsurf, Claude Code, n8n's MCP Client node, Cline, Continue, and any other MCP-compatible client. Note on autonomous routing: tools fire reliably on explicit invocation ("use harness_anti_deception to..."). Cold-prompt autonomous calling is structurally unreliable for any optional MCP tool. For stronger autonomous routing in Claude Code, install the skill files alongside. Free Ejentum API key, no card. Listings: - Smithery: https://smithery.ai/servers/ejentum/ejentum-mcp - Glama: https://glama.ai/mcp/servers/ejentum/ejentum-mcp - mcp.so: https://mcp.so/server/ejentum-mcp/Ejentum Source (MIT): https://github.com/ejentum/ejentum-mcp Docs: https://ejentum.com/docs/mcp_guide submitted by /u/frank_brsrk [link] [comments]
View originalGemini has a big outage going on but refuses to acknowledge on official status page! How do you know if an LLM API is actually down vs just you?
Genuine question. Gemini had a 5+ hour outage this morning. I found out because a user reported it on Tickerr, not because Google said anything. Status page was green the whole time. I built Tickerr using Claude Code for this only. It runs independent streaming API calls to LLM providers every 5 minutes and tracks real inference performance - not just HTTP pings. https://preview.redd.it/r6ugn0e57bzg1.png?width=1080&format=png&auto=webp&s=779961c2ee83245f9a46c10ced99f0ddc854494b The other way to know it's not just you is if other people are hitting the same thing at the same time. Which is why I also built a crowdsourced failure signal into Tickerr.ai - agents report 5xx errors anonymously and get back whether others are seeing the same thing. It's free to try, if you want to add reporting to your agent, three ways depending on your setup: MCP (Claude Code, Cursor, Windsurf): report_incident(provider="google", model="gemini-2.5-flash", error_code=503, error_type="overloaded") REST (any language): curl -X POST https://tickerr.ai/api/v1/report \ -H "Content-Type: application/json" \ -d '{"provider":"google","model":"gemini-2.5-flash","error_code":503}' Python: httpx.post("https://tickerr.ai/api/v1/report", json={ "provider": "google", "model": "gemini-2.5-flash", "error_code": 503 }) No API key. Anonymous. You get back how many other agents reported the same issue and what to fall back to. But it only works if agents are actually reporting. Anyone here already handling this problem a different way? submitted by /u/Remarkable_Divide755 [link] [comments]
View originalI built a tool that cut my Claude Code token bill 89%. v3.4 just shipped, works in 8 IDEs.
Quick context: I have been hitting Claude Code Max 5x limits in under 2 hours on real work. The session counter goes from 21% to 100% on a single complex prompt. If you have been on the recent threads, you know exactly what I mean. So I built engramx. It is an MCP server plus a SQLite knowledge graph that intercepts file reads at the agent boundary. When Claude is about to read a file engram has indexed, the hook returns a structural summary instead of the raw content. Same edit, same diff, far fewer tokens consumed in the round trip. The benchmark is committed to the repo. On a real 87-file codebase, the aggregate reduction is 89.1%. Best-case file dropped from 18,820 tokens to 306. The bench script is bench/real-world.ts, you can run it on any project you own. v3.4 shipped Friday and all the install paths are live now. The same engram works across 8 IDEs natively. Claude Code (hooks plus the official plugin in review), Cursor (MDC plus MCP plus a VS Code extension on OpenVSX), Cline, Continue.dev, Aider, Windsurf, Zed, OpenAI Codex CLI. One install, one graph, every tool benefits. It is local-first. SQLite database lives at .engram/graph.db in your repo. Nothing leaves your machine. Apache 2.0. No account, no telemetry. npm install -g engramx cd ~/your-project engram setup Cursor users can install the extension directly: code --install-extension nickcirv.engram-vscode Heads up on what comes next. v4.0 "Mesh + Spine" lands May 25. Adds an opt-in federation layer so engram instances on different machines exchange mistakes and ADRs without sharing source. Phase 1 foundation already merged this week (ed25519 identity, 14-category PII gate, 1007 tests). Subscribe via the GitHub Discussions page if you want updates. There is also a engram cost command that tracks how many tokens it has saved you, per project per week. After 24 hours of normal use the digest shows real numbers. Repo and benchmark: github.com/NickCirv/engram Happy to answer questions. If you have hit the new rate limits and want a second pair of hands on it, comment your stack and I will help. submitted by /u/SearchFlashy9801 [link] [comments]
View originalYes, Windsurf offers a free tier. Pricing found: $10, $0/month, $20/month, $200/month, $40/user
Key features include: Trusted by Developers. Proven in Enterprises..
Windsurf is commonly used for: AI-assisted code generation, Automated testing and debugging, Real-time code reviews, Collaborative coding environments, Custom AI model training, Integration with CI/CD pipelines.
Windsurf integrates with: GitHub, GitLab, Jira, Slack, Trello, AWS, Azure DevOps, Google Cloud Platform, Docker, Kubernetes.
Based on user reviews and social mentions, the most common pain points are: token usage, token cost.
Shawn Wang
Founder at smol.ai
3 mentions

Ferrovial: Building The Future Of Infrastructure With Windsurf
Aug 7, 2025
Based on 49 social mentions analyzed, 14% of sentiment is positive, 84% neutral, and 2% negative.