Aider is recognized for its open-source nature, making it appealing to developers seeking solutions for coding projects. Its main strength is the ability to reduce complexity and streamline processes in code management. Users, however, frequently express dissatisfaction with memory and context retention issues across sessions, similar to those encountered with other tools like Claude Code. While users appreciate its role in solving developer headaches, there is a notable concern about the rising cost and resource usage of running such tools. Overall, Aider holds a positive reputation for innovation in AI coding solutions, though it faces challenges with memory efficiency and economic sustainability.
Mentions (30d)
12
2 this week
Reviews
0
Platforms
4
GitHub Stars
42,600
4,101 forks
Aider is recognized for its open-source nature, making it appealing to developers seeking solutions for coding projects. Its main strength is the ability to reduce complexity and streamline processes in code management. Users, however, frequently express dissatisfaction with memory and context retention issues across sessions, similar to those encountered with other tools like Claude Code. While users appreciate its role in solving developer headaches, there is a notable concern about the rising cost and resource usage of running such tools. Overall, Aider holds a positive reputation for innovation in AI coding solutions, though it faces challenges with memory efficiency and economic sustainability.
Features
Use Cases
Industry
accounting
Employees
1
641
GitHub followers
7
GitHub repos
42,600
GitHub stars
20
npm packages
14
HuggingFace models
Six agents running. Three are paused waiting for me. I haven't written a line of code in two hours.
I've been running parallel Claude Code agents for a few months. The promise was speed - 5× the throughput because 5× the agents. What actually happens by hour two: One agent stops on a yes/no. You alt-tab to it, approve, alt-tab back. Two more pause within the next minute. You scroll through their context, lose your place in the first one. Now there are four waiting. You're not writing code anymore - you're processing a decision queue you accidentally built for yourself. The agents aren't slow. You are. I started calling this the bottleself: the point where parallelism stops adding output and starts adding approvals you can't process fast enough. The ceiling on your system isn't tokens, model speed, or context window. It's the human in the loop. So I built a layer above the agents - a planner that: takes a high-level goal decomposes it into parallel subtasks spawns parallel Claude Code sub-agents - one per task has a QA sub-agent review the output pings you only when it actually can't decide Right now it's Claude Code only. Codex / Cursor / Aider integrations next. For a fresh repo with Claude Code, the planner handles decomposition + parallel execution end-to-end without me touching the keyboard. Source: github.com/gekto-dev/gekto Try: npx gekto Honest question to anyone running 5+ agents: how much of your day is actually writing code vs clearing the queue your agents created? Where does the bottleself hit for you? submitted by /u/OptimisticYogurt42 [link] [comments]
View originalMax20 user: anyone running Opus 4.7 as orchestrator + DeepSeek V4 as the worker via OpenRouter?
I'm on the Max20 plan, thinking about a setup before I sink time into it. Want to hear from anyone actually running it, not theorycraft. The idea: Opus 4.7 in Claude Code as the orchestrator. It plans, breaks down tasks, reviews code quality, catches mistakes. The actual implementation, the bulk token spend, gets delegated to DeepSeek V4 Pro through OpenRouter. DeepSeek lands credibly close to Opus 4.7 on agentic coding benchmarks at a fraction of the output-token cost, so the bet is: keep Opus for the judgment-heavy parts, don't burn it on routine implementation. I'm not expecting huge savings. Realistically maybe an extra 30% (guessing here) effective Opus headroom if delegation works cleanly, and even less margin now that the limits situation has loosened a bit. So part of the question is genuinely whether 30% is worth the integration friction at all, or whether it's a fun idea that doesn't pay for itself. Pre-empting the obvious responses, because I've already thought about these: "Just use Sonnet for the cheap parts." The easy answer. But I'm specifically curious whether an external model's cost delta beats the friction, and whether anyone's actually measured it. "Max20 already gives generous Opus limits, why bother." Fair. But I'd rather use Opus where it earns its keep and not think about rationing for the rest. It's about allocation, not desperation. "The quality gap means Opus spends all its effort fixing DeepSeek's output." This is the actual question. DeepSeek reportedly drifts more than Opus on long agentic loops with many sequential tool calls. So does a tight review loop close that gap, or does it eat the 30%? That's what I want real data on. "This fights how Claude Code is built." Probably. Claude Code's subagents run on Claude models, so I assume this needs a different tool (Aider, Cline, Kilo) or a custom routing layer. If the real answer is "don't do this in Claude Code at all," tell me what you'd use instead. I know the single-model answer. I'm after whether the split specifically works in practice. submitted by /u/theargen [link] [comments]
View originalWe built a free tool that generates a DESIGN.md from any live URL, keeps AI coding agents on-brand
The Google Labs DESIGN.md spec launched last month, it's a machine-readable markdown file your AI coding agent reads to understand your design system. This tool automates creating it. Paste any public URL: the tool extracts CSS variables, typography, Tailwind classes, and component patterns, then an AI assembles them into a spec-compliant DESIGN.md. Visual editor lets you fine-tune tokens before you download. Drop the file in your repo root and your agent has a consistent design reference across every session. Works with Cursor, Claude Code, GitHub Copilot, Aider, and Continue. Free, no signup. https://www.masumi.network/tools/design-md https://reddit.com/link/1tb2tki/video/tlqzrvm1sp0h1/player submitted by /u/thinkgrowcrypto [link] [comments]
View originalI built a “Living Docs” system for long-term AI coding workflows
English is not my first language. AI actually told me to post this here, and also helped write this post 😅 After months of AI-assisted coding, I kept running into the same problems: - repeating architecture context every session - stale docs - conflicting rules - context drift - AI modifying wrong parts of the project - knowledge disappearing between sessions So I started building a documentation system specifically for AI workflows. The idea became something I now call “Living Docs”. Core idea: The same agent that changes the code is also responsible for maintaining the documentation and operational memory. But there is one important constraint: Documentation is NOT updated automatically after every task. The human confirms the code is correct first. Then the agent performs a deliberate “doc sweep” to sync the docs. Otherwise wrong code can mutate the docs, and then future sessions start treating incorrect behavior as truth. Some core rules from the system: One file owns each rule. No duplication. If a rule exists in two places, you now have two sources of truth, which means you have none. Code is primary truth for behavior. Docs are primary truth for intent. The docs are not static reference material. They act as institutional memory shared between humans and AI across sessions. The architecture has 3 layers: - codebase - LLM-maintained docs - governance/schema layer The governance layer tells the agent: - which docs to load - which file owns what - when documentation updates are allowed - how to prevent duplication and context drift Still experimental, but it already improved long-session stability a lot for me on larger projects. Repo: https://github.com/Diew/living-docs Would genuinely love feedback from people working with Cursor, Claude Code, Aider, Roo, OpenHands, etc. submitted by /u/RenAzure [link] [comments]
View originalFor system designers
Open-source spec studio for Claude Code. Draft a Markdown spec + an architecture diagram in the browser, then hand it off three ways: paste your API key, copy to claude.ai, or run a generated CLI snippet if you only have Claude Code. Optional: drop a GitHub PAT and it pushes CLAUDE.md straight to a branch + PR. I built the whole thing with Claude Code — the Vite migration, the BYOK integration, the pluggable storage layer, 95 tests, the wiki, even the screenshots (Playwright drives the real app). Free, MIT, no signup, no telemetry, keys stay in your browser. https://github.com/Hesper-Labs/architect submitted by /u/hsperus [link] [comments]
View originalI got tired of AI agents destroying my codebase and eating tokens, so I built a self-bootstrapping Markdown protocol to fix their memory.
Hi everyone, If you use Claude, Cursor, Copilot, or Gemini for large projects, you know the pain: after 20 messages, the AI's context window gets bloated. It forgets the architecture, hallucinates features, or worse, overwrites perfectly good code because it didn't read the right files. I realized the problem isn't the models; it's how we manage their memory. So I created BEMYAGENT: a single, lightweight Markdown file (BEMYAGENT.md) that acts as an "Agent OS". You just drop it into your project root, tell your AI to "Execute BEMYAGENT.md bootstrap", and it automatically generates a strictly separated file structure: docs/ (Immutable truth): 01-overview, 02-architecture, 03-code-map. The AI is forced to use Lazy Loading (it's instructed never to read feature specs unless strictly required for the current task). work/ (Volatile memory): Uses a Fractal TTE (Think-Task-Execute) workflow based on Hierarchical Task Networks (HTN). If a task is too big, the AI must decompose it into sub-folders instead of executing blindly. The coolest feature? Model Handoff / Pacing. I built a configuration state right into the rules. You can tell the AI to switch to INTERACTIVE mode. It will use a heavy model (like o1 or Claude 3.5 Sonnet) to write the 01_think.md strategy, then it pauses. You swap to a fast/cheap model (like Haiku or Flash) in your UI or CLI, and tell it to execute the code. Massive token/cost savings. It works with any AI UI or CLI tool (Aider, Cline, etc.) because it's just Markdown. I’d love for you to try it out or tear the architecture apart. Repo here: https://github.com/vitotafuni/bemyagent submitted by /u/vitotafuni [link] [comments]
View originalI built a free local MCP server that cut my Claude Code PR review prompt from 63K to 8.7K tokens
Every time I asked Claude Code something about my codebase — "how does the v2 pipeline work?", "what calls this function?", "is this PR safe?" — the agent walked the repo from scratch. Glob, Grep, Read, Read, 8–10 sequential tool calls per question. Same structure rediscovered every time, and the input-token bill kept growing. So I built graphify-ts. It builds a local knowledge graph of your code at index time (tree-sitter AST + Louvain communities + BM25 + optional local ONNX rerank) and exposes it as an MCP stdio server. Instead of 8–10 tool calls, Claude Code makes one `retrieve` call and gets the relevant slice back. Fully local — your code never leaves the laptop. Numbers I actually measured (verify.sh in the repo re-derives all of them from committed evidence): Real production NestJS + Next.js codebase, 1,268 files, same Claude Opus 4.7 question both runs: - Tool-call turns: 9 → 3 - Input tokens: 615,190 → 233,508 (2.6× fewer) - Latency: 96 sec → 35 sec (2.8× faster) - Both numbers from `claude --output-format json` usage field, not local estimates Real 36-file production PR review: - Prompt tokens: 63,024 → 8,690 (7.25× smaller) - Same reviewer, same diff, same review depth — both runs flagged the same hotspots Multi-repo question across 3 repos: - Estimated naive prompt: ~1.5M tokens (literally couldn't fit in any window) - With graphify-ts: 2,800 tokens - Caveat up front: the 1.5M is a structural estimate, not a sent prompt. Calling that out so it's not buried. Install: npm install -g @mohammednagy/graphify-ts cd your-project graphify-ts generate . graphify-ts claude install Also works with Cursor, Copilot, Gemini CLI, Aider, OpenCode via ` install`. Honest trade-offs: - Cold-start sessions cost about 13% more than no-graph baseline because the MCP server adds ~5K of tool-schema overhead at session init. Multi-question sessions amortize this. The default `core` profile ships 6 tools to keep that overhead small; opt into the full 21-tool surface with `GRAPHIFY_TOOL_PROFILE=full`. - Deep extraction is best on JS/TS with framework-aware passes for Express, NestJS, Next.js, Redux Toolkit, React Router. Python/Ruby/Go/Java/Rust use plain tree-sitter AST. C/Kotlin/C#/Scala/PHP/Swift/Zig use a generic structural extractor. - It's a structural map for an agent, not a complete program-analysis database. Heavily meta-programmed routes fall back to the base AST. GitHub: https://github.com/mohanagy/graphify-ts (MIT, Node 20+) I'd genuinely like counterexamples — the cases where structural slicing breaks. If you've got a repo where this approach should fail, I want to know before someone bets a real review pipeline on it. submitted by /u/CaptainProud4703 [link] [comments]
View originalI built a tool that cut my Claude Code token bill 89%. v3.4 just shipped, works in 8 IDEs.
Quick context: I have been hitting Claude Code Max 5x limits in under 2 hours on real work. The session counter goes from 21% to 100% on a single complex prompt. If you have been on the recent threads, you know exactly what I mean. So I built engramx. It is an MCP server plus a SQLite knowledge graph that intercepts file reads at the agent boundary. When Claude is about to read a file engram has indexed, the hook returns a structural summary instead of the raw content. Same edit, same diff, far fewer tokens consumed in the round trip. The benchmark is committed to the repo. On a real 87-file codebase, the aggregate reduction is 89.1%. Best-case file dropped from 18,820 tokens to 306. The bench script is bench/real-world.ts, you can run it on any project you own. v3.4 shipped Friday and all the install paths are live now. The same engram works across 8 IDEs natively. Claude Code (hooks plus the official plugin in review), Cursor (MDC plus MCP plus a VS Code extension on OpenVSX), Cline, Continue.dev, Aider, Windsurf, Zed, OpenAI Codex CLI. One install, one graph, every tool benefits. It is local-first. SQLite database lives at .engram/graph.db in your repo. Nothing leaves your machine. Apache 2.0. No account, no telemetry. npm install -g engramx cd ~/your-project engram setup Cursor users can install the extension directly: code --install-extension nickcirv.engram-vscode Heads up on what comes next. v4.0 "Mesh + Spine" lands May 25. Adds an opt-in federation layer so engram instances on different machines exchange mistakes and ADRs without sharing source. Phase 1 foundation already merged this week (ed25519 identity, 14-category PII gate, 1007 tests). Subscribe via the GitHub Discussions page if you want updates. There is also a engram cost command that tracks how many tokens it has saved you, per project per week. After 24 hours of normal use the digest shows real numbers. Repo and benchmark: github.com/NickCirv/engram Happy to answer questions. If you have hit the new rate limits and want a second pair of hands on it, comment your stack and I will help. submitted by /u/SearchFlashy9801 [link] [comments]
View originalI kept losing track of work, insights, and improvement ideas I deferred mid-task. Built a Claude Code skill to track, surface, and manage them across scattered project files.
Every project I work on accumulates deferred items in several places: a Deferred.md at the repo root, plan files in some "deferred" folder, audit-tool ledgers, code comments likeTODO: come back to this, memory entries for AI assistants, and paused plan files in ~/.claude/plans/. Later, when I have time to address deferred items, I find some have gone stale. Some got fixed when other things got fixed. Some probably will sit forever because I didn't remember them. I worked with Claude Code to find patterns that fixed this for my app (Stuffolio, a Universal Swift codebase shipping to iOS, iPadOS, and macOS), and developed the results into a standalone Claude Code skill: unforget, a single source of truth for deferred work. The full format (four sections, ten columns, color-coded ratings) is in the README and SKILL.md. Quick, but worth a read if you want to see the structure. The skill is functional today via Claude Code's /skill invocation. Drop SKILL.md in your skill path, then run /skill unforget init (or /skill unforget add "...") in any session. Claude follows the seven-phase spec to do the work. Same pattern as other SKILL.md-based tools like /skill humanizer or /skill prompter. The seven-phase init flow has been validated against two real projects (one complex Universal app, one minimal third-party skill). v0.2 will ship as a polished Claude Code plugin (.claude-plugin/ install) so you can invoke /unforget add without the /skill prefix. Functionality unchanged; ergonomics improved. Beta testers willing to try the format on their own projects, especially: Minimal repos (small libraries, single-purpose tools). The format was designed against a complex codebase; I want to catch where it doesn't fit small projects. Non-Apple-platform projects (web, Android, backend services, libraries). The Target/release-cycle column is most natural for App Store submission cadences; want to validate it works for other deploy patterns too. Projects using non-CLAUDE.md AI instruction files (Warp's WARP.md, Cursor's .cursorrules, Aider's .aider.conf.yml). Early testing already revealed the wiring step shouldn't hardcode filenames; want more variety in what the format encounters. Continuous-deployment workflows. The spec has a "Continuous" preset (Window column instead of Target with NOW / THIS WEEK / THIS MONTH / SOMEDAY values) but it's the least field-tested of the three presets. If you try it and something in the skill breaks down, opening an issue describing the failure mode is the highest-value feedback right now. Real-project gaps shape what v0.2's runtime implementation actually does. Repo: https://github.com/Terryc21/unforget Apache 2.0 licensed. The README has the full caveats and a Companion Skills section linking to the other skills the same project family produced. Happy to answer questions in the comments. Engagement plan after posting If the post lands and gets traction, the highest-value comment threads to engage with: "Why not just use [GitHub Issues / Linear / Jira / etc.]?": those are for tracked work; this is for deferred work that doesn't deserve a ticket but shouldn't be lost. The Target column is the differentiator. "What's v0.2 going to add then?": v0.2 packages the skill as a Claude Code plugin so you can run /unforget add "..." without the /skill prefix. The functionality is the same; v0.2 is about install ergonomics. The seven-phase flow, the four sections, the 10-column table, the promotion ritual are all working today via /skill unforget. "How does this differ from [other Claude Code skills doing similar things]?": likely no direct competitor exists; the closest things are general task trackers (not deferred-specific) or per-project Deferred.md conventions (not standardized). The single-source-of-truth plus Target-column promotion ritual is genuinely novel as far as I've seen. "Is this just a todo system?": see "Why a Target column" section in the body. Most todo systems collapse Urgency and Release into Priority. This skill keeps them separate, which is the actual mental model for "we know it's bad but the calendar says next sprint." submitted by /u/BullfrogRoyal7422 [link] [comments]
View originalLessons from building a coding agent for 8k context windows: token budgeting, parallel executors, and per-file isolation
Most AI coding tools (Cursor, Aider, Claude Code) assume you have a 200k-token model. If you're running local LLMs through Ollama or LM Studio, or hitting free-tier cloud APIs like Groq or OpenRouter, you've got around 8k tokens to work with. That doesn't fit a whole project, barely fits a single large file. I spent the last few weeks building a CLI coding agent that's designed around the 8k constraint instead of fighting it. Wanted to share what I learned, because some of it surprised me. The core insight: the LLM never needs to see your whole project. Most agents try to stuff as much context as possible into a single call. With 8k tokens that's a non-starter. The approach that worked for me is splitting the work into roles: A planner call that only sees a lightweight project map (Markdown summaries of each folder, ~300-500 tokens for the whole project) plus the user's request, and outputs a task list. Executor calls that each see exactly one file plus one task. Never two files in the same call. An orchestrator that's pure code, absolutely no LLM, building a dependency graph between tasks and deciding what runs in parallel vs sequential. This split means the LLM only ever reasons about a small, bounded amount of code at any one time. The planner doesn't need to see code at all (just file summaries), and the executor only sees one file. Multi-file refactors stop being a context-window problem and become a scheduling problem. Token budgeting has to be enforced in code, not promised in a prompt. Every LLM call goes through a canFit() check that measures: system prompt + reserved output tokens + memory + actual code. If the code doesn't fit, the agent automatically falls back to a per-file line index (generated once for files over ~150 lines) and pulls only the relevant section. Concrete budget math for 8192 tokens: System prompt + instructions: ~1000 Reserved for response: ~2000 Short-term memory (4 entries): ~360 Available for actual code: ~4800 (about 140-190 lines) Parallel execution is the speed multiplier that makes 8k usable. Because each executor sees only one file, independent edits across files can run simultaneously. A 5-file refactor that would be slow if run sequentially completes in roughly the time of the longest single edit. The dependency graph (built in pure code from the planner's task list) decides which tasks have to wait for which. A few things that tripped me up along the way: Question-style requests overwriting files. The first version had no concept of read-only operations, so asking "how many lines does X have?" caused the executor to write the answer into the file. Fixed by adding an action_type: "query" field to the planner's output that routes through a separate code path that never touches disk. Stale project maps causing silent misroutes. If the user named a file in their request that wasn't in the context map (because they just renamed it, or hadn't refreshed), the planner would silently route the action to the closest match. Now the orchestrator validates that mentioned file paths actually exist on disk and throws a clear error if they don't. Markdown fences in executor output. Even when explicitly told not to, smaller models love wrapping code in triple backticks. Strip them in post-processing rather than fighting the prompt. Memory token cost. Initially didn't budget for it; persistent memory is great but it's another ~80-90 tokens per entry that has to come out of the code budget. Now folder context is dropped first when the budget is tight, then memory, before the actual code gets cut. What I'm still figuring out: Whether the planner/executor split scales cleanly to codebases over 50 files. The dependency graph stays manageable, but the project map starts costing real tokens once you have enough folders. Currently dropping folder context first when budget is tight, but that means deeper edits get less context. Curious if anyone else has run into this and how they handle it. Open-sourced the implementation if anyone wants to dig in: https://github.com/razvanneculai/litecode submitted by /u/BestSeaworthiness283 [link] [comments]
View originalI added voting to my AI tools library, now the ratings are community-driven, not just mine
a few weeks ago I posted about building a library that tracks 120+ AI coding tools by how long their free tier actually lasts. the response was good but the most common feedback was "your scores are subjective." fair point. so I rebuilt the rating system. you can now sign in with Google and vote on any tool directly. the scores update in real time based on actual user votes, not just my personal assessment. if you think I rated something wrong, you can now do something about it instead of just commenting. also shipped dark mode because apparently I was the only person who thought the default looked fine. what Tolop actually is if you're new: every AI tool claims to be free. most aren't, or at least not for long. Tolop tracks the real limits: how many completions, how many requests, how long until you hit the wall under light use vs heavy use vs agentic sessions. it also flags the tools where "free" means you're still paying Anthropic or OpenAI through your own API key. 120+ tools across coding assistants, browser builders, CLI agents, frameworks, self-hosted tools, local models, and a new niche tools category for single-purpose utilities that don't fit anywhere else. a few things the data shows that I found genuinely interesting: Gemini Code Assist offers 180,000 free completions per month. GitHub Copilot Free offers 2,000. same category, 90x difference several of the most popular tools (Cline, Aider, Continue) are free to install but require paid API keys, so "free" is misleading self-hosted tools have by far the most generous free tiers because the cost is on your hardware, not a server would genuinely appreciate votes on tools you've actually used, the more real usage data behind the scores, the more useful the ratings get for everyone. tolop.space :- no account needed to browse, Google login to vote. submitted by /u/DAK12_YT [link] [comments]
View originalGot into Anthropic's Opus 4.7 hackathon — pushing Verified Skill (security + evals + package manager for AI agent skills, 49 platforms) this week
Approved at 1:39 AM this morning. 500 builders, $100K pool, virtual, judges from the Claude Code team. Apr 21-28. The product (already shipping, this week I push harder) Verified Skill is what every AI agent ecosystem is missing: security + quality + distribution for AI skills. Security — skills execute code, touch your tools, read your files. 52 known attack patterns. We scan and grade every skill 3 tiers (Scanned / Verified / Certified) before install. Quality — Skill Studio (npx vskill studio) is a 100% local eval framework. Plain-English test cases. A/B vs baseline. Multi-model (Claude, GPT, Gemini, Llama, Ollama). Nothing similar exists for AI skills today. Distribution — vskill CLI. Universal package manager. Works across 49 agent platforms (Claude Code, Cursor, Copilot, Windsurf, Codex, Gemini CLI, Cline, Aider, and more). The bet Every agent platform runs SKILL.md now. The question isn't "which format wins" — it has. The question is who builds the infrastructure around it. This week with Opus 4.7 Agent-aware generation: one skill source → tailored outputs per agent Smarter routing based on target-agent capabilities Tighter eval loops Daily ships Stack: Node.js ESM CLI, Cloudflare Workers + D1 + Prisma, Next.js 15 dashboard. Orchestrated through SpecWeave — my spec-driven dev framework (open source): https://spec-weave.com Links - Verified Skill: https://verified-skill.com - SpecWeave: https://spec-weave.com Swap notes Anyone else in the cohort? Anyone shipping developer tooling who wants to compare notes this week? submitted by /u/OwenAnton84 [link] [comments]
View originalengram v1.0 — my Claude Code sessions now use 88% fewer tokens (proven, not estimated)
I got tired of watching Claude re-read the same files over and over in a single session. Not occasionally — constantly. Every agent task would burn thousands of tokens just re-loading context it already had. So I built engram. It intercepts every Read call before it hits the file system, and serves a structured context packet instead: file summary, call graph, git history, past mistakes I've logged, dependency edges. The agent gets more useful signal in ~600 tokens than it would from reading the cold file. The numbers (10 tasks, run it yourself with npm run bench): Task Before After Saved Bug fix 18,400 1,980 89.2% New endpoint 22,100 2,640 88.1% Refactor 15,800 2,010 87.3% PR review 31,200 3,890 87.5% Total aggregate 88.1% Install in 3 commands: npm install -g engramx engram init engram install-hook A few things I found genuinely useful after daily use: Survives context compaction — PreCompact hook re-injects the context spine before Claude compacts, so you don't lose your map mid-session Auto-switches projects — CwdChanged hook detects when you move between repos and re-wires the graph automatically Mistake memory — log past errors with engram learn "bug: X happened because Y", and they surface with a warning the next time you're near that code v1.0 also ships with 5 IDE integrations (Claude Code, Continue.dev, Cursor, Zed, Aider) and an HTTP API if you want to build on top of it. Zero cloud, zero API keys, local SQLite. GitHub: https://github.com/NickCirv/engram What's your token spend per session on a typical coding task? Curious what everyone's baseline loo submitted by /u/SearchFlashy9801 [link] [comments]
View originalI stopped hoarding notes in Obsidian and started saving only conclusions. My AI actually remembers now.
I use Claude Code across ~10 projects. Obsidian vault alongside. The usual setup. The problem wasn't that I didn't have enough notes — I had too many. Research dumps, session logs, raw context. And somehow Claude and I would still reach the exact same conclusion we'd already made two weeks ago. Different words, same answer. The AI wasn't forgetting. I was saving the wrong things, 80% was context which it could gather from code or smart enough to know already. Claude Code is smart enough to re-derive most context from your codebase. What it can't figure out on its own is why you chose Supabase over Firebase, or that you built your own MCP instead of using the community one because you only needed a few key parts and not the whole install, etc etc. So I stopped saving notes and started saving only conclusions: [D] decisions, [I] insights, [E] errors, and [S] seeds, or in short DIES (ideas that activate when a condition is met). Each one with a specific trigger that makes it invalid, so the system knows when to question itself. That one shift turned into Memento OS — a plugin that gives your AI sessions persistent, validated memory. What it looks like in practice: Session 1: you make a decision → captured as [D] Use Supabase over Firebase — invalidates if Firebase adds RLS Session 5: Session Briefing — My Project Memory: 6.8/10 | 14 artifacts | 2 seeds | streak: 5 Active Decisions: [D] OAuth via Supabase Auth — invalidates if rate limits hit [critical] [D] Mobile-first, no desktop v1 — invalidates if desktop demand >30% Seeds Ready: [S] Consider Redis caching — activates when: API p95 > 200ms ← CONDITION MET No re-explaining. No "could you remind me what we discussed?" Just conclusions that survive across sessions, with built-in expiration dates. Works with Claude Code (full plugin), Codex, Cursor, Windsurf, Cline, Gemini, Aider, and Continue. Same vault, different integration depth. Named after the Nolan film — because unlike the guy in Memento, your AI actually remembers correctly, and knows when to forget. The grill-me skill (stress-tests your plan before you commit) is based on Matt Pocock's grill-me prompt — credit where it's due. Would love feedback. This started as a personal system and I'm curious if the "conclusions not notes" framing clicks for anyone else. https://github.com/Aiyo28/memento-os submitted by /u/aiyo28 [link] [comments]
View originalHere is what most people get wrong about saving tokens with AST tools
I spent the last day benchmarking codebase context tools against a real AI agent. Not synthetic token counts. Actual multi-turn agentic conversations on a real codebase. The results were not what I expected. Most tools in this space (codebones, codesight, repomix, aider's repo map) show impressive numbers on their READMEs. 8x, 22x, even 90x token savings compared to raw source. Those numbers are real, but they compare the wrong things. They measure "structural skeleton vs reading every file." No real agent reads every file. It greps, reads specific functions, follows imports. The baseline is already efficient. I ran two Claude Sonnet agents on the same tasks on FastAPI (107K LOC). One had grep, cat, find, ls. The other had the same plus a structural indexer: symbol search, targeted get, dependency graph, file outlines. Three tasks. Indexer lost in 1 out of 3. Task 1 — Implement CORS middleware: Standard agent: 58K tokens, 25 calls, 13 turns With indexer: 37K tokens, 19 calls, 9 turns Result: 1.6x fewer tokens, 31% fewer turns Task 2 — Check refactoring impact on routing.py: Standard agent: 163K tokens, 41 calls, 20 turns With indexer: 31K tokens, 14 calls, 6 turns Result: 5.2x fewer tokens (one graph call replaced 41 grep/ls calls), 70% fewer turns Task 3 — Trace async generator bug: Standard agent: 110K tokens, 28 calls, 20 turns With indexer: 196K tokens, 28 calls, 19 turns Result: indexer lost. Used ~80% more tokens for same task. Same number of turns Three things I took away. Conversation history is the real cost, not individual tool calls. Every tool result stays in history and gets re-sent every subsequent turn. A tool returning 200 lines per call accumulates context 40x faster than one returning 5 lines. Synthetic token counts are misleading because they measure one call in isolation. Real cost is multiplicative. Dependency graphs are the one feature that genuinely saves tokens. Grep cannot give you "what breaks if I change this file" without manually tracing imports. A structural indexer does it in one call. Agents don't follow usage guidelines. This surprised me the most. The tools work fine. The problem is the agent picks whatever gives the most information per call. Locally optimal, globally expensive. I looked at how other tools solve this. Some intercept the prompt before it reaches the agent and pre-compute context. Others use PageRank on the dependency graph to rank files by relevance. Both bypass the agent's tool selection entirely. Basically they don't trust the agent to choose well either. If you're evaluating codebase context tools for AI agents, run your benchmarks with a real agent doing real tasks. The numbers will be more modest and more honest. I published all conversation logs with full tool calls and token counts. Happy to share. submitted by /u/creynir [link] [comments]
View originalRepository Audit Available
Deep analysis of Aider-AI/aider — architecture, costs, security, dependencies & more
Aider uses a tiered pricing model. Visit their website for current pricing details.
Key features include: Cloud and local LLMs, Maps your codebase, 100+ code languages, Git integration, In your IDE, Images & web pages, Voice-to-code, Linting & testing.
Aider is commonly used for: Pair programming with LLMs for new project initiation, Enhancing existing codebases with AI suggestions, Automating code linting and testing processes, Generating documentation for codebases, Translating voice commands into code snippets, Integrating AI-generated images and web pages into projects.
Aider integrates with: GitHub, GitLab, Bitbucket, Jira, Slack, Trello, Visual Studio Code, JetBrains IDEs, Eclipse, Notion.
Aider has a public GitHub repository with 42,600 stars.
Marc Raibert
Founder at Boston Dynamics
1 mention
Based on user reviews and social mentions, the most common pain points are: token cost, token usage, API costs, large language model.
Based on 35 social mentions analyzed, 23% of sentiment is positive, 74% neutral, and 3% negative.