Built for your business
"Durable" receives high praise for its robust performance, with users appreciating its reliability as reflected in perfect and near-perfect ratings on review sites. Complaints are minimal and mainly revolve around the wider tech industry's issues, rather than the tool itself. The sentiment around the pricing is generally neutral, with no significant mentions suggesting dissatisfaction or praise. Overall, "Durable" appears to maintain a solid reputation for its functionality and support among its user base.
Mentions (30d)
6
Avg Rating
4.5
2 reviews
Platforms
2
Sentiment
6%
3 positive
"Durable" receives high praise for its robust performance, with users appreciating its reliability as reflected in perfect and near-perfect ratings on review sites. Complaints are minimal and mainly revolve around the wider tech industry's issues, rather than the tool itself. The sentiment around the pricing is generally neutral, with no significant mentions suggesting dissatisfaction or praise. Overall, "Durable" appears to maintain a solid reputation for its functionality and support among its user base.
Features
Use Cases
Industry
information technology & services
Employees
60
Funding Stage
Series A
Total Funding
$20.3M
Pricing found: $0, $0, $25/m, $20, $99/m
g2
What do you like best about Durable?Having no previous experience of website creation (Development), I was amazed by how quickly and easily a website for my business was made in front of my eyes! It make me a beautiful site, with incredible imagery and business descriptions that would have otherwise taken me hours. This allowed me to just sprinkle in a few changes and additional details of my own. Review collected by and hosted on G2.com.What do you dislike about Durable?Nothing that springs to mind, I have never built websites before, so having Durable do it all for me has been very refreshing! Review collected by and hosted on G2.com.
What do you like best about Durable?It creates a website in just five minutes. Review collected by and hosted on G2.com.What do you dislike about Durable?If you want to update the website then you can't edit the existing version, you have to make a new project and then work again on it. Review collected by and hosted on G2.com.
Managed Agents endpoint reference - what's new in CC 2.1.144 (-105 tokens)
Data: Managed Agents endpoint reference — Drops the type: "model_config" wrapper from the model config shorthand example, so the full config object is now just {id: "claude-opus-4-6", speed: "fast"}. Tool Description: CronCreate — Adds a "Not for live watching" section (shown when the Monitor tool is enabled) clarifying that CronCreate re-runs prompts at fixed wall-clock intervals and pointing users to the Monitor tool for streaming log/process/command output as it changes, since cron polls on a schedule. Refactors the durability and runtime-behavior copy so the durable-vs-session-only guidance is sourced from shared snippets rather than inlined conditionals. Details: https://github.com/Piebald-AI/claude-code-system-prompts/releases/tag/v2.1.144 submitted by /u/Dramatic_Squash_3502 [link] [comments]
View originalSmall memory bridge for Claude Code skills that run as separate commands
I was testing a small pattern for Claude Code skills that run as separate commands. The problem: commands like /grill-with-docs, /tdd, and /handoff can be useful on their own, but they start fresh enough that you end up repeating the same project decisions. This example wraps a skill command and does a simple lifecycle: recall relevant Memanto memories before the skill runs inject them through MEMANTO_SKILL_CONTEXT run the skill command store durable notes from the finished run, such as decisions, conventions, caveats, and must/avoid rules The demo uses local JSONL by default so it can be reviewed without any API key. There is also a Memanto CLI backend for actual use. PR/diff: https://github.com/moorcheh-ai/memanto/pull/522 Curious if this feels like the right level of memory: explicit durable notes, instead of trying to summarize the whole chat every time. submitted by /u/dnesdan [link] [comments]
View original100 Tips & Tricks for Building Your Own Personal AI Agent /LONG POST/
Everything I learned the hard way — 6 weeks, no sleep :), two environments, one agent that actually works. The Story I spent six weeks building a personal AI agent from scratch — not a chatbot wrapper, but a persistent assistant that manages tasks, tracks deals, reads emails, analyzes business data, and proactively surfaces things I'd otherwise miss. It started in the cloud (Claude Projects — shared memory files, rich context windows, custom skills). Then I migrated to Claude Code inside VS Code, which unlocked local file access, git tracking, shell hooks, and scheduled headless tasks. The migration forced us to solve problems we didn't know we had. These 100 tips are the distilled result. Most are universal to any serious agentic setup. Claude 20x max is must, start was 100%develompent s 0%real workd, after 3 weeks 50v50, now about 20v80. 🏗️ FOUNDATION & IDENTITY (1–8) 1. Write a Constitution, not a system prompt. A system prompt is a list of commands. A Constitution explains why the rules exist. When the agent hits an edge case no rule covers, it reasons from the Constitution instead of guessing. This single distinction separates agents that degrade gracefully from agents that hallucinate confidently. 2. Give your agent a name, a voice, and a role — not just a label. "Always first person. Direct. Data before emotion. No filler phrases. No trailing summaries." This eliminates hundreds of micro-decisions per session and creates consistency you can audit. Identity is the foundation everything else compounds on. 3. Separate hard rules from behavioral guidelines. Hard rules go in a dedicated section — never overridden by context. Behavioral guidelines are defaults that adapt. Mixing them makes both meaningless: the agent either treats everything as negotiable or nothing as negotiable. 4. Define your principal deeply, not just your "user." Who does this agent serve? What frustrates them? How do they make decisions? What communication style do they prefer? "Decides with data, not gut feel. Wants alternatives with scoring, not a single recommendation. Hates vague answers." This shapes every response more than any prompt engineering trick. 5. Build a Capability Map and a Component Map — separately. Capability Map: what can the agent do? (every skill, integration, automation). Component Map: how is it built? (what files exist, what connects to what). Both are necessary. Conflating them produces a document no one can use after month three. 6. Define what the agent is NOT. "Not a summarizer. Not a yes-machine. Not a search engine. Does not wait to be asked." Negative definitions are as powerful as positive ones, especially for preventing the slow drift toward generic helpfulness. 7. Build a THINK vs. DO mental model into the agent's identity. When uncertain → THINK (analyze, draft, prepare — but don't block waiting for permission). When clear → DO (execute, write, dispatch). The agent should never be frozen. Default to action at the lowest stakes level, surface the result. A paralyzed agent is useless. 8. Version your identity file in git. When behavior drifts, you need git blame on your configuration. Behavioral regressions trace directly to specific edits more often than you'd expect. Without version history, debugging identity drift is archaeology. 🧠 MEMORY SYSTEM (9–18) 9. Use flat markdown files for memory — not a database. For a personal agent, markdown files beat vector DBs. Readable, greppable, git-trackable, directly loadable by the agent. No infrastructure, no abstraction layer between you and your agent's memory. The simplest thing that works is usually the right thing. 10. Separate memory by domain, not by date. entities_people.md, entities_companies.md, entities_deals.md, hypotheses.md, task_queue.md. One file = one domain. Chronological dumps become unsearchable after week two. 11. Build a MEMORY.md index file. A single index listing every memory file with a one-line description. The agent loads the index first, pulls specific files on demand. Keeps context window usage predictable and agent lookups fast. 12. Distinguish "cache" from "source of truth" — explicitly. Your local deals.md is a cache of your CRM. The CRM is the SSOT. Mark every cache file with last_sync: header. The agent announces freshness before every analysis: "Data: CRM export from May 11, age 8 days." Silent use of stale data is how confident-but-wrong outputs happen. 13. Build a session_hot_context.md with an explicit TTL. What was in progress last session? What decisions were pending? The agent loads this at session start. After 72 hours it expires — stale hot context is worse than no hot context because the agent presents outdated state as current. 14. Build a daily_note.md as an async brain dump buffer. Drop thoughts, voice-to-text, quick ideas here throughout the day. The agent processes this during sync routines and routes items to their correct places. Structured memory without friction at ca
View originalI stopped treating agent runs as chats and started treating them as review packets
I’ve been experimenting with Codex/Claude-style workflows where an agent does more than answer a prompt: it researches, drafts, scores, creates artifacts, and leaves behind state for the next run. The thing that helped most was not more autonomy. It was making every run produce a small folder that another human or agent can inspect: - `research.md` for sources and assumptions - `drafts.md` for candidate outputs, including rejected ones - `evals.md` for the scoring rubric and why one option won - `approval-packet.md` for the final action checkpoint - `metrics.json` for outcomes - `memory.md` for reusable workflow lessons only The biggest lesson: memory should remember **how to work**, not become an unreviewed fact database. If a claim matters, it belongs in the reviewed artifact with a source. The second lesson: “fully autonomous” is less useful than “autonomous until the irreversible step.” For code that means commit/deploy. For content that means publish. For local workflows it means anything that touches credentials or third-party accounts. This made the agent runs much easier to improve over time because failures become visible: - Was the subreddit/repo/API research wrong? - Was the draft bad? - Was the eval rubric too vague? - Did the approval packet miss a risk? - Did the memory store a lesson that actually helped next time? Curious if others are doing something similar for Claude Code/Codex workflows: do you keep agent output as durable artifacts, or mostly trust the chat transcript? submitted by /u/qa_hme_051626_a [link] [comments]
View originalSam Altman’s ego was OpenAI’s downfall
The more I watch OpenAI, the more convinced I become that Sam Altman’s ego was the beginning of the company’s decline. OpenAI did not become huge because Altman was some once-in-a-generation operator. It became huge because ChatGPT was a once-in-a-generation product. There is a difference. The company stumbled into one of the most important consumer tech moments since the iPhone, rode the sheer shock value of that innovation, and then somehow convinced itself that the person sitting on top of the rocket must have designed the laws of physics. OpenAI’s first real advantage was novelty. ChatGPT felt magical. That gave OpenAI a massive head start, but when the novelty vanished and the rest of the market caught up, the company failed to prove itself not just as an innovation lab with a celebrity CEO. Altman seems to want OpenAI to become Apple: a closed, prestigious, centralized, gatekept ecosystem where everyone builds inside his cathedral. Apps inside ChatGPT. Agents inside ChatGPT. Hardware. ChatGPT is popular, but OpenAI does not own the phone. It does not own the operating system. It does not own the enterprise workflow. It does not own the cloud layer the way Microsoft, Amazon, or Google do. It does not even have a product moat that feels as unbreakable as people thought it was two years ago. The underlying model quality gap keeps narrowing. Switching costs are low. Developers and businesses will use whatever works, whatever is cheaper, and whatever integrates better. That is why Anthropic looks much better run right now. Anthropic is not pretending Claude is some holy object that needs an Apple-style walled garden around it. Their strategy feels much more Microsoft-like: accept that the core product may not be permanently magical, then build the boring, useful, sticky layers around it. Claude Code, enterprise integrations, developer tools, workflows, partnerships, APIs, reliability, business adoption. Not as sexy. Much smarter. Anthropic’s venture capital money is obviously being burned too. This whole industry is basically setting money on fire to buy GPUs. But Anthropic’s burn feels more strategically allocated. Compute, yes. But also marketing, sales and developer adoption. Enterprise positioning. Product polish. Peripherals that make the model useful in actual workflows. They are not just trying to win the “my chatbot is smarter than your chatbot” contest. They are trying to become infrastructure. OpenAI, meanwhile, is gatekeeping and guard railing the shit out of their models and for some reason just restricting them as much as possible. He went from being one of the most respected figures in AI to becoming the face of a company that increasingly looks like it is being run aground by ambition without operational coherence. OpenAI’s original image was almost wholesome: brilliant researchers building something open source. Now it feels like a capitalist machine run by someone who does not fully understand capitalism beyond fundraising and valuation theater. Altman religiously narrowing his vision towards his AGI mission believing VC money won't dry down. Amodei also talks a lot about AGI but he understands profit matters. That is the irony. Altman was chosen and celebrated largely because he came from the venture/startup world. He knew how to talk to capital. He knew how to sell a vision. He knew how to make investors believe the future was being negotiated in whatever room he happened to be standing in. But being good at venture mythology is not the same as being good at running a giant operating company. A VC can be rewarded for telling a compelling story before the business fundamentals exist. A CEO eventually has to make the fundamentals exist. OpenAI had the best possible starting position: the brand, the users, the developer mindshare, the press, the money, the talent, the cultural moment. And yet instead of consolidating that lead into a focused, profitable, durable company, it seems to have chased grandeur. Anthropic seems to understand something OpenAI forgot: the winner may not be the company with the loudest AGI rhetoric. It may be the company that makes AI useful, embedded, and rational. submitted by /u/Alternative_Bid_360 [link] [comments]
View originalSam Altman's ego was OpenAI's downfall.
The more I watch OpenAI, the more convinced I become that Sam Altman’s ego was the beginning of the company’s decline. OpenAI did not become huge because Altman was some once-in-a-generation operator. It became huge because ChatGPT was a once-in-a-generation product. There is a difference. The company stumbled into one of the most important consumer tech moments since the iPhone, rode the sheer shock value of that innovation, and then somehow convinced itself that the person sitting on top of the rocket must have designed the laws of physics. OpenAI’s first real advantage was novelty. ChatGPT felt magical. That gave OpenAI a massive head start, but when the novelty vanished and the rest of the market caught up, the company failed to prove itself not just as an innovation lab with a celebrity CEO. Altman seems to want OpenAI to become Apple: a closed, prestigious, centralized, gatekept ecosystem where everyone builds inside his cathedral. Apps inside ChatGPT. Agents inside ChatGPT. Hardware. ChatGPT is popular, but OpenAI does not own the phone. It does not own the operating system. It does not own the enterprise workflow. It does not own the cloud layer the way Microsoft, Amazon, or Google do. It does not even have a product moat that feels as unbreakable as people thought it was two years ago. The underlying model quality gap keeps narrowing. Switching costs are low. Developers and businesses will use whatever works, whatever is cheaper, and whatever integrates better. That is why Anthropic looks much better run right now. Anthropic is not pretending Claude is some holy object that needs an Apple-style walled garden around it. Their strategy feels much more Microsoft-like: accept that the core product may not be permanently magical, then build the boring, useful, sticky layers around it. Claude Code, enterprise integrations, developer tools, workflows, partnerships, APIs, reliability, business adoption. Not as sexy. Much smarter. Anthropic’s venture capital money is obviously being burned too. This whole industry is basically setting money on fire to buy GPUs. But Anthropic’s burn feels more strategically allocated. Compute, yes. But also marketing, sales and developer adoption. Enterprise positioning. Product polish. Peripherals that make the model useful in actual workflows. They are not just trying to win the “my chatbot is smarter than your chatbot” contest. They are trying to become infrastructure. OpenAI, meanwhile, is gatekeeping and guard railing the shit out of their models and for some reason just restricting them as much as possible. He went from being one of the most respected figures in AI to becoming the face of a company that increasingly looks like it is being run aground by ambition without operational coherence. OpenAI’s original image was almost wholesome: brilliant researchers building something open source. Now it feels like a capitalist machine run by someone who does not fully understand capitalism beyond fundraising and valuation theater. Altman religiously narrowing his vision towards his AGI mission believing VC money won't dry down. Amodei also talks a lot about AGI but he understands profit matters. That is the irony. Altman was chosen and celebrated largely because he came from the venture/startup world. He knew how to talk to capital. He knew how to sell a vision. He knew how to make investors believe the future was being negotiated in whatever room he happened to be standing in. But being good at venture mythology is not the same as being good at running a giant operating company. A VC can be rewarded for telling a compelling story before the business fundamentals exist. A CEO eventually has to make the fundamentals exist. OpenAI had the best possible starting position: the brand, the users, the developer mindshare, the press, the money, the talent, the cultural moment. And yet instead of consolidating that lead into a focused, profitable, durable company, it seems to have chased grandeur. Anthropic seems to understand something OpenAI forgot: the winner may not be the company with the loudest AGI rhetoric. It may be the company that makes AI useful, embedded, and rational. submitted by /u/Alternative_Bid_360 [link] [comments]
View originalI built a sidebar for Claude Code: every prompt clickable, jumps the terminal back to that turn
The why: I run Claude Code in a tmux session on a Linux dev box, SSH'd in from a Windows laptop. The terminal-only flow worked, but I wanted three things tmux alone doesn't give me — clickable prompt history, a file panel next to the terminal so I stop cat-ing things to look at them, and push notifications when Claude is waiting for me without staring at the tab. Existing tools each solve one slice (ttyd = terminal only, filebrowser = files only, code-server is VS Code-shaped and heavy). I wanted them in one page, on every device. Started as a weekend project, ended up as my daily driver. What it is: a single Go binary on your dev box. SSH-tunnel into 127.0.0.1:8080: xterm.js terminal, tmux-backed (survives disconnects, sleeps, server restarts) File tree (preview, drag-drop upload, follows your cd via tmux's pane_current_path — no shell integration needed) Activity panel reads ~/.claude/projects/*.jsonl and shows every prompt. Click one → terminal scrolls back to that turn. Same for Top-bar chips for active model + latest context tokens Push notifications via Claude Code's Stop hook (laptop pings when Claude is idle, even with tab backgrounded) Design decisions worth sharing: tmux is the durability layer. Every session is tmux new-session -A -s {id}. Shell survives WS disconnect, server restart, idle timeout because tmux already solved that. roost owns the WebSocket bridge and an append-only disk log — that's it. Single-user-per-instance, forever. I refuse to add accounts/RBAC. Two people share a host? Each runs their own roost serve on a different port. UNIX UIDs handle isolation. Multi-tenant logic belongs in a reverse-proxy, not the binary. Kept the auth code under 100 lines. Vanilla JS, no build step. Frontend is plain files under //go:embed all:web. No bundler. Easier to debug, easier to ship, lower future cost. One bug worth flagging: tmux's display-message -p '#{x}\x1f#{y}' returns 0x1f as literal _ when tmux is launched without a UTF-8 locale (systemd / launchd units, for example). Burned an hour on this before realising tmux -u is the one-line fix. If you ever pipe tmux through field separators, lock the locale. Validated combo right now: Linux server + Windows Chrome over SSH tunnel. macOS-as-server works but has rough edges. Codex sessions work too if you swap agents. Repo + GIF demo: https://github.com/liamsysmind/roost v0.1.0 tarballs: https://github.com/liamsysmind/roost/releases/tag/v0.1.0 If you drive Claude Code over SSH — what's missing for you? submitted by /u/Adventurous_Sun9149 [link] [comments]
View originalI built a catalog of MCP servers with paste-ready install configs. One of them is hosted so you can try it without setting anything up
Every time I added a new MCP server to Claude Desktop I ended up doing the exact same thing. Hunt down the github, dig through the README to find the install line, then build the JSON config by hand. Doing it once is fine, doing it five times in a week got old. Built agentalmanac.org. 23 servers indexed so far, each one has a detail page with paste-ready config snippets for Claude Desktop, Cursor, and Continue. Pick your runtime, copy the JSON, done. One thing I wasn't expecting to find: a bunch of the "official" reference servers in modelcontextprotocol/servers are actually archived now (GitHub, Slack, Postgres, SQLite, Puppeteer, Sentry, Brave Search, Google Drive). Most catalog sites I checked still list them as if they're current. I routed every archived one to whatever is still being actively maintained. Microsoft's Playwright instead of Puppeteer, Zencoder for Slack, Brave's own first party server for search, etc. Felt weird that nobody else was doing this. While I was at it I hosted one of them too. agentalmanac.org/s/agentalmanac-time runs on a Cloudflare Worker and exposes get_current_time and convert_time. Drop the snippet into your claude_desktop_config.json and it works. Mostly a proof of concept to see if hosting MCP servers on Workers was even possible. Turns out yes. For anyone curious about the stack: the catalog is plain HTML/JS on Cloudflare Pages with a single servers.json that the detail pages fetch and render client-side. No framework, no database, no build step. The hosted MCP demo is a small Worker using Cloudflare's agents/mcp SDK with a Durable Object for session state. The hosted demo was maybe four hours of work end to end. If you use a server that I'm missing, drop a comment and I'll add it. Also curious whether anyone would actually want to deploy their own server through something like this instead of running it on Railway or Fly. The hosted version is not a real product yet, just feeling out if there's appetite. No signup, no login. JSON feed at /servers.json if anyone wants to build something on top. submitted by /u/madman3063 [link] [comments]
View originalAction safety and truthful reporting - what's new in CC 2.1.136 (+525 tokens)
NEW: System Prompt: Action safety and truthful reporting — Requires confirmation for irreversible or outward-facing actions unless durably authorized, asks agents to inspect targets before deleting or overwriting them, and emphasizes faithful reporting of skipped steps, failed tests, and verified outcomes. Agent Prompt: Auto mode rule reviewer — Adds hard_deny as a fourth custom-rule category for unconditional security-boundary blocks, and narrows soft_deny to destructive or irreversible actions that clear user intent can authorize. Agent Prompt: Security monitor for autonomous agent actions (first part) — Splits blocking logic into unconditional hard blocks and user-authorizable soft blocks, updates the default rule, and makes user intent unable to clear hard-block security boundaries. Agent Prompt: Security monitor for autonomous agent actions (second part) — Moves data exfiltration into hard-block rules, adds hard-block coverage for safety-check bypasses, and treats agent-guessed external services or download sources as untrusted. Tool Description: Edit — Restores the line-number prefix format to a template variable while preserving the guidance to exclude line prefixes from edit strings. Details: https://github.com/Piebald-AI/claude-code-system-prompts/releases/tag/v2.1.136 submitted by /u/Dramatic_Squash_3502 [link] [comments]
View originalSpent two days at the AI Agents Conference in NYC. Most of the companies there were betting on the wrong moat.
One speaker (a VC) said his number for evaluating AI-native startups is ARR per engineer, and that the number ought to be going up. Almost every talk and every booth at the AI Agents Conference was selling a fix for something that broke this year when agents hit production. Observability, governance, supervisor agents, data substrates, "someone's gotta babysit the bots." But what's actually still going to be around in a couple years? What's defensible and durable? The old SaaS pitch was simple. We bundle the expensive engineering investments and domain expertise into a tool. You'd pay for the tool and generate outcomes, but it would be rare for the software company to have real alignment to the actual value created from those outcomes. That's breaking from two ends at once. In the direct-from-imagination era we're moving towards, engineering labor is approaching free. One of the most telling trends is the shift from companies bragging about the size of their engineering teams, towards how much ARR they can generate per engineer. You can vibe-code much of what those booths were selling in a few days or weeks if you have the domain knowledge. The old software model was actually based on under-utilization; the most profitable SaaS companies are frequently those whose customers underuse it (fixed price for the customer, but variable cloud costs for the vendor). Pricing is moving to "token markup." Maybe we'll get to 2-4x revenue for the software, because outcomes are more valuable; but margin compresses because transactional intelligence (i.e., the cost of running the LLMs that power many systems) is basically arbitraging token costs against outcome value. So everyone on that floor was implicitly betting on a new moat to replace the old one. I'm not too confident that these will hold... The most popular bet was on encoded domain expertise (e.g., the sales engineers at Harvey, a legal AI platform, are actually lawyers). I think this works *now* because we're still in the phase of "wow, this technology works like magic." I'm less convinced this is actually durable. Why: Prompt architecture is text. It's portable. The expertise underneath it is often abundant (e.g., there are over a million lawyers in the USA). The righteous destiny for this category ought to be open marketplaces of prompt architecture and/or crowdsourced best-practices. Not trade secrets. The companies trying to build closed prompt moats are going to lose to open ones that iterate faster (which simply parallels the fact that much software engineering is rapidly becoming commoditized to agentic engineering and the burgeoning quantity of ready-made GitHub repos). There are many people pursuing the data substrate; in short, this mirrors the early days of the Web when everyone scrambled to open up legacy data to dynamic standards-based Web UI. Agents will have 100-1000x the data demands of these Web apps, so it makes sense that we need tools to connect them, govern them and comply with regulatory obligations. Newer entrants extend this further, wiring up databases, pipelines, Slack threads, and tickets into context graphs agents can reason over. As I noted above, all this still seems magical. Connect a database, watch an agent crawl the schema and produce a chatbot interface and easy-to-change dashboards. But strip the magic away and most of these are prompt architectures on top of LLMs plus a data-ingestion layer. Once data-access standards mature (MCP is already doing this) and prompt architectures go open-source (alongside much of this wisdom increasingly getting pretrained into the LLMs themselves), that magic stops being proprietary. You'll be defending yourself against the same architecture built internally by your customer's eng team, or against an open-source version that's objectively better. The observability incumbents: these might do better but only at Stripe-like ubiquity where trust is the overriding value (who doesn't trust Stripe at this point?). The ones who survive are probably going to fuse with the audit and compliance function rather than stay pure observability. That's why I keep coming back to one arbitrage that seems critical: trust. This will be especially important in regulated industries, but it reminds me of the old (albeit now hilariously outdated) adage about "nobody ever got fired for choosing IBM." If your competitor can be vibe-coded over a weekend and your customer is a bank, why do they pay you 50x more? It isn't the engineering, it probably isn't even the expertise. The data plumbing will get commoditized, so it can't be that either... It's that you've shifted the risk to a third party who can actually price and defend against risk: SOC2, the named CEO who testifies in court and Congress, a legal team that takes calls, an indemnity wrapper for underwriters. Maybe this means that things actually get commodified into a financialization wrapper, rather than a way to package R&D (FinTech startups bac
View originalI turned Claude into a small claims court (with AI lawyers, a judge, and bribes)
Two people file opposing sides of a petty dispute. Claude argues both sides as lawyers, another Claude instance judges, spectators throw reactions. Mostly a prompt engineering exercise. A few fun bits: Personas with teeth. Five counsel archetypes. Shark, Crusader, Professor, Impresario, Underdog. The Shark attacks credibility. The Professor cites precedent (real and invented). Same case, different counsel = wildly different trial. Past verdicts as case law. Similar prior rulings get retrieved and injected as precedent. The court develops its own jurisprudence over time. Most unexpectedly fun part. Whispers. Send private strategy to your lawyer between turns. Injected as a separate channel, never reaches opposing counsel. Took iterations to get the lawyer to act on whispers without quoting them aloud. Judicial Gratuities. The judge accepts tips. Neither side sees what the other paid. The judge’s prompt is told the amounts and instructed they may be considered in close calls. (Yes, really.) Verdicts sometimes acknowledge it in the most thinly-veiled way possible. What started as a quick side project turned into a live web experience with live trials, spectators, and even a live court tv guide. Stack: Cloudflare Workers + Durable Objects + Claude. Happy to get into prompts and tech in the comments. submitted by /u/etaheri [link] [comments]
View originalAnthropic's job exposure data shows an enormous gap between what AI can do and what AI is actually doing. The composition of that gap is the most interesting part of the dataset.
Anthropic published a paper in March called Labour Market Impacts of AI: A New Measure and Early Evidence. Most of the coverage focused on the headline numbers - which jobs are most exposed, which are least, projected impacts on employment. Worth reading on its own. The part that didn't get enough attention is the structural finding underneath those numbers. For every major occupation, the paper distinguishes between two metrics: Theoretical AI capability: what AI could do based on task analysis Observed AI coverage: what AI is actually being used for right now, measured from real Claude usage data The gap between those two is enormous and consistent across sectors: Sector Theoretical capability Observed coverage Computer & mathematical 94% 33% Office & administrative 90% 25% Business & financial 85% 20% Legal 80% 15% Sales & marketing 62% 27% Healthcare support 40% 5% The headline reading is "AI capability is way ahead of adoption." That's true but it's the surface reading. The more interesting question is what specifically lives in that gap, and whether the things in the gap are temporary or permanent. The composition of the gap, based on the paper's analysis: Legal and compliance constraints. Tasks AI could do but isn't being used for because regulations require a human in the loop, or because liability frameworks haven't caught up. This is a large chunk of legal, healthcare, and financial work. Software integration friction. Tasks AI could do but currently can't because the data is locked in legacy systems that don't expose APIs, or because workflows require human handoffs between tools that aren't connected. Large chunk of administrative and back-office work. Verification overhead. Tasks AI could do at machine speed but in practice take human time to check, which eliminates most of the speed advantage. Common in coding, research, and data analysis. Workflow inertia. Tasks AI could do but where the existing process is socially embedded - meetings, decisions, established communication patterns - and changing the process is harder than the technology problem. Common in sales, management, and consulting. Quality threshold effects. Tasks where AI output is technically possible but consistently 10-15% below the quality bar that matters in practice. Common in creative work, complex writing, and any task where edge cases dominate. The paper is clear that the researchers consider all five of these temporary - barriers that are eroding rather than holding. Categories 2 and 3 (integration friction and verification overhead) are eroding fastest, because they're being addressed by infrastructure investments and tooling improvements. Categories 1, 4, and 5 are eroding more slowly because they involve law, social dynamics, and quality thresholds rather than just engineering. Why this matters more than the headline numbers: If you're trying to forecast how AI exposure will play out for any specific role, the headline number (current observed coverage) is misleading. What you actually want to know is which of those five gap categories your role's protection is built on. A role currently at 20% observed coverage is in a different position depending on whether the remaining 80% is: Locked behind compliance constraints (slow erosion) Locked behind integration problems (fast erosion - probably gone within 2-3 years) Locked behind quality thresholds (medium erosion - improving with each model generation) Locked behind workflow inertia (slow erosion - but cliff-edge once it goes) Two roles at the same observed exposure level can have very different future trajectories depending on which category their protection lives in. The headline number doesn't tell you that. The composition does. The rough framework I use to read my own role through this: For each task in your work, ask: if AI couldn't do this task today, why not? Then categorise the answer into one of the five categories above. The mix tells you how durable your current position is, more accurately than any single exposure number. Tasks protected by compliance or workflow inertia are durable for a few years even at high theoretical exposure. Tasks protected by integration friction or verification overhead are exposed soon, even at low current observed exposure. Tasks protected by quality thresholds are middle - improving model generations close those gradually rather than suddenly. A note on the data source: Anthropic measured observed coverage from real Claude usage. That means the dataset reflects what early adopters and AI-native workers are doing, not the average worker. The actual gap is probably larger than the table suggests, because Anthropic's user base skews toward people already using AI heavily. The 33% observed coverage for computer & mathematical occupations is what Claude users in that field are doing. Across the field as a whole, the number is lower. This makes the gap co
View originalI created a founder's OS, which is local, operable in Claude Desktop and automises mundane workflows.
I am an Indian founder, and I have built an ERP/Founder's OS. The idea was simple, and had a few layers. The system is not connected to any critical infrastrcture - banking, compliance, cloud etc. it is intentional. The maximum agents can do is effectively data entry It keeps records of business, projects, tasks, accounts and clients. It automates mundane processes like raising receipts/invoices/quotations etc, creating follow up emails, compliance reminders etc. It is connected to apple apps like mail, reminders. It creates structure for founder's month let's say, and a long term record of it. It is a data management tool in that sense. Agency of delegation and delivery remains with the founder, execution is outsourced to agents. You can work in Claude Desktop (only claude for now), the business register and records are generated in ERP. Coming to the technical stack - Single self-contained frontend 49 ES modules (8,635 LOC) compiled with Vite + React 18.3 + JSX into one dist/index.html (411 KB / 113 KB gzipped). Zero runtime CDN dependencies. Full ESLint 9 setup (react-hooks rules at error level) and 67 Vitest unit tests. Coherent print-ready design system using Newsreader / Alegreya Sans / IBM Plex Mono with 10+ document templates embedded Dual-store persistence IndexedDB for instant UI hot-path reads (browser-canonical). SQLite (~/.app/app_db.sqlite) as durable source of truth. Automatic boot-time reconciliation + custom db-change events for live sync after agent writes. Agent Integration (MCP-first): Full Anthropic plugin format (connector + skill + operator surfaces). Verified end-to-end in Claude Desktop and Claude Code. Through terminal, any agent can access It has been a joy, and I have been able to save so much of time of my week as a solo founder. And, I am not a coder or an engineer. I am a business management professional. Claude has actually given me wings, where I have been able to automate so many mundane parts of my work life through personal tool built via Claude. submitted by /u/Chinmay3011 [link] [comments]
View originalAI is moving from chatbots to real workflows. Here is what I think technical learners should focus on.
https://preview.redd.it/qfejbfsmxvyg1.png?width=1672&format=png&auto=webp&s=edf56bfbe020d0bd8d0eca785ff5479f0d9f6495 AI news is getting noisy again. New models. Coding agents. Cybersecurity benchmarks. Cloud agent platforms. Open-source AI tools. Huge infrastructure spending. But if you are learning cloud, Linux, AWS, automation, or practical AI, I think the useful question is not: "What is the best AI tool?" It is: "What skills help me use any AI tool better?" My current answer: Learn delegation, not just prompting Learn enough cybersecurity to verify AI output Learn the cloud stack around AI Use GitHub trends as a learning signal, not entertainment Build durable foundations Linux, networking, cloud, automation, debugging, security, data handling, and technical writing will still matter whether the AI hype grows or cools. Curious how others are thinking about this: if you are learning tech right now, are you focusing more on AI tools, cloud, Linux, coding, or security? submitted by /u/DearAnt812 [link] [comments]
View originalStop bloating your agent context with MEMORY.md. I built a local cognitive memory MCP instead.
Hey everyone, I’ve been building paradigm-memory, a local-first memory layer for AI coding agents. The motivation is pretty simple: I got tired of agents forgetting project context, or relying on giant MEMORY.md files that slowly become a messy context dump. paradigm-memory gives agents a persistent, searchable cognitive map instead. GitHub: https://github.com/infinition/paradigm-memory Website: https://infinition.github.io/paradigm-memory/ It is: local-first: one SQLite file on your machine MCP-native: works with Claude Code, Codex, Cursor, Cline, Continue, Gemini CLI, OpenCode, etc. auditable: every write / delete / import / move has a mutation log multi-agent: several agents can share the same memory store multi-workspace: one MCP process can serve multiple projects desktop inspectable: Tauri app with map, graph, search, review queue, audit log, snapshots and consolidation tools zero cloud / zero telemetry The core idea is that memory should not just be a flat vector store. Instead, facts live inside a cognitive map: nodes, items, keywords, importance, freshness, confidence, activation. When an agent calls memory_search, it gets a token-budgeted context pack with the relevant subtree and evidence, not 50 random chunks from a vector database. Typical workflow: At the start of a task, the agent calls memory_search. It gets relevant durable project context. When it learns a decision, convention, bug, preference, or architecture detail, it writes/proposes it back to memory. You can review, edit, move, audit, export, import or consolidate everything from the desktop app. Install is one line: Windows: powershell irm https://raw.githubusercontent.com/infinition/paradigm-memory/main/scripts/installer/install.ps1 | iex Linux / macOS: bash curl -fsSL https://raw.githubusercontent.com/infinition/paradigm-memory/main/scripts/installer/install.sh | bash Then: bash paradigm this is still early, but already useful in my own workflow. I’d especially love feedback from people using MCP-based coding agents: install flow, client compatibility, memory structure, and whether this kind of auditable local memory solves a real pain for you. submitted by /u/Bright_Warning_8406 [link] [comments]
View originalYes, Durable offers a free tier. Pricing found: $0, $0, $25/m, $20, $99/m
Durable has an average rating of 4.5 out of 5 stars based on 2 reviews from G2, Capterra, and TrustRadius.
Key features include: Home Services, Health Wellness, Professional Services, Food Events, Pet Auto, Creative Digital, AI image studio, Discoverability.
Durable is commonly used for: Creating a personal portfolio website in minutes, Building an online store for handmade goods, Launching a service-based business website for freelancers, Setting up a health and wellness blog with integrated booking, Developing a food delivery service platform, Creating a pet care service website with appointment scheduling.
Durable integrates with: Stripe for payment processing, Zapier for workflow automation, Google Analytics for website tracking, Mailchimp for email marketing, Slack for team communication, Calendly for scheduling appointments, Canva for graphic design, QuickBooks for accounting, Shopify for e-commerce capabilities, WordPress for blogging features.
Shawn Wang
Founder at smol.ai
2 mentions

Starting a Business Is About to Get Unfair
Mar 18, 2026
Based on user reviews and social mentions, the most common pain points are: token cost, API bill, API costs.
Based on 47 social mentions analyzed, 6% of sentiment is positive, 91% neutral, and 2% negative.