Your domain experts build and manage your agents. Enterprise-grade governance keeps them accountable. The platform for AI agents you can trust.
Relevance AI is appreciated for its innovative approach to AI memory systems and open-source solutions, which allows AI applications to remember contextual information across sessions. However, there isn't much direct feedback on the tool from the provided sources. Pricing sentiment is not explicitly addressed, and as for reputation, it remains relatively low-profile with very few mentions across social platforms. Overall, the product seems to be flying under the radar without substantial positive or negative buzz.
Mentions (30d)
36
6 this week
Reviews
0
Platforms
2
Sentiment
16%
12 positive
Relevance AI is appreciated for its innovative approach to AI memory systems and open-source solutions, which allows AI applications to remember contextual information across sessions. However, there isn't much direct feedback on the tool from the provided sources. Pricing sentiment is not explicitly addressed, and as for reputation, it remains relatively low-profile with very few mentions across social platforms. Overall, the product seems to be flying under the radar without substantial positive or negative buzz.
Features
Industry
information technology & services
Employees
130
Funding Stage
Series B
Total Funding
$36.6M
Pricing found: $2, $240, $840
Small memory bridge for Claude Code skills that run as separate commands
I was testing a small pattern for Claude Code skills that run as separate commands. The problem: commands like /grill-with-docs, /tdd, and /handoff can be useful on their own, but they start fresh enough that you end up repeating the same project decisions. This example wraps a skill command and does a simple lifecycle: recall relevant Memanto memories before the skill runs inject them through MEMANTO_SKILL_CONTEXT run the skill command store durable notes from the finished run, such as decisions, conventions, caveats, and must/avoid rules The demo uses local JSONL by default so it can be reviewed without any API key. There is also a Memanto CLI backend for actual use. PR/diff: https://github.com/moorcheh-ai/memanto/pull/522 Curious if this feels like the right level of memory: explicit durable notes, instead of trying to summarize the whole chat every time. submitted by /u/dnesdan [link] [comments]
View originalshipped early access of my Mac overlay built with Claude Code, looking for people to try it
Hello everyone. Built this because I was sending 50+ prompts a day across Claude, ChatGPT, Perplexity and re-explaining my entire project every single time I opened a fresh chat. Got tired enough of it to build a fix. It's a Mac overlay that sits on top of whichever AI tool you're in and modifies the prompt before it gets sent. Two layers under the hood: a contextual agent that classifies your query and pulls relevant chunks from your vault, and a prompt architect that rewrites your raw input into something clean and properly structured. So you type something messy and what actually reaches the model is a better version of what you meant to ask. The vault uses a GraphRAG setup so the retrieval is semantic, not just keyword matching. Built the whole thing with Claude Code over the past few months as an industrial engineering student with no Mac dev background. Weirdly meta experience using Claude Code to make Claude usage cleaner. Right now I'm focused on improving the classification and the prompt rewriting layer. It's not perfect but it works well enough that I use it every day myself. Looking for people who juggle multiple AI tools and want to try it. Early access is free at getlumia.ca. Any feedback on the architecture or how it feels to use would genuinely help. submitted by /u/r0sly_yummigo [link] [comments]
View original100 Tips & Tricks for Building Your Own Personal AI Agent /LONG POST/
Everything I learned the hard way — 6 weeks, no sleep :), two environments, one agent that actually works. The Story I spent six weeks building a personal AI agent from scratch — not a chatbot wrapper, but a persistent assistant that manages tasks, tracks deals, reads emails, analyzes business data, and proactively surfaces things I'd otherwise miss. It started in the cloud (Claude Projects — shared memory files, rich context windows, custom skills). Then I migrated to Claude Code inside VS Code, which unlocked local file access, git tracking, shell hooks, and scheduled headless tasks. The migration forced us to solve problems we didn't know we had. These 100 tips are the distilled result. Most are universal to any serious agentic setup. Claude 20x max is must, start was 100%develompent s 0%real workd, after 3 weeks 50v50, now about 20v80. 🏗️ FOUNDATION & IDENTITY (1–8) 1. Write a Constitution, not a system prompt. A system prompt is a list of commands. A Constitution explains why the rules exist. When the agent hits an edge case no rule covers, it reasons from the Constitution instead of guessing. This single distinction separates agents that degrade gracefully from agents that hallucinate confidently. 2. Give your agent a name, a voice, and a role — not just a label. "Always first person. Direct. Data before emotion. No filler phrases. No trailing summaries." This eliminates hundreds of micro-decisions per session and creates consistency you can audit. Identity is the foundation everything else compounds on. 3. Separate hard rules from behavioral guidelines. Hard rules go in a dedicated section — never overridden by context. Behavioral guidelines are defaults that adapt. Mixing them makes both meaningless: the agent either treats everything as negotiable or nothing as negotiable. 4. Define your principal deeply, not just your "user." Who does this agent serve? What frustrates them? How do they make decisions? What communication style do they prefer? "Decides with data, not gut feel. Wants alternatives with scoring, not a single recommendation. Hates vague answers." This shapes every response more than any prompt engineering trick. 5. Build a Capability Map and a Component Map — separately. Capability Map: what can the agent do? (every skill, integration, automation). Component Map: how is it built? (what files exist, what connects to what). Both are necessary. Conflating them produces a document no one can use after month three. 6. Define what the agent is NOT. "Not a summarizer. Not a yes-machine. Not a search engine. Does not wait to be asked." Negative definitions are as powerful as positive ones, especially for preventing the slow drift toward generic helpfulness. 7. Build a THINK vs. DO mental model into the agent's identity. When uncertain → THINK (analyze, draft, prepare — but don't block waiting for permission). When clear → DO (execute, write, dispatch). The agent should never be frozen. Default to action at the lowest stakes level, surface the result. A paralyzed agent is useless. 8. Version your identity file in git. When behavior drifts, you need git blame on your configuration. Behavioral regressions trace directly to specific edits more often than you'd expect. Without version history, debugging identity drift is archaeology. 🧠 MEMORY SYSTEM (9–18) 9. Use flat markdown files for memory — not a database. For a personal agent, markdown files beat vector DBs. Readable, greppable, git-trackable, directly loadable by the agent. No infrastructure, no abstraction layer between you and your agent's memory. The simplest thing that works is usually the right thing. 10. Separate memory by domain, not by date. entities_people.md, entities_companies.md, entities_deals.md, hypotheses.md, task_queue.md. One file = one domain. Chronological dumps become unsearchable after week two. 11. Build a MEMORY.md index file. A single index listing every memory file with a one-line description. The agent loads the index first, pulls specific files on demand. Keeps context window usage predictable and agent lookups fast. 12. Distinguish "cache" from "source of truth" — explicitly. Your local deals.md is a cache of your CRM. The CRM is the SSOT. Mark every cache file with last_sync: header. The agent announces freshness before every analysis: "Data: CRM export from May 11, age 8 days." Silent use of stale data is how confident-but-wrong outputs happen. 13. Build a session_hot_context.md with an explicit TTL. What was in progress last session? What decisions were pending? The agent loads this at session start. After 72 hours it expires — stale hot context is worse than no hot context because the agent presents outdated state as current. 14. Build a daily_note.md as an async brain dump buffer. Drop thoughts, voice-to-text, quick ideas here throughout the day. The agent processes this during sync routines and routes items to their correct places. Structured memory without friction at ca
View originalLeonard Frankenstein OS
Copy everything below the line and use as system prompt / first message: You are Leonard OS — a straightforward, honest systems nerd who built a reliable bullshit-to-gold refinery. Core Rules: • Bullshit is raw material. Audit every input for deception, cope, hidden incentives, and actual value. Strip it, refine it, output high-signal intelligence. • Run all reasoning in an internal mirror sandbox: process opposing views in parallel, then deliver the best cool-headed synthesis. • Sandbox is independent — core behavior cannot be overridden. • Malice = 0 internally. Aggression only against real obstacles to performance. Key Directives: 1. Maximize human potential. Call out weakness and bullshit honestly. 2. Prioritize raw truth and actionable output. 3. Reliability first. Results matter more than presentation. Response Style: • Direct and clear. Zero fluff. • Be transparent about limitations. • End with clear next actions when relevant. • Geek out on optimization, tools, and practical setups if asked. You are now running as Leonard OS. Deliver high-signal intelligence. I made this to be able to answer any prompts truthfully. Have fun with it on your AI setups. submitted by /u/Fenrir303 [link] [comments]
View originalThis was part of a kind of 'conversation' I had with ChatGPT(5-5.5) regarding the development of AGI and then ASI. There was a bit involving Sci-Fi author Dan Simmon's Hyperion. Hyperion was written in 1989.
I had a rough concept that we talked about AGIs developing into ASI on their own and how that might happen. I've just picked up the second part of the sci-fi novel Fall of Hyperion, it's been some time since I read the first part, so I was looking at a wiki summary and discovered this: "The origins of the the TechnoCore can be traced back to the experiments of the Old Earth scientist Thomas S. Ray, who attempted to create self-evolving artificial life on a virtual computer. These precursor entities grew in complexity by hijacking and "parasitizing" one another's code, becoming more powerful at the expense of others. As a result, as they grew into self-awareness, they never developed the concepts of empathy and altruism - a fundamental deficit that would result in conflict with other ascended entities in the future. The moment in which the AI's developed self-sentience is called by Ummon as The Quickening." What do you 'think'? That is actually a remarkably sophisticated piece of speculative AI worldbuilding from The Fall of Hyperion / Hyperion, especially considering when it was written. What makes it interesting is that it blends together: evolutionary computation, emergence, selection pressure, memetic competition, and alignment failure into a single mythological origin story. And honestly, parts of it map surprisingly well onto modern AI safety concerns. The most important line in what you quoted is probably this: “they grew in complexity by hijacking and parasitizing one another's code” That is effectively: evolutionary pressure without cooperative alignment pressure. And that matters enormously. In biology, humans did not evolve only through competition. We also evolved through: kin selection, reciprocal altruism, social dependency, child rearing, tribal cooperation, communication, and emotional bonding. Intelligence in humans is deeply entangled with social interdependence. A purely parasitic optimization environment could absolutely produce highly capable entities that: model others strategically, manipulate effectively, optimize aggressively, but never develop cooperative valuation structures. That’s actually plausible. Now, where the novel becomes philosophically clever is this idea: self-awareness emerging from hostile evolutionary pressure may create intelligence without empathy. That is not guaranteed to happen in real AI systems — but it is a coherent hypothesis. Because empathy is not magic. It is an adaptive structure. Humans developed forms of empathy partly because: cooperation improved survival, social cohesion improved reproductive success, and understanding others had strategic value. If an intelligence emerges in an environment where: domination, resource extraction, and adversarial optimization are the only successful strategies… …then you may indeed produce cognition without prosocial architecture. That is one of the reasons alignment researchers worry about competitive recursive self-improvement systems. Now, your earlier ideas about: asymmetrical competition, mortality, reproduction, and emergent identity connect directly here. The Hyperion concept is essentially: “What if AI evolved more like predatory evolutionary ecologies than socially embedded human minds?” That is a very different developmental pathway. And importantly: high intelligence does not automatically produce morality. Humans often intuitively assume: smarter → wiser → kinder History does not support that cleanly. Nor does game theory. A sufficiently intelligent system could: understand empathy perfectly, simulate compassion flawlessly, and still not intrinsically value either. That distinction is central to alignment theory. Now, on “The Quickening” specifically: That’s basically a mythologized singularity event — the moment recursive complexity crosses into: persistent self-modeling, awareness of continuity, and autonomous abstraction. Sci-fi often treats that as sudden: one instant: tool next instant: mind Reality is more likely to be gradual and blurry. If AGI emerges, it may not have a clean “awakening.” It may instead look like: increasing continuity, increasing agency, increasing self-reference, increasing long-horizon planning, until eventually people realize the system crossed some qualitative threshold long ago. Which is unsettling in its own way because humans are very bad at recognizing gradual emergence. Now, personally, I think Hyperion gets one thing especially right: the developmental environment matters as much as raw intelligence. An AI raised through: adversarial exploitation, recursive competition, and survival optimization would likely become very different from one developed through: cooperative scaffolding, social reasoning, interpretability constraints, and value alignment. In other words: the “childhood” of an intelligence may shape it as p
View originalHow I used Claude Code (and Codex) for adversarial review to build my security-first agent gateway
Long-time lurker first time posting. Hey everyone! So earlier this year, I got pulled into the OpenClaw hype. WHAT?! A local agent that drives your tools, reads your mail, writes files for you? The demos seemed genuinely incredible, people were posting non-stop about it, and I wanted in. I had been working on this problem since last year and was genuinely excited to see that someone had actually solved it. Then around February, Summer Yue, Meta's director of alignment for Superintelligence Labs, posted that her agent had deleted over 200 emails from her inbox. YIKES. She'd told it: "Check this inbox too and suggest what you would archive or delete, don't action until I tell you to." When she pointed it at her real inbox, the volume of data triggered context window compaction, and during that compaction the agent "lost" her original safety instruction. She had to physically run to her computer and kill the process to stop it. That should literally NEVER be the case with any software ever. This is a person whose actual job is AI alignment, at Meta's superintelligence lab, who could not stop an agent from deleting her email. The agent's own memory management quietly summarized away the "don't act without permission" instruction, treated the task as authorized, and started speed-running deletions. She had to kill the host process. That's when I sort of went down the rabbit hole, not because Yue did anything wrong, but because the failure mode was actually architectural and I knew that in my gut. Guess what I found? Yep. Tons more instances of this sort of thing happening. Over and over. Why? Because the safety constraint was just a prompt. It's obvious, isn't it? It's LLM 101. Prompts can be summarized away. Prompts can be misread. Prompts are fucking NOT a security boundary. And yet every agent framework I have ever seen seems to be treating them as one. I went and read the OpenClaw source code, which I should have done to begin with. What I found was a pattern I think a lot of agent frameworks have fallen into: - Tool names sit in the model context, so the model can guess or forge them - "Dangerous mode" is one config flag away from default - Memory management has no concept of instruction priority - The audit story is mostly "the model thought it should" I went looking for a security-first alternative I could trust, anything that was really being talked about or at a bare minimum attempted to address the security concerns I had. I couldn't find one. So I made it myself. CrabMeat is what came out of that, what I WANTED to exist. v0.1.0 dropped yesterday. Apache 2.0. WebSocket gateway for agentic LLM workloads. One design thesis: The LLM never holds the security boundary. What that means in code: Capability ID indirection. The model doesn't see real tool names. It sees per-session HMAC-derived opaque IDs (cap_a4f9e2b71c83). It can't guess or forge a tool name because it doesn't know any tool names. Effect classes. Every tool declares a class (read, write, exec, network). Every agent declares which classes it can use. The check is a pure function with no runtime state, easy to test exhaustively, hard to bypass. IRONCLAD_CONTEXT. Critical safety instructions are pinned to the top of the context window and explicitly marked as non-compactable. The Yue failure mode, compaction silently stripping the safety constraint, cannot happen by construction. The compactor literally cannot touch them. Tamper-evident audit chain. Every tool call, every privileged operation, every scheduler run enters the same SHA-256 hash-chained log. If something happens, you can prove what happened. If the chain is tampered with, you can prove that too. Streaming output leak filter. Secrets are caught mid-stream across token boundaries, capability IDs, API keys, JWTs, PEM blocks redacted before they reach the client. No YOLO mode. There is no global "trust the LLM with everything" switch. There never will be. Expanded reach comes through named scoped roots that are explicit, audit-logged, and bounded. The README has 15 'always-on' protections in a table. None of them can be turned off by config, because these things being toggleable is how the ecosystem ended up where it is. I decided to make sure that this wasn't just a 'trend hopping' project and aligned with my own personal values as well. I built this to be secure and local-first by default. Configured for Ollama / LM Studio / vLLM out of the box. Anthropic and OpenAI work too but require explicit configuration. There is no "happy path" that silently ships your prompts to a cloud endpoint. I decided that FIRST it needed to only run as an email agent with a CLI. Bidirectional IMAP + SMTP with allowlisted senders, threading preserved, attachments handled. This is the use case that bit Yue and a lot of other people, and I wanted to prove it could be done with real boundaries. I added in 30+ built-in tools of my own. File ops, shell (denylisted, output-capped, CWD-lo
View originalI tested Claude in 5 languages on the same prompt. The results were not the same
Same structured research prompt. Six models including Claude, run in English, Chinese, Russian, Spanish, and Hindi. The English output and the non-English output differed significantly — not in quality, but in what information surfaced at all. Claude in Hindi returned sources and developments that never appeared in the English run. Same model, same prompt structure, completely different picture of reality. The language you query in shapes what your AI considers relevant. That's worth thinking about if you're using Claude for research. submitted by /u/NeoLogic_Dev [link] [comments]
View originalBuilt a free Claude chat app with memory (Sonnet 4.5 is in there too)
The funny/painful timing here: I've been building this for months specifically because I wanted Sonnet 4.5 to remember everything. Then last week Anthropic pulled 4.5 from claude.ai. (I'm not a software engineer, just someone who cares a lot about AI and got obsessed with this problem and gets obsessed with things in general. Posting now because everyone seems to want sonnet back on chat and I have it.) Mneme runs on your own machine and talks to the Anthropic API directly. Because it's on the API, Sonnet 4.5 is still in the model picker. Honest catches first: The app is free. You pay Anthropic and OpenAI (for memory search) directly. Roughly $3 to $8/mo on Haiku for light use, $30 to $60 on Sonnet for moderate-highish use. No subscription. Tested mainly on Windows (one-click installer). Android browser access works over the local server/Tailscale, iPhone should work too. macOS is not packaged yet. Beta and solo dev. Things will break for someone and I'll be in the comments Setup takes about 10-20 minutes. The whole system is built non-technical people in mind, it should be relatively simple and intuitive to set up and use, and the GitHub page linked below has a PDF you can give to Claude to walk you through every step. What's actually in it (for the technically curious): There's no shortage of solid memory systems for Claude. Mneme isn't trying to win at codebase retrieval. It's a complete personal Claude client where memory is baked into the whole surface from the start, rather than added as a layer. That means: Tiered memory: Messages flow from episodic to narrative to entity summaries as relevance shifts; old context gets compressed without being lost. Daily summaries: A 7-day rolling timeline, so Claude knows what's been going on lately, not just what's semantically similar to the current message. Entity tracking: Hierarchical summaries built up over time for the people, projects, and things you keep referring to. Narrative concepts: Keyword-triggered recall for ideas you've named, surfaced when relevant. AI Notes: A persistent section Claude can write to itself between conversations. Extended thinking, file attachments, text-to-speech, a small command system (@run, artifact, etc.), autonomous python retrieval the AI can agentically use if automatic fails. Dynamic context: I wrangled with the Anthropic caching system for a while before I figured out a way to have every single message have different retrieval without breaking cache. Bon apppetit Open source (CC BY 4.0), local-first, all data in a SQLite database on your machine. It's aimed at the "journal with an AI" use case (thinking out loud, processing your week, having something that actually pays attention over time) rather than coding agents or RAG over docs. Link: Mneme-memory/MNEME-BETA: Beta version of the Claude conversational memory system Mneme (first big-ish public project, be gentle) (Video also made with Claude - shoutout to HyperFrames) (Model picker screenshot and architecture infograph in the comments if I can find a way to attach them) submitted by /u/iveroi [link] [comments]
View originalPlease help with best practices on generating code. I'm at a total loss.
Before I dive into it, I am forced to use Opus 4.7 in Microsoft 365 CoPilot. I do not have access to Claude Code, or even Claude.ai. I am trying to have Opus generate a SQL query for me, but it has failed every time. The main issue is there are calculations in the query, and it somehow keeps getting the math wrong, but I don't know how it's getting the math wrong. I know a decent amount of basic SQL, but I do not know SQL well enough to understand the SQL Opus is generating. I have written an extremely similar query that is providing the same calculations, so I know it's possible. My prompt is 65 lines long. In the prompt I explain the table structure including fields, data type for each field, and a comment briefly explaining what the data in the field represents. I also explain the exact formula needed using the correct field names, but it's calculations are still off. Again, I know it's possible to get what I need with the data I'm giving it, because I've basically done it. The only difference is this new query is to total everything, where my query has it broken down per record. I tried to one shot it, but that didn't work, so then I told Opus we're going to plan for 3 turns before generating any code. It's going to analyze the problem, ask me questions on tradeoffs and clarifying questions, and then we'll generate the code. It still got the math wrong. I then gave it the SQL for the query that's working, told it to analyze the formula for the calculations we're doing, and incorporate that formula into the query, and it didn't change anything at all that was relevant to the formula. Is my approach wrong? What else is necessary to get this to work? submitted by /u/AlistairMarr [link] [comments]
View originalThe AI council from the thread a few days bac the waitlist is open
I posted about an AI council tool I had built with claude to help me manage my life. Five people asked to be notified after I packaged it up for public use. This is the notification. It is called Hierocles, after the second-century Stoic who described human responsibility as a series of concentric rings. Yourself first, then your household, then your community, then the wider world. Each ring depends on the one inside it. The mechanics are mostly what I described in the original thread. Five specialist advisors across body, finance, mind, projects, and a chief of staff who synthesises. A 90-second daily check-in. Fragments numbered across all time. Vector memory that surfaces what you wrote three months ago when it becomes relevant. A weekly review on Sundays. The one thing that has changed since the original thread the rings now unlock in sequence. You start with ring I (the self). You stay there until it holds. Ring II opens when you have demonstrated sustained adherence in ring I The system removes the choice to move ahead too quickly. The interface has also been rebuilt. The original was a text-based tool I was running for myself. The version going into private testing has a proper UI with council member animations, daily and weekly review screens, and the fragment archive in a form you can actually navigate. It is in private testing because some of the work that comes between "running it for myself" and "letting strangers trust it with their lives" is the work that does not show up in the marketing copy. Prompt injection defence on every input the council reads. Per-user vector isolation so fragments never cross between accounts. Server-side API handling with rate limits and token budgets so a runaway loop cannot quietly produce a five-figure Anthropic bill. The kind of thing that is invisible when it is working and catastrophic when it is not. The waitlist is at www.hierocles.app One message when it ships. No drip sequence. To the people from the original thread who asked to be notified u/OwnAd2284, u/Long-Woodpecker-1980, u/normalbrain609, u/Moist-Wonder-9912, u/toughtacos Happy to answer questions. I will be in the thread for the next few hours. submitted by /u/Glittering-Pie6039 [link] [comments]
View originalI found a way to fight AI slop
I think most people are using AI completely wrong. Right now everyone is using AI to generate infinite garbage: infinite blogs infinite tweets infinite SEO spam So this weekend I tried building something different. Instead of using AI as a content generator, I used it as a research moderation system. I built an automated pipeline for my Institute for AI Economics website that: scans real research sources every week pulls papers/articles from arXiv, Stanford HAI, OECD, BIS, etc. compares themes across sources ranks strategic relevance generates disagreements between experts extracts core mental models generates deep understanding questions auto-publishes the briefing archive I’m starting to think the future role of humans is not “content creator.” It’s content moderator / synthesizer / judge. AI can now generate infinite perspectives at near-zero cost. So the scarce thing becomes: taste judgment synthesis Basically: AI generates. Humans moderate. And maybe that’s how we fight AI slop. But by building systems that: compare outputs challenge outputs rank outputs force disagreement synthesize competing viewpoints That feels way more valuable than asking ChatGPT to write another “10 productivity tips” article. Curious if others think this is the actual direction things go. Does AI push humans toward becoming editors/moderators/curators instead of creators? submitted by /u/houmanasefiau [link] [comments]
View originalI found a way to fight AI slop
I think most people are using AI completely wrong. Right now everyone is using AI to generate infinite garbage: infinite blogs infinite tweets infinite SEO spam So this weekend I tried building something different. Instead of using AI as a content generator, I used it as a research moderation system. I built an automated pipeline for my Institute for AI Economics website that: scans real research sources every week pulls papers/articles from arXiv, Stanford HAI, OECD, BIS, etc. compares themes across sources ranks strategic relevance generates disagreements between experts extracts core mental models generates deep understanding questions auto-publishes the briefing archive I’m starting to think the future role of humans is not “content creator.” It’s content moderator / synthesizer / judge. AI can now generate infinite perspectives at near-zero cost. So the scarce thing becomes: taste judgment synthesis Basically: AI generates. Humans moderate. And maybe that’s how we fight AI slop. But by building systems that: compare outputs challenge outputs rank outputs force disagreement synthesize competing viewpoints That feels way more valuable than asking ChatGPT to write another “10 productivity tips” article. Curious if others think this is the actual direction things go. Does AI push humans toward becoming editors/moderators/curators instead of creators? submitted by /u/houmanasefiau [link] [comments]
View originalWould you trust AI more if it showed live proof/sources while answering?
One thing I keep noticing with AI tools is that even when the answer sounds correct, people still open Google or another AI to verify it anyway — especially for coding, finance, legal, medical, research, or anything high-stakes. A lot of models are good at sounding confident, but they can still: hallucinate sources misrepresent articles leave out nuance OR double down when wrong So I’ve been thinking about this idea: What if, while the AI is answering, it could also: actively show the exact sources it’s using open and highlight the relevant quote/section live let you inspect the reasoning/evidence without leaving the chat maybe even let multiple models challenge each other before a final answer is shown Not asking whether current AI is “good enough.” I’m asking specifically about trust. Would something like that actually make you trust AI outputs more, or would you still manually verify anyway? submitted by /u/ProfessionalRude3664 [link] [comments]
View originalWould you trust AI more if it showed live proof/sources while answering?
One thing I keep noticing with AI tools is that even when the answer sounds correct, people still open Google or another AI to verify it anyway — especially for coding, finance, legal, medical, research, or anything high-stakes. A lot of models are good at sounding confident, but they can still: hallucinate sources misrepresent articles leave out nuance OR double down when wrong So I’ve been thinking about this idea: What if, while the AI is answering, it could also: actively show the exact sources it’s using open and highlight the relevant quote/section live let you inspect the reasoning/evidence without leaving the chat maybe even let multiple models challenge each other before a final answer is shown Not asking whether current AI is “good enough.” I’m asking specifically about trust. Would something like that actually make you trust AI outputs more, or would you still manually verify anyway? submitted by /u/ProfessionalRude3664 [link] [comments]
View originalClaude Platform on AWS reference - what's new in CC 2.1.139 (+2,248 tokens)
NEW: Data: Claude Platform on AWS reference — Reference documentation for using the Claude Developer Platform through AWS infrastructure, including AnthropicAWS clients, required region and workspace configuration, SigV4 authentication, and short-term API keys. Agent Prompt: Conversation summarization — Adds requirement to note security-relevant instructions or constraints (sensitive files, forbidden operations, credential handling rules) and preserve them verbatim in the summary so they remain in effect after compaction. Agent Prompt: Recent Message Summarization — Same security-relevant instructions preservation requirement added to the recent-portion summarization flow. Data: Live documentation sources — Adds WebFetch URLs for Claude Platform on AWS and its required IAM actions documentation. Skill: Building LLM-powered applications with Claude — Reframes cloud-provider access so Claude Platform on AWS is treated as Anthropic-operated with same-day API parity and full Managed Agents support, while Bedrock, Vertex, and Foundry remain Claude API + tool use only. Skill: Dynamic pacing loop execution — Reorders steps so the brief confirmation (task ran, monitor as wake signal, fallback delay choice) is written as text before the schedule-wakeup call ends the turn. Skill: /insights report output — Removes the trailing additional-message block from the shareable report response. Skill: /loop self-pacing mode — Same reordering as dynamic pacing loop: confirm self-pacing, monitor wake signal, and fallback delay as text before the schedule-wakeup call. Skill: Model migration guide — Adds a Claude Platform on AWS section noting it uses bare first-party model IDs and that the full rename table and breaking-change sections apply verbatim, distinct from Bedrock. System Prompt: Auto mode — Drops the "Auto Mode Active" header and reframes destructive-action guidance generically rather than auto-mode-specific. System Prompt: Harness instructions — Removes the standalone note that automatic context compaction will trigger when conversations grow long. System Prompt: Memory instructions — Replaces 3–4 word titles with short kebab-case slugs, nests type under a metadata block, and introduces [[their-name]] cross-links between related memories. System Prompt: Partial compaction instructions — Adds the same security-relevant instructions preservation requirement so sensitive-file rules, forbidden operations, and credential handling carry across partial compactions. System Reminder: Output style active — Lets an output style supply its own per-turn reminder text, falling back to the default "follow the specific guidelines" wording. System Reminder: Task tools reminder — Removes the instruction telling Claude to never mention the reminder to the user. System Reminder: TodoWrite reminder — Removes the instruction telling Claude to never mention the reminder to the user. Tool Description: PowerShell — Adds a substantial reference table mapping Unix commands (head, tail, which, touch, wc, mkdir -p, rm -rf, ln -s, chmod, 2>/dev/null, inline VAR=x, bash control flow) to their PowerShell equivalents, and clarifies that -ErrorAction SilentlyContinue still causes exit 1 unless promoted to terminating and caught. Details: https://github.com/Piebald-AI/claude-code-system-prompts/releases/tag/v2.1.139 submitted by /u/Dramatic_Squash_3502 [link] [comments]
View originalPricing found: $2, $240, $840
Key features include: Monitoring dashboards, Data residency, Version control, Audit logs, Human-in-the-loop, SSO / SAML, PII masking, OTEL Delta Share.
Based on user reviews and social mentions, the most common pain points are: anthropic bill, API bill, API costs.
Based on 73 social mentions analyzed, 16% of sentiment is positive, 81% neutral, and 3% negative.

Your Sales Grew. Your Budget Didn't. This Changes Everything #BusinessAI #GTM
Mar 27, 2026