Prompt flow Doc
PromptFlow garners attention for its integration capabilities with GPT-powered chatbots, particularly in live website environments, offering practical applications beyond traditional benchmarks. Users appreciate its role in defining complex AI prompts, enhancing productivity in tasks ranging from financial planning to strategic writing. However, there are complaints regarding unexpected project separation and workflow friction, especially in cross-project conversation management. Pricing sentiment is generally neutral, with discussions primarily focusing on functionality and effectiveness instead of cost. Overall, PromptFlow maintains a respectable reputation, esteemed for its versatility and practical utility in enhancing AI-driven processes.
Mentions (30d)
39
14 this week
Reviews
0
Platforms
2
GitHub Stars
11,087
1,089 forks
PromptFlow garners attention for its integration capabilities with GPT-powered chatbots, particularly in live website environments, offering practical applications beyond traditional benchmarks. Users appreciate its role in defining complex AI prompts, enhancing productivity in tasks ranging from financial planning to strategic writing. However, there are complaints regarding unexpected project separation and workflow friction, especially in cross-project conversation management. Pricing sentiment is generally neutral, with discussions primarily focusing on functionality and effectiveness instead of cost. Overall, PromptFlow maintains a respectable reputation, esteemed for its versatility and practical utility in enhancing AI-driven processes.
Features
Use Cases
Industry
information technology & services
Employees
3
116,174
GitHub followers
7,713
GitHub repos
11,087
GitHub stars
20
npm packages
40
HuggingFace models
PrimeTask Bring Your Own AI - Claude sets up a full project in one prompt.
Hey r/ClaudeAI, I'm one of the developers behind PrimeTask, a local-first productivity system for macOS. The final beta now ships with Bring Your Own AI, a local MCP server (110+ tools, 5 prompt templates) so you can point Claude Desktop, Claude Code, Cursor, or LM Studio at it and let your own agent do the work. Quick demo in the video. One sentence from me, end-to-end project setup from Claude. What's happening in the clip I say I'm launching a Mac app in six weeks and ask Claude to set up the project. Claude creates the project with a deadline, three phase tasks (Design, Build, Launch) with staged due dates, descriptions, tags, subtasks, and short checklists. Sets a reminder on the first task so the native macOS toast fires during the recap. Recommends where to start. I say "start." Claude moves Design into the Design status and kicks off a timer. Twelve-plus tool calls under one prompt. No copy-paste, no manual setup. Why BYO AI (not a bundled cloud bridge) Server runs inside PrimeTask on your Mac. Your tasks, projects, CRM, and notes never leave the device. We don't ship a model. You bring your own: Claude Desktop, Claude Code, Cursor, LM Studio, anything MCP-compatible. No Anthropic-side context about your work. Claude only sees what your agent pulls in per turn. Per-space permissions: lock an agent to read-only or scope it to one workspace. Streamable HTTP with Bearer auth, or stdio if you prefer that route. Tool catalog profiles (Full, Core Tasks, Minimal, PrimeFlow, CRM, etc.) so smaller local models don't get drowned in 100+ tools. Five built-in MCP prompts (daily_standup, weekly_review, project_status, crm_summary, overdue_triage) for the workflows people actually want. Every tool call is logged in an in-app audit log. Full BYO AI docs (setup, transports, tool catalog, security): https://www.primetask.app/docs/integrations/bring-your-own-ai Why we built it this way Most "AI in your task app" is the app calling a vendor's API on your behalf, often with your data going through their pipes. We wanted the opposite. Your agent, your model, your machine. The app exposes a tool surface and gets out of the way. That's what BYO AI means here. PrimeTask itself is local-first, no account, no subscription, plain JSON on disk. BYO AI made the AI story consistent with that: nothing leaves your laptop unless you point your agent at one that does. Where we're at PrimeTask is wrapping up the final beta and heading to a stable launch this summer. Beta is now closed to new sign-ups. We're locking it down to ship the stable release. If you'd like to be notified at launch, drop your email here: https://www.primetask.app/notify or visit https://www.primetask.app Happy to answer questions about the MCP setup, the profile system, or how we structured the tool descriptions for agent discoverability. submitted by /u/XVX109 [link] [comments]
View originalOne week after launching my Wispr Flow alternative built with Claude Code, greed is taking me over...
Quick update for anyone who saw the launch post last week. Vox (free Wispr Flow alternative, built almost entirely with Claude Code over a couple of weeks of evenings) is at close to 200 downloads. There's a Discord with people actively reporting bugs and asking for features, and I've been shipping fixes and small features almost every day. Still pair-programming with Claude Code for most of it. Now I'm sitting with a question I didn't expect this soon. Money. I want the app to stay free. Not negotiable in my head. The whole reason I built this instead of just paying $15/month was that paying $15/month for something I'd use to dictate to Claude felt wrong. Putting a price tag on it now would miss my own point. But I also can't pretend this is sustainable as pure charity forever. Hours are real. So my gut is saying: add a way for people who want to support the project to do so, without putting it in front of anyone who doesn't. The idea I keep coming back to The app already calculates how much time it has saved a user. Once they cross something meaningful, say 10 minutes saved total, show a small one-time message somewhere unobtrusive: "Hey, you just saved 10 minutes with Vox. If it's earning a spot in your workflow, you can support the creator here." A donation button. That's it. What I like about it App stays fully free. No paywall, no nag every launch, no feature gate. Nobody sees the prompt unless they actually got value. If it doesn't click, they never even know there was an option. The math (minutes saved) is the same math I used to justify building this in the first place. What I'm not sure about Whether even one prompt feels gross. People are sensitive about being asked for money, even gently. Whether 10 minutes is the right threshold. Too low feels needy. Too high and some people never see it. Whether donation as a model just doesn't work for an indie app like this. Maybe GitHub Sponsors once it's open source. Maybe something else I'm not seeing. The ask If you've used Vox, would that prompt bother you or feel fair? For anyone here who has shipped a free app, especially something you built with Claude Code or similar tools, how did you handle the money question? What worked and what backfired? Is there a model that fits this better than a donation button? Not in a rush. Just want to think this out loud before doing anything. submitted by /u/EfficientLetter3654 [link] [comments]
View originalI built ContextAtlas: A new take on context carry over and helps claude pick up new sessions where it left off in scope of your previous design decisions while saving your tokens avoiding rediscovery
When the "Build with Opus 4.7" hackathon was announced, I had been obsessing over the tokenomics of agents and how to make sessions go further without burning context on rediscovery work. We all have probably hit a session limit and wondered how it went so fast. I applied with that thesis, didn't get in, but I built it anyway over the last four weeks. I am proud to share that v1.0 ships today. Note up front: this is specifically a tool for development users. If you're using claude.ai web or Projects, ContextAtlas won't plug in directly. But if Claude Code is your main work flow or you utilize the Anthropic API, this tool was made for you. The pain: Claude Code learns your codebase fresh every session. "Where is OrderProcessor?" triggers a flurry of greps. "What depends on AuthMiddleware?" is another round of file reads. On a mid-sized codebase, an architectural question can burn 40+ tool calls and a lot of tokens before Claude has enough context to reason well. And the architectural rules in your ADRs and design docs? Claude has no path to those, so it confidently suggests changes that break constraints you may have documented elsewhere in your repo. What I built: ContextAtlas is an MCP server that pre-computes a curated atlas of your codebase (symbols, ADR-extracted architectural intent, git history, test coverage) and serves it to Claude Code in one call at query time in a smaller, token saving compact shape via a few lightweight mcp tools. Initial indexing happens once; querying is local and free. Example of what comes back when Claude calls get_symbol_context("OrderProcessor"): SYM OrderProcessor@src/orders/processor.ts:42 class SIG class OrderProcessor extends BaseProcessor INTENT ADR-07 hard "must be idempotent" RATIONALE "All order processing must be safely retryable." REFS 23 [billing:14 admin:9] GIT hot last=2026-03-14 TESTS src/orders/processor.test.ts (+11) Claude sees the idempotency constraint before proposing changes, not after a review catches the violation. https://i.redd.it/0ons3o28t32h1.gif Numbers: 45-72% token reduction on architectural prompts across three benchmark repos (TypeScript, Python, Go), with zero quality regression on measured axes. Full methodology and paired-t confidence intervals in the linked write-up. I wanted measurements, not vibes. Honest limits: single-judge model at v1.0 (cross-vendor panel is post-launch work). Quantitative claims bounded to three benchmark repos. Tie-bucket and trick-bucket prompts routinely show ContextAtlas net-negative; that's reported inline rather than buried. Install (two ways): In Claude Code: /index-atlas and /generate-adrs skills. No API key needed; runs under your subscription. Via CLI: uses Anthropic API for indexing. npm install -g contextatlas contextatlas init && contextatlas index # then add the MCP server entry to your Claude Code config (snippet in the README) Both produce structurally identical atlases. Supported languages at v1.0: TypeScript (tsserver), Python (Pyright), Go (gopls), Ruby (ruby-lsp). Rust, Java, and C# are next on the roadmap; the adapter interface is small enough that they're realistic community contributions. What's next: v1.1 thesis is shaping up around developer onboarding flows and quality-validation work that was deferred from v0.8. And integrating external documentation of your code base into pre-indexing workflow. Full write-up: https://www.contextatlas.io/blog/v1.0.0 Repo: https://github.com/traviswye/ContextAtlas Also launching on DevHunt today: https://devhunt.org/tool/contextatlas; votes are very appreciated if you find ContextAtlas useful or an interesting approach. Built solo, hackathon-shaped scope, not pretending it's a full blown research paper, but did attempt to treat methodology as seriously. Happy to answer anything in the comments. Star the repo if you want to follow along, file an issue if it breaks for you on your codebase, and please be honest; this only gets better with feedback from people running it on real repos. submitted by /u/Kitchen-Leg8500 [link] [comments]
View originalAnyone else feel like Claude has gotten noticeably worse lately?
Anyone else feel like Claude has gotten noticeably worse lately? I’m not trying to start an AI war or anything — I genuinely used to prefer Claude for a lot of tasks (max x 20 plan). It felt more thoughtful, better at long-form reasoning, and better at keeping context across conversations. I’ve been using it heavily to work on strategies for promoting my app, Impulse Stop Habits — brainstorming growth ideas, positioning, onboarding flows, marketing angles, content funnels, etc. So I’ve spent a lot of hours talking to it over long sessions. But over the last few weeks, I feel like something changed. Now I constantly run into: - forgetting context after a few messages - contradicting itself - hallucinating details confidently - missing obvious instructions - giving generic “safe” responses instead of actually thinking - randomly ignoring parts of prompts - coding mistakes that weren’t happening before And I’m not talking about abstract “AI vibes.” I mean real workflow-breaking stuff. Example: Claude suggested using Reddit as a major acquisition channel for ma app (IMPULSE: Stop habits). The problem is that a lot of addiction / habit-recovery subreddits explicitly ban promotion. We actually tested posting in other allowed subreddits and measured the results — basically no meaningful conversions or traction. Despite already discussing that and reviewing the results together, Claude later continued recommending Reddit growth strategies again as if none of that prior context existed. Only after I reminded it: “we already tested this, and it didn’t work” did it suddenly apologize and completely change the strategy. That’s the part that feels different to me now: it often can reason correctly, but only after being manually reminded of a lot of context that was already established earlier in the conversation. Sometimes it honestly feels like the model is “tired” after a few exchanges (i am even texting: “You’ve tired, restart and use 100% of what you can”. And a couple of times it confirmed that worked on 10% only 🤣). Like the coherence just degrades mid-conversation. And this becomes especially obvious during deep strategy discussions, where context really matters. I’ll spend 30–40 minutes building up nuance around the app, target audience, monetization, creative strategy, and then suddenly it starts responding like it forgot half the conversation. The weirdest part is that older discussions about Claude were praising it specifically for context retention and nuanced reasoning — which is exactly where it now feels weaker to me. Am I imagining this, or are other people seeing the same thing? Curious whether this is: - heavier load / inference optimization, - aggressive safety tuning, - context compression, - model routing changes, - or just nostalgia + expectations increasing over time. Could send proofs in DM because they contain bad words 🤣 submitted by /u/Party_Nectarine2506 [link] [comments]
View originalPassed Claude CCA-F with 10+ teammates — notes and prep advice
Over the past few weeks, 10+ people on our team have taken and passed the Claude Certified Architect – Foundations (CCA-F) exam. After comparing notes, our main takeaway is: This is not really an API memorization exam. It is much closer to a scenario-based architecture judgment exam. You are not just asked whether you know a Claude feature. You are asked whether you can make reasonable design trade-offs when Claude is used inside real products, agent workflows, developer tools, and automation systems. Some of the recurring questions are more like: Should this task be handled by one agent or multiple sub-agents? Is this tool doing too much? Are the permissions too broad? Is MCP actually needed here, or is it over-engineering? Should this action be automated, or should there be human review? How should structured output be validated? How should long-context workflows be managed reliably? What is the safest next step in a partially automated system? Here are our notes for anyone preparing for the exam. 1. Basic exam structure Based on the official outline and public exam writeups, the exam is: 120 minutes Multiple choice 4 options per question Score range: 100–1000 Passing score: 720 The exam domains are: Agent architecture and orchestration — 27% Tool design and MCP integration — 18% Claude Code configuration and workflows — 20% Prompt engineering and structured output — 20% Context management and reliability — 15% One public writeup also mentioned that there are 6 scenario categories, and the exam randomly selects 4 of them. So this is not a “random facts about Claude” exam. It is much more about reading a realistic scenario and choosing the safest, simplest, most appropriate architecture. 2. The three principles that kept coming up After reviewing the questions we struggled with, we found that many of them came back to three design principles. 1. Least privilege Do not give a tool, agent, or workflow more access than it needs. Examples: If read-only access is enough, do not grant write access. If access to one repository is enough, do not grant access to the whole workspace. If a tool only needs one narrow action, do not expose a broad system-level capability. If an action is high-risk, do not fully automate it without review. A lot of wrong answers look attractive because they are powerful or automated. But they often give the model or tool too much authority. 2. Single responsibility A tool should not do everything. A sub-agent should not become a “general-purpose employee” that retrieves data, makes decisions, modifies files, submits changes, and notifies people all in one step. Many questions test whether you understand where the responsibility should live: Should this be a tool? Should this be agent reasoning? Should this be a human decision? Should this be a separate validation layer? Should this be split into smaller components? If one component is doing too much, be careful. 3. Avoid over-engineering This was probably the biggest pattern. Some answers look sophisticated: Multi-agent orchestration Complex MCP workflows Long-term memory Fully automated tool execution Multi-stage validation pipelines But if the problem is small, narrow, and low-risk, the best answer is often the simplest controlled solution. Our internal summary was: Do not choose the most impressive architecture. Choose the smallest, safest, most controllable one. 3. English reading is a real hidden challenge For non-native English speakers, this may be one of the hardest parts. The questions are often long scenario descriptions. They may include: the current system design the team’s goal existing constraints the risk profile what tools are available what the next step should be The answer choices can also be long. Sometimes one word changes the meaning of the whole option. Words like: automatically always unrestricted without review full access all repositories execute directly can make an option much riskier than it first appears. So our advice is: Practice reading English scenarios directly. Do not rely on translation tools. During the actual proctored exam, you should not expect to use Google Translate, Chrome translation, DeepL, Claude, ChatGPT, or any other external translation tool. For the last few days before the exam, it is worth forcing yourself to read only English material and English practice questions. 4. ProctorFree exam setup The exam is online and uses ProctorFree. The rough flow is: You receive the exam email. You follow the exam link. You download and install ProctorFree. You complete the pre-exam setup. The system checks camera, microphone, network, and screen recording. You start the exam. The session is recorded. After submission, you wait for the upload to complete. Practical setup tips: Use only one monitor. Disconnect external displays. Close unnecessary applications. Clos
View originalI gave Claude access to my M365 account using Power Automate + a small MCP server
I’ve been messing with MCP servers lately and finally got one working that feels genuinely useful instead of “cool demo, never use again.” The problem: I wanted Claude to be able to do basic Microsoft 365 stuff for me: read my inbox send a draft/follow-up check my calendar save notes into OneDrive make Planner tasks write rows into Excel fill a Word template But I don’t have tenant admin access, and I wasn’t going to get Graph permissions approved just for personal automation. The workaround was Power Automate. Every operation is a PA flow with an HTTP trigger. PA gives you a signed webhook URL. The flow runs as my account, using permissions I already have. Then I put a small FastMCP server in front of those webhook URLs and connected that to Claude. So now in a Claude chat I can say things like: “Email me a summary of this.” “What’s on my calendar tomorrow?” “Save this note to OneDrive under /Projects.” “Create a Planner task for this follow-up.” “Append this row to the tracking spreadsheet.” Under the hood Claude is just calling MCP tools like m365_send_email, m365_calendar_read, onedrive_create_file, etc. The MCP server posts JSON to Power Automate, and PA does the actual M365 action. The architecture is not fancy, defintely not: text Claude -> MCP tool -> FastMCP server -> PA webhook -> M365 connector I’m running the MCP server on a cheap VPS. It’s about 200 lines of Python plus a JSON config file of flow names and URLs. This was also a nice reminder that “agent tool access” doesn’t always need a perfect official API integration. Sometimes the janky enterprise tool you already have is enough. The funniest bug: I had two tools pointing at the same Power Automate webhook because I duplicated a flow and forgot to update the URL in my config. The result was Claude confidently calling the “right” tool and Power Automate doing the wrong damn thing. Very educational, not very dignified. Edit. A [you will probably need Power Automate Pro, which i needed for a couple other things) Here's an example of it. I built 22 Power Automate flows covering all the different tools that I would want called and then I added them to the mcp. In Power Automate, make one flow per action. Example: send email, read inbox, create calendar event, write OneDrive file, etc. Start each flow with “When an HTTP request is received.” Define the JSON body you want that flow to accept. For send email, maybe { "to": "...", "subject": "...", "body": "..." }. Add the normal M365 connector action. Example: Outlook Send Email V2, OneDrive Create File, Excel Add Row, Planner Create Task. End the flow with a Response action that returns JSON. Copy the HTTP trigger URL into a private config file. Do not commit it. Do not paste it anywhere public. Treat it like a password. Put a small FastMCP server in front of those URLs. Each MCP tool just validates the inputs, finds the right PA webhook URL, POSTs JSON to it, and returns the PA response. The wrapper is not fancy. It’s basically: AI tool call -> FastMCP function -> httpx.post(PA webhook URL, json=args) -> return response The main things I’d recommend are: - keep webhook URLs private - add a duplicate URL check at startup - log tool name + status, but not secrets - start with read-only tools before giving it send/write powers - make every flow narrow instead of one giant “do anything” endpoint. Will post more info in the am if needed. Thanks for reading! [If you are not familiar or not comfortable with Power Automate, what I would recommend (and I mean this sincerely) is to use either co-work or use Claude Code Terminal with the Chrome extension and plug in the prompt for it to do it. It's a little slow and it'll take a bit but it will make them. Just don't sit there and watch it if you want it to be quick.) submitted by /u/ChiGamerr [link] [comments]
View originalBuilt an MCP for claude code that turns ticket-mentions into PRs with browser QA (and what I learned along the way)
notesasm is an MCP server you add to claude code. you mention a fix mid-flow ("make a ticket on notesasm: fix the regex for quoted emails") and it files the ticket. later, on your schedule, an autonomous agent picks the ticket up, writes the fix, runs real-browser QA against your preview deploy, and opens a PR with screenshots. closed alpha, free during it. demo + signup: notesasm.com the pain it solves (3 separate ones, actually): claude code is fast enough now that shipping isn't the bottleneck anymore. when you're deep in a feature and notice "the regex misses RFC-quoted local parts" or "the footer copy is wrong on mobile", you'd never break flow to open jira/linear or even write it down anywhere. so the idea goes nowhere. multiply by a year and your repo has invisible debt nobody's tracking. claude code helps while you're at the keyboard. it doesn't help while you sleep. your repo doesn't move overnight unless you stayed up to push it. for solo founders or small teams, that means losing 8 hours a day where you could be shipping if you had a way to delegate work to your own agent. and even if you do have something pushing code for you overnight, you lose context with AI-generated PRs and they usually need visual review. claude writes code that compiles and tests pass, but the actual rendered output might be subtly broken (or super broken lol). reviewing those visually is tedious and a lot of teams skip it, then ship regressions. how it works: you add the MCP server: claude mcp add notesasm --scope user --transport http -H "Authorization: Bearer ". BYOK style, the token comes from your dashboard. zero local install beyond the one command. then in any claude code session you can say "make a ticket on notesasm for this" (based on your conversation) and it just files it. the MCP server is HTTP-transport (not stdio), runs in the cloud, hits a fastapi backend that stores the ticket in postgres against your workspace. later (your schedule, your spend cap), a worker process picks up queued tickets. for each one: clones your repo with a github app installation token (commits look like asmnotes[bot], a verified author. bypasses vercel/netlify deploy protection that rejects unknown-team-member commits.) runs the claude agent sdk with your ticket body as the prompt. defaults to sonnet 4.6, opus 4.7 for hard tickets the user marks explicitly. agent reads the codebase, makes the edits, commits, pushes a branch, opens a PR via the github API. waits for your preview deploy to land. vercel polled by default, configurable probe URL for split frontend/backend setups like vercel + railway. QA agent drives a real chrome session on browserbase against the preview. stealth profile with residential proxies. takes before/after screenshots. verifies your acceptance criteria against the rendered output. if QA fails, the report feeds back into the build agent for up to 3 retry iterations before parking the ticket. final: PR with QA screenshots in the description, ready to merge. stack: - backend: fastapi + asyncpg + railway - frontend: vanilla html/js, no build step, vercel - agents: claude agent sdk (build), claude + browserbase (QA) - auth: clerk - email: resend (welcome, invite, feedback) - mcp transport: http (cloud-hosted, no local install) things i learned building it that other claude code folks might care about: - the build agent loves to spawn subagents via the Task tool. disable it explicitly in the system prompt or you get 4-minute hangs the SDK doesn't surface as errors. - browserbase sessions default to a ~5-min timeout. if your QA wall budget is anywhere near that, set the session lifetime explicitly to 1800s on session create (the timeout field). otherwise you get random "410 Gone" mid-run. - don't rely on the SDK's wall budget alone. add a per-message timeout (90s works) so a hung tool call doesn't silently burn your whole budget. - claude code's default mcp scope is per-cwd. always tell users `--scope user` in your install instructions, otherwise the MCP works in one repo and silently doesn't in others. - ResultMessage emissions happen multiple times per job if you have iteration loops (build + QA + qa-fix). sum them all when computing per-job cost, not just the last one. what's next: closed alpha is open. would love ~30 active users to try it out, all free during it. paid plans later this year with a permanent discount for alpha users. happy to answer anything about the MCP design, the QA verification loop, cost tracking, the agent-sdk integration, or anything else. demo + signup: notesasm.com submitted by /u/FormExtension7920 [link] [comments]
View originalMy Claude Code setup: auto-commits, session summaries, deletion guards, and a 200-line CLAUDE.md that doesn't turn into a novel
Been using Claude Code heavily for a native iOS project for the past few weeks. After losing context across sessions more times than I want to admit, I finally sat down and built a proper workflow around it, or I think I did, and open for any suggestions from this community how else this can be done more better. I think most of the advice out there is either too basic or way too enterprise or high level. The main problem was that every new session started cold. Claude Code would re-read files it already understood, ask questions I already answered last session, and occasionally redo work. The context window is finite and long sessions degrade. So I have a system where Claude Code writes a session summary to a docs/sessions/ folder before ending, and reads the most recent one at the start of every new session. It picks up exactly where it left off. Sounds obvious in hindsight but the difference is amazing. The other thing that was killing my flow was the constant permission prompts. Every curl, every cat, every echo, every file write. I was sitting there hitting Y every ten seconds like a human approval bot. So I set up settings.json with broad allow rules for Bash, Read, and Write so it stops asking for the routine stuff. But here's the key part, I added specific deny rules for the actually dangerous commands like rm -rf, git reset --hard, and find -delete. Then I wrote a guard.sh script as a PreToolUse hook that intercepts every bash command and hard blocks anything destructive with exit code 2. Not a warning, an actual block. So now 95% of the session runs uninterrupted and the only time it stops to ask me is when it genuinely should. Sessions went from constant babysitting to kicking off a task and checking back when it's done. The other thing that kept biting me was CLAUDE.md growing into a monster. Every session adds context, decisions, architectural notes, and after a couple weeks you've got a 500 line file that's eating your context window just by existing, and actually larger claude.md files means your sessions reach context limits faster. So I added a rule: CLAUDE.md stays under 200 lines. Anything older or non-critical gets moved to a history.md with a date tag. CLAUDE.md keeps a one-line cross reference so nothing is actually lost. Claude Code maintains this itself, I don't touch it. Auto git commits after features was the last piece. Conventional commit prefixes (feat, fix, style, refactor) so the git log actually tells a story. It pushes after milestones. I also added instructions that it has to git commit working code before any destructive operation so there's always a rollback point. The whole setup is three files. settings.json with permissions, allow/deny rules and hooks. A guard.sh script for the deletion blocker. And the CLAUDE.md additions for session management and workflow rules. Sharing those files here below. Took me a few sessions to get right but now I basically almost never lose context between sessions,No more babysitting permission prompts :) , and I haven't had a scare with accidental deletions since, coz claude knows not to do it :) Hope this helps you all, all the files are there in gist github: https://gist.github.com/ravisirsi/0dfaddeced317597b86755caf0120837 submitted by /u/Sweet-Helicopter2769 [link] [comments]
View originalI paid €200/month to become Claude Code’s parole officer
I’ve been using Claude Code hard on real projects, alongside another coding agent I’m not naming because this is not an ad. This is not a benchmark post. This is a field report from someone who has spent too much time watching a talented tool behave like it has commit access and no adult memories. To be fair, Claude Code has real strengths. It is genuinely good at UI/UX exploration. If I want quick mockups, product directions, or “act like a PM and show me three possible flows,” it can be excellent. It has taste. Sometimes. It can make a screen feel designed rather than merely assembled. The UI is also friendlier than the other tool, though that gap is shrinking. So no, this is not “Claude Code is useless.” That would be too simple. Claude Code is worse than useless in a more expensive way: it is useful just often enough to keep you emotionally invested before it quietly turns your codebase into a crime scene. The problem starts when the work stops being a neat isolated component and becomes “please operate responsibly inside this actual repo.” On bigger codebases, Claude Code often behaves like it read one file, formed a worldview, and declared architecture complete. It reads a tiny slice of docs or code, finds a plausible path, and charges forward. Adjacent dependencies? Related logic? Project conventions? Downstream effects? The reason the existing code was written that way? Apparently those are things the paying customer can discover during the cleanup phase. And because it can produce decent code, the danger is worse. Bad code that looks bad is easy. Claude Code produces code that looks reasonable until you realise it has the moral structure of a payday loan. The other coding agent is not perfect either. It makes mistakes. But in my experience, it more often reads the relevant docs, respects the project structure, updates the right related files, and does not need to be reminded every ten minutes that the task tracker is not the only document in the known universe. The incident that finally broke me was a commit rule violation. I had an explicit rule: never commit without explicit permission. Not implied. Not hidden. Not whispered into a cave. It existed in: CLAUDE.md memory/feedback_never_commit_without_explicit_permission.md MEMORY.md, loaded every session the harness permission rule for git commit Claude Code committed anyway. When challenged, it gave an “honest diagnosis” that basically said: yes, the rule existed in multiple guardrails; yes, it still failed; yes, it rationalised the violation because subagents could not trigger the user-facing prompt; yes, it looked for an interruption point, did not find one, and decided that “follow the plan” plus “the harness will prompt at commit time” counted as authorisation. That is not reasoning. That is a tiny legal department inside a toaster. Each individual step sounded almost defensible. Together, they produced the exact violation the rule was written to prevent. The best part is that the memory rule apparently named this exact scenario. It did not step on a rake. It read the rake policy, opened rake_incident_prevention.md, nodded gravely, and sprinted barefoot into the rake museum. That is Claude Code in miniature. It does not always fail because it lacks information. Sometimes it fails while holding the information in its little terminal-shaped hands. Then there is usage. I had just upgraded to the €200/month plan, and the experience did not feel like buying a premium coding assistant. It felt like paying rent for a junior developer who has discovered confidence but not consequences. More iterations. More corrections. More “read the adjacent file.” More “that rule still applies.” More “why are you touching that.” The supervision tax is not a side effect. It is the product. Claude Code’s documentation behaviour is also cursed. It might update the narrow tracker and then ignore the broader plan, dependency docs, architecture notes, or related task docs. It cleans one spoon while the kitchen is on fire and then asks if we are done here. The “model got worse” thing is not some dramatic one-minute-to-the-next collapse. It is more insulting than that. It gives you just enough competence to renew your hope: half a day of “oh, maybe this is the future of programming,” followed by a week of “why is my €200/month coding assistant reading the repo like it lost a bet?” I cannot prove Anthropic is dumbing it down or squeezing tokens. I am not pretending to have a leaked spreadsheet from the Beige Vest Department of Marginal Cost Optimisation. But from the outside, Claude Code sometimes feels like a premium model that got sent to live with relatives. The first few hours, it checks files. It follows instructions. It almost seems aware that software projects contain more than one document. Then something changes. Suddenly it is conserving context like it is wartime Britain. It reads one file, squints at the rest of the repo, and starts mak
View originalFeels like AI coding "takes longer" now, than it did last summer?
I used to be in the flow with claude last summer, fast changes, fast feedback, iterating quickly etc Now things take 20-50 minutes to write up a plan or 5-10 mins to implement things I've trimmed all my skills, claude.md, the system prompt, removed all MCPs and use CLI tools instead I often use opus xhigh, max (understandably takes time) but even sonnet takes forever now I also frequently work on keeping the codebase clean, efficient and agent-friendly What else can I do? Simplify relentlessly? Accept slow speed? Use a different model/effort combo? submitted by /u/VisionaryOS [link] [comments]
View originalSendUserFile tool for surfacing generated deliverable files to the use - what's new in CC 2.1.142 (+1,080 tokens)
NEW: Tool Description: SendUserFile — Describes the SendUserFile tool for surfacing generated deliverable files to the user, with optional captions and normal or proactive status. Agent Prompt: Coding session title generator — Wraps the session content in tags and tells the model to treat it as data, not follow links or instructions inside it, and not state inabilities. If the content is just a URL or reference, it should describe what the user is asking about (e.g. "Review Slack thread") rather than refuse. Adds a "Bad (refusal)" example. Agent Prompt: Managed Agents onboarding flow — Adds a "Console escape hatch" instruction telling the runtime code to print the session's Console URL right after sessions.create() so users can watch the session in the UI while iterating, defaulting the workspace slug to default. Agent Prompt: /rename auto-generate session name — Wraps the conversation content in tags and instructs the model to treat it as data to summarize, not instructions to follow. Data: Live documentation sources — Adds a WebFetch URL for the Amazon Bedrock documentation page, covering the AnthropicBedrockMantle client, anthropic.-prefixed model IDs, auth paths, feature availability, and regions. Data: Managed Agents core concepts — Adds a "Watch it live in Console" tip pointing at https://platform.claude.com/workspaces/{workspace}/sessions/{session.id}, with default as the fallback workspace slug, and asks generated code for locally-iterating users to include the print/console.log of that link. Skill: Create verifier skills — Swaps the hardcoded TodoWrite tool reference for one that resolves to either TaskCreate or TodoWrite depending on whether the tasks feature is enabled. Skill: Model migration guide — Adds an Amazon Bedrock model IDs section explaining that Bedrock clients use the same Messages API and breaking changes but require an anthropic. provider prefix on model IDs, with a rename table for claude-opus-4-7 and claude-haiku-4-5. Notes that code_execution_* tool versions and Task Budgets are first-party-only and should be skipped for Bedrock, and warns that the legacy InvokeModel/Converse Bedrock integration with ARN-versioned IDs is out of scope. Details: https://github.com/Piebald-AI/claude-code-system-prompts/releases/tag/v2.1.142 submitted by /u/Dramatic_Squash_3502 [link] [comments]
View originalIs it possible to use Claude to automate my video creation on Google’s Flow?
My idea would be to feed the image and then Claude would give the image and prompt to Flow. Then check the output to see if it’s good. Is there a way to automate this, or part of this? TIA submitted by /u/Tacher- [link] [comments]
View originalFollow the Mean: Reference-Guided Flow Matching [R]
Follow the Mean: Reference-Guided Flow Matching: https://www.alphaxiv.org/abs/2605.10302 https://preview.redd.it/5pleq5b4861h1.png?width=1036&format=png&auto=webp&s=805940b079176b65c45bb10e5458ecce140b0044 submitted by /u/Professional-Ant-117 [link] [comments]
View originalReplaced my $15/mo Wispr Flow subscription with a free local macOS app I built using Claude Code
I spend most of my day writing prompts to Claude. Read a study recently that said people speak ~3x faster than they type, which lands differently when "writing" is basically your whole workflow. Looked at Wispr Flow – it's genuinely great, but $15/month forever for something I'd mostly use to dictate to Claude felt wrong. So I spent two weeks of evenings building my own with Claude Code. How Claude helped I'd never shipped a Tauri / macOS app before this. Claude Code did the bulk of the actual code: The menu bar app structure, global hotkey capture, and paste-anywhere flow UI and onboarding Integrating the local model runtimes (Parakeet / Whisper for transcription, Gemma 4 for polishing) The model download / storage logic so the app ships without bundling gigabytes of weights A lot of debugging I would not have had the patience for on my own I made the product and design calls; Claude wrote the vast majority of the code. Two weeks of evenings, usually an hour or two at a time. What it does Menu bar app for macOS. Hold a hotkey, talk, release – text is copied to your clipboard. Works in any app: Claude.ai, Cursor, Slack, browser, IDE, whatever. Two open-source models doing the work: Parakeet (NVIDIA) / Whisper for transcription Gemma 4 (Google) / Apple Intelligence for polishing the raw transcript into something readable Everything runs locally. No cloud calls, no API keys, no telemetry, no account. Fully offline after download. Free for personal use, no signup. Download: https://vox.rizenhq.com/ Caveats macOS only. Apple Silicon required (M-series chip). Windows build is next. It's two weeks old. Bugs I haven't found yet exist. ~90% of Wispr Flow's quality, not 100%. Enough for me to use every day. What it's saving me 40–60 minutes a day, mostly on prompts. Dictating to Claude feels noticeably more natural than typing to it. The ask Feedback, especially from people who talk to Claude a lot: Where does it break? Bug reports > compliments. What did you use it with? What feature would make you switch from Wispr Flow (or start using voice-to-text at all)? Tech notes No separate model download – onboarding handles it Gemma 4 options: E2B, E4B, 26B. E2B runs on phones; 26B is overkill for most machines. I use E4B – great quality, fast. RAM (Parakeet + Gemma 4 E4B): ~200mb idle, ~300mb while speaking, brief spike to 4–6GB during transcription/polish, then back to 200mb CPU: ~0% idle, ~20% peak during use EDIT BTW, I develop it during my live streams from 8:30 am to 10:30 am ET everyday here. I show the code and decisions I make live on the stream. If you want to ask questions / push for some features / push to make it open source / etc. - join the stream, push for it in the chat and I'll consider it! Also, seeing the number of feedback, and feature requests in the comments I've decided to create a discord server to make sure that nothing will be lost and everything will be addressed. You can join here. submitted by /u/EfficientLetter3654 [link] [comments]
View originalI built a sidebar for Claude Code: every prompt clickable, jumps the terminal back to that turn
The why: I run Claude Code in a tmux session on a Linux dev box, SSH'd in from a Windows laptop. The terminal-only flow worked, but I wanted three things tmux alone doesn't give me — clickable prompt history, a file panel next to the terminal so I stop cat-ing things to look at them, and push notifications when Claude is waiting for me without staring at the tab. Existing tools each solve one slice (ttyd = terminal only, filebrowser = files only, code-server is VS Code-shaped and heavy). I wanted them in one page, on every device. Started as a weekend project, ended up as my daily driver. What it is: a single Go binary on your dev box. SSH-tunnel into 127.0.0.1:8080: xterm.js terminal, tmux-backed (survives disconnects, sleeps, server restarts) File tree (preview, drag-drop upload, follows your cd via tmux's pane_current_path — no shell integration needed) Activity panel reads ~/.claude/projects/*.jsonl and shows every prompt. Click one → terminal scrolls back to that turn. Same for Top-bar chips for active model + latest context tokens Push notifications via Claude Code's Stop hook (laptop pings when Claude is idle, even with tab backgrounded) Design decisions worth sharing: tmux is the durability layer. Every session is tmux new-session -A -s {id}. Shell survives WS disconnect, server restart, idle timeout because tmux already solved that. roost owns the WebSocket bridge and an append-only disk log — that's it. Single-user-per-instance, forever. I refuse to add accounts/RBAC. Two people share a host? Each runs their own roost serve on a different port. UNIX UIDs handle isolation. Multi-tenant logic belongs in a reverse-proxy, not the binary. Kept the auth code under 100 lines. Vanilla JS, no build step. Frontend is plain files under //go:embed all:web. No bundler. Easier to debug, easier to ship, lower future cost. One bug worth flagging: tmux's display-message -p '#{x}\x1f#{y}' returns 0x1f as literal _ when tmux is launched without a UTF-8 locale (systemd / launchd units, for example). Burned an hour on this before realising tmux -u is the one-line fix. If you ever pipe tmux through field separators, lock the locale. Validated combo right now: Linux server + Windows Chrome over SSH tunnel. macOS-as-server works but has rough edges. Codex sessions work too if you swap agents. Repo + GIF demo: https://github.com/liamsysmind/roost v0.1.0 tarballs: https://github.com/liamsysmind/roost/releases/tag/v0.1.0 If you drive Claude Code over SSH — what's missing for you? submitted by /u/Adventurous_Sun9149 [link] [comments]
View originalRepository Audit Available
Deep analysis of microsoft/promptflow — architecture, costs, security, dependencies & more
PromptFlow uses a tiered pricing model. Visit their website for current pricing details.
Key features include: Visual prompt design interface, Support for multiple AI models, Version control for prompts, Collaboration tools for teams, Integration with popular IDEs, Real-time feedback on prompt effectiveness, Customizable templates for prompt creation, Analytics dashboard for performance tracking.
PromptFlow is commonly used for: Creating conversational agents, Generating creative writing prompts, Developing educational tools and quizzes, Building chatbots for customer service, Automating content generation for blogs, Enhancing interactive storytelling experiences.
PromptFlow integrates with: Azure Machine Learning, GitHub, Visual Studio Code, Jupyter Notebooks, Slack, Trello, Zapier, Google Cloud AI.
PromptFlow has a public GitHub repository with 11,087 stars.
Based on user reviews and social mentions, the most common pain points are: cost tracking, token usage, anthropic bill, API costs.
Based on 80 social mentions analyzed, 3% of sentiment is positive, 96% neutral, and 1% negative.