
The AI Toolkit for TypeScript, from the creators of Next.js.
The Vercel AI SDK receives high praise for its simplicity and effectiveness, as reflected in its consistently high ratings on platforms like G2. Users laud it for integration ease, particularly its ability to significantly reduce token usage. However, some concerns are mentioned regarding the obligatory use of the Responses API in the tool, which can feel limiting. Pricing information is not frequently discussed, but overall, the SDK enjoys a strong reputation for enhancing AI functionality and developer productivity.
Mentions (30d)
0
Avg Rating
4.8
20 reviews
Platforms
2
GitHub Stars
23,126
4,086 forks
The Vercel AI SDK receives high praise for its simplicity and effectiveness, as reflected in its consistently high ratings on platforms like G2. Users laud it for integration ease, particularly its ability to significantly reduce token usage. However, some concerns are mentioned regarding the obligatory use of the Responses API in the tool, which can feel limiting. Pricing information is not frequently discussed, but overall, the SDK enjoys a strong reputation for enhancing AI functionality and developer productivity.
Features
Use Cases
27,110
GitHub followers
227
GitHub repos
23,126
GitHub stars
20
npm packages
25
HuggingFace models
11,880,060
npm downloads/wk
8,419
PyPI downloads/mo
g2
What do you like best about Vercel?I use Vercel to deploy all of my websites and my clients' websites. I love how easy it is to use, with a clean and simple UI that makes navigation a breeze. The fast deployment makes everything efficient, allowing me to quickly implement changes that my clients request, which keeps them pleased. One of my favorite features is the instant rollback, which is invaluable for correcting mistakes swiftly without causing worry for myself or my clients. The initial setup was really easy, especially with the CLI tool that integrates seamlessly. Review collected by and hosted on G2.com.What do you dislike about Vercel?Honestly, I have nothing bad to say apart from it could be cheaper. Review collected by and hosted on G2.com.
What do you like best about Vercel?Vercel is a great tool for managing everything from deployments to analytics. It offers a wide range of features, including one-click deployments for our Next and React applications, which makes the overall workflow much smoother. Review collected by and hosted on G2.com.What do you dislike about Vercel?So far, there’s nothing about Vercel that I haven’t liked. Review collected by and hosted on G2.com.
What do you like best about Vercel?What I like most about Vercel is how simple it makes the entire deployment workflow. You push code, get a live deployment quickly, and can validate changes in preview environments without a lot of extra setup. It feels especially polished for frontend-heavy projects and for teams that want to move fast. I also appreciate that performance and visibility are built into the platform. Having analytics, speed insights, logs, and deployment details all in one place makes it much easier to spot issues early and keep improving the product without having to juggle a bunch of separate tools. Review collected by and hosted on G2.com.What do you dislike about Vercel?What I don’t like is that as a project grows, pricing and usage can start to feel a bit less predictable. Also, if you need very custom control over your infrastructure, Vercel can feel more opinionated than a fully self managed setup. Review collected by and hosted on G2.com.
What do you like best about Vercel?For me, it is easier to create/deploy project portfolios and connect it with github Review collected by and hosted on G2.com.What do you dislike about Vercel?It costs insanely and unpredictably high, making it unaffordable to students Review collected by and hosted on G2.com.
What do you like best about Vercel?Vercel has completely transformed how I deploy full-stack and AI-powered applications. As a Lead AI Engineer working with Next.js, React, and LLM workflow pipelines, the GitHub integration is flawless push to main and the app is live in under a minute. Preview deployments on every PR make client demos and stakeholder reviews effortless. Edge functions, environment variable management, and built-in CDN make it the perfect platform for production-grade applications like my Nexus LLM Workflow builder. Review collected by and hosted on G2.com.What do you dislike about Vercel?Pricing scales up quickly for teams with high bandwidth or serverless function usage. The free tier limitations on build minutes can be restrictive for active projects. More granular control over cold start behavior for serverless functions would be appreciated, especially for latency-sensitive AI applications. Review collected by and hosted on G2.com.
What do you like best about Vercel?The developer experience is genuinely hard to beat. I connected my GitHub repo and that was basically it every push deploys automatically, with preview URLs included. As a solo developer running a real production project, the Hobby plan gives you more than you’d expect. The firewall tooling is surprisingly mature for a free tier, Speed Insights and Analytics are built in without any extra setup, and the dashboard feels clean and intuitive. The documentation is some of the best I’ve encountered on any platform: thorough, well organized, and actually kept up to date. I briefly tried the Pro plan and loved it too, but even on its own the Hobby plan is already a serious offering. Overall, it’s clear the team cares about the product. Review collected by and hosted on G2.com.What do you dislike about Vercel?The biggest limitation of the Hobby plan is how restricted team collaboration is, along with some more advanced features being locked behind Pro. For a solo project it works well enough, but as soon as you want to bring someone else in and collaborate properly, the jump to Pro becomes hard to ignore, especially given the price difference. That said, the Pro tier does offer real value I just wish there were an in-between option. Review collected by and hosted on G2.com.
What do you like best about Vercel?The developer experience (DX) is unmatched. The git-push-to-deploy workflow and automatic SSL provisioning allow me to focus entirely on building features rather than managing infrastructure. The Preview Deployments are essential for testing UI changes in a live environment before merging to production, which significantly speeds up my iteration cycles. Review collected by and hosted on G2.com.What do you dislike about Vercel?The "Serverless Function Execution Timeout" on the Pro plan can be a bottleneck for heavier background tasks or complex API calls. Additionally, while the usage-based pricing for bandwidth and functions is fair, it can become unpredictable if a project experiences a sudden, unoptimized traffic spike, requiring close monitoring of the dashboard. Review collected by and hosted on G2.com.
What do you like best about Vercel?The best part is the creation of a subdomain for each connected branch, so I can easily see which branch an issue is coming from. That makes it easier to test that specific branch and then deliver the final build. It also connects with both GitLab and GitHub, and provides a CI/CD setup for builds within it, along with domain connection between them. Review collected by and hosted on G2.com.What do you dislike about Vercel?I’m fine with everything, but building on the basis of credit can sometimes be costly. Review collected by and hosted on G2.com.
What do you like best about Vercel?I like that Vercel just works. It makes storing data in buckets and Postgres stupidly easy, especially when using Supabase. Supabase also helps Auth0 for authentication play well with Vercel. Switching from AWS to Vercel fixed the hard provisioning and the pain of managing AWS, making things much smoother. The initial setup with Vercel was incredibly easy, just as simple as a single CLI command. Review collected by and hosted on G2.com.What do you dislike about Vercel?Incredibly expensive Review collected by and hosted on G2.com.
What do you like best about Vercel?I like how Vercel makes deployment easier. I appreciate the secure, high-performing, and easy deployment of our Next.js site. Https://www.exibify.com Review collected by and hosted on G2.com.What do you dislike about Vercel?Easy to use Review collected by and hosted on G2.com.
The Hybrid Method: how I split tasks between the chat (Claude.ai) and a background agent (Claude Code)
After a month of running this daily, I've settled on what I call the Hybrid Method: keep Claude.ai (the chat) as my only surface, and delegate engineering work in the background to Claude Code. The chat writes the engineering prompt, launches the executor, supervises through the filesystem and git log, and reports back without me ever opening a terminal. The piece I find most useful to share is the **allocation matrix** — which kind of work goes to which engine. Took weeks of measurement to stabilize. **Background agent (Claude Code) handles:** Large refactors across many files Tedious mechanical work (renaming patterns, applying fixes from a list) Anything that needs filesystem + git access without back-and-forth Tasks that take more than ~2 minutes of pure execution **Chat (Claude.ai) handles:** Architecture decisions and tradeoffs Reviewing the agent's diff and discussing the output Sprint planning while the agent runs the current sprint Quick edits where the round-trip to a background process is wasted Anything where the answer needs human reading anyway **The hand-off:** The chat writes a detailed prompt for the background agent (including a fail-fast spec and what to commit at the end). It launches `claude --headless --instruction "..."` as a subprocess via a small MCP bash bridge (~200 lines of Python using Anthropic's MCP SDK; community implementations exist too). Then it polls the git log and a status file every 30–60 seconds while I plan the next thing. When the agent finishes, the chat reads the diff and reports. **Why "hybrid":** The analogy is the hybrid car. Two engines with different load profiles. The chat is electric — instant startup, smooth low-load, great for transitions and decisions. The background agent is combustion — cold-start cost (5–15 seconds while it loads the project's memory file and explores the repo), but sustained throughput once running. They specialize, they hand off, the user never feels the seam. **What changes from running Claude Code alone:** Context-switching cost drops to near-zero — I never leave the chat session Strategic and execution work happen in parallel (the chat plans the next sprint while the current one runs) The chat acts as supervisor — better wired for high-level reasoning than the executor agent which is wired for action **Caveats:** This is the operator pattern Anthropic has documented elsewhere; the specific assembly (Claude.ai web as the chat + an MCP bash bridge + Claude Code as the executor) is what I haven't found written up specifically No sandboxing on personal hardware; if any of this ever runs on someone else's machine, careful sandboxing is non-negotiable The chat saturates beyond ~2 parallel background tasks — past that, the supervision quality drops Curious whether anyone else has converged on something similar, or what variations work for you. submitted by /u/Krycekk [link] [comments]
View originalFeels like AI tooling is evolving faster than developer experience lately give full pist content
Feels like AI tooling is evolving faster than developer experience lately Every week there’s a new framework, orchestration layer, observability tool, memory system, agent SDK, or infrastructure stack. The ecosystem is moving insanely fast, but sometimes it feels like the actual developer experience is becoming more complicated instead of simpler. Curious if others feel the same or if I’m just approaching things the wrong way. submitted by /u/Bladerunner_7_ [link] [comments]
View originalOpenAI Agents SDK Sandboxes: Which Provider Should You Actually Use?
submitted by /u/jitendra_nirnejak [link] [comments]
View originalBuilt an AI flat-finder in a weekend. Indian rental sites are 70% broker spam so I scraped Reddit instead.
Weekend build, ~10 hours. Demo: https://trurent-five.vercel.app/ Problem I was poking at: every major Indian rental site (NoBroker, MagicBricks, 99acres) is infested with brokers even when you filter "direct owner." Reddit actually has honest listings posted by owners themselves but the posts are completely unsearchable. So I built TruRent. You chat with it, it parses the query into a structured search, runs it, the map updates live, and follow-ups carry context. Ask "compare the top two" and the model reasons over the actual listings instead of just filtering. Stack and the boring decisions: Next.js 16 with raw fetch to Anthropic. No SDK, I wanted full control of the streaming loop Claude Haiku 4.5, not Sonnet. The task doesn't need Sonnet and Haiku is 5x cheaper Two tools only (search, get_details). Comparison and ranking happen in the model's prose, not as separate tools. More tools = more failure modes NDJSON to the browser, way easier than parsing SSE Scrape pipeline: PullPush API to pull Reddit posts, then Haiku again to extract structured listings from raw post text, Nominatim for geocoding Honest numbers: 1,412 posts scraped, ~600 passed a local pre-filter, only 131 ended up being real listings. Dataset is tiny but the pipeline is source-agnostic, swap the fetcher and the rest doesn't change. Most curious about: anyone else built agents where they deliberately used fewer tools and let the model reason over richer tool outputs instead of adding more tools? Happy to get into any of it. submitted by /u/Scary-Alternative-81 [link] [comments]
View originalAnthropic just bought the company that generates most production MCP servers
Anthropic acquired Stainless on Monday for a reported $300M+. Most coverage is framing this as a developer tools acquisition. Stainless is best known for generating the official Python and Node SDKs that ship with OpenAI, Google, Meta, Cloudflare, and Anthropic. The SDK story is real. The MCP side is the part that matters here. Stainless was one of the first vendors to extend their compiler to produce MCP servers from the same OpenAPI specs that produce their SDKs. MCP hit ~97M monthly SDK downloads by December 2025 and around 10,000 production servers by early 2026. A lot of that production code was Stainless-generated. Anthropic now owns the dominant MCP server generator. What actually changed hands on Monday: The engineering team. Roughly 40-50 people including founder Alex Rattray, who previously built Stripe's patented SDK generation system. Now reporting to Katelyn Lesse in Anthropic's Platform Engineering org. The technology. The generator, the templates, the language-specific runtimes, the OpenAPI extensions Stainless invented for SDK-specific edge cases. The hosted product is winding down. New signups stopped Monday. New SDK and MCP server generations stopped Monday. Existing customers keep what they've already generated but the pipeline is closed. My read: this is closer to what Google did with Kubernetes than to a normal acquisition. Anthropic created MCP. Anthropic donated MCP to the Linux Foundation last December. Anthropic now owns the dominant implementation toolchain. The protocol is vendor-neutral on paper. The implementation toolchain isn't. Six months of Anthropic M&A starts looking less coincidental: December 2025: Bun, the JS runtime, pulled into Claude Code February 2026: Vercept, computer-use AI April 2026: Coefficient Bio, ~$400M healthcare AI May 2026: Stainless, SDK and MCP plumbing They're not buying training infrastructure or GPU clusters. They're buying the integration layers around the model. The bet seems to be that frontier models are converging faster than anyone expected, so the moat is everywhere except the model. If you're building on MCP today, tooling quality probably improves. Stainless's generator was already the cleanest in the space and the team that built it is now at Anthropic. Patterns will standardize faster as Stainless-derived templates become the de facto reference. The flip side is concentration risk. Cloudflare's MCP server framework, Pulse MCP, and the open-source generators Stainless released during the transition all become strategically important if you want any diversity in your stack. Sources: Anthropic announcement Why Anthropic actually did this, and migration math Curious whether Stainless ending up inside Anthropic reads as good news (better tooling) or concentration risk (one company owns the standard and the reference implementation) from your seat. submitted by /u/Ok-Constant6488 [link] [comments]
View originalHow do you share Claude HTML artifacts with non-technical people?
I keep generating these awesome HTML/React artifacts with Claude (dashboards, mini-tools, visual reports) but I'm constantly stuck when it comes to actually sharing them with clients or colleagues. Current options I've tried, all annoying in some way: - Download and share to be opened into browser → people doesn't know they have to download it - Share Claude Url published artefact → Not really client friendly (AI is a monster) - Copy the code → they can't open it - Screenshot → loses interactivity - Github Pages / Vercel → too technical for most people - Tiiny.host → works but feels like a generic file host What's frustrating: if I need to fix a typo or tweak a number, I have to re-prompt Claude (which sometimes breaks other things) or edit code manually and re-upload. How are you handling this? Am I missing an obvious solution? submitted by /u/Hairy-Fisherman8008 [link] [comments]
View originalGlia – Local-first shared memory layer (SQLite-vec + FTS5 + Offline Knowledge Graph)
Hey everyone, I wanted to share a project I've been working on called Glia. It is a 100% offline, local-first RAG and memory layer designed to connect your AI web chats (Claude, ChatGPT, DeepSeek) with your local developer tools (Claude Code, Cursor, Windsurf) using a unified local database. I wanted something lightweight that did not require pulling heavy Docker containers or subscribing to third-party memory APIs. I settled on a Node.js + SQLite architecture running sqlite-vec (for 768-dim float32 embeddings) alongside SQLite FTS5 for hybrid search, powered completely by local Ollama instances. We just launched a live website that outlines the details and demonstrates the features in action: Website: https://glia-ai.vercel.app/ Codebase: https://github.com/Eshaan-Nair/Glia-AI Technical Stack & Features: Hybrid Search Retrieval: SQLite-vec (using nomic-embed-text locally) + FTS5 keyword prefix matching (porter stemmer). Surgical Sentence-level Trimming: Chunks are sliced into sentences. When a prompt is intercepted, only the exact matching sentences are pulled out of the vector store instead of the whole paragraph. It cuts LLM prompt bloat by ~90-95% in my benchmarks. Knowledge Graph Extraction: An offline task queue uses a local LLM (llama3.1:8b via Ollama) to extract entity triples (subject-relation-object). These are stored in a SQLite facts table (or Neo4j if you run the full Docker compose profile) and fused with the vector retrieval score. HyDE (Hypothetical Document Embeddings): Queries are pre-processed to generate a hypothetical answer, which is embedded together with the original query to bridge semantic gaps. Concurrency: Running SQLite in WAL (Write-Ahead Logging) mode allows the browser extension dashboard and active MCP sessions to read/write concurrently without locking. PII Redaction: Aggressive scrubbing of JWTs, API keys, emails, and IPs in the extension before data is saved. The extension works on Claude.ai, ChatGPT, DeepSeek, Gemini, Grok, and Mistral. The MCP server runs out of the same backend database for your terminal agent or Cursor. You can set it up with a single command: npx glia-ai-setup Glia is completely open-source (MIT). If you like the local-first approach or want to contribute to the SQLite vector pipeline, PRs are very welcome, and a star on GitHub helps the project get discovered! I would appreciate any feedback on the SQLite hybrid search scaling, the scoring fusion algorithm (RAG pipeline details are in RAG_PIPELINE.md), or local graph extraction performance. submitted by /u/Better-Platypus-3420 [link] [comments]
View originalBuilt an MCP for claude code that turns ticket-mentions into PRs with browser QA (and what I learned along the way)
notesasm is an MCP server you add to claude code. you mention a fix mid-flow ("make a ticket on notesasm: fix the regex for quoted emails") and it files the ticket. later, on your schedule, an autonomous agent picks the ticket up, writes the fix, runs real-browser QA against your preview deploy, and opens a PR with screenshots. closed alpha, free during it. demo + signup: notesasm.com the pain it solves (3 separate ones, actually): claude code is fast enough now that shipping isn't the bottleneck anymore. when you're deep in a feature and notice "the regex misses RFC-quoted local parts" or "the footer copy is wrong on mobile", you'd never break flow to open jira/linear or even write it down anywhere. so the idea goes nowhere. multiply by a year and your repo has invisible debt nobody's tracking. claude code helps while you're at the keyboard. it doesn't help while you sleep. your repo doesn't move overnight unless you stayed up to push it. for solo founders or small teams, that means losing 8 hours a day where you could be shipping if you had a way to delegate work to your own agent. and even if you do have something pushing code for you overnight, you lose context with AI-generated PRs and they usually need visual review. claude writes code that compiles and tests pass, but the actual rendered output might be subtly broken (or super broken lol). reviewing those visually is tedious and a lot of teams skip it, then ship regressions. how it works: you add the MCP server: claude mcp add notesasm --scope user --transport http -H "Authorization: Bearer ". BYOK style, the token comes from your dashboard. zero local install beyond the one command. then in any claude code session you can say "make a ticket on notesasm for this" (based on your conversation) and it just files it. the MCP server is HTTP-transport (not stdio), runs in the cloud, hits a fastapi backend that stores the ticket in postgres against your workspace. later (your schedule, your spend cap), a worker process picks up queued tickets. for each one: clones your repo with a github app installation token (commits look like asmnotes[bot], a verified author. bypasses vercel/netlify deploy protection that rejects unknown-team-member commits.) runs the claude agent sdk with your ticket body as the prompt. defaults to sonnet 4.6, opus 4.7 for hard tickets the user marks explicitly. agent reads the codebase, makes the edits, commits, pushes a branch, opens a PR via the github API. waits for your preview deploy to land. vercel polled by default, configurable probe URL for split frontend/backend setups like vercel + railway. QA agent drives a real chrome session on browserbase against the preview. stealth profile with residential proxies. takes before/after screenshots. verifies your acceptance criteria against the rendered output. if QA fails, the report feeds back into the build agent for up to 3 retry iterations before parking the ticket. final: PR with QA screenshots in the description, ready to merge. stack: - backend: fastapi + asyncpg + railway - frontend: vanilla html/js, no build step, vercel - agents: claude agent sdk (build), claude + browserbase (QA) - auth: clerk - email: resend (welcome, invite, feedback) - mcp transport: http (cloud-hosted, no local install) things i learned building it that other claude code folks might care about: - the build agent loves to spawn subagents via the Task tool. disable it explicitly in the system prompt or you get 4-minute hangs the SDK doesn't surface as errors. - browserbase sessions default to a ~5-min timeout. if your QA wall budget is anywhere near that, set the session lifetime explicitly to 1800s on session create (the timeout field). otherwise you get random "410 Gone" mid-run. - don't rely on the SDK's wall budget alone. add a per-message timeout (90s works) so a hung tool call doesn't silently burn your whole budget. - claude code's default mcp scope is per-cwd. always tell users `--scope user` in your install instructions, otherwise the MCP works in one repo and silently doesn't in others. - ResultMessage emissions happen multiple times per job if you have iteration loops (build + QA + qa-fix). sum them all when computing per-job cost, not just the last one. what's next: closed alpha is open. would love ~30 active users to try it out, all free during it. paid plans later this year with a permanent discount for alpha users. happy to answer anything about the MCP design, the QA verification loop, cost tracking, the agent-sdk integration, or anything else. demo + signup: notesasm.com submitted by /u/FormExtension7920 [link] [comments]
View originalLLM-Rosetta — format conversion library across LLM API standards, doubles as a proxy
This started because we had a proprietary internal LLM API that spoke none of the standard formats. Built an internal conversion layer to bridge it, maintained that for over a year. As colleagues started adopting more and more coding tools — Claude Code, opencode, Codex, VS Code plugins, Goose, and whatever came out that week — each with its own API format expectations, maintaining separate adapters for each became the actual problem. That's what pushed the internal conversion layer into a proper generalized design, and llm-rosetta is the result. It's a Python library that converts between LLM API formats — OpenAI Chat, Responses/Open Responses, Anthropic, and Google GenAI. The idea is you convert through a shared IR so you don't end up writing N² adapters. The key difference from LiteLLM: LiteLLM is a unified calling layer that takes OpenAI-style input and transforms it into provider-native requests — one direction. llm-rosetta uses a hub-and-spoke IR, so each provider only needs one converter, and you get any-to-any conversion for free. Anthropic → Google, OpenAI Chat → Anthropic, whatever direction you need. Use it as a library — pip install and call convert() directly, no server needed. Or run the gateway if you want a proxy that handles the format translation for you. Zero required runtime dependencies either way. The HTTP server, client, and persistence layer are vendored from zerodep (https://github.com/Oaklight/zerodep), another project of mine — stdlib-only single-file modules, not someone else's library repackaged. The gateway ships with a Docker image if you'd rather not deal with Python env setup. You can also deploy it on HuggingFace Spaces or anything similar — admin panel, dashboard, request log, config management all included. Screenshots: https://llm-rosetta.readthedocs.io/en/latest/gateway/admin-panel/ We've been running it in production for about 5 months as the conversion layer for an internal multi-model access platform — needed to support various API standards and coding tool integrations before the upstream APIs were fully standardized. The Responses converter passes all 6 official Open Responses compliance tests (schema + semantic) from the spec repo. So if you're running Ollama, vLLM, or LM Studio with Responses endpoints, it should just work as one side of the conversion. There's a shim layer for provider-specific quirks — built-in shims for OpenRouter, DeepSeek, Qwen, xAI, Volcengine, etc. Converters stay generic per API standard, shims handle the edge cases declaratively. 24 cross-provider examples in the repo covering all provider pairs, SDK + REST, streaming, tool calls, image inputs, multi-turn with provider switching mid-conversation. GitHub: https://github.com/Oaklight/llm-rosetta Docs: https://llm-rosetta.readthedocs.io arXiv: https://arxiv.org/abs/2604.09360 Gateway screenshot: https://preview.redd.it/qzzjr2dcdw1h1.png?width=949&format=png&auto=webp&s=bce4293aae81059f794909fc37f85071cee34378 submitted by /u/Oaklight_dp [link] [comments]
View originalHow would you build a conversational control layer for client/brand workflows?
I’m working on a system for managing AI workflows across different brands/clients and I’m trying to figure out the best architecture before I build too much. The rough idea: I’d have a dashboard where each client has: workspaces agents/workflows run history outputs analytics approvals But I also want a conversational interface where I can talk to the system and trigger actions like: “Show me what changed for Client A this week” “Run the SEO report for Client B” “Add a cold email workflow to this client” “Summarize failed agent runs” “Create a GitHub issue/PR for this workflow change” “Draft the monthly client report” The part I’m unsure about is where this conversational layer should live. Options I’m considering: Slack bot Good for teams, approvals, internal notifications, and client-facing workspaces later. Telegram bot Fast, simple, mobile-first, easier for me to use as an operator command center. Chat panel inside the web dashboard More controlled, better permissions, easier to connect directly to client/workflow state. Some combination For example: dashboard chat as the main interface, Telegram for quick commands, Slack later for team/client collaboration. The backend would probably be something like: Vercel for the dashboard Railway or similar for the API/orchestrator Postgres for state GitHub for code/config changes LLM API for reasoning background workers for workflow runs The main thing I need help with: How would you design the communication layer between the conversational bot and the actual deployed workflows? For example: Should the bot directly call workflow APIs? Should it create jobs in a queue? Should every action require approval first? Should Slack/Telegram only be a thin command layer while the dashboard/database remains the source of truth? How would you handle permissions, audit logs, and avoiding accidental production changes? I’m not looking to promote anything. I’m trying to avoid building the wrong architecture early. If you’ve built internal tools, AI agents, workflow automation, Slack bots, Telegram bots, or client dashboards, what setup would you choose? submitted by /u/SeNorMat [link] [comments]
View originalMade a tool that tells you what your AI agent actually did to your codebase
After a few incidents, a hardcoded key here, a DEBUG=True there, I started auditing my sessions more carefully. Eventually I just automated it. shipcheck reads your Claude Code or Cursor session logs and gives you a cost breakdown, a heatmap of which files the agent touched most, and a security scan of anything it introduced. Runs in under a second, fully offline. The rule that's caught me the most: hallucinated package imports. Claude regularly writes u/anthropic/sdk when the real package is u/anthropic-ai/sdk. Subtle enough to miss in review, breaks at install time. https://www.shipcheck.space/ submitted by /u/mr_vengeance_72 [link] [comments]
View originalAm I stupid for pivoting to Transparency with Agents over Memory after 6 months?
built an open source memory layer for ai agents. thought the obvious feature people would care about was persistent memory across restarts and shared memory between agents. that was the whole pitch. few months of actual user data in. most of the api calls aren't about memory at all. they're hitting the audit trail (what did the agent do and when), the loop detector (catching when an agent is stuck doing the same thing 20 times in a row), and the per-agent performance dashboard (which agent is wasting tokens, which one keeps crashing, who's drifting off goal). basically people don't really care that their agent remembers stuff across restarts. they care that they can see what it did and pull the plug when it goes off the rails. so i'm wondering if i should just flip the pitch. lead with "observability and accountability for ai agents" instead of "memory for ai agents". memory is table stakes at this point and mem0/zep already dominate that framing. loop detection + audit trail + performance scoring per agent feels like open territory. am i stupid? or is this the obvious move i somehow missed for 3 months submitted by /u/DetectiveMindless652 [link] [comments]
View originalIs the new monthly Agent SDK credit applicable for the VS Code extension?
Source Could not find an answer to that, and knowing Anthropic - I wouldn't be surprised if there isn't one. submitted by /u/arvigeus [link] [comments]
View originalAnthropic was supposed to be different. They're not anymore.l.
Paying Max subscriber here, building agent orchestration on top of claude -p and the Agent SDK. So this week's announcement directly hits what I'm working on. Over the last few months, Anthropic has moved like this: Jan 9: server-side block against OAuth tokens used outside Claude.ai and the Claude Code CLI. OpenClaw, OpenCode, Goose, Roo Code - all broken instantly. No real announcement, just an error message. Feb 19: legal docs quietly updated. Agent SDK now needs an API key. A new phrase appears: "ordinary, individual usage." Anthropic staff jump on X to say "nothing is changing." Docs say what they say. April 4: full ban on third-party agents using subscription credentials. Fair point on their side - some people were running 24/7 bots on a $200 plan burning thousands in tokens. But the rollout was rough and the comms were rougher. April 21: someone notices Claude Code is gone from the Pro plan on the pricing page. Support docs changed too. After the backlash, Anthropic calls it a "2% test of new prosumer signups." Reverted in 24 hours, but the trial balloon got popped. May 13: reversal. claude -p and the Agent SDK come back, but now under a separate credit pool that matches your plan price 1:1 - $20 / $100 / $200. Non-rollover. Billed at API rates. Effective June 15. If you were running real automation on Max, your effective inference value just dropped on the order of 25-40x by what the community is calculating. In the background: spring outages and quota tightening, and last fall's privacy pivot where consumer chat training defaulted on. Opt-out exists, but retention went from 30 days to 5 years for anyone who didn't opt out. Here's what's been bothering me. A lot of us paid Anthropic specifically because of the positioning. The lab that does things differently - safety-first, transparency-first, the responsible alternative to whoever else you thought was extracting from users at every turn. I knew part of it was marketing. The operational behavior backed it up, though. For a while. What's happening now is the playbook of every other AI company. Quiet doc edits. Three policy flips in two months. A 25-40x devaluation framed as a "simplification" and a "perk." Staff on X publicly contradicting their own docs in the same week. The vocabulary has shifted from "here's what we're building" to "here's what we're clarifying" - and that shift is the tell. Could be capacity panic from a company that grew faster than its infrastructure. Could be something quieter - if model improvements get harder to differentiate, business growth has to come from somewhere, and "somewhere" usually means tightening on the customers you already have. I don't know which one it is. What I do know is that the lab that sold itself as the alternative is now running the same playbook. Anyone else reading it this way? submitted by /u/rmmadl [link] [comments]
View originalAnthropic built the agentic features. Now they're billing them separately.
Starting June 15, Claude subscribers get a separate monthly credit for Agent SDK and claude -p usage: $200/mo for Max 20x, $100 for Max 5x, $20 for Pro. Once you burn through it, programmatic usage stops unless you've opted into extra usage billing at API rates. Your interactive Claude Code and chat usage stays on the subscription pool, untouched. I spent the last day digging into the community reaction across Reddit, GitHub, HN, and tech press. Tracked roughly 120 distinct opinions. Here's what I found. The sentiment split About 60% negative (credit is too small, feels like a value regression) About 25% pragmatic ("this was inevitable, the old model was broken") About 15% neutral to supportive ("interactive use is untouched, this is fair") Theo Browne (T3.gg) put it bluntly: anyone using T3 Code, Conductor, Zed, or claude -p in CI scripts had their effective usage cut by 25x. He said he now has to make the Claude Code experience on T3 Code "significantly worse." Ben Hylak (co-founder of Raindrop.ai) responded: "This is either really silly, or shows how bad of a spot Anthropic is in re: GPUs." Theo also said: "Framing this as a free credit instead of a regression for users is wild." That tracks with what I'm seeing across the threads. The telco parallel This follows the exact playbook telcos used with "unlimited" data plans. Sell unlimited. Watch users actually use it. Introduce a Fair Usage Policy that throttles heavy users. Continue marketing the plan as unlimited. Anthropic marketed Claude Code as an all-in-one agentic platform. They shipped Routines, /goal, /loop, scheduled tasks, and cloud sessions as headline features. Users adopted those patterns. Then the compute math didn't work out, and instead of solving the infrastructure problem, they drew a billing boundary inside their own product. Where the telco analogy breaks: Anthropic is capacity-constrained in ways telcos never were. They're spending aggressively on compute, and the resource contention isn't fabricated. But resource contention is an infrastructure problem, not a billing problem. And as we'll see, Anthropic did build the infrastructure to solve it. The question is why claude -p doesn't benefit from it. The contradiction that cuts deepest Here's what most people haven't articulated yet. Anthropic's product roadmap over the last 3 months has been aggressively agentic: Routines (cloud-hosted, schedule/webhook/GitHub triggers, no human in the loop) /goal (autonomous execution with minimal input) /loop (persistent in-session repetition) Scheduled tasks (desktop recurring prompts) Agent View (multi-session monitoring dashboard) Remote Control (manage sessions from phone) Every one of these features trains users to treat Claude Code as an always-on autonomous system. Anthropic productized exactly the usage pattern that the "you should use the API" crowd says doesn't belong on a subscription. But here's the catch. Routines draw from your regular subscription pool. claude -p doing the same work draws from the new capped credit. The billing line isn't "interactive vs agentic." It's "first-party agentic vs everything else." claude -p is the unix-philosophy composable interface for Claude Code. Penalizing users for calling the same primitive directly instead of wrapping it in Anthropic's GUI is anti-composability. If it were purely about cost management, Routines would also draw from the SDK credit. They don't. The distinction is about who controls the agent runtime. Then there's Managed Agents, Anthropic's API-side agent harness that entered public beta in April. Fully hosted runtime with cloud containers, built-in tools, and prompt caching baked in. API billing, pay-as-you-go. So now there are three tiers: Tier 1: Routines (subscription). Anthropic-hosted, flat-rate. They control the runtime, they optimize caching. Tier 2: Agent SDK / claude -p (credit). Your runtime, your code. Hard-capped. Caching APIs exist but you're on your own to implement them. Tier 3: Managed Agents (API). Anthropic-hosted again. Pay-as-you-go, but with full caching and compaction. Tiers 1 and 3, where Anthropic controls the runtime, get either flat-rate billing or optimized infrastructure. Tier 2, where you control the runtime, gets the worst deal. The strategy isn't "interactive vs programmatic." It's "managed vs unmanaged." The credit system is the squeeze play pushing you toward one of their managed options. Here's the nuance: prompt caching IS publicly available via the API. Agent SDK developers can use it. Cache reads cost 10% of base input token price. The optimization isn't gated behind Managed Agents. So why did third-party tools burn so many tokens? Many were unoptimized for Anthropic's caching compared to first-party tools. That resource contention was partly a third-party engineering gap. But that raises the obvious question: claude -p is Anthropic's own tool. They could bake caching into its runtime the same way they
View originalRepository Audit Available
Deep analysis of vercel/ai — architecture, costs, security, dependencies & more
Vercel AI SDK uses a tiered pricing model. Visit their website for current pricing details.
Vercel AI SDK has an average rating of 4.8 out of 5 stars based on 20 reviews from G2, Capterra, and TrustRadius.
Key features include: The Framework Agnostic AI Toolkit, Scale with confidence.
Vercel AI SDK is commonly used for: Building AI chatbots with persistence, Creating multi-modal chat applications, Developing Slackbots for direct message responses, Integrating natural language processing with PostgreSQL databases, Implementing long-running AI agents that can suspend and resume, Generating structured objects and tool calls with LLMs.
Vercel AI SDK integrates with: OpenAI, AWS Lambda, Slack, PostgreSQL, React, Next.js, Vue, Svelte, Node.js, GitHub.
Jerry Liu
CEO at LlamaIndex
1 mention
Vercel AI SDK has a public GitHub repository with 23,126 stars.
Based on user reviews and social mentions, the most common pain points are: API bill, token usage, cost tracking.
Based on 69 social mentions analyzed, 23% of sentiment is positive, 75% neutral, and 1% negative.