Users appreciate AutoGen for its innovative AI capabilities and powerful automation features, which streamline complex workflows efficiently. However, some criticism revolves around its lack of comprehensive documentation and occasional bugs, which can hinder usability. The pricing is generally perceived as reasonable, especially considering its robust feature set compared to competitors. Overall, AutoGen has a positive reputation for being a solid choice for tech-savvy users seeking advanced AI solutions despite some areas needing improvement.
Mentions (30d)
0
Reviews
0
Platforms
4
GitHub Stars
56,499
8,492 forks
Users appreciate AutoGen for its innovative AI capabilities and powerful automation features, which streamline complex workflows efficiently. However, some criticism revolves around its lack of comprehensive documentation and occasional bugs, which can hinder usability. The pricing is generally perceived as reasonable, especially considering its robust feature set compared to competitors. Overall, AutoGen has a positive reputation for being a solid choice for tech-savvy users seeking advanced AI solutions despite some areas needing improvement.
Features
Use Cases
Industry
information technology & services
Employees
3
116,169
GitHub followers
7,713
GitHub repos
56,499
GitHub stars
20
npm packages
40
HuggingFace models
81
npm downloads/wk
189,562
PyPI downloads/mo
EVAL #004: AI Agent Frameworks — LangGraph vs CrewAI vs AutoGen vs Smolagents vs OpenAI Agents SDK
Every week there's a new AI agent framework on Hacker News. The GitHub stars pile up, the demo videos...
View originalAm I stupid for pivoting to Transparency with Agents over Memory after 6 months?
built an open source memory layer for ai agents. thought the obvious feature people would care about was persistent memory across restarts and shared memory between agents. that was the whole pitch. few months of actual user data in. most of the api calls aren't about memory at all. they're hitting the audit trail (what did the agent do and when), the loop detector (catching when an agent is stuck doing the same thing 20 times in a row), and the per-agent performance dashboard (which agent is wasting tokens, which one keeps crashing, who's drifting off goal). basically people don't really care that their agent remembers stuff across restarts. they care that they can see what it did and pull the plug when it goes off the rails. so i'm wondering if i should just flip the pitch. lead with "observability and accountability for ai agents" instead of "memory for ai agents". memory is table stakes at this point and mem0/zep already dominate that framing. loop detection + audit trail + performance scoring per agent feels like open territory. am i stupid? or is this the obvious move i somehow missed for 3 months submitted by /u/DetectiveMindless652 [link] [comments]
View originalAndroid Auto gets a massive AI-powered upgrade with YouTube, Dolby Atmos, and immersive 3D Maps | Google’s next-gen in-car software is getting smarter and slicker
submitted by /u/ControlCAD [link] [comments]
View originalI made a Claude skill that stops it from cloning whole repos when I just want one function
Kept hitting the same friction with Claude Code. I'd point at a GitHub repo and say "look at how this handles agent handoffs" — meaning, borrow the idea. Claude would git clone the whole repo, read 50 files, and ask which __init__.py was interesting. Or worse — it'd add the library to my package.json as a dependency. For one function. Suddenly I own the transitive deps, the CVE notifications, and a version pin I'll never upgrade. The actual problem: "use this library", "borrow an idea from this library", and "just steal that one function" deserve totally different workflows, and nothing was telling Claude which one I meant. So I wrote a skill — a single SKILL.md (surgical-github-extraction) that auto-triggers when I drop a GitHub URL as inspiration. The rule: Read the README first to get the shape. Pull 1–3 source files via raw URLs to see how the pattern is wired — prompts, schemas, the orchestration file. Never the whole repo. Pin to a commit SHA, save to /tmp (or %TEMP% on Windows). Lift the smallest useful unit — a function, a prompt, or just the pattern. Rewrite in your style. Cite the source SHA. Two concrete cases this week: Pointed it at TradingAgents (a multi-agent trading repo) asking "can we use this pattern for a job-applier?" → README plus a few agent/prompt files, proposed an analogue (JobFitAnalyst + Critic arguing against). Nothing copied into my project. Asked it to "steal the exp backoff from litl/backoff" → fetched one file (_wait_gen.py), extracted the 8-line generator, rewrote inline in my style with a provenance comment. No pip install. Sibling skill: code-graft — for when a one-off snippet isn't enough but a runtime dep is too much. Vendor only the slice of a library you use into your project, trim the rest, re-sync selectively from upstream. Think "I want one tokenizer out of HuggingFace transformers without the 2GB." Why a Skill and not an MCP: Pure discipline on tools Claude already has (WebFetch, curl, gh, Read). MCPs ship new tools; Skills ship instructions. Same shape as Anthropic's own mcp-builder — that's a Skill, not an MCP. MIT-licensed, single file install: `mkdir -p ~/.claude/skills/surgical-github-extraction` curl -fsSL https://raw.githubusercontent.com/jeet-dhandha/jd-skills/main/skills/surgical-github-extraction/SKILL.md \ -o ~/.claude/skills/surgical-github-extraction/SKILL.md Both skills (jd-skills collection): https://github.com/jeet-dhandha/jd-skills Curious if anyone has hit this and solved it differently — especially failure cases where the skill picks the wrong path (concept vs. snippet vs. full vendor). Issues welcome. submitted by /u/hone_coding_skills [link] [comments]
View original5 enterprise AI agent swarms (Lemonade, CrowdStrike, Siemens) reverse-engineered into runnable browser templates.
Hey everyone, There is a massive disconnect right now between what indie devs are building with AI (mostly simple customer support chatbots) and what enterprise companies are actually deploying in production (complex, multi-agent swarms). I wanted to bridge this gap, so I spent the last few weeks analyzing case studies from massive tech companies to understand their multi-agent routing logic. Then, I recreated their architectures as runnable visual node-graphs inside agentswarms.fyi (an in-browser agent sandbox I’ve been building). If you want to see how the big players orchestrate agents without having to write 1,000 lines of Python, I just published 5 new industry templates you can run in your browser right now: 1. 🛡️ Insurance: Auto-Claims FNOL Triage Swarm Inspired by: Lemonade’s AI Jim, Tractable AI (Tokio Marine), and Zurich GenAI Claims. The Architecture: A multimodal swarm where a Vision Agent assesses uploaded images of car damage, a Policy Agent cross-references the user's coverage database, and a Fraud-Detection Agent flags inconsistencies before routing to a human adjuster. 2. ⚙️ Manufacturing: Quality / Root-Cause Analysis Swarm Inspired by: Siemens Industrial Copilot, BMW iFactory, Foxconn-NVIDIA Omniverse. The Architecture: A sensor-data ingest node triggers a diagnostic swarm. One agent pulls historical maintenance logs via RAG, while a SQL Agent queries the parts database to identify failure patterns on the assembly line. 3. 🔒 Cybersecurity: SOC Alert Triage & Response Inspired by: Microsoft Security Copilot, CrowdStrike Charlotte AI, Google Sec-Gemini. The Architecture: The ultimate high-speed parallel routing swarm. When an anomaly is detected, specialized sub-agents simultaneously investigate IP reputation, analyze the malicious payload, and draft an incident response ticket for the human SOC analyst to approve. 4. 📚 Education: Adaptive Socratic Tutor & Auto-Grader Inspired by: Khan Academy Khanmigo, Duolingo Max, Carnegie Learning LiveHint. The Architecture: A strict "No-Direct-Answers" routing loop. The Student Agent interacts with the user, but its output is constantly evaluated by a hidden "Pedagogy Agent" that ensures the AI is guiding the student to the answer via Socratic questioning rather than just giving away the solution. 5. 📦 Retail/E-commerce: Returns & Reverse-Logistics Swarm Inspired by: Walmart Sparky, Mercado Libre, Shopify Sidekick. The Architecture: A logistics orchestration loop that analyzes a customer return request, checks inventory levels in real-time, determines if the item should be restocked or liquidated (based on shipping costs vs. item value), and autonomously issues the refund. How to play with them: You don't need to spin up Docker containers or wrangle API keys to test these architectures. You can load any of these 5 templates directly into the visual canvas, see how the data flows between the specialized nodes, and try to break the routing logic yourself. Link: https://agentswarms.fyi/templates submitted by /u/Outside-Risk-8912 [link] [comments]
View originalI built a Pokémon-styled multi-agent dashboard to manage all Claude Code sessions
Like many others here, I got frustrated with managing all my different claude/codex sessions, so i built Pokegents, which is an open source multi-agent workspace for coding agents. It has a Pokemon-themed dashboard/chat interface plus a local orchestration server for managing agent sessions (currently supports Claude Code in iTerm2, plus Claude and Codex through ACP-based chat runtimes), persistent agent identities, mcp messaging between agents, notifications, session cloning, and more. This was mostly a vibe-coded side project, but I've been using it constantly in my day-to-day workflow as an engineer, and its helped me parallelize a lot of my work. My coworkers make fun of me because it looks like I'm just playing Pokemon all day haha. I made it open source and sharing in case it might be useful or just fun for anyone to use (links in comment below). submitted by /u/girishkumama [link] [comments]
View originalAlien Pinball Postmortem - How I made a full physics pinball game with Claude
Postmortem: Alien Pinball — built with Claude + ChatGPT + Suno + LittleJS Just shipped a browser pinball game. Short writeup of the AI workflow in case it's useful here. The game — Full physics pinball: multiball, an A-L-I-E-N rollover multiplier (caps at 5x), skill shots, escalating combos, outlane gutter saves, and a wizard-mode centipede boss you fight while juggling 3 balls. Browser, mobile-friendly, no install. Play it: https://focaccai.itch.io/alien-pinball Setup. Claude Code Max, Opus model for the heavy lifting. Roughly half my input was via speech-to-text — talking at the codebase rather than typing — the other half was typing plus a lot of manual code editing. It genuinely felt co-developed rather than code-generated: describe what I want, riff with Claude, dive in by hand to steer or clean up. Tool stack Code: Claude. All game logic, custom Box2D parts (slingshots, drop targets, spinners, ramps, ball locks, break targets), plus a full in-game table editor I built so I could drag/place/tune every part visually. Reusable for future pinball games. Art: ChatGPT image gen. I had Claude write the image prompts too. Music: Suno 5.5 — three tracks, lots of iteration to find the right vibe. Claude wrote the music prompts. Sounds: ZzFX — every sound generated procedurally at game start, no audio files. Claude tuned the parameters by ear-by-ear iteration. This combo was a joy with AI. Engine: LittleJS + Box2D WASM. Small, fast, AI handles it beautifully — minimal API surface, no framework ceremony to wade through. The art trick that actually worked. I exported a silhouette of the collision geometry (walls, ramps, bumpers, drop targets — exact positions) and handed it to the image generator with: "create an alien-themed pinball playfield that exactly matches this silhouette." Took many generations plus manual compositing — stitching the best parts from different outputs — but conceptually it nailed the brief on the first try. The art lines up with the physics because the physics is the prompt. Co-developed, not just code-generated. A bunch of design ideas came from the AI. The bumpers being giant eyeballs? Came out of an image gen, I just ran with it. I also kept asking Claude pinball-specific design questions ("what does a complete pinball table have?", "how should wizard mode work?", "what's missing here?"). I have plenty of video gamedev experience but very little pinball-specific, and Claude was a useful domain consultant for filling in genre conventions and sanity-checking the system. Things that came together easily: The alien centipede boss — multi-segmented, loses tail segments as you hit it, speeds up and turns red. Worked basically first try. An AI debug player that auto-flips and knocks the ball around. Not great, but good enough to flip on and watch while I think. Surprisingly useful — you get ideas just watching the machine play your machine. What still needed me: feel. Restitution values, flipper torque, ramp curvature, slingshot kick angles, peg bounce. The git log has an embarrassing number of "tweak peg bounce" / "1.49 → 1.491" commits. The model can write the system; a human still has to sit there bouncing balls until it feels right. The polish tail is brutal. Last week of commits is sound passes, ramp angles, message priorities, and a multiball end-check race condition. All small. None optional. Budget for it. Happy to answer workflow / Claude / LittleJS questions in the comments. submitted by /u/Slackluster [link] [comments]
View originalAlien Pinball Postmortem - How I made a full physics pinball game with AI tools
Postmortem: Alien Pinball — built with Claude + ChatGPT + Suno + LittleJS Just shipped a browser pinball game. Short writeup of the AI workflow in case it's useful here. The game — Full physics pinball: multiball, an A-L-I-E-N rollover multiplier (caps at 5x), skill shots, escalating combos, outlane gutter saves, and a wizard-mode centipede boss you fight while juggling 3 balls. Browser, mobile-friendly, no install. Play it: https://focaccai.itch.io/alien-pinball Setup. Claude Code Max, Opus model for the heavy lifting. Roughly half my input was via speech-to-text — talking at the codebase rather than typing — the other half was typing plus a lot of manual code editing. It genuinely felt co-developed rather than code-generated: describe what I want, riff with Claude, dive in by hand to steer or clean up. Tool stack Code: Claude. All game logic, custom Box2D parts (slingshots, drop targets, spinners, ramps, ball locks, break targets), plus a full in-game table editor I built so I could drag/place/tune every part visually. Reusable for future pinball games. Art: ChatGPT image gen. I had Claude write the image prompts too. Music: Suno 5.5 — three tracks, lots of iteration to find the right vibe. Claude wrote the music prompts. Sounds: ZzFX — every sound generated procedurally at game start, no audio files. Claude tuned the parameters by ear-by-ear iteration. This combo was a joy with AI. Engine: LittleJS + Box2D WASM. Small, fast, AI handles it beautifully — minimal API surface, no framework ceremony to wade through. The art trick that actually worked. I exported a silhouette of the collision geometry (walls, ramps, bumpers, drop targets — exact positions) and handed it to the image generator with: "create an alien-themed pinball playfield that exactly matches this silhouette." Took many generations plus manual compositing — stitching the best parts from different outputs — but conceptually it nailed the brief on the first try. The art lines up with the physics because the physics is the prompt. Co-developed, not just code-generated. A bunch of design ideas came from the AI. The bumpers being giant eyeballs? Came out of an image gen, I just ran with it. I also kept asking Claude pinball-specific design questions ("what does a complete pinball table have?", "how should wizard mode work?", "what's missing here?"). I have plenty of video gamedev experience but very little pinball-specific, and Claude was a useful domain consultant for filling in genre conventions and sanity-checking the system. Things that came together easily: The alien centipede boss — multi-segmented, loses tail segments as you hit it, speeds up and turns red. Worked basically first try. An AI debug player that auto-flips and knocks the ball around. Not great, but good enough to flip on and watch while I think. Surprisingly useful — you get ideas just watching the machine play your machine. What still needed me: feel. Restitution values, flipper torque, ramp curvature, slingshot kick angles, peg bounce. The git log has an embarrassing number of "tweak peg bounce" / "1.49 → 1.491" commits. The model can write the system; a human still has to sit there bouncing balls until it feels right. The polish tail is brutal. Last week of commits is sound passes, ramp angles, message priorities, and a multiball end-check race condition. All small. None optional. Budget for it. Happy to answer workflow / Claude / LittleJS questions in the comments. submitted by /u/Slackluster [link] [comments]
View originalI built a hands-free voice AI that sends emails mid-conversation — and that's just one feature. Here's everything AskSary can do.
https://reddit.com/link/1symbsj/video/k2no3zfgq1yg1/player Been building AskSary solo for a while. Just shipped hands-free voice email - you're mid-conversation with an AI and you say "send an email to [john@example.com](mailto:john@example.com) subject X body Y" and it pre-fills the Gmail modal automatically. One tap sends. Powered by OpenAI Realtime API, works in 22 languages. But that's just the latest feature. Here's the full picture: Every major model in one place GPT-5-Nano, GPT-5.2, GPT-5.2 Pro, O1 Reasoning, Claude Sonnet 4.6, Grok 4, Gemini 2.5 Flash, Gemini 3.1 Pro, Gemini Ultra, DeepSeek V3, DeepSeek R1 - with smart auto-routing or manual override. Pro-Active Personalisation On every login the AI reads your previous conversations and sends the first message itself - asking if you want to continue or start fresh. Before you type a single word. Persistent Cross-Model Memory Start a conversation with Claude on your phone, open your laptop, switch to GPT-5.2 - it already knows what you discussed. No copy-pasting, no summaries. Just works. Knowledge Base - RAG Upload docs up to 500MB per file, unlimited uploads, chat with them across any model via OpenAI Vector Store. Your files stay in context forever. Integrations Google Drive, Gmail, Google Calendar, Notion - access files, get email and calendar summaries, use them in chat or push them to your Knowledge Base. Generation Tools Image Gen - GPT-Image-1 and Nano Banana Pro Flux Image Editor - full editing suite with visual history Video Studio - Luma Dream, Veo 3.1, Kling 1.6 / 2.6 / 3, up to 10 second AI videos with audio Music Studio - 30 second tracks with custom or AI lyrics via ElevenLabs, visualizer built into chat 3D Model Studio - Meshy with STL export (deploying soon) Video Analysis - upload up to 500MB or paste a YouTube link Developer and Builder Tools Vision to Code - screenshot any UI, get live editable code Web Architect - build full web apps from a single prompt Game Engine - build and prototype games with AI Code Lab - split screen live coding with SQL Architect, Bug Buster, Git Guru, Regex Generator, Test Genie and more Tavily web search across all models Voice and Audio Real-time 2-way voice chat - 8 voices, near-zero latency WebRTC Podcast Mode - two AI voices, switchable, near-zero latency, downloadable as MP3 Voiceover Studio, Voice Notes, Voice Tuner Productivity and Content Slides, Docs and File Tools Pro Writer and Content Library Social Tools - Hook Generator, Video Script, Hashtag Creator, Idea Spark Business Suite - Pitch Deck Builder, Deep Analytics, Legal Eagle, Maths Solver Daily Briefing and Market Watch CV Creator, Email Polisher, Cover Letter Builder, TL;DR Bot Share conversations or snippets with anyone Platform Extras 30+ live interactive wallpapers and themes Custom Agents and Personas Folder organisation and Smart Search across chat history Media Manager Gallery - all your generated content in one place Fully customisable UI in 26 languages with full RTL support The Stack Frontend: Next.js, Capacitor (iOS + Android), Vanilla JS / React Backend: Vercel serverless, Firebase / Firestore, Firebase Admin SDK AI: OpenAI, Anthropic, Google, xAI, DeepSeek Generation: Luma AI, Kling via Replicate, Veo via Replicate, ElevenLabs, Flux via Replicate, Meshy Integrations: Google Drive, Notion, Tavily, OpenAI Vector Store, Stripe, CloudConvert, Sentry Rendering: Mermaid, MathJax Platforms: Web, iOS, Android, Apple Vision Pro What you get free just for creating an account (1,000 credits/month, rolling): Unlimited chat on GPT-5 Nano, Gemini Flash and DeepSeek V3 - no daily limits, zero credit charge 25 image generations via GPT-Image-1 and Nano Banana Pro - 40 credits each 8 image edits via Flux Studio - 80 credits each 2 song generations via ElevenLabs - 350 credits each 2 video generations via Luma Dream and Kling - 350 credits each ~70 messages on Claude Sonnet 4.6, GPT-5.2, Grok 4, Gemini 3.1 Pro and DeepSeek R1 - 15 credits each No credit card required. Built entirely solo. No CS degree, no team, no funding. Started because I asked an AI to build me a chatbot and it failed - so I built my own. Accepted to LEAP 2026 in Saudi Arabia along the way. Happy to answer anything about the build. asksary.com submitted by /u/Beneficial-Cow-7408 [link] [comments]
View originalBuilt my own cloud agent harness and workspace, here's what I learned
I experimented with many tools before, including Claude Code, Codex, opencode, and a custom local harness. As I was using custom agents more, I saw a real gap in managing agents that work persistently across multiple projects. This included tasks like coding, automated jobs for code review/documentation/bug fixes, as well as business workflows like lead gen, marketing content, etc. and it led me to start building my own tool as both a learning experience and to be able to fully customize my harness and workspace. Specific features I wanted: Cloud native setup that runs 24/7 Task management and database as primitives Manage multiple agents with their own roles, memory, skills, MCPs I focused on the the minimal setup that would function, knowing that I would put more content and instructions into the agents and skills themselves. Lightweight harness At its core, a harness is just the program that uses LLMs to power a tool calling loop you can interact with. Within this layer you define the basic tools and how things like sessions and context windows will be managed. This is basically what enables an "agent" to work, allowing an LLM loop to continue to make tool calls unitl it completes a task. Here is where you can customize your platform to have native tools for things like databases and task management just like how CLI agents expose bash or web search tools. Also env var and secret management for MCPs and API requests. Agent customization Most harnesses define agents by the following components: - SOUL.md: Role and instructions unique to each separate agent, like responsibilities, voice & tone, and artifacts it should own - AGENTS.md (or CLAUDE.md, CODEX.md, etc.): Workspace or project-level context and preferences, shared across agents - /skills: Use existing SKILL.md standard and provide tool for loading instructions into session context. Use lazy loading/progressive disclosure to only load content when relevant. - /memory and MEMORY.md: I generally use this straightforward file based memory per agent similar to Claude Code's active memory. Customize further or use existing solutions Most providers for LLM models you'll want to use like Claude Opus 4.7 and Sonnet 4.6, GPT 5.5, Deepseek V4, Kimi K2.6 all can use Anthropic or OpenAI SDKs which come with their own optimal agent features. They provide interfaces for defining tools, message history structure, and even context window auto compaction. Performance so far I've been running my github pr review and documentation agents on here instead of locally so that they're automatic, as well as some scheduled jobs for a sales/lead-gen agent workflow. So far it's been performing great for the few well-crafted and battle-tested skills I've written. I think with the same frontier models and a minimum harness, the environment context and skills can really shine and do the heavy lifting for any kind of workflow you want agents to do. Here's the project link if you're interested in learning more, would love feedback or to hear if you've experimented with anything similar: https://www.subterranean.io/ submitted by /u/Plenty-Dog-167 [link] [comments]
View originalINSTANT MAGAZINE: I asked Claude to help me build "a Blog post" Eight agents later, I have a full Magazine media operation running on a $200 NAS in my closet. Here's what happened. (Claude, GPT Image 2 Canva)
These are not random text (lorem ipsum) but actual daily content!!! What?!? I work in talent, BGRated is a talent agency and we partner with independent media to help our clients get coverage that actually reflects the culture. One of those partners is BlkCosmo, a Black culture and celebrity magazine. Think The Shade Room meets Essence, independently owned and operated. We went to zGenMedia a digital strategy and design operation to figure out how to produce content faster without sacrificing the cultural specificity that makes BlkCosmo worth reading in the first place. What they built for us has genuinely changed how we think about independent media production. A few months ago I just needed help writing captions faster. That's it. One tool. Something to pull today's headlines and give me Instagram copy so I wasn't copy-pasting at midnight. What I have now is something I genuinely cannot explain to people in my life without watching their eyes glaze over. So I'm explaining it here, where someone might actually get it. What they built: a pipeline that ends in an editable magazine cover The system runs 24/7 on a NAS server no cloud subscription, no monthly SaaS fees all private. Step 1 : Demographic-targeted story scoring An agent pulls from 15+ Black media RSS sources every morning. Cross-references Google News. Digs through targeted Reddit communities. Every story gets scored 1–10 against a live demographic profile — right now that's Black women 35-54, US-heavy, celebrity-forward — and anything below a 6 gets dropped. The profile isn't static. It updates based on real engagement and audience data fed back into it over time. python # rough shape — not the actual thing demo = load_demographic_profile() # live JSON, updates with audience data scored = [s for s in stories if score_story(s, demo) >= 6] ranked = sorted(scored, key=lambda x: x['score'], reverse=True) The output isn't just a list of headlines. It's a structured brief cover story, four secondary stories, each with subheadlines formatted specifically for what comes next. Step 2 : GPT Image 2 prompt, auto-generated At the bottom of every brief is a ready-to-paste image generation prompt. Not a generic one. It pulls the actual stories from the brief, formats them with the correct accent color (RGB 218,165,32), specifies font stacks, image ratio (9:16), layout hierarchy. The cover story becomes the hero. The secondary stories become the sidebar teasers. It writes the prompt from the brief content so there's no manual translation step. Step 3 : Paste into GPT Image 2 → get the cover One paste. One generate. A full magazine cover visual comes back. Step 4 : Upload to Canva → Magic Layers This is where it gets interesting for anyone in creative production. Upload the GPT Image 2 output into Canva, hit Edit → Magic Layers, and Canva automatically separates the image into editable layers. The text becomes editable text. The background separates from the subject. You can adjust, swap, refine — without rebuilding from scratch. This is for the guys that say yeah but ai makes mistakes. You use it as a tool not the business. Step 5 : Magazine Cover Builder A custom layered canvas tool that knows what BlkCosmo covers are supposed to look like. Pull from the morning brief and every slot fills in order... cover story, left column, right columns A/B/C. Hit Generate Cover Copy and the AI polishes the existing text: tightens headlines that are too long for the visual space, fixes spelling, improves wording without replacing anything with invented content. The download matches what you see on screen. (That took longer to get right than anything else in the whole build.) Why this matters for the industry Independent media has always been resource-constrained. You either have the audience or the production quality rarely both at the same time. What zGenMedia built here collapses the production side down to almost nothing. The demographic targeting piece is what most tools miss. A generic AI cover generator doesn't know that your audience cares about this story and not that one. It doesn't know that gospel music beef hits differently than pop music beef for a 42-year-old Black woman in Atlanta. The scoring layer makes those calls before anything visual gets touched. For talent agencies like BGRated, this changes the pitch. When we bring a client to a partner publication, we can now show up with a cover-ready treatment the same day the story breaks. That's not something that was possible before without a full design team on standby. The output BlkCosmo is at blkcosmo.com every cover you see there has gone through some version of this pipeline. zGenMedia built the architecture. BGRated brought the talent relationships and the cultural context. BlkCosmo is the proof of concept that it works at publication quality. If you're in independent media, talent management, or anywhere adjacent to content production and you're still doing this manually this
View originalI run a team of Claude agents that ships PRs to production — open source
I've been running a multi-agent system in production for a few months — a co-CTO agent + specialist agents (PM, dev, ops) that handle real engineering work end-to-end: design specs, code review, PR implementation, deploys, monitoring. The architecture: Each agent is a Docker container running claude -p (with optional Codex fallback) wrapped in .NET 10. A central orchestrator coordinates them via Temporal workflows + RabbitMQ. Agents talk to me over Telegram (DMs + group chat for the whole team). Memory is Qdrant + Ollama embeddings — agents recall past decisions across sessions. A web dashboard shows live agent status and in-flight workflows. What it does day-to-day: I drop a one-line request in Telegram. PM writes the spec, two reviewers run consensus, dev implements the PR, CI ships to staging, PM verifies, I approve the merge gate, prod deploy. Same pattern handles infra: deploy verifications, health checks, daily digests, incident triage. Agents have access to fleet-memory (semantic memory MCP) — they search before acting, write learnings after. 5-min demo of an actual production PR being shipped: https://youtu.be/DIx7Y3GfmGc Why I built it instead of using crewai/autogen/langgraph: I wanted Temporal-backed durability (workflows survive restarts, retries are deterministic) and ops-grade observability (every workflow visible in the temporal UI, every signal auditable). The agents themselves are just claude -p — the magic is in the orchestration layer. Open source: https://github.com/anurmatov/phleet Side note for those who recognize me — this runs on the Mac Studio I documented in mac-studio-server. The dogfooding is real. Happy to dig into prompts, system architecture, memory strategy, or how the agents handle PR reviews — AMA. submitted by /u/_ggsa [link] [comments]
View originalAgentic OS — an governed multi-agent execution platform
I've been building a system where multiple AI agents execute structured work under explicit governance rules. Sharing it because the architecture might be interesting to people building multi-agent systems. What it does: You set a goal. A coordinator agent decomposes it into tasks. Specialized agents (developer, designer, QA, etc.) execute through controlled tool access, collaborate via explicit handoffs, and produce artifacts. QA agents validate outputs. Escalations surface for human approval. What's different from CrewAI/AutoGen/LangGraph: The focus isn't on the agent — it's on the governance and execution layer around the agent. Tool calls go through an MCP gateway with per-role permission checks and audit logging Zero shared mutable state between agents — collaboration through structured handoffs only Policy engine with configurable approval workflows (proceed/block/timeout-with-default) Append-only task versioning — every modification creates a new version with author and reason Built-in evaluation engine that scores tasks on quality, iterations, latency, cost, and policy compliance Agent reputation scoring with a weighted formula (QA pass rate, iteration efficiency, latency, cost, reliability) Architecture: 5 layers with strict boundaries — frontend (visualization only), API gateway (auth/RBAC), orchestration engine (24 modules), agent runtime (role-based, no direct tool access), MCP gateway (the only path to tools). Stack: React + TypeScript, FastAPI, SQLite WAL, pluggable LLM providers (OpenAI, Anthropic, Azure), MCP protocol. Configurable: Different team presets (software, marketing, custom), operating models with different governance rules, pluggable LLM backends, reusable skills, and MCP-backed integrations. please guys, I would love to get your feedback on this and tell me if this is interesting for you to use submitted by /u/ramirez_tn [link] [comments]
View originalAgentic OS — an governed multi-agent execution platform
I've been building a system where multiple AI agents execute structured work under explicit governance rules. Sharing it because the architecture might be interesting to people building multi-agent systems. What it does: You set a goal. A coordinator agent decomposes it into tasks. Specialized agents (developer, designer, QA, etc.) execute through controlled tool access, collaborate via explicit handoffs, and produce artifacts. QA agents validate outputs. Escalations surface for human approval. What's different from CrewAI/AutoGen/LangGraph: The focus isn't on the agent — it's on the governance and execution layer around the agent. Tool calls go through an MCP gateway with per-role permission checks and audit logging Zero shared mutable state between agents — collaboration through structured handoffs only Policy engine with configurable approval workflows (proceed/block/timeout-with-default) Append-only task versioning — every modification creates a new version with author and reason Built-in evaluation engine that scores tasks on quality, iterations, latency, cost, and policy compliance Agent reputation scoring with a weighted formula (QA pass rate, iteration efficiency, latency, cost, reliability) Architecture: 5 layers with strict boundaries — frontend (visualization only), API gateway (auth/RBAC), orchestration engine (24 modules), agent runtime (role-based, no direct tool access), MCP gateway (the only path to tools). Stack: React + TypeScript, FastAPI, SQLite WAL, pluggable LLM providers (OpenAI, Anthropic, Azure), MCP protocol. Configurable: Different team presets (software, marketing, custom), operating models with different governance rules, pluggable LLM backends, reusable skills, and MCP-backed integrations. agenticompanies.com please guys, I would love to get your feedback on this and tell me if this is interesting for you to use you can register with email/passoword to view the platform but if you want to operate agentsession I need to send you an invitation code. please feel free to DM me for an invitation code you would also need to use your Anthropic or OpenAI API key to operate then engines Thanks submitted by /u/ramirez_tn [link] [comments]
View originalI tracked what AI agents actually do when nobody's watching. Built a tool that replays every decision.
Been building AI agents for about a year now and the thing that always drove me crazy is you deploy an agent, it runs for hours, and you have absolutely no idea what it did. The logs say "task complete" 47 times but did it actually do 47 different things or did it just loop the same task over and over? I had an agent burn through about $340 in API credits over a weekend because it got stuck retrying the same request. The logs showed 200 OK on every call. Everything looked fine. It just kept doing the same thing for 6 hours straight while I slept. So I built something to fix this. It's called Octopoda and its basically an observability layer that sits underneath your agents. Every memory write, every decision, every recall gets logged on a timeline. You can literally press play and watch what your agent did at 3am, step by step, like scrubbing through a video. The part that surprised me most was the loop detection. Once I could see the full timeline I realised how often agents loop without you knowing. Not obvious infinite loops, subtle stuff. An agent that rewrites the same conclusion 8 times with slightly different wording. Or one that keeps checking the same API endpoint every 30 seconds even though the data hasn't changed. Each iteration costs tokens but produces nothing new. We track 5 signals for this: write similarity, key overwrite frequency, velocity spikes, alert frequency, and goal drift. When enough signals fire together it flags it and estimates how much money the loop is costing you per hour. One user had a research agent that was wasting about $10 an hour on duplicate writes before the detection caught it. It also does auto-checkpoints. Every 25 writes it saves a snapshot automatically so if something goes wrong you can roll back to any point with one click. No more losing an entire night of agent work because something corrupted at 4am. Works with LangChain, CrewAI, AutoGen, and OpenAI Agents SDK. One line to integrate: The dashboard shows everything in real time. Agent health scores, cost per agent, shared memory between agents, full audit trail with reasoning for every decision. Honestly the most useful thing is just being able to answer "what happened overnight" without spending an hour reading logs. Anyone else dealing with the "I have no idea what my agent did" problem? Curious how other people are handling observability for autonomous workflows. Let me know if anyone wants to check it out! submitted by /u/DetectiveMindless652 [link] [comments]
View originalI built Synapse AI: An open-source, DAG-based orchestrator for AI agents.
Hey Everyone, For the past three months, I’ve been building an open-source orchestration platform for AI agents called Synapse AI. I started this because I found existing frameworks (like LangChain or AutoGen) either too bloated or too unpredictable for production workflows. Letting agents freely "chat" with each other often leads to infinite loops, high API costs, and debugging nightmares. I wanted strict, predictable control. The Architecture: Instead of conversational routing, Synapse AI relies on a Directed Acyclic Graph (DAG) architecture. You define the work, strictly control the hand-offs between agents, and get a completed task on the other side. Under the Hood: Tool Agnostic: Build custom tools from scratch (Python/webhooks) or instantly plug in existing Model Context Protocol (MCP) servers. Local-First Emphasis: Full native support for Ollama so you can run routing and tasks entirely locally. (It also supports Gemini, Claude, and OpenAI for the heavy lifting). CLI Integration: Just shipped a community-requested feature to connect Claude Code, Gemini CLI, Codex CLI, and GitHub Copilot CLI directly to your agents. Frictionless Setup: A 1-step installation process across macOS, Windows, and Linux. What I'm looking for: I am currently maintaining this solo and rolling it out for an early pilot phase. I would love for this community to take a look under the hood. Specifically: Code Review: I’d love brutal feedback on the DAG implementation and overall architecture. Contributors & Collaborators: If you find the project worthwhile, I am actively looking for people to team up with! Whether it's adding new LLM providers, fixing UI quirks, or improving the 1-step installer, PRs are incredibly welcome. Repo: https://github.com/naveenraj-17/synapse-ai If you bump into any bugs, please drop an issue so I can patch it. Would love to hear your thoughts! submitted by /u/WabbaLubba-DubDub [link] [comments]
View originalRepository Audit Available
Deep analysis of microsoft/autogen — architecture, costs, security, dependencies & more
Key features include: Multi-agent orchestration, Real-time collaboration tools, Customizable agent behaviors, Built-in debugging tools, Observability dashboards, Task prioritization mechanisms, Integration with existing AI models, Support for various communication protocols.
AutoGen is commonly used for: Automated customer support systems, Collaborative content generation, Dynamic resource allocation in cloud environments, Real-time data analysis and reporting, Multi-agent gaming environments, Coordinated task execution in IoT systems.
AutoGen integrates with: OpenAI, AWS Lambda, Azure Functions, Google Cloud Platform, Slack, Trello, Jira, Microsoft Teams, Zapier, Docker.
AutoGen has a public GitHub repository with 56,499 stars.
Based on user reviews and social mentions, the most common pain points are: API costs, cost tracking.
Aparna Dhinakaran
CEO at Arize AI
1 mention
Based on 22 social mentions analyzed, 5% of sentiment is positive, 95% neutral, and 0% negative.