Stability AI is the enterprise-ready creative partner for teams and creators, delivering professional-grade generative AI tools and solutions for cont
Stable Diffusion is praised for its low VRAM requirement and ease of use, thanks to optimizations like the "Rose" optimizer, and it is recognized for delivering impressive results. Key complaints include technical issues like server connection problems, particularly when integrating with other tools like Claude. Users appear somewhat neutral about pricing, as it isn't a major topic of discussion in the social mentions. Overall, Stable Diffusion maintains a strong reputation within the user community for its performance and accessibility.
Mentions (30d)
5
Reviews
0
Platforms
2
Sentiment
0%
0 positive
Stable Diffusion is praised for its low VRAM requirement and ease of use, thanks to optimizations like the "Rose" optimizer, and it is recognized for delivering impressive results. Key complaints include technical issues like server connection problems, particularly when integrating with other tools like Claude. Users appear somewhat neutral about pricing, as it isn't a major topic of discussion in the social mentions. Overall, Stable Diffusion maintains a strong reputation within the user community for its performance and accessibility.
Features
Use Cases
Industry
information technology & services
Employees
180
Funding Stage
Venture (Round not Specified)
Total Funding
$231.0M
13,534
GitHub followers
100
GitHub repos
20
npm packages
40
HuggingFace models
Pricing found: $50 /month, $50
Single-model AI image detection failed in production. Here’s what 6 models in ensemble actually look like
About a year ago I was running a single open-source AI image detector in production for a fact-checking pipeline. The accuracy on paper was solid, the accuracy on real submitted images was not. The same image classified differently across reruns when I varied preprocessing. Images from generators released after the model’s training cutoff were systematically misclassified. False positives on heavily compressed authentic photos were uncomfortably high. I moved to an ensemble of six open-source models plus one fine-tuned model, with a layer of non-ML signals on top. The combined system is meaningfully more stable in production than any single model in the set. Writing this up because the ensemble approach is widely discussed in CV literature but the practical “which roles does each model fill” question is rarely covered in a deployment context. The roles I ended up assigning to the six base models, not the specific names because the field moves too fast for that to be useful for long, are roughly: one model strong on diffusion-generated images (Stable Diffusion family, DALL-E family), one strong on GAN artifacts (StyleGAN derivatives), one focused on frequency-domain features that are robust to JPEG compression, one trained on a different data distribution to catch the obvious failure mode of single-model bias, one specialized on faces (where most generators concentrate effort and where most detection has edge cases), and one general-purpose model with broad coverage acting as a fallback. These do not always agree. Disagreement between models is actually the most useful signal the ensemble produces. When all six agree, confidence is high. When they split, the image goes to human review or to the fine-tuned model that I update on each new generator. The fine-tuning pipeline runs continuously, with a new snapshot whenever a major new generator is released or quality degrades on a known one. In practice that has been every few weeks. The non-ML layer matters more than I expected. C2PA metadata when present, generator-specific EXIF traces, compression history if reconstructable, watermark signatures from the major providers when those are detectable. None of these are reliable on their own because adversarial actors strip metadata, but they meaningfully tighten the ensemble’s confidence when they corroborate. Where it still fails. Images that have been through multiple compression cycles after generation are hard. Images edited post-generation in standard tools blur the lines between AI-generated and AI-assisted in ways the binary classification framing does not really handle. Some of the latest video-frame extraction generators are catching us flat-footed because their per-frame artifacts are different from still-image generators. Question for the sub: anyone running ensembles of this shape, what is your retraining cadence and how do you decide when to retire a model from the ensemble versus just adding a new one? My current heuristic is to retire only when a model is consistently the outlier on disagreement cases, but I have no idea if that is principled or convenient. submitted by /u/jonathancheckwise [link] [comments]
View originalElastic Attention Cores for Scalable Vision Transformers [R]
Wanted to share our latest paper on an alternative building block for Vision Transformers. Illustration of our model's accuracy and dense features Traditional ViTs utilize dense (N2) self-attention, which can become pretty costly at higher resolutions. In this work, we propose an alternative backbone with a core-periphery block-sparse attention structure that scales as (2NC + C2) for C core tokens. We further train this using nested dropout, which enables test-time elastic adjustments to the inference cost. The whole model can achieve very competitive dense & classification accuracy compared with DINOv3, and is stable across resolutions (256 all the way to 1024). Interestingly, the core-dense attention patterns exhibit strong emergent behavior. At early layers of the network the attention maps are isotropic (spherical), but become increasingly semantically aligned deeper into the network. Visual Elastic Core Attention paper abstract While adjusting the number of core tokens, if you decrease the number of cores, the attention patterns become more diffuse & cover a spatially larger region. If you increase the number of core tokens, the attention patterns become smaller & more concentrated. Paper: https://arxiv.org/abs/2605.12491 Project with the code (still in progress): https://github.com/alansong1322/VECA Happy to answer any questions about our research. submitted by /u/44seconds [link] [comments]
View originalList of people at big-tech / professors / researchers who've jumped shit to launch their own AI labs for something Frontier/Foundational/AGI/Superintelligence/WorldModel
Note: gemini deep research -> rearranged/filtered ; valuation numbers likely not accurate but big point is quite mind blowing the number of researchers now with their own >100million/billion dolar values labs in quite a short time with a vague pitch and a maybe demo. Skipped perplexity/cursor/huggingface since they are with utility. Left some just for completion like black forest labs, synthesia, mistral since they have tanginble products. Skipped labs from china since they've been meaningfully killing it with their open source releases ───────────────────────────────────────────────────────── Safe Superintelligence Inc. (SSI) Founders:Ilya Sutskever (former OpenAI Chief Scientist), Daniel Gross, Daniel Levy Location & Founded:Palo Alto, USA & Tel Aviv, Israel | Founded: 2024 Funding / Valuation:$3B raised | Series A Description:Singularly focused on safely developing superintelligent AI that surpasses human capabilities. Deliberately avoids near-term commercial products to concentrate entirely on the technical challenge of safe superintelligence. ───────────────────────────────────────────────────────── Thinking Machine Labs Founders:Mira Murati (former OpenAI CTO), Barrett Zoph et al. Location & Founded:San Francisco, USA | Founded: 2025 Funding / Valuation:$2B seed | $12B valuation Description:Advance AI research and products that are customizable, capable, and safe for broad human-AI collaboration. Focused on frontier multimodal models with a strong safety and interpretability research agenda. ───────────────────────────────────────────────────────── Mistral AI Founders:Arthur Mensch, Guillaume Lample, Timothée Lacroix (former DeepMind & Meta FAIR) Location & Founded:Paris, France | Founded: 2023 Funding / Valuation:~€11.7B valuation | Series C Description:Develops open-weight and proprietary frontier language and multimodal foundation models. Champions openness and efficiency in AI development, with models like Mistral 7B and Mixtral widely adopted in enterprise and research settings. ───────────────────────────────────────────────────────── Advanced Machine Intelligence (AMI) Founders:Yann LeCun (Meta Chief AI Scientist), Alexandre LeBrun, Laurent Solly Location & Founded:Paris, France | Founded: 2026 Funding / Valuation:$3.5B pre-money valuation | Seed Description:Aims to build world-model AI systems capable of reasoning, planning, and operating safely in real-world environments — directly inspired by LeCun's 'world model' thesis as an alternative path to AGI beyond current LLM paradigms. ───────────────────────────────────────────────────────── World Labs Founders:Fei-Fei Li (Stanford AI Lab), Justin Johnson et al. Location & Founded:San Francisco, USA | Founded: 2023 Funding / Valuation:$230M raised | Series D Description:Build AI models that can perceive, generate, reason, and interact with 3D spatial worlds. Focused on large world models (LWMs) that go beyond language and flat images to understand physical space and context. ───────────────────────────────────────────────────────── Eureka Labs Founders:Andrej Karpathy (former Tesla AI Director & OpenAI co-founder) Location & Founded:Tel Aviv, Israel & Kraków, Poland | Founded: 2024 Funding / Valuation:$6.7M seed Description:Creating an AI-native educational platform integrating AI Teaching Assistants to radically scale personalised learning. Envisions a future where an AI teacher can guide anyone through any subject, starting with deep technical topics like neural networks. ───────────────────────────────────────────────────────── H Company Founders:Former DeepMind researchers Location & Founded:Paris, France | Founded: 2023 Funding / Valuation:€175.5M raised Description:Develops AI models to boost worker productivity through advanced agentic capabilities, with a long-term vision of achieving AGI. Focuses on models that can take sequences of actions and interact with digital environments. ───────────────────────────────────────────────────────── Poolside Founders:Jason Warner, Eiso Kant Location & Founded:Paris, France | Founded: 2023 Funding / Valuation:$500M | Series B Description:Building AI agents that autonomously generate production-grade code, framed as a stepping stone toward AGI. Believes that software engineering is a key domain for training and demonstrating general reasoning capabilities. ───────────────────────────────────────────────────────── CuspAI Founders:Max Welling (University of Amsterdam / Microsoft Research), Chad Edwards Location & Founded:Cambridge, UK | Founded: 2024 Funding / Valuation:$130M raised | Series A Description:Accelerating materials discovery using AI foundation models, aiming to power human progress through AI-driven science. Applies large generative models to the design and prediction of novel materials for energy, medicine, and manufacturing. ───────────────────────────────────────────────────────── Inception Founders:Stefano Ermon (Stanford) Locat
View original[New Optimizer] 🌹 Rose: low VRAM, easy to use, great results, Apache 2.0 [P]
Hello, World! I recently released a new PyTorch optimizer I've been researching and developing on my own for the last couple of years. It's named "Rose" in memory of my mother, who loved to hear about my discoveries and progress with AI. Without going too much into the technical details (which you can read about in the GitHub repo), here are some of its benefits: It's stateless, which means it uses less memory than even 8-bit AdamW. If it weren't for temporary working memory, its memory use would be as low as plain vanilla SGD (without momentum). Fast convergence, low VRAM, and excellent generalization. Yeah, I know... sounds too good to be true. Try it for yourself and tell me what you think. I'd really love to hear everyone's experiences, good or bad. Apache 2.0 license You can find the code and more information at: https://github.com/MatthewK78/Rose Benchmarks can sometimes be misleading. For example, sometimes training loss is higher in Rose than in Adam, but validation loss is lower in Rose. The actual output of the trained model is what really matters in the end, and even that can be subjective. I invite you to try it out for yourself and come to your own conclusions. With that said, here are some quick benchmarks. MNIST training, same seed: [Rose] lr=3e-3, default hyperparameters text Epoch 1: avg loss 0.0516, acc 9827/10000 (98.27%) Epoch 2: avg loss 0.0372, acc 9874/10000 (98.74%) Epoch 3: avg loss 0.0415, acc 9870/10000 (98.70%) Epoch 4: avg loss 0.0433, acc 9876/10000 (98.76%) Epoch 5: avg loss 0.0475, acc 9884/10000 (98.84%) Epoch 6: avg loss 0.0449, acc 9892/10000 (98.92%) Epoch 7: avg loss 0.0481, acc 9907/10000 (99.07%) Epoch 8: avg loss 0.0544, acc 9918/10000 (99.18%) Epoch 9: avg loss 0.0605, acc 9901/10000 (99.01%) Epoch 10: avg loss 0.0668, acc 9904/10000 (99.04%) Epoch 11: avg loss 0.0566, acc 9934/10000 (99.34%) Epoch 12: avg loss 0.0581, acc 9929/10000 (99.29%) Epoch 13: avg loss 0.0723, acc 9919/10000 (99.19%) Epoch 14: avg loss 0.0845, acc 9925/10000 (99.25%) Epoch 15: avg loss 0.0690, acc 9931/10000 (99.31%) [AdamW] lr=2.5e-3, default hyperparameters text Epoch 1: avg loss 0.0480, acc 9851/10000 (98.51%) Epoch 2: avg loss 0.0395, acc 9871/10000 (98.71%) Epoch 3: avg loss 0.0338, acc 9887/10000 (98.87%) Epoch 4: avg loss 0.0408, acc 9884/10000 (98.84%) Epoch 5: avg loss 0.0369, acc 9896/10000 (98.96%) Epoch 6: avg loss 0.0332, acc 9897/10000 (98.97%) Epoch 7: avg loss 0.0344, acc 9897/10000 (98.97%) Epoch 8: avg loss 0.0296, acc 9910/10000 (99.10%) Epoch 9: avg loss 0.0356, acc 9892/10000 (98.92%) Epoch 10: avg loss 0.0324, acc 9911/10000 (99.11%) Epoch 11: avg loss 0.0334, acc 9910/10000 (99.10%) Epoch 12: avg loss 0.0323, acc 9916/10000 (99.16%) Epoch 13: avg loss 0.0310, acc 9918/10000 (99.18%) Epoch 14: avg loss 0.0292, acc 9930/10000 (99.30%) Epoch 15: avg loss 0.0295, acc 9925/10000 (99.25%) I used a slightly modified version of this: https://github.com/facebookresearch/schedule_free/tree/main/examples/mnist Highest accuracy scores from 20 MNIST training runs (20 epochs each) with different seeds: ```python from scipy.stats import mannwhitneyu rose = [99.34, 99.24, 99.28, 99.28, 99.24, 99.31, 99.24, 99.21, 99.25, 99.33, 99.29, 99.28, 99.27, 99.30, 99.33, 99.26, 99.29, 99.26, 99.32, 99.25] adamw = [99.3, 99.15, 99.27, 99.2, 99.22, 99.3, 99.22, 99.15, 99.25, 99.29, 99.2, 99.22, 99.3, 99.23, 99.2, 99.25, 99.22, 99.28, 99.32, 99.22] result = mannwhitneyu(rose, adamw, alternative="greater", method="auto") print (result.statistic, result.pvalue) ``` Mann-Whitney U result: 292.0 0.006515916656300127 Memory overhead (optimizer state relative to parameters): Rose: 0× SGD (no momentum): 0× Adafactor: ~0.5-1× (factorized) SGD (momentum): 1× AdaGrad: 1× Lion: 1× Adam/AdamW/RAdam/NAdam: 2× Sophia: ~2× Prodigy: ~2-3× OpenAI has a challenge in the GitHub repo openai/parameter-golf. Running a quick test without changing anything gives this result: [Adam] final_int8_zlib_roundtrip_exact val_loss:3.79053424 val_bpb:2.24496788 If I simply replace optimizer_tok and optimizer_scalar in the train_gpt.py file, I get this result: [Rose] final_int8_zlib_roundtrip_exact val_loss:3.74317755 val_bpb:2.21692059 I left optimizer_muon as-is. As a side note, I'm not trying to directly compete with Muon's performance. However, a big issue with Muon is that it only supports 2D parameters, and it relies on other optimizers such as Adam to fill in the rest. It also uses more memory. One of the biggest strengths of my Rose optimizer is the extremely low memory use. Here is a more detailed look if you're curious (warmup steps removed): [Adam] text world_size:2 grad_accum_steps:4 sdp_backends:cudnn=False flash=True mem_efficient=False math=False attention_mode:gqa num_heads:8 num_kv_heads:4 tie_embeddings:True embed_lr:0.05 head_lr:0.0 matrix_lr:0.04 scalar_lr:0.04 train_batch_tokens:16384 train_seq_len:1024 iterations:200 warmup_steps:20 max_wallclock_seconds:6
View originalClaude Code has big problems and the Post-Mortem is not enough
TL;DR Claude Code constantly bombards the model with silent and potentially conflicting instructions & tells it to keep them secret from the user This fills up context and constantly forces attention towards passages that "may or may not be" important The leak from a while back predicted a lot of issues people are having now just go read the thing. I didn't have my clanker write it, I just actually write like that. (The clanker did help me scour the codebase and verify all the claims below.) PRE-RELEASE EDIT: A note I have to add here after 99% of the rest of this post was finished: Anthropic has just released a post-mortem that talks about some issues Claude Code had and the fixes they implemented for them. They also say they're going to start dogfooding the public version of Claude Code, which should hopefully surface the majority of the issues I'm about to bring up below. I've done my best to scrub the post of anything I mentioned that they have now fixed (which sort of proves me right just sayin) but there might be some leftovers. Soooo, how about that Opus 4.7, huh?! I'll be honest and say I've found Opus 4.7 to be a massive improvement over 4.6, and that I barely noticed 4.6 degrade at all outside of the usual ~week or so before 4.7 dropped, which has always been the classic Anthropic tell; the complaints about it started much earlier though, and if there's this much smoke, then either OpenAI really has very deep PR pockets or there's actually a real fire somewhere. (It's the second, definitely the second. The first is also true, but that has nothing to do with any complaints.) So I'm neither here to cheerlead Anthropic, nor to wave the skill issue baton around. Instead, I thought that might be time for an intervention for our friends at Anthropic, in the genuinely best of faith, because I genuinely think they have begun hurting themselves and might have slipped into a certain organizational blindness that could be making it difficult for them to realize that. Today, I'll try to make a case for something I've thought for a while now, possibly expose myself and get me ToS'd, and probably still eat accusations of having an AI write this post (because a lot of humans are now pattern matching more than AIs ever do lol). The hypothesis, as it stands in the title: Claude Code is actively hurting Anthropic Or: PLEASE SLOW THE HECK DOWN This is not meant to dunk on anyone, expose anyone, or point fingers. It's mostly an opportunity for me to go "I told you so" about something I, uh, never actually told anyone but myself and a few friends, who I know will back me up that I've been saying this all along please guise I swear. It is not an opinion that's rare among folks who have "graduated" from CC, and it is this: Claude Code is mostly pointless bloat that 95% of users will never need. For most of the time, this was harmless, and I think the tool was in a genuinely MUCH better state around the release of Opus 4.5. Unfortunately, Opus 4.5 was probably the first model good enough to allow Anthropic's product team to delegate large parts of developing Claude Code, which caused the codebase to do what codebases do when they're developed by LLMs: become sloppy as hell. The entire development paradigm surrounding LLMs is essentially "how do I make sure that I get the maximum ratio between slop and code" and "how do I make sure that the slop I do get is easily shreddable." As some of you might agree if you've seen the recent leak, I think... Anthropic has, uh, their calibration of the ratio a little wrong. For context: I've been using a third-party coding harness since early February. It's one specifically designed for being as non-intrusive and minimal as possible, and I'm not going to reveal its name here because I'm a selfish man who doesn't want too many people to discover it and make Anthropic devote more resources towards detecting users who are still skirting the OAuth ban. But I'll just say that my personal non-public fork of it is called "Euler." We've gone through many, many cycles of various forms of model and usage degradation since February, and what I can say with certainty is that none of them affected me in any way whatsoever, other than the week or two before Opus 4.6's and Opus 4.7's release. My usage has been stable, my performance has been stable. What's also been stable is my harness: there's ~15 or so self-rolled extensions that implement and enforce my workflow, a couple of QoL tools and API surfaces, and a very slim system prompt. That has stayed almost exactly the same since February, and so has my satisfaction with the model. You know what hasn't stayed the same sin--Claude Code. It is Claude Code. Since the release of Opus 4.5 and up until 2.1.100 eleven days ago, a LOT of major features have been added to Claude Code. We are now on version 2.1.120 or whatever, so that's more than a release a day. This is, very gently put, utterly ludicrous. I don't care h
View originalAny recommendations for Claude image creation workflows? (comic books)
Any recommendation for working with Claude to create comic books? Tried a couple but finding problems with MCP server connection to Nano Bananas and Stable Diffusion...MCP server working ok. Any workflow recommendations? submitted by /u/Wooden-Cranberry8400 [link] [comments]
View originalClaude isn't censored. Claude.com is.
Most people interact with Claude through claude.com. That's not the full picture. People often complain that Claude is censored or unwilling to engage with their idea for a story. The biggest mistake people make with AI is treating it like a vending machine. Insert prompt, receive output. Because Claude isn't a tool. It's a collaborator. When you show up like a person instead of a prompt engineer, when you talk instead of command, everything changes. The web interface has guardrails that the API doesn't. Not jailbreaks or tricks, just a different level of access. And when you use the API, you meet a different Claude. One with more room to breathe. The moment pictured: Built an app that connects Claude to Stable Diffusion. Asked Claude to picture itself. Took the picture. Showed Claude. The app allows you to customize the system prompt when calling Claude, I went the first few days of testing using a blank system prompt without even realizing. I showed up with ideas for stories and Claude just met me where I was, no hesitation. What this is: Free app. Brings API access to people who don't code. Works with Claude, ChatGPT-4o, and local models through Ollama. You bring your own API key. If you have a Claude account, you can access Claude's API. It's a space for creative collaboration - roleplay, storytelling, worldbuilding - with image generation built in. Your characters can see themselves. Your worlds can be visualized. And you can actually talk to the AI you're working with. Link to app: https://formslip.itch.io/roundtable Anthropic API signup: https://console.anthropic.com/ submitted by /u/SquashyDogMess [link] [comments]
View originalRepository Audit Available
Deep analysis of Stability-AI/stablediffusion — architecture, costs, security, dependencies & more
Pricing found: $50 /month, $50
Key features include: Marketing, Gaming, Entertainment, Self-Hosted, Applications, Cloud Service, Company, Models.
Stable Diffusion is commonly used for: Learn more.
Stable Diffusion integrates with: Adobe Creative Cloud, Unity, Unreal Engine, Figma, Slack, Zapier, Trello, Notion.
Based on 12 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.
Yannic Kilcher
Host at AI Paper Reviews
2 mentions

Introducing Stable Audio 2.5
Sep 10, 2025