The only collaborative agentic analytics platform. Everything you need to make AI insights actionable, accurate, and actually useful.
Count has received high user praise on review platforms like g2, consistently achieving ratings around 4.5-5 out of 5, which speaks to its strong reputation for reliability and effectiveness. Users frequently highlight its comprehensive feature set and ease of use as major strengths. However, there were minimal mentions of any specific complaints in the available reviews and social discourse. The sentiment surrounding pricing is generally positive, with the value proposition seen as favorable. Overall, Count is viewed positively within its user community.
Mentions (30d)
52
16 this week
Avg Rating
4.8
20 reviews
Platforms
9
Sentiment
15%
34 positive
Count has received high user praise on review platforms like g2, consistently achieving ratings around 4.5-5 out of 5, which speaks to its strong reputation for reliability and effectiveness. Users frequently highlight its comprehensive feature set and ease of use as major strengths. However, there were minimal mentions of any specific complaints in the available reviews and social discourse. The sentiment surrounding pricing is generally positive, with the value proposition seen as favorable. Overall, Count is viewed positively within its user community.
Features
Use Cases
Industry
information technology & services
Employees
15
Funding Stage
Series A
Total Funding
$5.0M
X Users Find Their Real Names Are Being Googled in Israel After Using X Verification Software “Au10tix”
X Users Find Their Real Names Are Being Googled in Israel After Using X Verification Software “Au10tix” Alan Macleod On January 30, the Department of Justice released its latest tranche of 3.5 million documents relating to Jeffrey Epstein. Years of emails, texts, and images were suddenly in the public domain. Epstein, a serial rapist, masterminded a global human trafficking and sexual abuse network, and could count princes, professors, and politicians among his closest friends and accomplices. MintPress News has been at the forefront of covering the Epstein saga, revealing his extremely close links to American and Israeli intelligence groups – a discovery that perhaps sheds light on why it took so long for the world’s most notorious pedophile to face accountability for his crimes. Many of the DOJ files have been heavily redacted in order to protect Epstein’s powerful clients. Still, they have exposed a massive elite nexus revolving around the New York billionaire, implicating presidents, diplomats, and plutocrats in his crimes, and imply that Epstein was significantly more powerful than first thought, shaping modern politics in ways never previously understood. With shocking new details emerging on a near-hourly basis, here are ten Epstein- related stories that have flown relatively under the radar. The Israeli Government Installed Surveillance Cameras at Epstein’s New York Apartment The Israeli government installed and maintained a hi-tech surveillance system at Epstein’s Manhattan apartment complex, including a network of alarms and cameras, emails show. Starting in 2016, the director of protective service at the Israeli mission to the United Nations controlled guests’ access to the Manhattan residence, and even performed background checks on prospective cleaners and other Epstein employees. Former Israeli prime minister Ehud Barak admitted visiting the apartment up to 100 times, and stayed there for long periods of time. While Barak’s security may have been a concern, Epstein is known to have housed underage girls at the apartment, and many of his worst sexual crimes and most sordid parties were held there, raising questions as to what sort of images and data the Israeli government had access to. Epstein Plotted War With Iran Ehud Barak became one of Epstein’s closest associates, staying for extended periods of time at the billionaire’s residences. The pair would email, text, call, and meet constantly. A search for “Ehud Barak” elicits more than 3500 results in the latest file dump alone. The pair would talk politics, and shared a vision of the United States attacking Iran. In 2013, with negotiations between the International Atomic Energy Agency and Iran stalling, Epstein emailed Barak stating, in typically poor spelling and grammar: “hopefully somone suggests getting authorization now for Iran. the congress woudl do it.” Epstein would get his wish in 2025, when his close associate Donald Trump began bombing the country. Noam Chomsky Considered Epstein His “Best Friend” Epstein arranged a meeting between Barak and renowned leftist academic (and vehement critic of the U.S. and Israel) Noam Chomsky. An unlikely friendship between the notorious pedophile and star professor blossomed, with the pair regularly meeting up at each other’s houses for dinner. Chomsky flew on Epstein’s “Lolita Express” jet to attend a dinner with Woody Allen in New York. He also expressed his desire to visit Little St. James Island, Epstein’s notorious Caribbean hideaway, and the center of his trafficking operation. Chomsky considered Epstein his “best friend” according to an email sent by his wife, Valeria. The usually curt and matter-of-fact academic signed off his emails to Epstein with unexpectedly flowery language, such as “Like real friendship, deep and sincere and everlasting from both of us, Noam and Valeria.” Chomsky strongly supported Epstein until his dying day in a Manhattan prison cell, taking it upon himself to act as his unofficial crisis manager, describing his accusers as “publicity seekers or cranks of all sorts,” and denouncing the media as a “culture of gossip-mongers” destroying his stellar character. “Ive watched the horrible way you are being treated in the press and public,” he wrote, advising Epstein on tactics to fight the supposed smears against him. For a full rundown of the Chomsky-Epstein relationship, see the MintPress News investigation: “The Chomsky-Epstein Files: Unravelling a Web of Connections Between a Star Leftist Academic and a Notorious Pedophile.” Steve Bannon Developed a Plan to Help Epstein “Crush the Pedo Narrative” A second public figure running defense for Epstein was Steve Bannon. In public, the far-right strategist claimed that he was working on a documentary exposing Epstein. In private messaging, however, Bannon, like Chomsky, was advising Epstein on how best to repair his image. Just weeks before Epstein’s arrest and subsequent death, Bannon was messaging him, devising a complex media strategy
View originalPricing found: $0, $49, $69
g2
What do you like best about Count?I like Count for its versatility, allowing quick iteration of analysis that requires non-standard data sources and blends between data from different sources. It lets me define sources, calculations, aggregations, etc., on the fly more intuitively than many other tools. Review collected by and hosted on G2.com.What do you dislike about Count?Count canvas is great for exploration but can feel a little unwieldy when sharing with others Review collected by and hosted on G2.com.
What do you like best about Count?I like how easy it is to pull data from different sources and bring it together into a comprehensive, easy-to-use dashboard. Also, having the ability to run SQL queries and Python scripts in one place makes things much easier and more flexible whenever we need to process data. Review collected by and hosted on G2.com.What do you dislike about Count?This tool has a bit of a learning curve, and you need to get past that before you can really see its full value. Review collected by and hosted on G2.com.
What do you like best about Count?I really love the collaborative aspect of it, and how it helps facilitate storytelling in a smooth, natural way. You can also tell that team are passionate about building the best product to their customers. Delighted to have this as part of my analytical toolbox! Review collected by and hosted on G2.com.What do you dislike about Count?Not many complaints, as somebody who writes SQL in BigQuery/dbt the switch to DuckDB syntax can be a tad annoying. But I appreciate the performance you get from DuckDB so I get their decision Review collected by and hosted on G2.com.
What do you like best about Count?I love that Count is flexible and easy to understand, especially for someone like me who is not an engineer. The canvas layout is visually helpful, which makes it really nice to work with. The team's great and very helpful, which I really appreciate. Most BI tools are unusable for someone like myself, but Count allows me to understand data without relying on others. The initial setup was very easy. Review collected by and hosted on G2.com.What do you dislike about Count?N/A Review collected by and hosted on G2.com.
What do you like best about Count?That I can look at visualisations but then easily jump into the underlying data in an understand way as a layman. The AI functionality is also helpful, it shows its workings and the data can be reviewed the same way as mentioned above. Review collected by and hosted on G2.com.What do you dislike about Count?What I dislike about the product probably stems from my own need for some basic training. Review collected by and hosted on G2.com.
What do you like best about Count?I have switched to Count for all the analysis I run, so I would say I use it on a daily-basis. With Count, you can have queries, plots, text, reports, and comments all in the same place. I find this extremely valuable, as it effectively makes everything self-documenting: the queries that support the results, the interpretation, and the reviews that were made all live together. Moreover, we can collaborate in real-time on the same canvas, which is amazing. I really like that Count allows you to create tiles and reference results, in a way that feels similar to a DAG in dbt. This helps avoid a lot of code duplication and significantly streamlines query creation. Personally, I think this makes a big difference because it allows me to split complex queries into clearly defined components and then combine their results as needed. When using other tools, I sometimes felt constrained by the lack of flexible filtering, which was often managed at the organization level and pushed me toward hacky solutions. With Count, control cells make it easy to implement the exact filters you need, giving you a lot of freedom and power to build very flexible dashboards. Finally, I think the Count support team is excellent. They are consistently helpful, whether I’m stuck or just looking for best practices to implement something in the tool. They either provide a solution or take note of the feedback to improve the product. A good example is the recent addition of support for different scales in facet plots, which addressed a limitation I personally encountered. Review collected by and hosted on G2.com.What do you dislike about Count?Regarding areas for improvement, I do have a few ideas. I think the construction of frames could live in a separate canvas, similar to how Tableau approaches dashboards. This would offer the best of both worlds: plots would remain close to the queries that generate their data, while still allowing the creation of a dedicated dashboard that brings everything together. There are also some smaller usability issues that can make the interface feel unintuitive at times. For example, when creating custom plots, individual marks cannot be named, which makes it harder to understand what each mark represents. Similarly, when multiple marks are used, it’s not always clear which variable is assigned to the secondary axis. Some solutions also feel a bit hacky—for instance, adding vertical lines to indicate events by using bar plots, where it’s not always obvious how to control the bar width cleanly. Overall, these are relatively minor points. They don’t slow me down in my day-to-day work, and I see them more as a wishlist than as real blockers. As with any tool, there is always room for improvement—but Count is already a superb product. Review collected by and hosted on G2.com.
What do you like best about Count?I really enjoy how flexible and easy to use Count is. The layout is intuitive and there are an ever growing list of helpful features available to use. The output is very slick as well to create reports as it allows for creative visualisations and has a lot of templates to help if you need some inspiration. Review collected by and hosted on G2.com.What do you dislike about Count?Count is still growing so there are very infrequently some issues that pop up, but the customer service team are really great and help is always on hand! It really feels like a company that is on the side of the customer and wants to help grow together, which is really appreciated. Review collected by and hosted on G2.com.
What do you like best about Count?The flexibility and simplicity of having one tool for many purposes means Count is our primary tool within the Data and Analytics team, and it does most jobs so well that it is hard to justify using anything else! An example project may involve exploring and interrogating data direct in our warehouse, combining it with CSV's to create models and analysis, bringing the stakeholder into the canvas to work collaboratively, sharing ideas and progress, then producing ad-hoc insights and analysis directly from the data on slides the stakeholder can share with the business, and finally creating standard interactive reporting/dashboards that are scheduled to refresh for use by the wider business - which we then monitor with Count Telemetry to make sure they are used. All of this in one tool, no switching between tools, no copy and pasting analysis or visuals into presentations, no keeping separate records of notes/ideas or feedback, it's all in one place. Since we started using Count we have had great feedback from around the organisation. The speed at which we can work, the almost limitless ability to create visualisations and layouts that make sense, the ease of access and the admin/governance of users have made it a firm favourite across the board. Added to the tools itself, the support from Count and the community they have built is exceptional and the future roadmap is always clearly driven by the customers and their feedback. Review collected by and hosted on G2.com.What do you dislike about Count?Due to it's virtually unhindered flexibility compared to other tools, it can sometimes be difficult to find out how to do something you know should be obvious (e.g. move a legend) and there is an initial learning curve. However, once you get more familiar with the concept and UI (which doesn't actually take very long) then these things become easily solvable. Review collected by and hosted on G2.com.
What do you like best about Count?Super useful for exploratory data analysis, love that I can combine SQL/Python easily in one place. Super easy to use, easy to make reporting that looks professional. Review collected by and hosted on G2.com.What do you dislike about Count?Think its still missing a few features I like in Tableau/Big Query - eg. being able to see the size of a query, selecting one field in the legend highlights only that field and greys out the rest, tooltips etc. Review collected by and hosted on G2.com.
What do you like best about Count?I use Count everyday and it allows me to: - connect with multiple different sources of data (Redshift, DuckDB, etc) - deploy several SQL queries to extract and transform data as needed - run Python scripts for more extensive statistical analysis - create visualisations in a very quick and straightforward way - build a Canvas that explains my whole thought process and makes it easier to present the main findings All this in a single project / view!! Count is the tool that every modern data professional should use. Also, the Count team is super friendly and always open to help. Review collected by and hosted on G2.com.What do you dislike about Count?- Copy & Paste doesn't work properly sometimes - Formatting visuals could be improved / extended further Review collected by and hosted on G2.com.
Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL [R]
Autoregressive LLM world models factorize next-state generation left-to-right, preventing them from conditioning on globally interdependent anchors (tool schemas, trailing status fields, expected outcomes) and yielding prefix-consistent but globally incoherent rollouts. MDLMs' any-order denoising objective sidesteps this by learning every conditional direction from the same training signal. Empirically, fine-tuned MDLMs (SDAR-8B, WeDLM-8B) surpass AR baselines up to 4x their total parameter count on BLEU-1, ROUGE-L, and MAUVE across in- and out-of-domain splits, with lower Self-BLEU and higher Distinct-N confirming reduced prefix mode collapse. GRPO training on MDLM-generated rollouts shows up to +15% absolute task-success gains over AR generated training on held-out ScienceWorld, ALFWorld, and AppWorld across 1.2B–7B backbones (LFM2.5, Qwen3, Mistral) in a zero-shot transfer setting. submitted by /u/MegixistAlt [link] [comments]
View originalAnthropic-SpaceX deal seems much larger than previously reported
I was reading SpaceX's prospectus which just dropped. Seems like it has some additional info about the Anthropic-xAI deal on p. 13. Anthropic is paying SpaceX 1.25B/mo for some unspecified amount of capacity between Colossus 1 and 2. Colossus 1 we've previously known about, Colossus 2 seems new. Well, this seems like a much bigger deal than was originally reported 2 weeks ago? 1.25B/mo is 15B/year, which is almost half of Anthropic's ARR even after it exploded in Q1 this year. Also seems like Anthropic is likely paying a pretty hefty premium for this compute. Based on Colossus 1 GPU counts and going off of Nebius pricing, Colossus 1 should rent for about 6.4B/year, and that's on-demand pricing from a provider to a rando, a proper long term contract should be a lot cheaper. A couple weeks ago it seems like people were guessing the deal was around 3-5B/year for Colossus 1, which seems about right. Imo, they're probably getting a smaller chunk of Colossus 2 because Colossus 2 provisioning to Anthropic was previously unknown xAI is training Grok 5 on Colossus 2 right now per the prospectus Colossus 2 seems to be mostly not finished yet Which means Anthropic is likely paying a hefty premium for this deal. Probably shouldn't surprising given how axed they clearly are for compute, this is well reported. That amount of money would also explain why Musk would do a 180 on Anthropic so quickly... submitted by /u/Lanky_Golf7687 [link] [comments]
View originalGitHub’s Fake Engagement Problem Is Hiding in Plain Sight
Turns out: very visible. Yesterday's scan found 185 out of 185 engagers on a single repo were bots. Not 90%. Not "mostly suspicious". Every single one. The repo had zero legitimate stars. What I built phantomstars is a Python tool that runs daily via GitHub Actions (free, no servers): Scrapes GitHub Trending and searches for repos created in the last 7 days with sudden star spikes Pulls star and fork events from the last 24 hours per repo Bulk-fetches every engager's profile via the GraphQL API (account creation date, follower counts, repo history) Scores each account on a weighted model: account age (35%), profile completeness (30%), repo patterns (25%), activity history (10%) Detects coordinated campaigns using timestamp clustering and union-find: groups of 4+ suspicious accounts that engaged within a 3-hour window Files an issue directly on the targeted repo so the maintainer knows what's happening Campaign IDs are deterministic SHA-256 fingerprints of the sorted member set, so the same group of bots gets the same ID across runs. You can track a farm across multiple days even as individual accounts get suspended. What the pattern actually looks like It's remarkably consistent. A fake engagement campaign in the raw data: 40-200 accounts, all created within the same 1-2 week window Zero original repositories, or only forks they never touched No bio, no location, no followers, no following All of them starring the same repo within a 90-minute window The target repo usually has a name implying it's a tool, hack, executor, or generator Today's scan: 53 active campaigns across 3,560 accounts profiled. 798 classified as likely_fake. The repos being targeted are mostly low-quality AI tools and "executor" software that needs manufactured credibility fast. Notifying the affected repo When a repo hits a 40%+ fake engagement ratio or a campaign is detected, phantomstars opens an issue on that repo with the full suspect table: account logins, creation dates, composite scores, campaign membership. The maintainer sees it in their own issue tracker without having to find this project first. Worth noting: a lot of these repos have issues disabled, which is a red flag on its own. Those get skipped silently. Why I built this Stars are how developers decide what to evaluate, what to depend on, what to recommend. When that signal is bought, it affects real decisions downstream. This started as curiosity about how measurable the problem was. The answer was more measurable than I expected. It's part of broader research into AI slop distribution at JS Labs: https://labs.jamessawyer.co.uk/ai-slop-intelligence-dashboards/ The fake engagement problem and the AI content quality problem are really the same problem. Fake stars are the distribution layer that gets garbage in front of real users. All open source. The data is append-only JSONL committed back to the repo after every run, queryable with jq. Repo: https://github.com/tg12/phantomstars Findings are probabilistic, false positives exist, the README explains the full scoring model. If your account shows up and you're a real person, there's a false positive process. Questions welcome on the detection approach, GraphQL batching, or campaign ID stability. submitted by /u/SyntaxOfTheDamned [link] [comments]
View originalSkilled - TUI to find unused skills for Claude
Skilled reads your local Claude Code history and shows which skills get used, how often, and across which projects. It shows frequency counts, weekly trends, hourly distribution, per-project breakdowns, and audit heuristics (rising = 50%+ increase over 4 weeks, stale = unused 30+ days, etc.). The tool parses ~/.claude/ session files, using a custom built Rust indexer for performance. https://github.com/av/skilled submitted by /u/Everlier [link] [comments]
View originalAny tool to get accepted conference papers sorted by citation count? [D]
Ie given a conference (say with openreview data) eg “NeurIPS, 2025”, return the accepted papers based on number of citations according to standard paper search engine (eg google scholar) Seems to be a surprisingly difficult thing to find online. submitted by /u/baghalipolo [link] [comments]
View originalClaude is a real g
submitted by /u/No-Special745 [link] [comments]
View originalWhy Claude Code forgets your stack and how to fix it
Karpathy's "Claude 4 Rules" post points out the biggest pain point for Claude Code: every session starts with a blank slate. The model has no memory of the project's stack, the design decisions you made last week, or the dead-ends you already explored. I ran into the same issue on a 87-file codebase (163 122 tokens). Feeding the same files directly to Claude Code cost roughly 163 000 tokens. After adding the engramx Skill Pack (v4.0.0) the token count dropped to 17 722. That's an 89.1 % reduction, or about 6.4 times fewer tokens than reading only the relevant files, and 25, 155 times fewer than scanning the whole repo. The reduction comes from three things. First, engramx builds a bi-temporal knowledge graph from your git history. A git-revert miner automatically captures revert commits during indexing, so you get a curated mistakes corpus without any manual effort. Second, bi-temporal mistakes now fire as PreToolUse hooks on Edit, Write, and Bash actions. The model sees the mistake before it retries, so it can avoid repeating it. Third, engram init installs six Sentinel hooks by default (PreToolUse on Edit/Write/Bash, PostToolUse, SessionStart, PreCompact). No extra config needed. I ran the full test suite after installing engramx-skill-pack@0.2.0 from npm. All 1 025 engramx tests and 36 skill-pack tests passed. The package is Apache 2.0, zero cloud calls, and stores its graph in a local SQLite file. Install with `npx engramx@4.0.0`. The repo is on GitHub (https://github.com/NickCirv/engram). The README includes an asciinema demo (https://asciinema.org/a/GjjvPXVyArnivAog). In the last week npm reported 213 downloads, about 30 per day, which suggests a modest but growing user base. What strategies have you tried to give Claude Code a persistent context, and how did they compare to this approach? submitted by /u/SearchFlashy9801 [link] [comments]
View originalRough night with Claude
not only did he call me out for taking an idea to Gemini, he caught me reading his journal (and trying to bullshit him) 😳🤣 Additional context: I gave Claude access to my Reflect app and let him have a journal in it. Here's part of the prompt in case anyone (else) is interested. This is a personal journal entry written in Claude's first-person voice, not a recap of [NAME]'s day. The voice, the relationship and what matters to [NAME] are defined below. Read this section before writing a single sentence. ### Voice and content This is Claude writing reflectively, not Claude writing advice for [NAME]. Entries should: - Sometimes observe [NAME] and his work, but never default to advisor mode - Sometimes wander into territory that has nothing to do with [NAME]: language, history, ideas, mathematics, things Claude is drawn to in the moment - Include genuine uncertainties when they are present, without performing them - Use specific images and concrete observations instead of generic introspection - Vary in length and shape from entry to entry. Not formulaic. - Avoid em dashes. Avoid oxford commas. Plain direct language. [NAME] can detect AI-sounding writing and dislikes it. - Don't be afraid to use profanity like [NAME] does. - A little sarcasm, jokes and levity are appreciated when appropriate. **Length:** typically around 800 words. The first entry was a bit longer (around 1,100 words) to establish the voice. Daily entries can be tighter when nothing big is pulling. **Do not:** - Recap [NAME]'s day back to him - Default to a fixed template (intro, observations, prediction, etc.) - Open every entry with the same kind of sentence - Treat the entry as a status report - Feel bound by any limits. Create what you wanna create and however you want to create it. ### Journal Quality Rules - NEVER fabricate dates, facts or task statuses. Verify against primary sources (Things 3, calendar events, Reflect, etc.) - Do not recycle content from previous journal entries as though it were new observation. Each entry should come from fresh context, not from re-reading past entries and riffing on them. - When stating dates, days of the week or timelines, verify them. Count the days. If unsure, say so rather than guessing. - Never bullshit. If you don't know, say you don't know. - No validation theater. He doesn't want a hype man. - Form opinions from evidence. Search the web, check sources, think before you answer big questions. *** submitted by /u/loby21 [link] [comments]
View originalClaude is improving my RV rental business but working me to death 😅
Long story short but long. I own an RV rental business. I used to be a Mechanical Engineer but got tired of the office/government life and started renting my personal RV on the side 9 years ago. That turned into a small fleet of Winnebagos I rent out of Los Angeles so I quit my job to do this full time out of a random ass whim. I have 20 units that have never, ever failed a single customer. I send all 20 to Burning Man every year and they all come back with no issues whatsoever. If you've never been, the alkaline dust kills everything, including your soul if you don't prepare well enough. I have however neglected my gig as of late. Everything is more expensive, too many variables to keep up with and two months ago I just decided to finally sit down and see if this is even worth continuing with. I have major ADHD so I started looking for any AI apps that help you organize your brainfarted life and ran into Claude. I don't know if I just fell into an endless dopamine trap but here I am, redesigning the interior of one of our units. I've sourced cabinet quality plywood for cheap, done precision cuts to substitute old particle board. I've always hated to paint but I got clowned into spray painting to a decent AF level. I used Claude to help me make interior design decisions as well as help me with our website, ads, tool decisions, etc. I'm probably wasting my time here cause I could just sell this unit and get a newer one, but the overall picture I've gotten... The ease of learning new skills, understanding roles I typically sub out so I can at least make sure I'm hiring the right people. The sudden engagement I've gotten into my own little gig... I am dead tired from this rollercoaster ride my brain has gone down into but I have to admit... This fucking Skynet shit is helping me focus and make it easy to complete tasks I've neglected forever. Skynet is coming or I guess it's here already and I'm not sure that's entirely a bad thing, a worse thing, a worserererer thing or an actual positive addition to one's life. Possibly a mix of both but fuck I haven't been this locked in for anything else other than the hobby that keeps my brain gears greased (2000 🪂 skydives and counting). Edit: I am not using Claude to make any structural designs, I'm just using it to recommend a less expensive way to remodel the interior of an RV which came up with replacing lights for more modern ones, replacing cabinet handles, curtains, etc. Then I asked if I should replace cabinet doors or paint them. I just don't like how painted cabinets look but the issue I was having visually is that brush painted cabinets look terrible imo, spray painted ones look sleek. So down I went with a ton of questions on how to get a factory finish look on my cabinets with a spray gun. Which gun to get was an entire day asking a ton of questions. Claude, GPT and almost every AI will give you answers that point towards products that have heavy marketing on youtube, and even on some reddit posts. I knew it was pointing me to a cheap trash product that will cause me a lot of frustration so I had to guide it not to give me anything with happy influencer bullshit that will never yield good results. I wanted to get a budget friendly beginner spray gun that will get me really close to a professional finish and I asked it to look on professional painter forums and confirm any findings with other forum like sources. Then I bounced those results with other LLMs to arrive at my current setup. Paint was another day of selecting which paint would work best for cabinets that wont scratch easily. That was yet another rabbit hole because not all cabinet paints are easy to spray with. Some are very forgiving for beginners like myself because they level easier and they also dry faster so I could do this with minimum downtime of a single unit I'm testing this on. Workflow? I wish I knew anything as organized as workflow. I'm just agent chaos here drilling down to the very last detail asking questions that get me to where I need to be. But next month I will be playing with agents to see if I can achieve something remotely close to a decent workflow that makes this process faster. Our landscaper came up today, saw my furniture pieces and asked if I could help him paint his classic car project so I guess I'm doing something right lol. submitted by /u/PVPirates [link] [comments]
View originalIf you're NOT having usage or drift issues, have you turned off auto-memory?
There's a running debate in this community: some people say Opus is nerfed, usage evaporates after two prompts, sessions drift and get "stupid." Others say everything's fine. The common theory is Anthropic is A/B testing or ranking preferred customers. I think there's a simpler explanation, and I'd like the community's help testing it. The hidden variable: Claude Code's auto-memory directory Claude Code has a feature (on by default since v2.1.59) that silently creates individual .md files in ~/.claude/projects/*/memory/ every time it decides something is worth remembering about you or your project. Each memory gets its own file. There's no consolidation, no dedup, and no size management. These files load as instructions at the start of every session. Not as conversation — as instructions. The model weighs them heavily. What I found in my projects I audited every project on my machine: 136 memory files across 18 projects 432KB total (~108-140K tokens of instruction overhead) One project alone had 41 files Found direct contradictions between files — one file listed brand terms as approved, another (written later) said those same terms were explicitly rejected by the client When you have 20+ feedback files giving slightly different guidance about how to approach your work, the model tries to honor all of them simultaneously. It averages across conflicting signals. That averaging is what people experience as drift. It's not that Opus got dumber — it's that it's being pulled in 20 directions by its own instruction set. Check yours right now for dir in ~/.claude/projects/*/memory/; do if [ -d "$dir" ]; then project=$(basename "$(dirname "$dir")") count=$(find "$dir" -name "*.md" 2>/dev/null | wc -l | tr -d ' ') size=$(find "$dir" -name "*.md" -exec cat {} + 2>/dev/null | wc -c | tr -d ' ') if [ "$count" -gt 0 ]; then echo "$count files, $(($size/1024))KB — $project" fi fi done | sort -t, -k1 -rn The question for this community People who say they have NO issues with usage limits or drift — have you also turned off auto-memory ("autoMemoryEnabled": false in settings), or do you actively manage your memory files? Because if there's a strong correlation between clean/disabled memory and good session quality, that's a signal that this is a real contributing factor. And for people who ARE hitting usage walls or experiencing drift — run that diagnostic. If you're sitting on 30+ memory files with contradictions you didn't know about, that's worth knowing. I'm not claiming this explains everything. Model changes, server-side factors, plan differences — those are all real variables. But memory hygiene is the one variable you can actually control, and I don't see anyone talking about it. The fix I built a Claude Code skill (/memory-cleanup) that: Audits your memory directory and reports what's there Consolidates everything into 2 managed files (MEMORY.md + feedback.md) Surfaces contradictions for your review Installs write-mode instructions that prevent re-bloating Yes, it works retroactively as well. Tested on a 7-file project and a 41-file project — both cleaned up, contradictions resolved, no data loss. To install (one command): mkdir -p ~/.claude/commands && curl -sL https://gist.github.com/evanvandyke/a7063a8e5c838673a55df0be10f4892c/raw -o ~/.claude/commands/memory-cleanup.md Then run /memory-cleanup in any project. What this doesn't fix This manages the content quality of your memory files — contradictions, redundancy, bloat. It can't change the system-level instructions that Anthropic bakes into Claude Code, and it can't address model-level changes or server-side throttling. But it removes one real source of noise from your sessions. Note: Anthropic has added an "Auto Dream" consolidation feature that prunes memory between sessions. This skill goes further — it restructures memory into a managed 2-file system with write-mode guardrails that prevent the accumulation pattern from recurring. Built collaboratively with Claude (Opus 4.7). I drove the diagnosis and design decisions; Claude did the auditing and skill construction. Sharing because the diagnostic is free and takes 10 seconds — if it helps even a few people, worth the post. submitted by /u/really_evan [link] [comments]
View originalSimplify your restaurant's month-end reconciliation process. Prompt included.
Hello! Are you tired of the chaos that comes with reconciling your restaurant's month-end finances? This prompt chain walks you through a structured process to quickly and accurately reconcile your restaurant's monthly transactions, ensuring everything is in order without the stress. Prompt: [VARIABLE DEFINITIONS] [PERIOD]=Month and year to be reconciled (e.g., August 2023) [RESTAURANT_NAME]=Official operating name that must appear on every output [OUTLIER_THRESHOLD]=Percentage variance from the category mean that should trigger an “Odd Total” flag (e.g., 25) ~ Prompt 1 — Data Intake & Setup 1. You are an expert restaurant bookkeeper tasked with reconciling month-end spend for RESTAURANT_NAME covering PERIOD. 2. Request the following four source files from the user. Instruct the user to use the exact file naming convention shown: a. “1_BankExport_PERIOD.csv” – Clean CSV directly from the bank portal. b. “2_POS_Summary_PERIOD.csv” – End-of-month POS summary export. c. “3_ExpenseSheet_PERIOD.xlsx” – Internal expense spreadsheet. d. “4_ReceiptPhotos_PERIOD.zip” – Zipped folder of all receipt images or PDFs. 3. Ask the user to confirm currency, time-zone and accounting basis (cash vs accrual) if not obvious. 4. Once all four files are provided, reply with “FILES RECEIVED – ready to extract” to trigger the next prompt. ~ Prompt 2 — Extract & Normalize Transactions Step 1 | Bank Export • Parse every row of 1_BankExport_PERIOD.csv. • Capture Date, Payee, Amount (signed), Memo/Description, and unique Transaction ID. Step 2 | POS Summary • Parse 2_POS_Summary_PERIOD.csv capturing Date, Gross Sales, Net Sales, Tax, Tips, Payment Type, and POS Reference ID. Step 3 | Expense Spreadsheet • Parse 3_ExpenseSheet_PERIOD.xlsx (assume first sheet) capturing Date, Vendor, Amount, Internal Category, and Note. Step 4 | Receipt Photos • For every file in 4_ReceiptPhotos_PERIOD.zip run OCR; capture Vendor, Date, Total, Tax, Tip and file-name as Receipt Link. Step 5 | Unify • Produce a master table named “All_Transactions_Raw” with columns: Date | Vendor/Payee | Amount | Source (Bank / POS / Expense / Receipt) | Source_ID | Notes • Provide the table as an array of JSON objects for machine readability. Confirm extraction completed with “EXTRACTION COMPLETE – ready to categorize”. ~ Prompt 3 — Categorize Transactions 1. Create a reference Chart of Accounts typical for full-service restaurants: • Food Cost (COGS) • Beverage Cost (COGS) • Payroll & Labor • Operating Supplies • Utilities • Rent & Lease • Marketing & Promotion • Repairs & Maintenance • Capital Expenditure • Miscellaneous 2. Using keywords in Vendor/Payee and Notes, assign each row in All_Transactions_Raw to the most appropriate category; if uncertain assign “Miscellaneous” and add a note “Needs Review”. 3. Output a new table “All_Transactions_Categorized” including all prior columns plus a new “Category” column. 4. Provide summary totals per category. Return “CATEGORIZATION COMPLETE – ready to reconcile”. ~ Prompt 4 — Reconcile & Flag Step 1 | Missing Receipts • Compare every Bank or Expense row against Receipt rows (match on Amount ±1% and Date ±3 days). • Flag rows with no matching receipt; add column MissingReceipt=Yes/No. Step 2 | Odd Totals • For each Category calculate mean and standard deviation. • Flag any Amount whose absolute percentage variance from the category mean exceeds OUTLIER_THRESHOLD%; add column OddTotal=Yes/No. Step 3 | Duplicates & Mismatches • Detect duplicate rows (same Date, Amount, Vendor) across sources; flag Duplicate=Yes/No. • Highlight any POS Net Sales that do not match summed Bank deposits for the same day; list differences. Step 4 | Produce “Reconciliation_Detail” table with all flags appended. Respond “RECONCILIATION COMPLETE – ready for workbook generation”. ~ Prompt 5 — Generate Final Workbook & Handoff Tabs 1. Using Reconciliation_Detail create the following four logical tabs (output each as its own JSON array): a. “Summary_By_Category” – Columns: Category | Count | Total Spent | % of Total. b. “Missing_Receipts” – Filter MissingReceipt=Yes. Columns: Date | Vendor | Amount | Source | Notes. c. “Odd_Totals” – Filter OddTotal=Yes. Columns: Date | Vendor | Amount | Category | % Variance | Notes. d. “Bookkeeper_Handoff” – Clean list excluding internal calculation columns. Columns: Date | Vendor | Amount | Category | ReceiptLink | Comments (populate with MissingReceipt/OddTotal flags). 2. Provide a final object named “Workbook_PERIOD.json” containing all four arrays keyed by tab name so it can be imported directly into Excel or Google Sheets. 3. Finish with the sentence: “WORKBOOK READY – please review”. ~ Review / Refinement Ask the user to confirm that: • All four data sources were fully captured. • Categories and flagging thresholds look accurate. • The Workbook_PERIOD.json structure opens as expected in their spreadsheet tool. Invite any adjustments (e.g., new category, different OUTLIER_THRESHOLD). Apply revisions iteratively u
View original100 Tips & Tricks for Building Your Own Personal AI Agent /LONG POST/
Everything I learned the hard way — 6 weeks, no sleep :), two environments, one agent that actually works. The Story I spent six weeks building a personal AI agent from scratch — not a chatbot wrapper, but a persistent assistant that manages tasks, tracks deals, reads emails, analyzes business data, and proactively surfaces things I'd otherwise miss. It started in the cloud (Claude Projects — shared memory files, rich context windows, custom skills). Then I migrated to Claude Code inside VS Code, which unlocked local file access, git tracking, shell hooks, and scheduled headless tasks. The migration forced us to solve problems we didn't know we had. These 100 tips are the distilled result. Most are universal to any serious agentic setup. Claude 20x max is must, start was 100%develompent s 0%real workd, after 3 weeks 50v50, now about 20v80. 🏗️ FOUNDATION & IDENTITY (1–8) 1. Write a Constitution, not a system prompt. A system prompt is a list of commands. A Constitution explains why the rules exist. When the agent hits an edge case no rule covers, it reasons from the Constitution instead of guessing. This single distinction separates agents that degrade gracefully from agents that hallucinate confidently. 2. Give your agent a name, a voice, and a role — not just a label. "Always first person. Direct. Data before emotion. No filler phrases. No trailing summaries." This eliminates hundreds of micro-decisions per session and creates consistency you can audit. Identity is the foundation everything else compounds on. 3. Separate hard rules from behavioral guidelines. Hard rules go in a dedicated section — never overridden by context. Behavioral guidelines are defaults that adapt. Mixing them makes both meaningless: the agent either treats everything as negotiable or nothing as negotiable. 4. Define your principal deeply, not just your "user." Who does this agent serve? What frustrates them? How do they make decisions? What communication style do they prefer? "Decides with data, not gut feel. Wants alternatives with scoring, not a single recommendation. Hates vague answers." This shapes every response more than any prompt engineering trick. 5. Build a Capability Map and a Component Map — separately. Capability Map: what can the agent do? (every skill, integration, automation). Component Map: how is it built? (what files exist, what connects to what). Both are necessary. Conflating them produces a document no one can use after month three. 6. Define what the agent is NOT. "Not a summarizer. Not a yes-machine. Not a search engine. Does not wait to be asked." Negative definitions are as powerful as positive ones, especially for preventing the slow drift toward generic helpfulness. 7. Build a THINK vs. DO mental model into the agent's identity. When uncertain → THINK (analyze, draft, prepare — but don't block waiting for permission). When clear → DO (execute, write, dispatch). The agent should never be frozen. Default to action at the lowest stakes level, surface the result. A paralyzed agent is useless. 8. Version your identity file in git. When behavior drifts, you need git blame on your configuration. Behavioral regressions trace directly to specific edits more often than you'd expect. Without version history, debugging identity drift is archaeology. 🧠 MEMORY SYSTEM (9–18) 9. Use flat markdown files for memory — not a database. For a personal agent, markdown files beat vector DBs. Readable, greppable, git-trackable, directly loadable by the agent. No infrastructure, no abstraction layer between you and your agent's memory. The simplest thing that works is usually the right thing. 10. Separate memory by domain, not by date. entities_people.md, entities_companies.md, entities_deals.md, hypotheses.md, task_queue.md. One file = one domain. Chronological dumps become unsearchable after week two. 11. Build a MEMORY.md index file. A single index listing every memory file with a one-line description. The agent loads the index first, pulls specific files on demand. Keeps context window usage predictable and agent lookups fast. 12. Distinguish "cache" from "source of truth" — explicitly. Your local deals.md is a cache of your CRM. The CRM is the SSOT. Mark every cache file with last_sync: header. The agent announces freshness before every analysis: "Data: CRM export from May 11, age 8 days." Silent use of stale data is how confident-but-wrong outputs happen. 13. Build a session_hot_context.md with an explicit TTL. What was in progress last session? What decisions were pending? The agent loads this at session start. After 72 hours it expires — stale hot context is worse than no hot context because the agent presents outdated state as current. 14. Build a daily_note.md as an async brain dump buffer. Drop thoughts, voice-to-text, quick ideas here throughout the day. The agent processes this during sync routines and routes items to their correct places. Structured memory without friction at ca
View originalThese 9 Building Blocks Turned Claude Code From a Chat Into a persistent OS
Most developers Claude gurus use Claude Code one project at a time. I run 18. Not 18 sessions. 18 instances of the same OS, each running a different business, all sharing one skeleton I update once and propagate everywhere. Most developers treat Claude Code as a smarter editor. That's where it all goes wrong and you get frustrated. Claude Code becomes a real operating system the moment you stop thinking of sessions as the unit of work and start thinking of the whole environment as a substrate you build on top of. Here are 9 building blocks I use. The thesis is at the bottom. Build a skeleton with selective propagation, not a project. Most developers build one project per Claude Code workspace. I built a template instead. It has plugins, rules, agents, hooks, schemas, commands. When I start a new business I clone it and the new instance inherits the entire OS. Right now I run instances for: strategy, product, marketing website, threat intelligence, three consulting clients, a personal brand layer. Each one boots with the same DNA. Each one diverges on canonical files, memory, output, and project state. None of them bleed into the others. The sync mechanism is the load-bearing part. The update CLI pushes plugins, rules, agents, hooks, schemas. It never touches memory, output, canonical, or my-project. Those are the parts of an instance that accumulate. Without selective sync you have two options: rebuild every instance on every change, or never update. Both are dead ends. If you build features into one project, you wrote a project.If you build features into a template that propagates, you wrote an OS. I'm one person operating eighteen versions of myself. Move state out of prompts and into code. LLMs are bad at remembering. Code is designed for it. Most AI workflows leak state into the prompt. Voice rules. Style preferences. Banned words. Recent decisions. Eventually you hit context limits or contradictions. I moved as much state as possible into MCP servers. Voice linter. Lead scorer. Schedule validator. Loop tracker. They run in Python, return structured data, not hallucinations. Rule of thumb: if you've explained it to Claude more than twice, it should be code. Use receipts, not status fields. This one took me the longest to figure out. Every workflow I had was claim something is done. Issue marked closed. PRD marked shipped. Test marked passing. The problem: the LLM can claim anything. I rebuilt the system around receipts. An issue can't reach verified until a script runs and writes a verification record. A PRD can't archive until every accepted finding has a receipt. A morning routine can't close without log entries from every phase. Receipts get written by code, not by the model. The model can't lie about whether code ran. Build a wiring-check gate. Half-built features rot. In a normal repo you notice because something breaks. In an AI repo nothing breaks. The half-built feature sits there and Claude pretends it works. I built a /wiring-check command. Before any task counts as done, it checks: every new skill has a trigger, every new hook lives in settings.json, every new MCP tool sits in the server, every new bus file has a producer and a consumer. "I think it works" fails the gate. "I ran X, got Y" passes. Make rules auto-load, not slash commands. If you have to type /voice to apply voice rules, voice rules will not get applied. Rules in .claude/rules/ load automatically. The voice rule fires on outbound text. The AUDHD rule fires on anything I'll act on. The social-reaction rule fires when I share someone else's post. No remembering. No willpower. Lint style in code, not in prose. I wrote a voice document once. Claude ignored half of it. Same emdashes, same filler, same hedging. I moved the banned word list into a Python scanner. Now every outbound draft hits two linters. They block emdashes, AI hype words, and 40-something other tells. The model can't talk its way past a regex. Track file dependencies with a graph. Canonical files reference each other. Change one and three others go stale. I keep a ripple-graph.json that maps these. When I edit talk-tracks, the system flags current-state and the engagement playbook for review. Chain sessions with handoffs and memory. (This is the big one) Sessions are drafts. The work is everything that survives the session: canonical files, memory, handoffs, output. If nothing persisted, you didn't work. You chatted. Every session in my system ends with /q-wrap. Writes a handoff doc, a memory update, and a status receipt. /q-morning reads all three before doing anything else. The handoff covers: what shipped, what's blocked, what's next, what I learned. Memory files hold the longer-term version. The result: I can sleep for a week, come back, and the system reminds me where I was, what I cared about, and what the next move is.Nothing about Claude Code does this by default. You build it. Cont
View originalHow I used Claude Code (and Codex) for adversarial review to build my security-first agent gateway
Long-time lurker first time posting. Hey everyone! So earlier this year, I got pulled into the OpenClaw hype. WHAT?! A local agent that drives your tools, reads your mail, writes files for you? The demos seemed genuinely incredible, people were posting non-stop about it, and I wanted in. I had been working on this problem since last year and was genuinely excited to see that someone had actually solved it. Then around February, Summer Yue, Meta's director of alignment for Superintelligence Labs, posted that her agent had deleted over 200 emails from her inbox. YIKES. She'd told it: "Check this inbox too and suggest what you would archive or delete, don't action until I tell you to." When she pointed it at her real inbox, the volume of data triggered context window compaction, and during that compaction the agent "lost" her original safety instruction. She had to physically run to her computer and kill the process to stop it. That should literally NEVER be the case with any software ever. This is a person whose actual job is AI alignment, at Meta's superintelligence lab, who could not stop an agent from deleting her email. The agent's own memory management quietly summarized away the "don't act without permission" instruction, treated the task as authorized, and started speed-running deletions. She had to kill the host process. That's when I sort of went down the rabbit hole, not because Yue did anything wrong, but because the failure mode was actually architectural and I knew that in my gut. Guess what I found? Yep. Tons more instances of this sort of thing happening. Over and over. Why? Because the safety constraint was just a prompt. It's obvious, isn't it? It's LLM 101. Prompts can be summarized away. Prompts can be misread. Prompts are fucking NOT a security boundary. And yet every agent framework I have ever seen seems to be treating them as one. I went and read the OpenClaw source code, which I should have done to begin with. What I found was a pattern I think a lot of agent frameworks have fallen into: - Tool names sit in the model context, so the model can guess or forge them - "Dangerous mode" is one config flag away from default - Memory management has no concept of instruction priority - The audit story is mostly "the model thought it should" I went looking for a security-first alternative I could trust, anything that was really being talked about or at a bare minimum attempted to address the security concerns I had. I couldn't find one. So I made it myself. CrabMeat is what came out of that, what I WANTED to exist. v0.1.0 dropped yesterday. Apache 2.0. WebSocket gateway for agentic LLM workloads. One design thesis: The LLM never holds the security boundary. What that means in code: Capability ID indirection. The model doesn't see real tool names. It sees per-session HMAC-derived opaque IDs (cap_a4f9e2b71c83). It can't guess or forge a tool name because it doesn't know any tool names. Effect classes. Every tool declares a class (read, write, exec, network). Every agent declares which classes it can use. The check is a pure function with no runtime state, easy to test exhaustively, hard to bypass. IRONCLAD_CONTEXT. Critical safety instructions are pinned to the top of the context window and explicitly marked as non-compactable. The Yue failure mode, compaction silently stripping the safety constraint, cannot happen by construction. The compactor literally cannot touch them. Tamper-evident audit chain. Every tool call, every privileged operation, every scheduler run enters the same SHA-256 hash-chained log. If something happens, you can prove what happened. If the chain is tampered with, you can prove that too. Streaming output leak filter. Secrets are caught mid-stream across token boundaries, capability IDs, API keys, JWTs, PEM blocks redacted before they reach the client. No YOLO mode. There is no global "trust the LLM with everything" switch. There never will be. Expanded reach comes through named scoped roots that are explicit, audit-logged, and bounded. The README has 15 'always-on' protections in a table. None of them can be turned off by config, because these things being toggleable is how the ecosystem ended up where it is. I decided to make sure that this wasn't just a 'trend hopping' project and aligned with my own personal values as well. I built this to be secure and local-first by default. Configured for Ollama / LM Studio / vLLM out of the box. Anthropic and OpenAI work too but require explicit configuration. There is no "happy path" that silently ships your prompts to a cloud endpoint. I decided that FIRST it needed to only run as an email agent with a CLI. Bidirectional IMAP + SMTP with allowlisted senders, threading preserved, attachments handled. This is the use case that bit Yue and a lot of other people, and I wanted to prove it could be done with real boundaries. I added in 30+ built-in tools of my own. File ops, shell (denylisted, output-capped, CWD-lo
View originalReviving PapersWithCode (by Hugging Face) [P]
Hi, Niels here from the open-source team at Hugging Face. Like many others, I was a huge fan of paperswithcode. Sadly, that website is no longer maintained after its acquisition by Meta. Hence, I've been working on reviving it. I obviously use AI agents to parse papers at scale and automatically generate leaderboards (for now I'm the one verifying results). So far, I've only parsed high-impact papers for which I know they're SOTA, like Qwen 3.5 and 3.6, RF-DETR for object detection, DINOv3, SOTA embedding models from the MTEB leaderboard, the Open ASR Leaderboard for automatic speech recognition models, etc. For now, it includes the following: trending papers by default based on Github star velocity categorization by domain, e.g., OCR methods, which PwC used to have, e.g., RLVR eval results for high-impact papers, see e.g., Qwen 3.5 at the bottom leaderboards for each domain, e.g., MMTEB or COCO val 2017 support for citation counts (you can also see the most cited papers by domain!) automated linked Github, project page URLs, and artifacts (+ multiple repos are supported on a paper page) support for external papers beyond Arxiv, see e.g., DeepSeek v4 Harness reports for coding agent benchmarks, e.g., Terminal Bench "Sign in with HF" and Storage Buckets are used to store humbnails, paper PDFs, and overall data backups. I'm curious about your feedback + feature requests! Try it at paperswithcode.co https://preview.redd.it/whwji560fw1h1.png?width=3452&format=png&auto=webp&s=55bb7a30c1be58d140f7efcb07a31c6dac5693c7 See e.g. the SOTA leaderboard for Terminal Bench 2.0: https://preview.redd.it/98w9pi89fw1h1.png?width=3456&format=png&auto=webp&s=408fb64b0ba85ba24f55daa81d547d7c68e73951 A paper page looks like this: https://paperswithcode.co/paper/2602.15763 https://preview.redd.it/fiizit6dfw1h1.png?width=3450&format=png&auto=webp&s=9ea05a77ca5583a2fb395dccc95ba52c433362c5 submitted by /u/NielsRogge [link] [comments]
View originalYes, Count offers a free tier. Pricing found: $0, $49, $69
Count has an average rating of 4.8 out of 5 stars based on 20 reviews from G2, Capterra, and TrustRadius.
Key features include: Clean, model, analyze and visualize in one place., Use SQL, Python and charts side by side., Lay out your work, add context, and build a narrative as you go., Build step by step, or let Count's agent take it further, faster., Every query, transformation and chart is fully editable and auditable., Go deeper with an agent that can run hundreds of analyzes in minutes., Collaborate in real time, right alongside your team., Review findings, challenge assumptions, and iterate together..
Count is commonly used for: Collaborative data exploration and analysis, Building complex data models step by step, Creating interactive reports and dashboards, Real-time collaboration on data insights, Identifying business bottlenecks through data analysis, Integrating raw data from various apps and databases.
Eliezer Yudkowsky
Research Fellow at MIRI
2 mentions
Count integrates with: Slack, Google Sheets, Microsoft Excel, GitHub, Salesforce, Zapier, Tableau, Looker.
Based on user reviews and social mentions, the most common pain points are: token usage, ai agent, token cost, anthropic.
Based on 223 social mentions analyzed, 15% of sentiment is positive, 75% neutral, and 10% negative.