The multimodal lakehouse for AI. One table for raw data, embeddings, and features. Searchable, processable, trainable across every stage of the model
LanceDB is praised for its effectiveness in managing and analyzing codebases, especially with its integration capabilities. Users appreciate its functionality for indexing codebases, creating dependency graphs, and providing insights into dead code and git intelligence. Some users express concerns about high token usage when working with large code repositories, which affects operational efficiency and costs. Overall, the sentiment around pricing is mixed, but LanceDB retains a positive reputation for its robust feature set and developer-oriented tools.
Mentions (30d)
0
Reviews
0
Platforms
2
GitHub Stars
10,115
863 forks
LanceDB is praised for its effectiveness in managing and analyzing codebases, especially with its integration capabilities. Users appreciate its functionality for indexing codebases, creating dependency graphs, and providing insights into dead code and git intelligence. Some users express concerns about high token usage when working with large code repositories, which affects operational efficiency and costs. Overall, the sentiment around pricing is mixed, but LanceDB retains a positive reputation for its robust feature set and developer-oriented tools.
Features
Use Cases
Industry
information services
Employees
45
Funding Stage
Series A
Total Funding
$41.1M
10,115
GitHub stars
20
npm packages
10
HuggingFace models
Pricing found: $30, $30
mnemo - a local semantic memory for Claude Code (early stage, looking for testers and contributors)
Most "AI memory" tools make the vector database the source of truth. Which means your knowledge is opaque, hard to inspect, and one corruption away from being gone. I am building mnemo around a different idea: plain markdown files are the source of truth. LanceDB and SQLite are indexes built on top of them - both fully disposable, both rebuilable from the files in seconds. The three layers each have a job: .mnemo/knowledge/ - one .md file per item. This is what you actually own. Open it in any editor, diff it, copy it to another machine. LanceDB - semantic search index. Turns mnemo search "why did we pick postgres" into ranked results. Holds no data that isn't already in the markdown files. If it breaks: mnemo reindex. SQLite - metadata index. Tracks when items were ingested, source URLs, tags, and staleness. This is what makes mnemo stale fast - instead of scanning every file, it queries a table with ingested_at and stale_after_days. Also rebuilt from the files if lost. Staleness is a first-class concept because knowledge rots. You can set a threshold when you add a URL: mnemo add https://docs.stripe.com/webhooks --stale-days 30 After 30 days, mnemo stale surfaces it. mnemo refresh shows you what you wrote and prompts you to update it. Architectural decisions, API docs, third-party behavior - it all drifts, and the tool knows it. The actual use case is Claude Code. Claude is stateless — every session starts cold. I put two hooks in CLAUDE.md: before each task Claude runs mnemo search " " and reads the results; while working it calls mnemo add "..." when it discovers something worth keeping. After that it runs invisibly. Everything runs locally. No API key, no cloud, no telemetry. The embedding model (~25 MB) downloads once on first use. GitHub: [https://github.com/pixari/mnemo\] - early stage, feedback welcome. submitted by /u/Alternative_One_4804 [link] [comments]
View originalHow I cut Claude Code usage in half (open source)
Every time I start a Claude Code session on a real codebase, it burns through tokens just trying to understand the repo. Read the file tree, open 20 files, trace the imports, figure out how auth connects to the API layer. On a 50k+ LOC project that exploration phase eats your context window before any real work starts. I built Repowise to fix this. It's a codebase intelligence layer that pre-computes the structural knowledge Claude Code needs and exposes it through MCP tools. Dependency graphs via AST parsing, searchable docs in LanceDB, git history tracking, architectural decision records. All local, nothing leaves your machine. Instead of Claude spelunking through your files every session, it calls something like `get_context` or `get_overview` and gets the full picture in one shot. Eight MCP tools total including `get_risk`, `search_codebase`, `get_dependency_path`, and `get_dead_code`. The savings come from the exploration side. That caveman prompt post from last week was clever for cutting output tokens, this attacks the input/exploration side. Claude already has the map so it stops burning context just to get oriented. Setup is just `pip install repowise`, then `repowise init` in your repo. Works with Claude Code, Cursor, and Windsurf. Fully open source, AGPL-3.0, self-hostable. GitHub: https://github.com/repowise-dev/repowise Would love your feedback on the same! submitted by /u/Obvious_Gap_5768 [link] [comments]
View originalBuilt a CLI that indexes codebases dependency graphs, dead code, git intelligence, wiki generation
Been working on this for a while. It's a CLI that runs analysis on any codebase. pip install repowise repowise init --index-only repowise serve What you get, both webui and as mcp to Claude - Interactive dependency graph (D3.js, handles 2000+ nodes) - Dead code detection with confidence scores - Git hotspots and code ownership - Bus factor per module Optionally point it at an LLM and it generates wiki docs for every file too. Tech: Python/FastAPI backend, Next.js frontend, tree-sitter for parsing, LanceDB for vector search, SQLite. github: https://github.com/repowise-dev/repowise What would you add? Trying to figure out the next useful feature. submitted by /u/aiandchai [link] [comments]
View originalRepository Audit Available
Deep analysis of lancedb/lancedb — architecture, costs, security, dependencies & more
Pricing found: $30, $30
Key features include: You can accept all, reject all, or customize your privacy settings., Non-essential cookies are disabled by default., Closing this banner does not confirm any choice..
LanceDB is commonly used for: The new columnar standard for multimodal data.
LanceDB integrates with: TensorFlow, PyTorch, Keras, Scikit-learn, Hugging Face, Apache Spark, Django, Flask, FastAPI, Streamlit.
LanceDB has a public GitHub repository with 10,115 stars.
Robert Nishihara
Co-founder at Anyscale / Ray
1 mention