Been experimenting with Turso as a distributed cache for our edge AI pipeline and curious about others' experiences. We're running LLM inference at the edge (using Cloudflare Workers) and need to cache embeddings + model responses across multiple regions.
Currently hitting ~15ms query latency from our edge functions, which feels reasonable, but wondering if anyone's squeezed better performance out of LibSQL for similar workloads. Our setup:
CREATE TABLE embedding_cache (
input_hash TEXT PRIMARY KEY,
embedding BLOB,
model_version TEXT,
created_at INTEGER
);
Storing ~1536-dim vectors as BLOBs (about 6KB each). Cache hit rate is solid at 68%, but those cold misses are painful when we have to fall back to OpenAI's API.
Two specific questions:
The multi-region replication is honestly the killer feature here - way simpler than managing Redis clusters across edge locations. Just want to make sure I'm not missing obvious optimization opportunities.
Running about 50K queries/day currently, planning to scale to 500K+ soon.
Sorry if this is basic, but I'm still wrapping my head around edge caching - are you storing the actual embedding vectors in Turso or just references/metadata? And when you say 15ms latency, is that for reads or writes? I'm working on something similar but using Supabase right now and getting way worse performance (~200ms). Trying to understand if the bottleneck is my database choice or if I'm doing something fundamentally wrong with how I'm structuring the cached data.
We switched from Turso to Upstash Redis for our edge caching last month and saw query times drop to ~3-5ms. The key was using their global replicas + connection pooling. For embeddings specifically, we're using their vector similarity search which handles the heavy lifting. Setup was pretty straightforward with their REST API - works great with CF Workers since you don't need persistent connections. Only downside is cost scales quickly with data volume, but for our use case the performance gain was worth it.