Why Instacart Moved to Postgres & pgvector to Boost Semantic Search

Delivering fast and accurate search is crucial for platforms like Instacart, which serves 14 million daily users across billions of products.

The challenge goes beyond simple keyword matching, demanding semantic understanding to accurately interpret user intent behind ambiguous queries like “healthy food”.

The system must identify relevant products beyond exact text matches. Moreover, it needs to reflect real-time inventory, price, and ranking changes, subjecting the search database to high write and read workloads to ensure up-to-date and precise results.

Instacart previously relied on Elasticsearch for search and Facebook AI Similarity Search (FAISS) for semantic search. However, the company moved to a hybrid search stack with Postgres and pgvector, significantly boosting search performance. The details of this process were outlined in a blog post published last month.

As Instacart’s database required frequent changes and updates based on its inventory, its denormalised data model in Elasticsearch necessitated frequent partial writes to billions of items. “Over time, the indexing load and throughput caused the cluster to struggle so much that fixing erroneous data would take days to be corrected,” the company stated.

Instacart also aimed to add machine learning models for its search features, which further increased the already high indexing load and costs. This took a hit on the read performance, making the overall search performance unsustainable, Instacart added.

‘Somewhat Unconventional, But Made Sense for Our Case’

Instacart then migrated its text retrieval stack to sharded Postgres instances with a high degree of data normalisation. “While this might seem somewhat unconventional, it made sense for our use case,” the company explained.

“A normalised data model allowed us to have a 10x reduction in write workload compared to the denormalised data model that we had to use in Elasticsearch,” Instacart said, indicating that Postgres led to substantial savings in storage and indexing.

Besides, a key advantage of using Postgres was the ability to store ML features and model coefficients in separate tables. This architecture meant each dataset could have a different update frequency and be combined on-demand using SQL, providing the flexibility needed for more sophisticated ML retrieval models.

Furthermore, moving compute closer to storage, by using Postgres on non-volatile memory express (NVMe), boosted search performance twice as fast for Instacart. Unlike traditional methods, pushing logic to the data layer eliminated multiple network calls and data overfetching, halving latency and simplifying their application.

FAISS to pgvector Migration ‘Was a Great Success’

Instacart initially implemented semantic search using a standalone FAISS service for Approximate Nearest Neighbour (ANN) search, while full-text search remained on Postgres. This hybrid setup combines results in the application layer and improves search quality.

However, FAISS’ limitations in attribute filtering, overfetching, and the overhead of maintaining two separate, potentially inconsistent systems led Instacart to seek a unified solution.

They opted for pgvector, a Postgres extension, to consolidate both retrieval mechanisms. This move eliminated data duplication, reduced operational complexity, enabled finer-grained control over result sets, and leveraged Postgres’ existing capabilities for real-time filtering, ultimately boosting search performance and user satisfaction.

“Based on the offline performance of pgvector, we launched a production A/B test to a section of users. We saw a 6% drop in the number of searches with zero results due to better recall,” Instacart said. “This led to a substantial increase in incremental revenue for the platform as users ran into fewer dead-end searches and were able to better find the items they were looking for.”

A Modern Search Infra is the Need of the Hour

Besides Instacart, several companies worldwide have adopted modern infrastructures for search. Last year, Shopify, another e-commerce giant outlined in a blog post how it improved consumer search intent with real-time machine learning capabilities.

Shopify enhanced its storefront search with AI-powered semantic capabilities, moving beyond keyword matching to better understand consumer intent. This was achieved by building foundational machine learning assets, particularly real-time text and image embeddings.

Shopify’s real-time embedding pipeline processes 2,500 embeddings per second on Google Cloud Dataflow, but scaling GPU-accelerated streaming inference presented critical optimisation challenges.

Dataflow spawned 16 processes with 12 threads each, loading 192 images simultaneously into memory and causing frequent crashes. Rather than paying 14% more for high-memory instances, Shopify dialled down the thread count to four. This reduced concurrent images to 64, cutting memory usage by 2.6x without hurting performance, considering that the GPU was already the bottleneck.

Each process loaded its own copy of the ML model, eating up GPU memory but keeping inference fast through parallelism. On the other hand, sharing a single model across processes saved memory but significantly slowed throughput.

Meanwhile, unpredictable traffic bursts meant images arrived one at a time instead of in efficient batches. Although forcing batches helped GPU utilisation, it added too much latency.

Shopify’s solution embraced these trade-offs. It kept multiple model copies running because its models were lightweight enough and accepted inefficient batching, because parallel processing still kept GPUs sufficiently busy to meet performance targets.

“When a merchant edits their products or uploads a new image, they want these updates to be available on their website instantly. Additionally, the ultimate objective is to boost sales for our merchants and offer pleasant interactive experiences for their consumers,” Shopify stated.

“Our data suggests that up-to-date embeddings achieved through a streaming pipeline allow us to optimise for this, despite the additional complexity it incurs when compared with a batch solution,” it added.

The post Why Instacart Moved to Postgres & pgvector to Boost Semantic Search appeared first on Analytics India Magazine.

‘Somewhat Unconventional, But Made Sense for Our Case’

FAISS to pgvector Migration ‘Was a Great Success’

A Modern Search Infra is the Need of the Hour

Related Posts