Beyond LIKE: An Introduction to Full-Text Search with Elasticsearch

The Problem with SQL's LIKE

If you've ever tried to build a search feature using a SQL database, you probably reached for the LIKE operator: SELECT * FROM articles WHERE content LIKE '%search term%';. While this works for simple cases, it quickly falls apart:

It's Slow: A leading wildcard (%...) prevents the database from using an index, forcing a full table scan, which is incredibly slow on large datasets.
It's Not "Smart": It has no concept of relevance. A document that mentions the search term once is treated the same as one that mentions it 20 times.
It's Inflexible: It can't handle typos (search temr), synonyms (search word), or different forms of a word (plurals, verb tenses like searching vs. searched).

To solve these problems, you need a dedicated full-text search engine.

What is Elasticsearch?

Elasticsearch is a distributed, open-source search and analytics engine built on top of a library called Apache Lucene. It's designed from the ground up to solve the problems listed above.

Think of it like the index card catalog in a library, but on steroids. Instead of just looking up book titles, it analyzes the entire content of every book, understands the language, and can instantly find not just exact matches, but the most relevant passages, even if you make a spelling mistake.

Core Concepts

Document: A JSON object that represents a single piece of data you want to make searchable (e.g., a product, a user profile, a log entry).
Index: A collection of documents with a similar structure. It's roughly analogous to a table in a SQL database.
Inverted Index: This is the secret sauce. Instead of a regular index that maps a Document ID to its content, an inverted index maps each word to a list of Document IDs where that word appears. When you search for a word, Elasticsearch just looks it up in this massive dictionary to find all matching documents instantly.

The Two-Step Process: Indexing and Searching

Working with Elasticsearch involves two main actions.

1. Indexing: Sending Your Data to Elasticsearch Before you can search for data, you have to put it into an Elasticsearch index. This process is called indexing. During this step, Elasticsearch performs an analysis on your text fields:

Tokenization: Breaks text down into individual words (tokens). "The quick brown fox" -> [the, quick, brown, fox].
Lowercasing: Converts all tokens to lowercase.
Stop Word Removal: Removes common words like the, a, is.
Stemming: Reduces words to their root form. searching, searched, searches all become search.

This analysis is what makes the search so "smart" and flexible.

Example of indexing a document using cURL:

Bash

curl -X PUT "localhost:9200/products/_doc/1" -H 'Content-Type: application/json' -d'
{
  "name": "Super Fast Laptop",
  "description": "A very quick laptop for all your development and gaming needs.",
  "price": 1299.99,
  "in_stock": true
}
'

2. Searching: Querying Your Index Once your data is indexed, you can run queries against it. Elasticsearch provides a rich JSON-based Query DSL (Domain-Specific Language). The most common query is the match query, which performs a full-text search on a field.

Example of a search query using cURL:

Bash

curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "description": "quick develop"
    }
  }
}
'

Elasticsearch will find documents whose description contains words like "quick" or "development" (thanks to stemming). The results will be returned sorted by a relevance score (_score), which calculates how well each document matches the query.

Code Snippet: Using the Elasticsearch Client in Node.js

JavaScript

import { Client } from '@elastic/elasticsearch';

// --- Setup ---
const client = new Client({ node: 'http://localhost:9200' });

async function runExample() {
  const indexName = 'products';

  // --- 1. Index a document ---
  console.log('Indexing a document...');
  await client.index({
    index: indexName,
    id: '1',
    document: {
      name: 'Super Fast Laptop',
      description: 'A very quick laptop for all your development needs.',
      price: 1299.99
    }
  });
  // Ensure the document is indexed before searching
  await client.indices.refresh({ index: indexName });

  // --- 2. Search for the document ---
  console.log('Searching for documents...');
  const { hits } = await client.search({
    index: indexName,
    query: {
      match: {
        description: 'quick develop' // Note the typo is handled by analysis
      }
    }
  });

  console.log('Search results:');
  console.log(hits.hits); // The 'hits' array contains the search results
}

runExample().catch(console.error);

LearnCodePro

Beyond LIKE: An Introduction to Full-Text Search with Elasticsearch

What is Elasticsearch?

The Two-Step Process: Indexing and Searching

Supercharge Your App with Redis: Caching, Sessions, and Pub/Sub

Decoupling Your Services with Message Queues: RabbitMQ & Kafka Fundamentals

Don't Block the User: Efficient Background Job Processing

Quick Navigation

This Series

Topics in MERN STACK WEB DEVELOPMENT

Categories

Learn More

Want to Track Your Progress?

What is Elasticsearch?

The Two-Step Process: Indexing and Searching

More in Caching, Search, Message Queues

Supercharge Your App with Redis: Caching, Sessions, and Pub/Sub

Decoupling Your Services with Message Queues: RabbitMQ & Kafka Fundamentals

Don't Block the User: Efficient Background Job Processing

Quick Navigation

This Series

Topics in MERN STACK WEB DEVELOPMENT

Categories

Learn More

Want to Track Your Progress?