MySQL full-text search has been a reliable option since version 3.23, and for many web applications it handles the job well enough. But as datasets grow and search expectations increase, MySQL's built-in capabilities start showing their limits. Elasticsearch is a dedicated search engine built on Apache Lucene that handles text search at a scale and speed MySQL cannot match. Knowing which to use, and when to migrate between them, affects both application performance and long-term development costs.

This guide covers how each search method works, where their practical differences lie, and how to make a decision that fits your application rather than following generic advice.

How MySQL Full-Text Search Works

MySQL's full-text index creates an inverted index over TEXT and VARCHAR columns. When you add a FULLTEXT index to a column, MySQL builds a data structure that maps each word to the rows that contain it. Searching becomes a matter of looking up the search terms in the index and returning matching rows sorted by relevance.

To add a full-text index to an existing table, you use an ALTER TABLE statement. For new tables, you include the index in your table definition.

ALTER TABLE articles ADD FULLTEXT(title, content);

SELECT * FROM articles
WHERE MATCH(title, content) AGAINST('search term' IN NATURAL LANGUAGE MODE);

MySQL full-text search supports two primary modes. Natural language mode returns results ranked by relevance score, which is useful for straightforward keyword matching. Boolean mode supports operators like AND, OR, and NOT, allowing more precise query control.

The search engine handles stemming automatically, so searching for "running" also matches "run" and "ran". Very common words (stop words) are ignored by default. For a database with hundreds of thousands of records and basic keyword matching needs, MySQL full-text is often the practical choice because it requires no additional services or infrastructure.

MySQL Full-Text Search Limitations

MySQL full-text search has meaningful constraints for demanding use cases. Understanding these limits helps you recognise when your application is approaching them.

Fuzzy Matching and Typos

MySQL full-text does not support fuzzy matching by default. If a user types "runing" instead of "running", a standard MySQL full-text query returns nothing. You can work around this with phonetic algorithms or Levenshtein distance functions, but these add complexity and performance overhead that defeats the simplicity of using MySQL in the first place.

Large Dataset Performance

Performance degrades on very large datasets, typically tens of millions of rows. The full-text index must be stored on the same server as your data, and as the index grows, query response times increase. If your server runs out of RAM, the filesystem cache cannot keep the index in memory, and queries slow down noticeably. Our MySQL slow query optimisation guide covers how to diagnose and address performance issues in MySQL, which becomes relevant once your dataset grows large enough to stress the database.

Field Weighting and Relevance Control

MySQL full-text has no built-in way to weight fields. If you want matches in the title to rank higher than matches in the body content, you need to work around this with custom SQL, separate queries, or application-level merging. This is technically possible but adds significant complexity.

Multi-Language and Specialized Stemming

MySQL's stemming support is basic. For English text it works reasonably well, but for languages with more complex morphology, or for domain-specific vocabulary, the built-in stemmer often produces poor results. If you need language-specific analyzers, custom dictionaries, or synonym support, MySQL full-text requires substantial customisation.

No Business Signal Integration

If your search results need to factor in user behaviour, click-through rates, recency, or other business signals, MySQL full-text has no native mechanism. You would need to build this yourself, at which point you are rebuilding functionality that dedicated search engines already handle better and faster.

How Elasticsearch Works

Elasticsearch is a distributed search engine built on top of Lucene. It stores documents as JSON and builds a Lucene index over the fields you specify. Unlike MySQL, Elasticsearch is designed from the ground up for search: it handles relevance scoring, supports complex queries, and scales horizontally across multiple nodes.

When you index a document in Elasticsearch, the engine analyses the text through several steps. First, tokenisation splits the text into individual terms. Then stemming reduces words to their root form. Stop words are removed. The analysed terms are stored in a structure optimised for fast retrieval. Queries are distributed across all shards (Elasticsearch's unit of data distribution), and results are merged and ranked by relevance score.

Elasticsearch's query DSL supports very complex queries that MySQL cannot handle natively:

  • Nested queries: Search within arrays of objects with proper relationship handling.
  • Geo-distance queries: Filter results by proximity to a location.
  • Aggregated faceted search: Count how many results match each category, price range, or other attribute.
  • Completion suggester: Purpose-built for autocomplete functionality.
  • Multi-index search: Query across multiple indices simultaneously.

The MySQL 8 full-text search guide covers MySQL's built-in capabilities in more detail, which is worth reading if you want to understand exactly what you would be giving up before moving to a dedicated search engine.

When MySQL Full-Text Is the Right Choice

MySQL full-text is the practical choice in several scenarios. If your dataset is small enough that performance is not a concern, generally under a few hundred thousand records, MySQL full-text handles it without issue. If your search requirements are straightforward keyword matching without fuzzy logic, complex relevance ranking, or multi-language support, MySQL does the job.

If you want to avoid the operational complexity of running a separate search service, keeping everything in MySQL is sensible. There is no sync to manage, no second service to monitor, and no additional failure point. When your search and transactional data are the same dataset, and you do not need to search across multiple data sources, MySQL's simplicity is worth preserving.

If you are building a blog, a simple product catalogue, or an application where search is a secondary feature rather than a core differentiator, MySQL full-text is almost certainly sufficient. The operational simplicity of keeping search and data in one system is valuable and should not be surrendered without good reason.

When Elasticsearch Is the Right Choice

Elasticsearch becomes necessary when your search requirements exceed MySQL's capabilities. If you need fuzzy matching, autocomplete, or typo tolerance, Elasticsearch handles these out of the box. If your dataset is large enough that MySQL full-text performance is inadequate, Elasticsearch's distributed architecture scales to handle billions of documents across multiple nodes.

When you need to search across multiple different entity types at once, such as users, posts, products, and comments in a single query, Elasticsearch's unified index model makes this straightforward. If search relevance needs to incorporate business signals like click-through rate, recency, or user-defined boost factors, Elasticsearch's function score queries handle this naturally.

Elasticsearch also excels when you need to aggregate data from multiple sources. If your application has a primary database but also needs to search across uploaded documents, third-party API data, or legacy system data, Elasticsearch's ingestion pipelines can aggregate all of this into a single searchable index without forcing you to migrate data into your primary database.

The Operational Cost of Elasticsearch

Elasticsearch is not free to operate. A production Elasticsearch cluster requires at least three nodes for high availability. Each node needs sufficient RAM to keep index segments in the filesystem cache. The operational complexity is significantly higher than MySQL: you need to manage cluster health, monitor shard allocation, handle index lifecycle management, and back up the search index separately from your primary database.

Managed Elasticsearch services reduce operational overhead but add cost. Amazon OpenSearch Service, Elastic Cloud, and similar offerings handle the infrastructure management but still require expertise to configure correctly, tune performance, and troubleshoot issues.

Before adding Elasticsearch to your stack, confirm that the search capability improvements justify the infrastructure cost and operational complexity. A dedicated search engine that nobody can maintain is worse than a simpler solution that your team understands.

Synchronising Data Between MySQL and Elasticsearch

If you choose Elasticsearch, you need to keep it synchronised with your primary database. Several approaches exist, each with different trade-offs.

Application-Level Sync

Your application writes to MySQL and simultaneously indexes to Elasticsearch. This is the most common approach and the most straightforward to debug. Your application treats Elasticsearch as a write-through cache: every time you write to MySQL, you also update the corresponding Elasticsearch document.

# Example: Application-level sync pseudocode
def save_article(article):
    db.insert('articles', article)
    es.index('articles', article.id, article.to_json())
    return article

This approach keeps search results fresh and is the simplest pattern to understand and maintain. The downside is that your application becomes coupled to both systems, and write operations take longer since they involve two services.

Change Data Capture

Tools like Debezium read the MySQL binlog and push changes to Elasticsearch automatically. This decouples your application from the sync process but introduces another service to manage. Change data capture is more resilient to application bugs because the sync happens regardless of what the application code does.

Scheduled Bulk Reindexing

Periodically reindex everything from scratch. This approach is acceptable for datasets that do not change frequently, such as product catalogues that update once daily. It is simple to implement but means search results are always out of date by at least the interval between reindex runs.

Performance Considerations for Both Approaches

Both MySQL full-text and Elasticsearch require careful configuration to perform well. For MySQL, the index must fit in memory or query performance drops significantly. For Elasticsearch, shard count and allocation strategy directly affect query performance and cluster stability.

MySQL full-text queries cannot use covering indexes in the same way that regular B-tree indexes can. The query still needs to access the actual row data after the full-text index identifies matching rows, which adds latency on large tables.

Elasticsearch shards should be sized to hold roughly 20-50GB of data each. Too many small shards creates overhead; too few large shards limits parallelism. Getting this right requires understanding your data growth patterns and query patterns.

Making the Decision for Your Application

The choice between MySQL full-text and Elasticsearch is not binary. Many production systems use MySQL as the system of record and Elasticsearch as the search index. MySQL holds the authoritative data; Elasticsearch holds a copy optimised for search. This pattern works well for applications that have outgrown MySQL full-text but need to preserve MySQL for transactional operations.

Start with MySQL full-text. It is simpler, requires no additional infrastructure, and handles most common search needs adequately. When you hit its limits in practice, not theoretically, migrate to Elasticsearch. The pain of outgrowing MySQL full-text is a good problem to have because it means your application is succeeding.

If you are building a new application and know from the start that search will be a core feature with complex requirements, fuzzy matching, autocomplete, and multi-language support, start with Elasticsearch from day one. Retrofitting Elasticsearch onto an existing application is more disruptive than building it in from the beginning.