A Comprehensive Guide to RediSearch: Full-Text Search Engine for Redis Dutytaker

Introduction

RediSearch is a powerful full-text search engine available as a module for Redis. It offers advanced search capabilities, making it a compelling choice for those who need to integrate search functionality into their applications. In this guide, we will explore RediSearch through a hands-on tutorial, covering key full-text search concepts and how they differ from traditional wildcard searches.

Prerequisites

To follow this guide, you should have a basic understanding of programming and databases. While some knowledge of Redis is helpful, it is not mandatory.

What is Full-Text Search?

Full-text search (FTS) allows a search engine to examine all words within stored documents to match user-specified search criteria. Unlike wildcard searches, which rely on pattern matching (e.g., using the LIKE query in SQL), full-text search utilizes natural language processing (NLP). Here’s a comparison between wildcard search and full-text search:

Wildcard Search (LIKE Query)	Full-Text Search
Supported by databases like PostgreSQL, MySQL	Supported by RediSearch, ElasticSearch, PostgreSQL
Based on wildcards (e.g., `%ang%` matches `mango`, `angel`)	Based on language processing (e.g., “gone” can match “going”)
Slower performance on large datasets	Generally faster performance
Typically limited to a single attribute	Can search across multiple attributes in a single query

Installing RediSearch

For this tutorial, you can install RediSearch using Docker. Ensure Docker is installed on your local environment and run the following command:

docker run -p 6379:6379 redislabs/redisearch:latest

Then, in a separate terminal tab, execute:

docker ps # Get the container id
docker exec -it {container_id} redis-cli

Alternatively, you can follow the official quick start guide.

Hands-On with Redis-CLI

Once installed, you can start using Redis-CLI with the RediSearch module. Below are steps for creating indexes, adding documents, and performing searches.

Creating Indexes

Indexes in RediSearch are akin to tables or collections in a database. They organize records for efficient searching. To create an index, use the FT.CREATE command:

FT.CREATE products ON HASH PREFIX 1 product: SCHEMA name TEXT SORTABLE quantity NUMERIC SORTABLE description TEXT

This command creates an index called products, with fields for name, quantity, and description. The prefix product: ensures that all Redis hash keys starting with this prefix are indexed.

Adding Documents

Documents represent individual records within an index. To add a document, use the HMSET command with the appropriate index prefix:

HMSET product:1 name "Apple Juice" quantity 2 description "Fresh apple juice"
HMSET product:2 name "Mango Juice" quantity 4 description "Fresh mango juice"
HMSET product:3 name "Grape Smoothie" quantity 5 description "Fresh grape smoothie"

These commands create three documents within the products index.

Performing Searches

You can search for documents using the FT.SEARCH command. RediSearch supports two main search algorithms: prefix-based search and fuzzy search.

Prefix-based Search: Matches documents based on the prefix of individual terms.

  FT.SEARCH products app* // Returns product with the name "Apple Juice"
  FT.SEARCH products jui* // Returns products with the names "Apple Juice" and "Mango Juice"
  FT.SEARCH products @name=app* // Searches across a specific field

Fuzzy Search: Matches documents based on Levenshtein Distance (L.D.), which measures the number of single-character edits needed to change one word into another.

  FT.SEARCH products %jui% // L.D up to 1
  FT.SEARCH products %%jui%% // L.D up to 2
  FT.SEARCH products %%%jui%%% // L.D up to 3

Be cautious when using a high L.D., as it can lead to inaccurate results.

Listing All Entries

You can list all entries in the index using:

FT.SEARCH products *

Pagination and Sorting

Implement basic pagination and sorting with the following syntax:

FT.SEARCH products {term}* LIMIT {OFFSET} {LIMIT} SORTBY {sort_field} {sort_direction}
FT.SEARCH products jui* LIMIT 0 10 SORTBY quantity desc

For more detailed search queries, refer to the official documentation on query syntax and advanced concepts.

Tokenization and Escaping

Understanding tokenization is crucial for implementing effective search functionality, especially when handling special characters. RediSearch tokenizes text based on special characters, splitting terms like Apple-Juice into Apple and Juice.

To search with special characters, escape them using double backslashes (\\):

module StringExtensions
  refine String do
    def escape_special_characters
      pattern = %r{(\'|\"|\.|\,|\;|\<|\>|\{|\}|\[|\]|\"|\'|\=|\~|\*|\:|\#|\+|\^|\$|\@|\%|\!|\&|\)|\(|/|\-|\\)}
      gsub(pattern) { |match| '\\' + match }
    end
  end
end

# Sample Usage
'Apple-Juice'.escape_special_characters

Demo Application

There are no official libraries for Ruby or Ruby on Rails to interact with RediSearch, so custom implementations are necessary. Below are some code snippets to get started:

Include the Redis library in the Gemfile: gem 'redis-rb'
Initialize a Redis connection: REDIS = Redis.new(url: 'redis://redis:6379')
Create an index: module RediSearch class Index def self.create(name:, prefix:, schema:) command = "FT.CREATE #{name} ON HASH PREFIX 1 #{prefix} SCHEMA #{schema.to_a.flatten.join(' ')}" REDIS.call(command.split(' ')) end end end
Add a record: REDIS.mapped_hmset("products:1", { id: 1, name: 'mango', quantity: 2 })
Search the database: module RediSearch class Document def self.search(index_name:, term: nil, filters: {}, paging: {}, sort: {}) command = "FT.SEARCH #{index_name} #{term}" # Apply filters, pagination, and sorting as needed REDIS.call(command.split(' ')) end end end

The full implementation is available in this GitHub repository.

Deployment Options

RediSearch can be deployed using:

Self-Managed Redis: Deploy using the official Docker image.
Redis Cloud: Redis Labs’ cloud-based offering.

Note: AWS ElastiCache does not support adding modules like RediSearch.

Conclusion

Deciding between RediSearch and Elasticsearch depends on your specific use case. Here are some considerations:

Pros of RediSearch:

Faster performance in certain scenarios.
Seamless integration with existing Redis deployments.

Cons of RediSearch:

Fewer search algorithms compared to Elasticsearch.
Limited tokenizer options.
Fewer libraries and lower community adoption.

Understanding these trade-offs will help you make an informed decision when choosing between RediSearch and other search engines.

A Comprehensive Guide to RediSearch: Full-Text Search Engine for Redis

Leave a Reply Cancel reply

Useful Links

Contact us

Socials