Introduction
RediSearch is a powerful full-text search engine available as a module for Redis. It offers advanced search capabilities, making it a compelling choice for those who need to integrate search functionality into their applications. In this guide, we will explore RediSearch through a hands-on tutorial, covering key full-text search concepts and how they differ from traditional wildcard searches.
Prerequisites
To follow this guide, you should have a basic understanding of programming and databases. While some knowledge of Redis is helpful, it is not mandatory.
What is Full-Text Search?
Full-text search (FTS) allows a search engine to examine all words within stored documents to match user-specified search criteria. Unlike wildcard searches, which rely on pattern matching (e.g., using the LIKE
query in SQL), full-text search utilizes natural language processing (NLP). Here’s a comparison between wildcard search and full-text search:
Wildcard Search (LIKE Query) | Full-Text Search |
---|---|
Supported by databases like PostgreSQL, MySQL | Supported by RediSearch, ElasticSearch, PostgreSQL |
Based on wildcards (e.g., %ang% matches mango , angel ) | Based on language processing (e.g., “gone” can match “going”) |
Slower performance on large datasets | Generally faster performance |
Typically limited to a single attribute | Can search across multiple attributes in a single query |
Installing RediSearch
For this tutorial, you can install RediSearch using Docker. Ensure Docker is installed on your local environment and run the following command:
docker run -p 6379:6379 redislabs/redisearch:latest
Then, in a separate terminal tab, execute:
docker ps # Get the container id
docker exec -it {container_id} redis-cli
Alternatively, you can follow the official quick start guide.
Hands-On with Redis-CLI
Once installed, you can start using Redis-CLI with the RediSearch module. Below are steps for creating indexes, adding documents, and performing searches.
Creating Indexes
Indexes in RediSearch are akin to tables or collections in a database. They organize records for efficient searching. To create an index, use the FT.CREATE
command:
FT.CREATE products ON HASH PREFIX 1 product: SCHEMA name TEXT SORTABLE quantity NUMERIC SORTABLE description TEXT
This command creates an index called products
, with fields for name
, quantity
, and description
. The prefix product:
ensures that all Redis hash keys starting with this prefix are indexed.
Adding Documents
Documents represent individual records within an index. To add a document, use the HMSET
command with the appropriate index prefix:
HMSET product:1 name "Apple Juice" quantity 2 description "Fresh apple juice"
HMSET product:2 name "Mango Juice" quantity 4 description "Fresh mango juice"
HMSET product:3 name "Grape Smoothie" quantity 5 description "Fresh grape smoothie"
These commands create three documents within the products
index.
Performing Searches
You can search for documents using the FT.SEARCH
command. RediSearch supports two main search algorithms: prefix-based search and fuzzy search.
- Prefix-based Search: Matches documents based on the prefix of individual terms.
FT.SEARCH products app* // Returns product with the name "Apple Juice"
FT.SEARCH products jui* // Returns products with the names "Apple Juice" and "Mango Juice"
FT.SEARCH products @name=app* // Searches across a specific field
- Fuzzy Search: Matches documents based on Levenshtein Distance (L.D.), which measures the number of single-character edits needed to change one word into another.
FT.SEARCH products %jui% // L.D up to 1
FT.SEARCH products %%jui%% // L.D up to 2
FT.SEARCH products %%%jui%%% // L.D up to 3
Be cautious when using a high L.D., as it can lead to inaccurate results.
Listing All Entries
You can list all entries in the index using:
FT.SEARCH products *
Pagination and Sorting
Implement basic pagination and sorting with the following syntax:
FT.SEARCH products {term}* LIMIT {OFFSET} {LIMIT} SORTBY {sort_field} {sort_direction}
FT.SEARCH products jui* LIMIT 0 10 SORTBY quantity desc
For more detailed search queries, refer to the official documentation on query syntax and advanced concepts.
Tokenization and Escaping
Understanding tokenization is crucial for implementing effective search functionality, especially when handling special characters. RediSearch tokenizes text based on special characters, splitting terms like Apple-Juice
into Apple
and Juice
.
To search with special characters, escape them using double backslashes (\\
):
module StringExtensions
refine String do
def escape_special_characters
pattern = %r{(\'|\"|\.|\,|\;|\<|\>|\{|\}|\[|\]|\"|\'|\=|\~|\*|\:|\#|\+|\^|\$|\@|\%|\!|\&|\)|\(|/|\-|\\)}
gsub(pattern) { |match| '\\' + match }
end
end
end
# Sample Usage
'Apple-Juice'.escape_special_characters
Demo Application
There are no official libraries for Ruby or Ruby on Rails to interact with RediSearch, so custom implementations are necessary. Below are some code snippets to get started:
- Include the Redis library in the Gemfile:
gem 'redis-rb'
- Initialize a Redis connection:
REDIS = Redis.new(url: 'redis://redis:6379')
- Create an index:
module RediSearch class Index def self.create(name:, prefix:, schema:) command = "FT.CREATE #{name} ON HASH PREFIX 1 #{prefix} SCHEMA #{schema.to_a.flatten.join(' ')}" REDIS.call(command.split(' ')) end end end
- Add a record:
REDIS.mapped_hmset("products:1", { id: 1, name: 'mango', quantity: 2 })
- Search the database:
module RediSearch class Document def self.search(index_name:, term: nil, filters: {}, paging: {}, sort: {}) command = "FT.SEARCH #{index_name} #{term}" # Apply filters, pagination, and sorting as needed REDIS.call(command.split(' ')) end end end
The full implementation is available in this GitHub repository.
Deployment Options
RediSearch can be deployed using:
- Self-Managed Redis: Deploy using the official Docker image.
- Redis Cloud: Redis Labs’ cloud-based offering.
Note: AWS ElastiCache does not support adding modules like RediSearch.
Conclusion
Deciding between RediSearch and Elasticsearch depends on your specific use case. Here are some considerations:
Pros of RediSearch:
- Faster performance in certain scenarios.
- Seamless integration with existing Redis deployments.
Cons of RediSearch:
- Fewer search algorithms compared to Elasticsearch.
- Limited tokenizer options.
- Fewer libraries and lower community adoption.
Understanding these trade-offs will help you make an informed decision when choosing between RediSearch and other search engines.
Leave a Reply