Blogs

people searching through boxes of vinyl records in a line
Full-Text Search in AEM Pages and Assets including PDF, Excel and PowerPoint

 

Search is an important feature of any website. Implementing an efficient search on your website can considerably improve the experience of your visitors. For websites on AEM, creating a custom search component without creating any new indexes has been a challenge. 

We took up the challenge 

We created Full-Text Search – a custom search component to help end users search through all your web pages and published assets. This includes searching through PDFs, Excel files, PowerPoint presentations, asset metadata and SEO tags. This is a generic search component which can be used to search within any content and DAM hierarchy.

As compared to the OOTB search component of AEM, the custom search component does a full sentence search instead of individual words of sentences. For asset search it can even provide the page number in which the text is present.

The objective behind creating a custom search component

To create a search component in AEM to enable users to search any word, number or sentence. Even special characters in AEM website pages as well as DAM assets (PDFs, Excel files, PowerPoint presentations). 

The approach taken to create our AEM Search component 

We used Omnisearch API with QueryBuilder, which in turn uses Lucene indexes to perform effective and efficient searching. 

Prerequisites for creating Full-Text Search in AEM

For efficient searching, please validate your AEM instance has the following nodes. 

  1. /oak:index/lucene
  2. /oak:index/cmLucene
  3. /oak:index/damAssetLucene
  4. /oak:index/nodetype
  5. /oak:index/cqPageLucene
How to implement our Full-Text Search component in your AEM instance?
  1. Create a component with search directory as dialog and text fields along with submit button on display layer.

    The component dialog will look like this 
component dialog box for full text search in aem

The basic UI will look like this but you can customize it the way you want.

basic UI of component with search directory

2. Add AJAX call on submit button click which sends search string and search location. 

3. Create a servlet which gets the search parameters: 

  1. Search string
  2. Search location 

4. Create a query using ‘QueryBuilder’ to perform the search. 

5. Parse the result in required format.

6. Send the response in JSON. 

7. Parse the result on screen. 

Advantages of our Custom Search Component

Through Full-Text Search, you can improve the user journey on your AEM website as users can find the specific item they’re looking for. Additionally, this custom search component will help you in site personalization as you can implement a user-permission based search. You can even integrate analytics with this search to understand your users’ demands at a more granular level.

Check out other articles in our blog to learn about the different tools and features that we’ve created around and for different Adobe Experience Cloud solutions.

Share it:

Arpit Rathi

October 23, 2019

3 thoughts on “Full-Text Search in AEM Pages and Assets including PDF, Excel and PowerPoint”

  1. nice blog, well elucidated.

    One quick question, Is there any specific reason to use the Omnisearch API with QueryBuilder. I believe AEM by default uses Lucene indexes to perform effective and efficient searching.

    Reply
    • Yes, you are correct, AEM uses Lucene indexes for efficient searching and we are also using Lucene indexes. Omnisearch is an API which enables different search modules (or location) to plugin with common and unified search interface.

      Reply
  2. Nice blog.
    I have a question Is there a way to get the Excerpt of the pdf?

    I am using following query
    /jcr:root/content/dam/site/documents//element(*, dam:Asset)
    [
    (jcr:contains(., ‘hello’))
    and (_x002e_./_x002e_./jcr:content/@jcr:primaryType != ‘dam:AssetContent’)
    ]

    Reply

Leave a Comment

Related Posts