Attribute loader helps us to provide additional metadata to the URLs crawled from the website.
For example, the PDF's document crawled from the website will not have any additional metadata specified but the additional metadata can be loaded through Attribute Loader.
e.g while crawling the pdf document from the website it will be possible to provide only pdf URl and file name but will not be able to provide the additional details like title, description, etc, this additional metadata can be provided via Attribute Loader.
The values will be merged during indexing through primary key value.
PDF URL Crawled from website - https://www.example.com/test/Albin.pdf
Attribute Loader Data-
URL- https://www.example.com/test/Albin.pdf(primary key)
Tittle - test PDF
Description - test PDF
The Attribute Loader is executed before actual indexing and the metadata data values are merged based on the primary key during indexing.
Defining Attribute Loader:
Sample Feed XML
<attributes xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0">
<channel>
<title>Attribute Loader Feed</title>
<Item>
<title>test PDF1</title>
<desc>test PDF1</desc>
<url>https://www.example.com/test/Albin1.pdf</url>
</Item>
<Item>
<title>test PDF2</title>
<desc>test PDF2</desc>
<url>https://www.example.com/test/Albin2.pdf</url>
</Item>
</channel>
</attributes>
To preview the Attribute Loader Data - Click on Load Attribute Loader Data then Start Load
Click on Preview
Make sure the Content-Types for the required document types are selected to enable the crawler to crawl those document types from website.
Configure the URL entry point - website URL from where the documents should be crawled and the URL mask - the matching URL that should be considered for crawling.
https://www.example.com/home.html
<html>
<body>
<a href="https://www.example.com/test/Albin1.pdf">test1 pdf</a>
<a href="https://www.example.com/test/Albin2.pdf">test2 pdf</a>
</body>
</html>
Run the live index by configuring the website URL entrypoint that has the reference to PDF documents, now the search result displays the metadata provided by Attribute Loader for PDF documents
The Attribute Loader is not enabled by default, this should be enabled in S&P account by your Adobe account representative or by Adobe Support.
No comments:
Post a Comment
If you have any doubts or questions, please let us know.