Some cases we may need to index specific types of URLs from the website and excluding all other URLs available.
The URL Masks can be used in Adobe S&P to achieve this.
URL mask will help us to define the rules to include or exclude the specific URLs during the indexing.
We will be able to define include and exclude rules
Include - pattern that specifies the URLs will be indexed
Exclude - pattern that specifies the URLs will be excluded from the indexing.
To index the URLs that is starting with mask.
The crawler will index all the URLs that starts with https://server.com/content/doc
To index the URLs that is in the particular format.
This crawler will index all the URLs matching with - https://server.com/content/doc/*.html?id=*
e.g. https://server.com/content/doc/sample.html?id=123
Regex can be used to match the URLs for indexing
This crawler will index all the URLs matching with the regex ^.*/content/doc/.*\.html$
e.g. https://server.com/content/doc/sample.html
The URL Masks can be used in Adobe S&P to achieve this.
URL mask will help us to define the rules to include or exclude the specific URLs during the indexing.
We will be able to define include and exclude rules
Include - pattern that specifies the URLs will be indexed
Exclude - pattern that specifies the URLs will be excluded from the indexing.
To index the URLs that is starting with mask.
The crawler will index all the URLs that starts with https://server.com/content/doc
To index the URLs that is in the particular format.
This crawler will index all the URLs matching with - https://server.com/content/doc/*.html?id=*
e.g. https://server.com/content/doc/sample.html?id=123
Regex can be used to match the URLs for indexing
This crawler will index all the URLs matching with the regex ^.*/content/doc/.*\.html$
e.g. https://server.com/content/doc/sample.html
No comments:
Post a Comment
If you have any doubts or questions, please let us know.