You often have case where you want to use Impression service provided by CQ to do custom operation for example finding top 10 most viewed page or sorting all page based on there popularity.
It might possible that your impression data (Page Views) is in external system and then you want to import those data as impression in CQ to have more application context.
You want to aggregate all data across all publish instances.
Solutions:
Approach 1:
Creating your Own Impression service
You can create your own impression service by extending com.day.crx.statistics.The entry here is an example
Supporting class
import com.day.crx.statistics.Entry;
import com.day.crx.statistics.PathBuilder;
/**
* Custom Impression Path Builder
* @author Yogesh Upadhyay
*
*/
public class ImpressionsPathBuilder extends PathBuilder {
/** The name of the node that contains the statistical data about a page */
public static final String STATS_NAME = ".stats";
/** The path of the page. */
private final String path;
/** Default constructor */
public ImpressionsPathBuilder(String path) {
super("yyyy/MM/dd");
this.path = path;
}
/**
* Formats the path for a {@link ImpressionsEntry} instance.
*
* @param entry
* a {@link ImpressionsEntry} instance
* @param buffer
* where to write the path to
*/
public void formatPath(Entry entry, StringBuffer buffer) {
MicrositesImpressionEntry pv = (MicrositesImpressionEntry) entry;
buffer.append(pv.getPathPrefix());
buffer.append(path);
buffer.append("/").append(STATS_NAME).append("/");
// add date nodes as specified in constructor pattern
super.formatPath(pv, buffer);
}
}
import java.text.DateFormat;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import javax.jcr.Item;
import javax.jcr.Node;
import javax.jcr.NodeIterator;
import javax.jcr.PathNotFoundException;
import javax.jcr.RepositoryException;
import javax.jcr.Session;
import javax.jcr.ValueFormatException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.day.crx.statistics.Entry;
import com.day.crx.statistics.PathBuilder;
/**
* Custom Impression Entry
* @author Yogesh Upadhyay
*
*/
public class CustomImpressionEntry extends Entry {
/** default log */
private final Logger log = LoggerFactory.getLogger(getClass());
/** Name of the property that contains the view count */
public static final String VIEWS = "views";
/** Name of the property that contains the rolling week count */
public static final String ROLLING_WEEK_COUNT = "rollingWeekViews";
/** Name of the property that contains the rolling month count */
public static final String ROLLING_MONTH_COUNT = "rollingMonthViews";
/** The page */
private final String pagePath;
private final long count;
public MicrositesImpressionEntry(String pathPrefix, String pagePath,
String date, long count) {
super(pathPrefix);
this.pagePath = pagePath;
DateFormat format = new SimpleDateFormat("yyyy-MM-dd");
try {
this.setTimestamp(format.parse(date).getTime());
} catch (ParseException e) {
log.error("error while parsing date for impressionsentry", e);
}
this.count = count;
}
@Override
protected PathBuilder getPathBuilder() {
return new ImpressionsPathBuilder(pagePath);
}
@Override
public void write(Node node) throws RepositoryException {
log.info("writing impressions node " + node.getPath());
// If Node alredy have count property and it is increment by 1 then
// increment view by 1
if (this.count == 1) {
if (node.hasProperty(VIEWS)) {
long currentCount = node.getProperty(VIEWS).getLong();
node.setProperty(VIEWS, currentCount + 1);
}
} else {
node.setProperty(VIEWS, count);
}
// set month value
Node month = node.getParent();
NodeIterator dayIter = month.getNodes();
long monthCount = 0;
while (dayIter.hasNext()) {
Node tmp = dayIter.nextNode();
if (tmp.hasProperty(VIEWS)) {
monthCount += tmp.getProperty(VIEWS).getLong();
}
}
month.setProperty(VIEWS, monthCount);
// set year value
Node year = month.getParent();
NodeIterator monthIter = year.getNodes();
long yearCount = 0;
while (monthIter.hasNext()) {
Node tmp = monthIter.nextNode();
if (tmp.hasProperty(VIEWS)) {
yearCount += tmp.getProperty(VIEWS).getLong();
}
}
year.setProperty(VIEWS, yearCount);
// set cumulative values for week and month
node.setProperty(ROLLING_WEEK_COUNT, getCumulativeCount(node, 7, VIEWS));
node.setProperty(ROLLING_MONTH_COUNT, getCumulativeCount(node, 30, VIEWS));
}
/**
* Calculates the cumulative view count on the <code>node</code>.
*
* @param node
* the node where to update the cumulative view count
* @param numDays
* the number of days back in time that are cumulated
* @param propertyName
* the name of the count property
* @throws RepositoryException
* if an error occurs while reading or updating.
*/
private long getCumulativeCount(Node node, int numDays, String propName)
throws RepositoryException, ValueFormatException {
long viewCount = 0;
Session session = node.getSession();
PathBuilder builder = getPathBuilder();
Calendar date = Calendar.getInstance();
date.setTimeInMillis(getTimestamp());
CustomImpressionEntry entry = new CustomImpressionEntry(
getPathPrefix(), pagePath, "1970-01-01", 0);
StringBuffer buffer = new StringBuffer();
for (int i = 0; i < numDays; i++) {
// re-use buffer
buffer.setLength(0);
entry.setTimestamp(date.getTimeInMillis());
builder.formatPath(entry, buffer);
String path = buffer.toString();
try {
Item item = session.getItem(path);
if (item.isNode()) {
Node n = (Node) item;
if (n.hasProperty(propName)) {
viewCount += n.getProperty(propName).getLong();
}
}
} catch (PathNotFoundException e) {
// no statistics found for that day
}
// go back one day
date.add(Calendar.DAY_OF_MONTH, -1);
}
return viewCount;
}
}
You need to embed the following dependency for this
<dependency>
<groupId>com.day.cq</groupId> <artifactId>cq-statistics</artifactId> <scope>provided</scope>
</dependency>
<!--All Other Dependencies -->
Here is an example of how you can use this service
import java.util.Date;
import org.apache.sling.api.resource.Resource;
/**
* Custom Page Impression Service
* @author Yogesh Upadhyay
*
*/
public interface CustomImpressionService {
/**
* Record Impression for a path given a date
* If this method is called multiple time for sam date then value will get overridden
* Date should always be in form of yyyy-MM-DD
* @param resourcePath
* @param date (In form of yyyy-MM-DD)
* @param count
*/
public void recordImpression(String resourcePath, String date, long count);
/**
* Record Impression for a path given a date
* If this method is called multiple time for sam date then value will get overridden
* Date should always be in form of yyyy-MM-DD
* @param resource
* @param date (In form of yyyy-MM-DD)
* @param count
*/
public void recordImpression(Resource resource, String date, long count);
/**
* Record Impression for a path given a date
* If this method is called multiple time for sam date then value will get overridden
* @param resource
* @param date
* @param count
*/
public void recordImpression(Resource resource, Date date, long count);
/**
* Record Impression for a path given a date
* Calling this method for same day will increase count of impression by 1
* @param resource
* @param date
*/
public void recordImpression(Resource resource, Date date);
/**
* Record Impression for a path given a date
* Calling this method for same day will increase count of impression by 1
* Date should be in form of yyyy-MM-DD
* @param resourcePath
* @param date (in form of yyyy-MM-DD)
*/
public void recordImpression(String resourcePath, String date);
/**
* Method that will return formated date for impression
* @param date
* @return formatted date in form of yyyy-MM-DD
*/
public String getFormattedDateForImpression(Date date);
}
import java.text.SimpleDateFormat;
import java.util.Date;
import javax.jcr.RepositoryException;
import org.apache.felix.scr.annotations.Activate;
import org.apache.felix.scr.annotations.Component;
import org.apache.felix.scr.annotations.Deactivate;
import org.apache.felix.scr.annotations.Reference;
import org.apache.felix.scr.annotations.Service;
import org.apache.sling.api.resource.LoginException;
import org.apache.sling.api.resource.NonExistingResource;
import org.apache.sling.api.resource.Resource;
import org.apache.sling.api.resource.ResourceResolver;
import org.apache.sling.api.resource.ResourceResolverFactory;
import org.osgi.service.component.ComponentContext;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.day.cq.statistics.StatisticsService;
/**
*
* @author Yogesh Upadhyay
*
*/
@Component
@Service
public class CustomPageImpressionImpl implements CustomImpressionService {
private final Logger log = LoggerFactory.getLogger(getClass());
private static final String STATISTICS_PATH = "/pages";
@Reference
private StatisticsService statisticsService;
@Reference
private ResourceResolverFactory resourceResolverFactory;
private ResourceResolver resourceResolver;
private String statisticsPath;
/**
* Record Impression Method
* It essentially create Impression Entry and add through OOTB service
*/
@Override
public void recordImpression(String resourcePath, String date, long count) {
Resource resource;
ResourceResolver resourceResolver = null;
try {
resourceResolver = getAdminResourceResolver();
resource = resourceResolver.resolve(resourcePath);
if(!(resource instanceof NonExistingResource)){
CustomImpressionEntry customImpressionEntry = new CustomImpressionEntry(statisticsPath, resource.getPath(), date, count);
statisticsService.addEntry(customImpressionEntry);
}
} catch (LoginException e) {
log.error(e.getMessage());
e.printStackTrace();
} catch (RepositoryException e) {
log.error(e.getMessage());
e.printStackTrace();
} finally{
closeResourceResolver(resourceResolver);
}
}
@Override
public void recordImpression(Resource resource, String date, long count) {
if(null!=resource){
recordImpression(resource.getPath(), date,count);
}else{
log.error("Resource Provided is Null ");
}
}
@Override
public void recordImpression(Resource resource, Date date, long count) {
recordImpression(resource, getFormattedDateForImpression(date),count);
}
@Override
public void recordImpression(Resource resource, Date date) {
recordImpression(resource, getFormattedDateForImpression(date),1);
}
@Override
public void recordImpression(String resourcePath, String date) {
recordImpression(resourcePath, date,1);
}
private synchronized ResourceResolver getAdminResourceResolver() throws LoginException{
return resourceResolverFactory.getAdministrativeResourceResolver(null);
}
private synchronized void closeResourceResolver(ResourceResolver resourceResolver){
if(null!=resourceResolver && resourceResolver.isLive()){
resourceResolver.close();
}
}
public String getFormattedDateForImpression(Date date){
if(date!=null){
SimpleDateFormat simpleDateFormat = new SimpleDateFormat("yyyy-MM-dd");
return simpleDateFormat.format(date);
}
return null;
}
@Activate
protected void activate(ComponentContext ctx) {
statisticsPath = statisticsService.getPath() + STATISTICS_PATH;
}
@Deactivate
protected void deactivate(ComponentContext ctx) {
if (resourceResolver != null && resourceResolver.isLive()) {
resourceResolver.close();
}
}
}
Now you can import data from an external system (GA, Site Catalyst, Kafka) and then populate it using this service to your instance.
Once you are ready with all data you can use the following service to use data,
import java.util.Iterator;
import java.util.Set;
import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.resource.Resource;
/**
* Custom Impression Provider Service
* @author Yogesh Upadhyay
*
*/
public interface CustomImpressionProvider {
/**
* Get Iterator of all popular resource
* @param root_path
* @param isDeep
* @param num_days
* @return {@link Iterator<Resource>}
*/
public Iterator<Resource> getPopularResource(String root_path, boolean isDeep, int num_days);
/**
* Get Page impression count based on page path
* @param page_path
* @param num_days
* @return
*/
public int getPageImpressionCount(String page_path,int num_days);
/**
* Get most popular Resource based on root path.
* @param root_path
* @param num_days
* @return {@link Resource}
*/
public Resource getMostPopularResource(String root_path,int num_days);
/**
* return set of all popular resources sorted by there impression
* @param root_path
* @param isDeep
* @param num_days
* @param total_count
* @return
*/
public Set<Resource> getPopularResource(String root_path,boolean isDeep,int num_days, int total_count);
/**
* Get Json Output of all popular resource under a path
* Json Output give page path and impression count for all resource under root path sorted by impression count
* @param httpServletRequest
* @param root_path
* @param num_days
* @return
*/
public String getJsonForPopularString(SlingHttpServletRequest httpServletRequest, String root_path,int num_days);
}
Actual Implementation
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.TreeMap;
import javax.jcr.RepositoryException;
import org.apache.commons.lang3.StringUtils;
import org.apache.felix.scr.annotations.Component;
import org.apache.felix.scr.annotations.Reference;
import org.apache.felix.scr.annotations.Service;
import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.resource.LoginException;
import org.apache.sling.api.resource.NonExistingResource;
import org.apache.sling.api.resource.Resource;
import org.apache.sling.api.resource.ResourceResolver;
import org.apache.sling.api.resource.ResourceResolverFactory;
import org.apache.sling.commons.json.JSONException;
import org.apache.sling.commons.json.io.JSONStringer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.day.cq.statistics.StatisticsService;
import com.day.cq.wcm.api.Page;
import com.day.cq.wcm.api.PageFilter;
import com.day.cq.wcm.api.PageManager;
import com.day.cq.wcm.api.PageManagerFactory;
import com.day.cq.wcm.api.WCMMode;
import com.day.cq.wcm.core.stats.PageViewReport;
/**
* Custom Impression Provider implemetation
* @author Yogesh Upadhyay
*
*/
@Component
@Service
public class CustomImpressionProviderImpl implements CustomImpressionProvider {
private final Logger log = LoggerFactory.getLogger(getClass());
protected static final int MAX_RESULT_COUNT = 3000;
@Reference
protected StatisticsService statisticsService;
@Reference
protected ResourceResolverFactory resourceResolverFactory;
protected ResourceResolver resourceResolver;
protected String stat_path;
/**
* Get all popular resources.
* We use admin session for now to get popular resources
* TODO: may be not use admin session to get impression
* TODO: cache result some where so that we don't have to run this all the time.
*/
@Override
public Iterator<Resource> getPopularResource(final String root_path,
final boolean isDeep, final int num_days) {
Map<String, Integer> all_page_impression = null;
Iterator<Resource> popular_resource_iterator = null;
try {
resourceResolver = resourceResolverFactory.getAdministrativeResourceResolver(null);
Map<String, Integer> sorted_map = getpoularResourceMap(resourceResolver, root_path, isDeep, num_days);
Set<Resource> all_popular_resource_set = new HashSet<Resource>();
for (String each_popular_resource : sorted_map.keySet()) {
Resource each_popular_resource_object = resourceResolver.resolve(each_popular_resource);
if (!(each_popular_resource_object instanceof NonExistingResource)) {
all_popular_resource_set.add(each_popular_resource_object);
}
}
popular_resource_iterator = all_popular_resource_set.iterator();
} catch (LoginException e) {
log.error(e.getMessage());
e.printStackTrace();
}finally {
if (null != resourceResolver && resourceResolver.isLive()) {
resourceResolver.close();
}
}
return popular_resource_iterator;
}
/**
* Method to get page impression count
* We use admin session here as well to get count
*/
@Override
public int getPageImpressionCount(final String page_path, final int num_days) {
int total_count = 0;
try {
resourceResolver = resourceResolverFactory.getAdministrativeResourceResolver(null);
total_count = getPageImpressionCount(resourceResolver, page_path, num_days);
} catch (LoginException e) {
log.error(e.getMessage());
e.printStackTrace();
}catch (RepositoryException e) {
log.error(e.getMessage());
}finally{
if(this.resourceResolver!=null && resourceResolver.isLive()){
resourceResolver.close();
}
}
return total_count;
}
/**
* Method to get most popular resource
*/
@Override
public Resource getMostPopularResource(final String root_path,final int num_days) {
Iterator<Resource> most_popular_resource = getPopularResource(root_path, true, num_days);
if(most_popular_resource!=null){
while(most_popular_resource.hasNext()){
return most_popular_resource.next();
}
}
return null;
}
/**
* get popular resource based on total count
*/
@Override
public Set<Resource> getPopularResource(final String root_path,final boolean isDeep, final int num_days, final int total_count) {
Iterator<Resource> popular_resources = getPopularResource(root_path, isDeep, num_days);
Set<Resource> popular_resource_set=null;
if(popular_resources!=null){
int temp_count=0;
popular_resource_set = new HashSet<Resource>();
while(popular_resources.hasNext()){
//If result is more than total count then break
if(temp_count>total_count) break;
popular_resource_set.add(popular_resources.next());
temp_count++;
}
}
return popular_resource_set;
}
/**
* Utility method to get page impression using resource resolver
* @param resourceResolver
* @param page_path
* @param num_days
* @return
* @throws RepositoryException
*/
protected int getPageImpressionCount(ResourceResolver resourceResolver,String page_path, int num_days) throws RepositoryException{
if(null==resourceResolver || StringUtils.isBlank(page_path)){
return 0;
}
Page page = resourceResolver.resolve(page_path).adaptTo(Page.class);
stat_path = statisticsService.getPath() + "/pages";
//use Page view class
PageViewReport pageViewReport = new PageViewReport(stat_path, page,WCMMode.DISABLED);
pageViewReport.setPeriod(30);
//this is were report is ran
Iterator stats = statisticsService.runReport(pageViewReport);
int totalPageViews = 0;
while (stats.hasNext()) {
Object[] res = (Object[]) stats.next();
totalPageViews = totalPageViews + Integer.parseInt(res[1].toString());
}
log.debug("Total page view for path "+page_path+" is "+totalPageViews);
return totalPageViews;
}
/**
* Get Json string using JsonStringer
*/
@Override
public String getJsonForPopularString(SlingHttpServletRequest httpServletRequest, String root_path, int num_days) {
JSONStringer jsonStringer = new JSONStringer();
log.debug("Root path is "+root_path);
jsonStringer.setTidy(true);
try {
jsonStringer.array();
jsonStringer.object().key("rootpath").value(root_path);
jsonStringer.key("num_days").value(num_days).endObject();
Map<String, Integer> all_popular_resource = getpoularResourceMap(httpServletRequest.getResourceResolver(), root_path, true, num_days);
log.debug(all_popular_resource.toString());
for(Entry<String, Integer> each_resource_entry:all_popular_resource.entrySet()){
jsonStringer.object();
jsonStringer.key("path").value(each_resource_entry.getKey());
jsonStringer.key("impression_count").value(each_resource_entry.getValue());
jsonStringer.endObject();
}
jsonStringer.endArray();
} catch (JSONException e) {
log.error(e.getMessage());
e.printStackTrace();
}
return jsonStringer.toString();
}
/**
* Helper method to get sorted map for popular resources
* @param resourceResolver
* @param root_path
* @param isDeep
* @param num_days
* @return
*/
protected Map<String, Integer> getpoularResourceMap(final ResourceResolver resourceResolver, final String root_path,
final boolean isDeep, final int num_days){
Map<String, Integer> sorted_map = null;
Map<String, Integer> all_page_impression = null;
try {
PageManager pageManager = resourceResolver.adaptTo(PageManager.class);
Page page = resourceResolver.resolve(root_path).adaptTo(Page.class);
if(page==null){
log.error("Root path is not present "+root_path);
return null;
}
//get all children including subchildren
Iterator<Page> all_child_pages = page.listChildren(new PageFilter(), isDeep);
all_page_impression = new HashMap<String, Integer>();
int all_result_count = 0;
while (all_child_pages.hasNext()) {
//If result is more than MAX allowed then break. This is to make sure that performance is not imppacted.
if (all_result_count >= MAX_RESULT_COUNT) {
break;
}
Page each_page = all_child_pages.next();
if (null != each_page) {
int totalPageViews = getPageImpressionCount(resourceResolver, each_page.getPath(), num_days);
//Only if page view count is more than 0 consider adding them
if (totalPageViews > 0) {
log.debug("Adding "+each_page.getPath()+" to Map with value "+totalPageViews);
all_page_impression.put(each_page.getPath(), totalPageViews);
}
}
}
// Now create resource
log.debug("Unsorted Popular Map size is "+all_page_impression.size());
//Once we have whole map need to sort them based on impression
ValueComparator valueComparator = new ValueComparator(all_page_impression);
sorted_map = new TreeMap<String, Integer>(valueComparator);
sorted_map.putAll(all_page_impression);
log.debug("Soted Popular Map size is "+sorted_map.size());
}catch (RepositoryException e) {
log.error(e.getMessage());
}
return sorted_map;
}
}
/**
* Helper class to sort map based on impression count
* @author Yogesh Upadhyay
*
*/
class ValueComparator implements Comparator<String> {
Map<String, Integer> base;
public ValueComparator(Map<String, Integer> base) {
this.base = base;
}
// Note: this comparator imposes orderings that are inconsistent with equals.
public int compare(String a, String b) {
if (base.get(a) >= base.get(b)) {
return -1;
} else {
return 1;
} // returning 0 would merge keys
}
}
Approach 2:
You don't want to write your own service as mentioned in Approach 1 and use OOTB service available to you. Only problem with this is, You have multiple publish instance and somehow you want to combine all data in to one so that you get an accurate picture. It kind of tricky to get all data from all publish instance (through reverse replication) and then combine them on author and then push them over again. However you can use one instance to collect all stat data (king of single source of truth and then replicate it back to all instance every day)
Make sure that you enable page view tracking by adding following line
<cq:include script="/libs/foundation/components/page/stats.jsp" />
Then configure all publish instance to point to one DNS using following config (You can always override this under /apps)
/apps/wcm/core/config.publish/com.day.cq.wcm.core.stats.PageViewStatistics
/apps/wcm/core/config.publish/com.day.cq.wcm.core.stats.PageViewStatisticsImpl
make sure that pageviewstatistics.trackingurl is pointing to single domain (You need to create a domain, something like impression.mydomain.com that will be stand alone CQ instance to take all impression request)
Now you have consolidated page impression on one machine
You can easily write a schedular which will run every night and reverse replicate all data to author instance.
Once it is on author instance you can use replicator service to replicate to all other publish instance
Then you can use code mention in approach 1 to get popular resources.
Note: You can always use GA or something to track data. This is more useful if you want to do something internally and not want to share data with GA.
Once you are ready with all data you can use the following service to use data,
import java.util.Iterator;
import java.util.Set;
import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.resource.Resource;
/**
* Custom Impression Provider Service
* @author Yogesh Upadhyay
*
*/
public interface CustomImpressionProvider {
/**
* Get Iterator of all popular resource
* @param root_path
* @param isDeep
* @param num_days
* @return {@link Iterator<Resource>}
*/
public Iterator<Resource> getPopularResource(String root_path, boolean isDeep, int num_days);
/**
* Get Page impression count based on page path
* @param page_path
* @param num_days
* @return
*/
public int getPageImpressionCount(String page_path,int num_days);
/**
* Get most popular Resource based on root path.
* @param root_path
* @param num_days
* @return {@link Resource}
*/
public Resource getMostPopularResource(String root_path,int num_days);
/**
* return set of all popular resources sorted by there impression
* @param root_path
* @param isDeep
* @param num_days
* @param total_count
* @return
*/
public Set<Resource> getPopularResource(String root_path,boolean isDeep,int num_days, int total_count);
/**
* Get Json Output of all popular resource under a path
* Json Output give page path and impression count for all resource under root path sorted by impression count
* @param httpServletRequest
* @param root_path
* @param num_days
* @return
*/
public String getJsonForPopularString(SlingHttpServletRequest httpServletRequest, String root_path,int num_days);
}
Actual Implementation
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.TreeMap;
import javax.jcr.RepositoryException;
import org.apache.commons.lang3.StringUtils;
import org.apache.felix.scr.annotations.Component;
import org.apache.felix.scr.annotations.Reference;
import org.apache.felix.scr.annotations.Service;
import org.apache.sling.api.SlingHttpServletRequest;
import org.apache.sling.api.resource.LoginException;
import org.apache.sling.api.resource.NonExistingResource;
import org.apache.sling.api.resource.Resource;
import org.apache.sling.api.resource.ResourceResolver;
import org.apache.sling.api.resource.ResourceResolverFactory;
import org.apache.sling.commons.json.JSONException;
import org.apache.sling.commons.json.io.JSONStringer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.day.cq.statistics.StatisticsService;
import com.day.cq.wcm.api.Page;
import com.day.cq.wcm.api.PageFilter;
import com.day.cq.wcm.api.PageManager;
import com.day.cq.wcm.api.PageManagerFactory;
import com.day.cq.wcm.api.WCMMode;
import com.day.cq.wcm.core.stats.PageViewReport;
/**
* Custom Impression Provider implemetation
* @author Yogesh Upadhyay
*
*/
@Component
@Service
public class CustomImpressionProviderImpl implements CustomImpressionProvider {
private final Logger log = LoggerFactory.getLogger(getClass());
protected static final int MAX_RESULT_COUNT = 3000;
@Reference
protected StatisticsService statisticsService;
@Reference
protected ResourceResolverFactory resourceResolverFactory;
protected ResourceResolver resourceResolver;
protected String stat_path;
/**
* Get all popular resources.
* We use admin session for now to get popular resources
* TODO: may be not use admin session to get impression
* TODO: cache result some where so that we don't have to run this all the time.
*/
@Override
public Iterator<Resource> getPopularResource(final String root_path,
final boolean isDeep, final int num_days) {
Map<String, Integer> all_page_impression = null;
Iterator<Resource> popular_resource_iterator = null;
try {
resourceResolver = resourceResolverFactory.getAdministrativeResourceResolver(null);
Map<String, Integer> sorted_map = getpoularResourceMap(resourceResolver, root_path, isDeep, num_days);
Set<Resource> all_popular_resource_set = new HashSet<Resource>();
for (String each_popular_resource : sorted_map.keySet()) {
Resource each_popular_resource_object = resourceResolver.resolve(each_popular_resource);
if (!(each_popular_resource_object instanceof NonExistingResource)) {
all_popular_resource_set.add(each_popular_resource_object);
}
}
popular_resource_iterator = all_popular_resource_set.iterator();
} catch (LoginException e) {
log.error(e.getMessage());
e.printStackTrace();
}finally {
if (null != resourceResolver && resourceResolver.isLive()) {
resourceResolver.close();
}
}
return popular_resource_iterator;
}
/**
* Method to get page impression count
* We use admin session here as well to get count
*/
@Override
public int getPageImpressionCount(final String page_path, final int num_days) {
int total_count = 0;
try {
resourceResolver = resourceResolverFactory.getAdministrativeResourceResolver(null);
total_count = getPageImpressionCount(resourceResolver, page_path, num_days);
} catch (LoginException e) {
log.error(e.getMessage());
e.printStackTrace();
}catch (RepositoryException e) {
log.error(e.getMessage());
}finally{
if(this.resourceResolver!=null && resourceResolver.isLive()){
resourceResolver.close();
}
}
return total_count;
}
/**
* Method to get most popular resource
*/
@Override
public Resource getMostPopularResource(final String root_path,final int num_days) {
Iterator<Resource> most_popular_resource = getPopularResource(root_path, true, num_days);
if(most_popular_resource!=null){
while(most_popular_resource.hasNext()){
return most_popular_resource.next();
}
}
return null;
}
/**
* get popular resource based on total count
*/
@Override
public Set<Resource> getPopularResource(final String root_path,final boolean isDeep, final int num_days, final int total_count) {
Iterator<Resource> popular_resources = getPopularResource(root_path, isDeep, num_days);
Set<Resource> popular_resource_set=null;
if(popular_resources!=null){
int temp_count=0;
popular_resource_set = new HashSet<Resource>();
while(popular_resources.hasNext()){
//If result is more than total count then break
if(temp_count>total_count) break;
popular_resource_set.add(popular_resources.next());
temp_count++;
}
}
return popular_resource_set;
}
/**
* Utility method to get page impression using resource resolver
* @param resourceResolver
* @param page_path
* @param num_days
* @return
* @throws RepositoryException
*/
protected int getPageImpressionCount(ResourceResolver resourceResolver,String page_path, int num_days) throws RepositoryException{
if(null==resourceResolver || StringUtils.isBlank(page_path)){
return 0;
}
Page page = resourceResolver.resolve(page_path).adaptTo(Page.class);
stat_path = statisticsService.getPath() + "/pages";
//use Page view class
PageViewReport pageViewReport = new PageViewReport(stat_path, page,WCMMode.DISABLED);
pageViewReport.setPeriod(30);
//this is were report is ran
Iterator stats = statisticsService.runReport(pageViewReport);
int totalPageViews = 0;
while (stats.hasNext()) {
Object[] res = (Object[]) stats.next();
totalPageViews = totalPageViews + Integer.parseInt(res[1].toString());
}
log.debug("Total page view for path "+page_path+" is "+totalPageViews);
return totalPageViews;
}
/**
* Get Json string using JsonStringer
*/
@Override
public String getJsonForPopularString(SlingHttpServletRequest httpServletRequest, String root_path, int num_days) {
JSONStringer jsonStringer = new JSONStringer();
log.debug("Root path is "+root_path);
jsonStringer.setTidy(true);
try {
jsonStringer.array();
jsonStringer.object().key("rootpath").value(root_path);
jsonStringer.key("num_days").value(num_days).endObject();
Map<String, Integer> all_popular_resource = getpoularResourceMap(httpServletRequest.getResourceResolver(), root_path, true, num_days);
log.debug(all_popular_resource.toString());
for(Entry<String, Integer> each_resource_entry:all_popular_resource.entrySet()){
jsonStringer.object();
jsonStringer.key("path").value(each_resource_entry.getKey());
jsonStringer.key("impression_count").value(each_resource_entry.getValue());
jsonStringer.endObject();
}
jsonStringer.endArray();
} catch (JSONException e) {
log.error(e.getMessage());
e.printStackTrace();
}
return jsonStringer.toString();
}
/**
* Helper method to get sorted map for popular resources
* @param resourceResolver
* @param root_path
* @param isDeep
* @param num_days
* @return
*/
protected Map<String, Integer> getpoularResourceMap(final ResourceResolver resourceResolver, final String root_path,
final boolean isDeep, final int num_days){
Map<String, Integer> sorted_map = null;
Map<String, Integer> all_page_impression = null;
try {
PageManager pageManager = resourceResolver.adaptTo(PageManager.class);
Page page = resourceResolver.resolve(root_path).adaptTo(Page.class);
if(page==null){
log.error("Root path is not present "+root_path);
return null;
}
//get all children including subchildren
Iterator<Page> all_child_pages = page.listChildren(new PageFilter(), isDeep);
all_page_impression = new HashMap<String, Integer>();
int all_result_count = 0;
while (all_child_pages.hasNext()) {
//If result is more than MAX allowed then break. This is to make sure that performance is not imppacted.
if (all_result_count >= MAX_RESULT_COUNT) {
break;
}
Page each_page = all_child_pages.next();
if (null != each_page) {
int totalPageViews = getPageImpressionCount(resourceResolver, each_page.getPath(), num_days);
//Only if page view count is more than 0 consider adding them
if (totalPageViews > 0) {
log.debug("Adding "+each_page.getPath()+" to Map with value "+totalPageViews);
all_page_impression.put(each_page.getPath(), totalPageViews);
}
}
}
// Now create resource
log.debug("Unsorted Popular Map size is "+all_page_impression.size());
//Once we have whole map need to sort them based on impression
ValueComparator valueComparator = new ValueComparator(all_page_impression);
sorted_map = new TreeMap<String, Integer>(valueComparator);
sorted_map.putAll(all_page_impression);
log.debug("Soted Popular Map size is "+sorted_map.size());
}catch (RepositoryException e) {
log.error(e.getMessage());
}
return sorted_map;
}
}
/**
* Helper class to sort map based on impression count
* @author Yogesh Upadhyay
*
*/
class ValueComparator implements Comparator<String> {
Map<String, Integer> base;
public ValueComparator(Map<String, Integer> base) {
this.base = base;
}
// Note: this comparator imposes orderings that are inconsistent with equals.
public int compare(String a, String b) {
if (base.get(a) >= base.get(b)) {
return -1;
} else {
return 1;
} // returning 0 would merge keys
}
}
Approach 2:
You don't want to write your own service as mentioned in Approach 1 and use OOTB service available to you. Only problem with this is, You have multiple publish instance and somehow you want to combine all data in to one so that you get an accurate picture. It kind of tricky to get all data from all publish instance (through reverse replication) and then combine them on author and then push them over again. However you can use one instance to collect all stat data (king of single source of truth and then replicate it back to all instance every day)
Make sure that you enable page view tracking by adding following line
<cq:include script="/libs/foundation/components/page/stats.jsp" />
Then configure all publish instance to point to one DNS using following config (You can always override this under /apps)
/apps/wcm/core/config.publish/com.day.cq.wcm.core.stats.PageViewStatistics
/apps/wcm/core/config.publish/com.day.cq.wcm.core.stats.PageViewStatisticsImpl
make sure that pageviewstatistics.trackingurl is pointing to single domain (You need to create a domain, something like impression.mydomain.com that will be stand alone CQ instance to take all impression request)
Now you have consolidated page impression on one machine
You can easily write a schedular which will run every night and reverse replicate all data to author instance.
Once it is on author instance you can use replicator service to replicate to all other publish instance
Then you can use code mention in approach 1 to get popular resources.
Note: You can always use GA or something to track data. This is more useful if you want to do something internally and not want to share data with GA.
No comments:
Post a Comment
If you have any doubts or questions, please let us know.