1. Explain the entire process of replication with respect to the data flow.
A :
A :
- the author requests that certain content be published (activated); this can be initiated by a manual request, or by automatic triggers which have been preconfigured.
- the request is passed to the appropriate default replication agent; an environment can have several default agents which will always be selected for such actions. These agent use buildContent method to build the content for replication. This method uses session for the read and write from the content repository. Also the replication agent has method replicate that take replication content, type and options are arguments. In case this method fails to replicate the content, they are stored in replication queue. A prceprocessor is used before the replication, which performs the actions like creating versions of the pages before replication. It also checks the permissions. A replicationContentFilter is used for filtering out contents before the replication. i.e. to check if a child of a page should be replicated or not. This filter uses replication filter chain.
- the replication agent "packages" the content and places it in the replication queue.
- in the Websites tab the colored status indicator is set for the individual pages.
- the content is lifted from the queue and transported to the publish environment using the configured protocol; usually this is HTTP.
- a servlet in the publish environment receives the request and publishes the received content; the default servlet is http://localhost:4503/bin/receive.
- multiple author and publish environments can be configured.
- Features such as comments and forms, allow users to enter information on a publish instance. For this a type of replication is needed to return this information to the author environment, from where it is redistributed to other publish environments. However, due to security considerations, any traffic from the publish to the author environment must be strictly controlled.
- This is known as reverse replication and functions using an agent in the publish environment which references the author environment. This agent places the input into an outbox. This outbox is matched with replication listeners in the author environment. The listeners (PollingTransportHandler) poll the outboxes to collect any input made and then distribute it as necessary. This ensures that the author environment controls all traffic.
2. Explain serialization types in detail. And how they're handled.
A :
Defualt : For the default replication as explained above.
Dispatcher Flusher : For the dispatcher cache flush.
Binary less : To make the publish instance point to the same author datastore in order to reduce the storage cost and time taken to activate the pages. To use binary less, change the data store to filedatasote in repository.xml and then point all the instances to the same filedatastore on the network.
Static Content Builder : For the file system view of repository
A :
Defualt : For the default replication as explained above.
Dispatcher Flusher : For the dispatcher cache flush.
Binary less : To make the publish instance point to the same author datastore in order to reduce the storage cost and time taken to activate the pages. To use binary less, change the data store to filedatasote in repository.xml and then point all the instances to the same filedatastore on the network.
Static Content Builder : For the file system view of repository
3. How to get the properties in activate method.
A :
PropertiesUtil.toStringArray(componentContext.getProperties().get(name),type);
A :
PropertiesUtil.toStringArray(componentContext.getProperties().get(name),type);
4. Explain the difference between workflow launcher and sling event listener and observation manager with respect to their performance.
A :
- Workflow launcher is used at AEM level : This is an ideal way of listening the events. They are cluster aware and are at AEM level. This is well documented and is useful when there are more customized non admin users.
- Event Listener (Handler) is used at Sling level : these are application level events. These events must have topic registered to them. Here the event can be sent by EventAdmin either synchronously or asynchronously. postEvent for async and sendEvent API for sync. An event handler can listen to such events as well as replication events. Events get black listed in case the specified time in EventAdmin is atained before the event is handled by the event handler. Also it should be noted that the handler is called for each and every event. There can be event filter used to avoid the same this will ensure event admin will check the filter parameter before calling the event handler.
- Observation manager is used at JCR level : This is done at the lowest level i.e. node level. Like node added, property added etc. A live session is required for this one to listen to repository. These listeners are not cluster aware, each time a change is made in nodes, the code will execute n times as per the no of clusters.
There are two more ways :
- Scheduled events
- Post to repository
A :
- Workflow launcher is used at AEM level : This is an ideal way of listening the events. They are cluster aware and are at AEM level. This is well documented and is useful when there are more customized non admin users.
- Event Listener (Handler) is used at Sling level : these are application level events. These events must have topic registered to them. Here the event can be sent by EventAdmin either synchronously or asynchronously. postEvent for async and sendEvent API for sync. An event handler can listen to such events as well as replication events. Events get black listed in case the specified time in EventAdmin is atained before the event is handled by the event handler. Also it should be noted that the handler is called for each and every event. There can be event filter used to avoid the same this will ensure event admin will check the filter parameter before calling the event handler.
- Observation manager is used at JCR level : This is done at the lowest level i.e. node level. Like node added, property added etc. A live session is required for this one to listen to repository. These listeners are not cluster aware, each time a change is made in nodes, the code will execute n times as per the no of clusters.
There are two more ways :
- Scheduled events
- Post to repository
5. TarMK vs MongoMK. Which one to prefer when?
A : MongoMK provides better scalability and TarMK provides better performance. Based on the number of author instances required one can make choice of MongoMK or TarMK. i.e. when there's requirement of more than one author instance, MongoMK is the answer. If there are huge number of page updates in a day, TarMK is the right choice. Also the choice of underlying database depends upon the number of concurrent users, number of assets, number of page edits, number of searches per day. In case of publish instance, the selection of MongoMK or TarMK should be based on the fact if there's any user generated contents or not. So if the site requires users generated contents go for MongoMK.
A : MongoMK provides better scalability and TarMK provides better performance. Based on the number of author instances required one can make choice of MongoMK or TarMK. i.e. when there's requirement of more than one author instance, MongoMK is the answer. If there are huge number of page updates in a day, TarMK is the right choice. Also the choice of underlying database depends upon the number of concurrent users, number of assets, number of page edits, number of searches per day. In case of publish instance, the selection of MongoMK or TarMK should be based on the fact if there's any user generated contents or not. So if the site requires users generated contents go for MongoMK.
6. Explain content builder and it's methods.
A: A ContentBuilder assembles data for the replication. It contains two variations of create method and getTitle and getName method. The content builder is registered with the name property.
A: A ContentBuilder assembles data for the replication. It contains two variations of create method and getTitle and getName method. The content builder is registered with the name property.
7. Global objects
8. What is the usage of SlingSafeMethodsServlet?
A:
A:
- Read only
- Doesn't support post put delete
- Supports get, head, trace, options
9. What is the usage of SlingAllMethodsServlet?
A:
A:
- Used for data modifying servlets
- Supports all methods inclusing post, put and delete
10. What is Tar farm?
11. What is fragment in OSGI bundles?
A : Fragments are used to customize and osgi bundle. They are dependent upon the host bundle and they use the same classloader as host bundle.
A : Fragments are used to customize and osgi bundle. They are dependent upon the host bundle and they use the same classloader as host bundle.
12. What is bundle context and what is component context?
A :
13. How are the big size images being replicated on publish?
A : Use binary less replication. This will mean the same data is shared to used on author and publisher instances and no copies will be created. Binary less ensures only meta data are copied and shared amongst the instances. TarMK should be the choice since binary less is used here.
14. Your system architecture in terms of author publisher servers.
A : Please prepare this answer according to your system.
15. How is the load balancing done on publishers with different dispatchers?
A : refer https://cqdump.wordpress.com/2015/01/12/connecting-dispatchers-and-publishers/
16. If there are multiple instances of a service implementations, which one will be picked up? The one that is used recently or the one that is older.
A : service with highest service ranking will be picked up. If there are more than one service implementation with the same service ranking then the lowest service id will be picked up. Also if you have specific requirement of service being used. There should be a service type defined as propery and the same should be called using
16. Why sling?
A : Sling is a web application frame work based on RESTful API which provides the easy development of content centric system.
17. Why JCR?
A : Sling uses JCR repository like jackrabbit or in case of AEM it is CRX.
18. Why OSGI?
A :
A : Higher level APIs like Sling API provides easy way of accessing the data. It is more readable, maintainable and productive. This reduces the boilerplate code. Exceptions are handled elegantly. They have more useful and new features included.
20. What is new in AEM 6.4?
21. How are the Replication Queues maintained in case of system failure?
A : The replication queue is stored in /var/replication/data and are retrieved from this location in case of system failure.
Also AEM uses a proprietary binary format for replication called Durbo. Durbo includes the necessary checksumming to ensure that replicated content is not corrupted during transport.
22. Managing repository growth.
A first of all, the root cause of the repository growth has to be identified. Following steps can be taken for the same.
1.
Configure a logger - org.apache.jackrabbit.oak.jcr.operations.writes at traces log level in http://aemhost:port/system/console/slinglog
2. Run the disk usage report at - http://host:port/etc/reports/diskusage.html
3. Perform following
CPU Profiling, capturing thread dumps, Analysing AEM thread dump
23. Tell about some of the Sling API you've used.
ResourceResolver
ValueMap
Resource
ResourceWrapper
24. Why OAK?
25. Why do we need component and why do we need service?
26. Explain Query API in detail in java.
27. Explain how the permissions are evaluated on a node/page with internal logic.
28. How is micro kernel different from persistence manager?
29. Indexing in AEM.
A :
13. How are the big size images being replicated on publish?
A : Use binary less replication. This will mean the same data is shared to used on author and publisher instances and no copies will be created. Binary less ensures only meta data are copied and shared amongst the instances. TarMK should be the choice since binary less is used here.
14. Your system architecture in terms of author publisher servers.
A : Please prepare this answer according to your system.
15. How is the load balancing done on publishers with different dispatchers?
A : refer https://cqdump.wordpress.com/2015/01/12/connecting-dispatchers-and-publishers/
16. If there are multiple instances of a service implementations, which one will be picked up? The one that is used recently or the one that is older.
A : service with highest service ranking will be picked up. If there are more than one service implementation with the same service ranking then the lowest service id will be picked up. Also if you have specific requirement of service being used. There should be a service type defined as propery and the same should be called using
16. Why sling?
A : Sling is a web application frame work based on RESTful API which provides the easy development of content centric system.
17. Why JCR?
A : Sling uses JCR repository like jackrabbit or in case of AEM it is CRX.
18. Why OSGI?
A :
- Defines architecture for the moduler applications
- Dynamic discovery of service and contract
- Allows dynamic loading, unloading and configuration of the controls of the bundle
A : Higher level APIs like Sling API provides easy way of accessing the data. It is more readable, maintainable and productive. This reduces the boilerplate code. Exceptions are handled elegantly. They have more useful and new features included.
20. What is new in AEM 6.4?
21. How are the Replication Queues maintained in case of system failure?
A : The replication queue is stored in /var/replication/data and are retrieved from this location in case of system failure.
Also AEM uses a proprietary binary format for replication called Durbo. Durbo includes the necessary checksumming to ensure that replicated content is not corrupted during transport.
22. Managing repository growth.
A first of all, the root cause of the repository growth has to be identified. Following steps can be taken for the same.
1.
Configure a logger - org.apache.jackrabbit.oak.jcr.operations.writes at traces log level in http://aemhost:port/system/console/slinglog
2. Run the disk usage report at - http://host:port/etc/reports/diskusage.html
3. Perform following
CPU Profiling, capturing thread dumps, Analysing AEM thread dump
23. Tell about some of the Sling API you've used.
ResourceResolver
ValueMap
Resource
ResourceWrapper
24. Why OAK?
25. Why do we need component and why do we need service?
26. Explain Query API in detail in java.
27. Explain how the permissions are evaluated on a node/page with internal logic.
28. How is micro kernel different from persistence manager?
29. Indexing in AEM.
No comments:
Post a Comment
If you have any doubts or questions, please let us know.