April 26, 2020
Estimated Post Reading Time ~

Multilingual Search(MLS) – Breaking the Language Barrier

Adobe is committed to provide its AEM customers the ability to serve people across different countries and regions with a streamlined product. In its endeavor to achieve the goal for no-language barrier, the AEM team at Adobe has introduced the feature of Multilingual Search (MLS) with the release of AEM 6.2

Lets try to better understand it through an example.

Suppose an automobile giant in Germany uses AEM as a comprehensive content management platform solution. We have three people here to demonstrate how it works:
  • A businessman in London
  • An engineer in Germany
  • A car enthusiast in China
The businessman is having some trouble with his car’s transmission system and he goes to the online community for help. Here is how the conversation between the three gentleman goes like:

Businessman: Hi! I am facing problem with my transmission system. The shiftsare making unusual noise. Any help?

Engineer searches for Getriebe problem (Transmission problem) to see all the queries people might have and he will see this post.

Engineer: Wann war das letzte Mal Sie das Motoröl geändert (When was the last time you changed the engine oil)

Meanwhile, the Car enthusiast from China responds to this:

Car Enthusiast: 即使我面一些与我汽速器,我他写了和它取代 (Even I face some issue with my car’s transmission, I wrote to them and got it replaced.)

Businessman: I did change it recently. I guess I would go for complete replacement. Thank you.

Conversation Over.

The businessman comes back and searches for ‘Transmission system’ and he gets all the above replies, even though they were written in different languages.

This was a simple demo, how MLS works.

AEM 6.2 and FP4 for 6.1 come equipped with this powerful feature. It is being offered in two variants
  • Simple MLS
  • Advances MLS
The major difference between these two is the ability of the latter to detect, modify the query language and search inappropriate index. It should also be noted that for MLS to work, currently, only Mongo Backend is supported along-with SOLR as a search platform.

Whenever a user makes a contribution, for example a comment, reply or question/answer, the User Generated Content(UGC) gets stored in the verbatim_default index. Once the system detects the language, it gets stored in verbatim_en, verbatim_fr and verbatim_de etc.

If we have Simple MLS deployed, the system searches in the following indexes:
  • Verbatim_default
  • Verbatim_lang ; where lang: user preference language
The user has the liberty to choose the language from the dropdown. Suppose he chooses, German(de), then the indexes that would be searched:
  • Verbatim_default
  • Verbatim_de
Sample Query Generated by AEM for the search text –“sprechen Sie laut”

INFO – 2016-07-26 10:19:51.171; org.apache.solr.core.SolrCore; [collection1]webapp=/solr path=/selectparams{q=%2B(cqtags_ss(sprechen+Sie+laut)+author_username:sprechen+Sie+laut+verbatim_default:(sprechen+Sie+laut)+verbatim_de:(sprechen+Sie+laut)+title_t:(sprechen+Sie+laut)+author_display_name(sprechen+Sie+laut))+%2Bprovider_id:\/content/usergenerated/asi/mongo/content/sites/checkMLS/en/*+%2Bresource_type_s:*&df=provider_id&el=de&start=0&trf=verbatim&sort=timestamp+desc&fq={!cost%3D100}report_suite:mongo&rows=10&wt=javabin&version=2} hits=1 status=0 QTime=4

In case of Advanced MLS:
The system itself detects the language of the query and after some modification,generate a query which will search in the desired index and we will get search result.

Sample Query Generated in this case – –“sprechen Sie laut”

INFO – 2016-07-26 10:47:53.633; com.adobe.tat.LangDetectRequestHandler; FOR TAT LOG,params:{params(q=%2B(cqtags_ss:*(sprechen+Sie+laut)+author_username:sprechen+Sie+laut+verbatim_default:(sprechen+Sie+laut)+verbatim_en:(sprechen+Sie+laut)+title_t:(sprechen+Sie+laut)+author_display_name(sprechen+Sie+laut))+%2Bprovider_id:\/content/usergenerated/asi/mongo/content/sites/checkMLS/en/*+%2Bresource_type_s:*&df=provider_id&start=0&trf=verbatim&bl=en&sort=timestamp+desc&fq{!cost%3D100}report_suite:mongo&pl=en&rows=10&wt=javabin&version=2),defaults(df=text&echoParams=explicit&rows=10)}

INFO – 2016-07-26 10:47:53.633; com.adobe.tat.LangDetectRequestHandler;

FOR TAT LOG, q={+(cqtags_ss:*(sprechen Sie laut) author_username:sprechen Sie lautverbatim_default:(sprechen Sie laut) verbatim_en:(sprechen Sie laut) title_t:(sprechen Sielaut) author_display_name:(sprechen Sie laut))+provider_id:\/content/usergenerated/asi/mongo/content/sites/checkMLS/en/*+resource_type_s:*}INFO – 2016-07-26 10:47:53.633; com.adobe.tat.LangDetectRequestHandler; translate fileds name :verbatim

INFO – 2016-07-26 10:47:53.633; com.adobe.tat.langdetect.ShortTextLangDetector; There are signals for this language detector

INFO – 2016-07-26 10:47:53.636; com.adobe.nlp.core.processor.AdobeNLPComponentRunner; using: 2315431 ns, to run process method: processINFO – 2016-07-26 10:47:53.636; com.adobe.tat.LangDetectRequestHandler;new_q={+(cqtags_ss:*(sprechen Sie laut) author_username:sprechen Sie lautverbatim_default:(sprechen Sie laut) verbatim_de:(sprechen Sie laut) verbatim_en:(sprechenSie laut) title_t:(sprechen Sie laut) author_display_name:(sprechenSie laut))

While these sample queries above may look all gibberish and be in-comprehendible to the naked eye, with MultiLingual Search, Adobe has certainly broken the language barriers across the community members speaking different languages. Stay tuned for more insights into this!!


By aem4beginner

No comments:

Post a Comment

If you have any doubts or questions, please let us know.