Tags
Last modified by Thomas Mortagne on 2016/09/08 12:30
search
All pages tagged with search
Activity Stream for pages tagged with search
29 Jan
-
Solr Search Query API 29 Jan, 18:02
14 Dec 2018
-
Solr Search Application 14 Dec, 10:59 2018
07 Sep 2018
-
Solr Search Application 07 Sep, 11:48 2018
09 Mar 2018
-
Solr Search API 09 Mar, 11:00 2018
23 Nov 2017
-
Solr Search Application 23 Nov, 16:54 2017
13 Nov 2017
-
Solr Search Application 13 Nov, 16:56 2017
08 Sep 2017
-
Solr Search API 08 Sep, 09:53 2017
06 Sep 2017
-
Solr Search API 06 Sep, 09:01 2017
17 Mar 2017
-
Solr Search Application 17 Mar, 21:17 2017
-
24 Jan 2017
-
Solr Search Query API 24 Jan, 11:11 2017
20 Jul 2016
-
Solr Search Application 20 Jul, 10:28 2016
19 Jul 2016
-
Solr Search Query API 19 Jul, 16:28 2016
-
Solr Search API 19 Jul, 16:23 2016
-
Solr Search Application 19 Jul, 16:22 2016
29 Apr 2016
-
Solr Search API 29 Apr, 11:25 2016
25 Sep 2015
-
Solr Search Query API 25 Sep, 12:28 2015
-
Solr Search API 25 Sep, 11:22 2015
-
Solr Search Application 25 Sep, 11:00 2015
15 Sep 2015
07 Aug 2015
-
Solr Search Query API 07 Aug, 14:54 2015
This page includes RemoveDuplicatesTokenFilterFactory in the example analyzer and claims that the three "flip" terms will be reduced to one term by that filter.
That is NOT what happens. That filter will only remove duplicates if they have the same *position*, which normally only happens with things like the synonym filter.
After running through the standard tokenizer and the snowball filter, the three "flip" terms will have different positions. This screenshot shows what happens if the exact analyzer documented on this page is used with "flip flipped flipping" as the input:
https://www.dropbox.com/s/k4hbtzkthse122i/xwiki-remove-duplicates-analysis-results.png?dl=0
Documentation for the filter, which does say that terms have to have the same position to be considered duplicates:
https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-RemoveDuplicatesTokenFilter
There is no filter included with Solr or Lucene that will remove duplicate terms at different positions.