Solr Search API

Last modified by Thomas Mortagne on 2026/05/28 15:11

cogSolr engine management: configuration, indexing, listeners, script service, etc. This module does not handle the search queries
TypeJAR
Category
Developed by

XWiki Development Team

Rating
0 Votes
LicenseGNU Lesser General Public License 2.1
Bundled With

XWiki Standard

Success

Installable with the Extension Manager

Description

Checkout the Solr Core to understand what information is being indexed.

Configuration

The following properties can be configured in the xwiki.properties file for the Solr API:

#-------------------------------------------------------------------------------------
# Solr Search
#-------------------------------------------------------------------------------------

#-# [Since 4.5M1]
#-# The Solr server type. Currently accepted values are "embedded" (default) and "remote".
# solr.type=embedded

#-# [Since 4.5M1]
#-# The location where the embedded Solr instance home folder is located.
#-# The default is the subfolder "store/solr" inside folder defined by the property "environment.permanentDirectory".
# solr.embedded.home=/var/local/xwiki/store/solr

#-# [Since 12.2]
#-# The URL of the Solr server (the root server and not the URL of a core).
#-# The default value assumes that the remote Solr server is started in a different process on the same machine, using the default port.
# solr.remote.baseURL=http://localhost:8983/solr

#-# [Since 5.1M1]
#-# Elements to index are not sent to the Solr server one by one but in batch to improve performances.
#-# It's possible to configure this behavior with the following properties:
#-#
#-# The maximum number of elements sent at the same time to the Solr server
#-# The default is 50.
# solr.indexer.batch.size=50
#-# The maximum number of characters in the batch of elements to send to the Solr server.
#-# The default is 10000.
# solr.indexer.batch.maxLength=10000

#-# [Since 5.1M1]
#-# The maximum number of elements in the background queue of elements to index/delete
#-# The default is 10000.
# solr.indexer.queue.capacity=100000

#-# [Since 6.1M2]
#-# Indicating if a synchronization between SOLR index and XWiki database should be run at startup.
#-# Synchronization can be started from search administration.
#-# The default is true.
# solr.synchronizeAtStartup=false

#-# [Since 12.5RC1]
#-# Indicates which wiki synchronization to perform when the "solr.synchronizeAtStartup" property is set to true.
#-# Two modes are available:
#-#   - WIKI: indicate that the synchronization is performed when each wiki is accessed for the first time.
#-#   - FARM: indicate that the synchronization is performed once for the full farm when XWiki is started.
#-# For large farms and in order to spread the machine's indexing load, the WIKI value is recommended, especially if
#-# some wikis are not used.
#-# The default is:
# solr.synchronizeAtStartupMode=FARM

#-# [Since 17.2.0RC1]
#-# [Since 16.10.5]
#-# [Since 16.4.7]
#-# Indicates the batch size for the synchronization between SOLR index and XWiki database. This defines how many
#-# documents will be loaded from the database and Solr in each step. Higher values lead to fewer queries and thus
#-# better performance but increase the memory usage. The expected memory usage is around 1KB per document, but
#-# depends highly on the length of the document names.
#-# The default is 1000.
# solr.synchronizeBatchSize=1000

XWiki 16.10.9+, 17.4.1+, 17.5.0+ The default batch sizes have been increased to 1000 documents and 10 million characters to speed up indexing as commits are quite slow. Solr is configured to perform a soft commit every 3 seconds, so this shouldn't influence (much) how fast documents become available in the search.

XWiki 17.5.0+ The indexer keeps up to two full batches of data to index in memory. When configuring the batch size, configure the maximum number of characters such that your XWiki instance has enough free memory for this data.

Setup a remote Solr server

XWiki is only tested with the version of Solr that is embedded in XWiki, so it's usually the one with which you will have the fewest surprises, but it should be possible to use a higher version of Solr as a standalone instance. Note that Solr supports only the current and previous major versions (Solr 9 should be usable with XWiki versions that expect Solr 8.x or Solr 9.x).

Warning

It's not recommended to use Solr 10.x, until XWIKI-24326 is fixed.

Here is a compatibility matrix to help with the choice:

XWiki versionEmbedded/Tested Solr version
11.4 to 11.57.7.1
11.6 to 13.28.1.1
12.3 to 13.08.5.1
13.1 to 14.78.8.0
14.8 to 16.1.08.11.2
16.2.0+9.4.1

Download and install Solr. WarningXWiki 16.6.0+ You will need to enable the analysis-extras module.

Debian based system

If your Solr instance is installed on a Debian/Ubuntu system take a look at InstallationViaAPT.

Manual install

The Solr REST API is unfortunately too limited, so you will need to create several cores on your Solr instance. For each one, download the zip file synchronized with your version of XWiki and unzip its content in a new folder located with other Solr cores with the following names:

XWiki <16.2.0

Solr8:

Indicate in xwiki.properties file that you want to use a remote Solr instance, and its URL:

solr.type=remote

solr.remote.baseURL=http://solrhost/solr

When using solr.remote.baseURL you can control the name of the search core (and the prefix for the other cores) using solr.remote.corePrefix property (default the main core is "xwiki" and the others are prefixed with "xwiki_").

Data transfer upon moving the Solr of an existing instance to a remote Solr

TODO: add a note about how to move data for data cores (ratings & events) from the embedded Solr to the remote Solr

Backup remote Solr data

TODO: add a note about what and how to backup the data from the external Solr server.

Performances

By default XWiki ships with an embedded Solr. This is mostly for ease of use but the embedded instance is not really recommended by the Solr team so you might want to externalize it when starting to have a wiki with a lots of pages. Solr is using a lot of memory and a standalone Solr instance is generally better in term of speed than the embedded one. It should not be much noticeable in a small wiki but if you find yourself starting to have memory issues and slow search results you should probably try to install and setup an external instance of Solr using the guide.

Also the speed of the drive where the Solr index is located can be very important because Solr/Lucene is quite filesystem intensive. For example putting it in a SSD might give a noticeable boost.

You can also find more Solr-specific performance details on https://wiki.apache.org/solr/SolrPerformanceProblems. Standalone Solr also comes with a very nice UI, along with monitoring and test tools.

Size on disk

It depends on the size of each document but an instance like the http://www.myxwiki.org farm (mostly standard documents in lots of wikis) uses 3.2GB of disk space to store around 180000 documents, which gives approximately 18KB per document.

Extensibility

The Solr module provides several ways for extension to user or modify its behavior.

Dedicated Solr core

It is possible to "reserve" and initialize a dedicated Solr core by implementing the component role org.xwiki.search.solr.SolrCoreInitializer. You can also access a specific core using org.xwiki.search.solrSolr#getClient(String).

An org.xwiki.search.solrAbstractSolrCoreInitializer is provided to make easier to implement org.xwiki.search.solr.SolrCoreInitializer, it comes with the following features:

  • calls getVersion() to know the current specification version of the core
  • calls createSchema() when the schema does not exist yet
  • calls migrateSchema(long cversion) when the schema already exist but is in an older version
  • provide a lot of helper methods to create field types and fields including support for a virtual Map field.

An org.xwiki.search.solr.SolrUtils component also exists to provide helpers to manipulate Solr cores beyond the schema manipulation.

One thing which is currently not automated is the creation of the core in the case of a remote Solr instance.

Customize the "search" core indexing

It's possible to implement a component which is going to be called every time an entity will is indexed in the "search" core. For that, you can implement the role org.xwiki.search.solr.SolrEntityMetadataExtractor (and make sure you use a unique role hint). The currently supported entities (and corresponding roles to implement) are:

  • org.xwiki.search.solr.SolrEntityMetadataExtractor<DocumentReference>
  • org.xwiki.search.solr.SolrEntityMetadataExtractor<AttachmentReference>
  • org.xwiki.search.solr.SolrEntityMetadataExtractor<ObjectReference>
  • org.xwiki.search.solr.SolrEntityMetadataExtractor<ObjectPropertyReference>

Prerequisites & Installation Instructions

We recommend using the Extension Manager to install this extension (Make sure that the text "Installable with the Extension Manager" is displayed at the top right location on this page to know if this extension can be installed with the Extension Manager).

You can also use the manual method which involves dropping the JAR file and all its dependencies into the WEB-INF/lib folder and restarting XWiki.

Dependencies

Dependencies for this extension (org.xwiki.platform:xwiki-platform-search-solr-api 18.4.0):

Get Connected