Solr Search API
| Solr engine management: configuration, indexing, listeners, script service, etc. This module does not handle the search queries |
| Type | JAR |
| Category | |
| Developed by | |
| Rating | |
| License | GNU Lesser General Public License 2.1 |
| Bundled With | XWiki Standard |
Table of contents
Description
Checkout the Solr Core to understand what information is being indexed.
Configuration
The following properties can be configured in the xwiki.properties file for the Solr API:
#-------------------------------------------------------------------------------------
# Solr Search
#-------------------------------------------------------------------------------------
#-# [Since 4.5M1]
#-# The Solr server type. Currently accepted values are "embedded" (default) and "remote".
# solr.type=embedded
#-# [Since 4.5M1]
#-# The location where the embedded Solr instance home folder is located.
#-# The default is the subfolder "store/solr" inside folder defined by the property "environment.permanentDirectory".
# solr.embedded.home=/var/local/xwiki/store/solr
#-# [Since 12.2]
#-# The URL of the Solr server (the root server and not the URL of a core).
#-# The default value assumes that the remote Solr server is started in a different process on the same machine, using the default port.
# solr.remote.baseURL=http://localhost:8983/solr
#-# [Since 5.1M1]
#-# Elements to index are not sent to the Solr server one by one but in batch to improve performances.
#-# It's possible to configure this behavior with the following properties:
#-#
#-# The maximum number of elements sent at the same time to the Solr server
#-# The default is 50.
# solr.indexer.batch.size=50
#-# The maximum number of characters in the batch of elements to send to the Solr server.
#-# The default is 10000.
# solr.indexer.batch.maxLength=10000
#-# [Since 5.1M1]
#-# The maximum number of elements in the background queue of elements to index/delete
#-# The default is 10000.
# solr.indexer.queue.capacity=100000
#-# [Since 6.1M2]
#-# Indicating if a synchronization between SOLR index and XWiki database should be run at startup.
#-# Synchronization can be started from search administration.
#-# The default is true.
# solr.synchronizeAtStartup=false
#-# [Since 12.5RC1]
#-# Indicates which wiki synchronization to perform when the "solr.synchronizeAtStartup" property is set to true.
#-# Two modes are available:
#-# - WIKI: indicate that the synchronization is performed when each wiki is accessed for the first time.
#-# - FARM: indicate that the synchronization is performed once for the full farm when XWiki is started.
#-# For large farms and in order to spread the machine's indexing load, the WIKI value is recommended, especially if
#-# some wikis are not used.
#-# The default is:
# solr.synchronizeAtStartupMode=FARM
#-# [Since 17.2.0RC1]
#-# [Since 16.10.5]
#-# [Since 16.4.7]
#-# Indicates the batch size for the synchronization between SOLR index and XWiki database. This defines how many
#-# documents will be loaded from the database and Solr in each step. Higher values lead to fewer queries and thus
#-# better performance but increase the memory usage. The expected memory usage is around 1KB per document, but
#-# depends highly on the length of the document names.
#-# The default is 1000.
# solr.synchronizeBatchSize=1000XWiki 16.10.9+, 17.4.1+, 17.5.0+ The default batch sizes have been increased to 1000 documents and 10 million characters to speed up indexing as commits are quite slow. Solr is configured to perform a soft commit every 3 seconds, so this shouldn't influence (much) how fast documents become available in the search.
XWiki 17.5.0+ The indexer keeps up to two full batches of data to index in memory. When configuring the batch size, configure the maximum number of characters such that your XWiki instance has enough free memory for this data.
Setup a remote Solr server
XWiki is only tested with the version of Solr that is embedded in XWiki, so it's usually the one with which you will have the fewest surprises, but it should be possible to use a higher version of Solr as a standalone instance. Note that Solr supports only the current and previous major versions (Solr 9 should be usable with XWiki versions that expect Solr 8.x or Solr 9.x).
Here is a compatibility matrix to help with the choice:
| XWiki version | Embedded/Tested Solr version |
|---|---|
| 11.4 to 11.5 | 7.7.1 |
| 11.6 to 13.2 | 8.1.1 |
| 12.3 to 13.0 | 8.5.1 |
| 13.1 to 14.7 | 8.8.0 |
| 14.8 to 16.1.0 | 8.11.2 |
| 16.2.0+ | 9.4.1 |
Debian based system
If your Solr instance is installed on a Debian/Ubuntu system take a look at InstallationViaAPT.
Manual install
The Solr REST API is unfortunately too limited, so you will need to create several cores on your Solr instance. For each one, download the zip file synchronized with your version of XWiki and unzip its content in a new folder located with other Solr cores with the following names:
XWiki <16.2.0
Solr8:
- xwiki: use https://maven.xwiki.org/releases/org/xwiki/platform/xwiki-platform-search-solr-server-core/ as core, the zip file name is of format xwiki-platform-search-solr-server-core-<version>.jar (adds .zip at the end if you zip application does not allow you to unzip it)
- xwiki_extension_index: use https://maven.xwiki.org/releases/org/xwiki/platform/xwiki-platform-search-solr-server-core-minimal/
- xwiki_ratings: use https://maven.xwiki.org/releases/org/xwiki/platform/xwiki-platform-search-solr-server-core-minimal/
- xwiki_events: use https://maven.xwiki.org/releases/org/xwiki/platform/xwiki-platform-search-solr-server-core-minimal/
XWiki 16.2.0+
Solr9:
- xwiki_search_9: use https://maven.xwiki.org/releases/org/xwiki/platform/xwiki-platform-search-solr-server-core-search/
- xwiki_extension_index_9: use https://maven.xwiki.org/releases/org/xwiki/platform/xwiki-platform-search-solr-server-core-minimal/
- xwiki_ratings_9: use https://maven.xwiki.org/releases/org/xwiki/platform/xwiki-platform-search-solr-server-core-minimal/
- xwiki_events_9: use https://maven.xwiki.org/releases/org/xwiki/platform/xwiki-platform-search-solr-server-core-minimal/
Indicate in xwiki.properties file that you want to use a remote Solr instance, and its URL:
solr.type=remote
solr.remote.baseURL=http://solrhost/solrWhen using solr.remote.baseURL you can control the name of the search core (and the prefix for the other cores) using solr.remote.corePrefix property (default the main core is "xwiki" and the others are prefixed with "xwiki_").
Data transfer upon moving the Solr of an existing instance to a remote Solr
TODO: add a note about how to move data for data cores (ratings & events) from the embedded Solr to the remote Solr
Backup remote Solr data
TODO: add a note about what and how to backup the data from the external Solr server.
Performances
By default XWiki ships with an embedded Solr. This is mostly for ease of use but the embedded instance is not really recommended by the Solr team so you might want to externalize it when starting to have a wiki with a lots of pages. Solr is using a lot of memory and a standalone Solr instance is generally better in term of speed than the embedded one. It should not be much noticeable in a small wiki but if you find yourself starting to have memory issues and slow search results you should probably try to install and setup an external instance of Solr using the guide.
Also the speed of the drive where the Solr index is located can be very important because Solr/Lucene is quite filesystem intensive. For example putting it in a SSD might give a noticeable boost.
You can also find more Solr-specific performance details on https://wiki.apache.org/solr/SolrPerformanceProblems. Standalone Solr also comes with a very nice UI, along with monitoring and test tools.
Size on disk
It depends on the size of each document but an instance like the http://www.myxwiki.org farm (mostly standard documents in lots of wikis) uses 3.2GB of disk space to store around 180000 documents, which gives approximately 18KB per document.
Extensibility
The Solr module provides several ways for extension to user or modify its behavior.
Dedicated Solr core
It is possible to "reserve" and initialize a dedicated Solr core by implementing the component role org.xwiki.search.solr.SolrCoreInitializer. You can also access a specific core using org.xwiki.search.solrSolr#getClient(String).
An org.xwiki.search.solrAbstractSolrCoreInitializer is provided to make easier to implement org.xwiki.search.solr.SolrCoreInitializer, it comes with the following features:
- calls
getVersion()to know the current specification version of the core - calls
createSchema()when the schema does not exist yet - calls
migrateSchema(long cversion)when the schema already exist but is in an older version - provide a lot of helper methods to create field types and fields including support for a virtual Map field.
An org.xwiki.search.solr.SolrUtils component also exists to provide helpers to manipulate Solr cores beyond the schema manipulation.
One thing which is currently not automated is the creation of the core in the case of a remote Solr instance.
Customize the "search" core indexing
It's possible to implement a component which is going to be called every time an entity will is indexed in the "search" core. For that, you can implement the role org.xwiki.search.solr.SolrEntityMetadataExtractor (and make sure you use a unique role hint). The currently supported entities (and corresponding roles to implement) are:
- org.xwiki.search.solr.SolrEntityMetadataExtractor<DocumentReference>
- org.xwiki.search.solr.SolrEntityMetadataExtractor<AttachmentReference>
- org.xwiki.search.solr.SolrEntityMetadataExtractor<ObjectReference>
- org.xwiki.search.solr.SolrEntityMetadataExtractor<ObjectPropertyReference>
Prerequisites & Installation Instructions
We recommend using the Extension Manager to install this extension (Make sure that the text "Installable with the Extension Manager" is displayed at the top right location on this page to know if this extension can be installed with the Extension Manager).
You can also use the manual method which involves dropping the JAR file and all its dependencies into the WEB-INF/lib folder and restarting XWiki.
Dependencies
Dependencies for this extension (org.xwiki.platform:xwiki-platform-search-solr-api 18.4.0):
- org.apache.solr:solr-solrj 9.4.1
- org.apache.commons:commons-lang3 3.20.0
- org.xwiki.platform:xwiki-platform-tika-parsers 18.4.0
- org.xwiki.commons:xwiki-commons-component-api 18.4.0
- org.xwiki.commons:xwiki-commons-environment-api 18.4.0
- org.xwiki.platform:xwiki-platform-model-api 18.4.0
- org.xwiki.platform:xwiki-platform-oldcore 18.4.0
- org.xwiki.platform:xwiki-platform-bridge 18.4.0
- org.xwiki.platform:xwiki-platform-query-manager 18.4.0
- org.xwiki.platform:xwiki-platform-link 18.4.0