cogImport the contents of a MediaWiki instance to a XWiki instance
Developed by


0 Votes
LicenseGNU Lesser General Public License 2.1


The recommended way to import MediaWiki content is now through the MediaWiki input filter.

Note that this application generates only XWiki 1.0 syntax and that XWiki now provides support for MediaWiki syntax as page source and a generic wiki syntax converter. See Rendering Module for more details.

There is also an extension available to migrate pages and attachments to XWiki 2.0 syntax, see Mediawiki To XWiki Migration Toolkit

It uses 

  • A MediaWiki XML dump (for instance the Wikipedia one, downloaded from 1)
  • Dom4J for parsing Wikipedia XML contents
  • WikiModel for converting MediaWiki syntax to XWiki syntax.
  • The Groovy script below

This script should work with any MediaWiki exports. It remains to be improved for dealing with revisions, talks, images etc. Currently it imports only the text of the latest revision.

You can also find a Perl script that does something similar: Part I and Part II.

Groovy script

import org.dom4j.*

import org.wikimodel.wem.mediawiki.MediaWikiParser
import org.wikimodel.wem.xwiki.*

class PruningPageHandler implements ElementHandler {
def proxy, token;
def counter = 0;
def max = 10000;

PruningPageHandler(proxy, token) {
this.proxy = proxy
this.token = token
   def messages = []

    public void onStart(ElementPath path) { }
    public void onEnd(ElementPath path) {
     def page = path.current
       def title = page.elementText('title')
        title = title.replaceAll(' ','_')
       def id = page.elementText('id')
        println(title+ '('+counter+')')
       def revision = page.element('revision')
       def revid = revision.elementText('id');
       def revtext = revision.elementText('text');
       def contributor = revision.element('contributor')
def username = contributor.elementText('username')

       def index = revtext.substring(0, Math.min(30,revtext.length())).toLowerCase().indexOf("redirect")
       if (counter < max && index < 0) {

        revtext = revtext.replaceFirst("^-", "*");
        revtext = revtext.replaceAll("__","")  
        revtext = revtext.replaceAll("[\\|][\\+]","")
def buffer = new StringBuffer()
       try {
def reader = new StringReader(revtext);
       def parser = new MediaWikiParser();
         buffer = new StringBuffer()
         def listener = new XWikiSerializer(buffer);
         parser.parse(reader, listener);
} catch (Exception e) {
       def map = new HashMap()
       map.put('content', buffer.toString())
map.put('modifier', username)
try {      
proxy.confluence1.storePage(token, map)
} catch (Exception e) {


         page.detach() // prune the tree

def server = new XMLRPCServer()
def proxy = new XMLRPCServerProxy("http://xwikiserver/xwiki/xmlrpc/confluence")
def token = proxy.confluence1.login("","")  

def reader  = new SAXReader()
def handler = new PruningPageHandler(proxy, token)

File f = new File("/home/slauriere/enwiki-20070908-pages-articles.xml.bz2.1.out")

FileInputStream fis = new FileInputStream(f);
reader.addHandler('/mediawiki/page', handler)


Tags: migration

Get Connected