Create a Wikifarm

From SaruWiki
Jump to navigation Jump to search

On Wikifarms

What is a Wikifarm?

Wikifarm and the verb Wikifarming refer to the setup and operation of multiple wikis on a single server. This could be because on a single website you want to run multiple wikis (e.g. http://www.saruman.biz/linuxwiki and http://www.saruman.biz/bsdwiki). Or it could be because you're running multiple websites on a single server (each behind its own IP, or all on the same IP number using virtual hosts), and more than one of these websites needs to have its own wiki.

Types of Wikifarms

There are multiple ways to set up a wikifarm; what is the most suitable way for you depends on how you want to use the farm. We can roughly distinguish the following type of farms:

  • Multiple wikis, each running their own code and their own database. This is simple to set up when you install the Wiki code base from source - funny enough, under Debian it's not easy. Also, it's not a recommendable solution from the perspective of maintenance: each wiki must be updated separately when e.g. a security patch is issued for the wiki software.
  • Multiple wikis, running on a shared code base, but each with their own database. This is a good solution for multiple wikis for multiple websites.
  • Multiple wikis, running on a shared code base, and on a shared database. This could be a good solution for multiple wikis for a single website, because a user can register on one wiki, and automatically have the same account in the other database as well.

We're not going to go into the first type, since the shared codebase is more maintainable. Instead, we're going to show how you can share the Mediawiki code base under Debian, so that a Debian update to the Mediawiki package is automatically applied to all instances of your wikifarm. And we'll see that from there, the choice between separate databases or shared databases is quite trivial.

Wikifarming under Debian

As with most Linux-based software packages, the MediaWiki package mediawiki that Debian offers is modified to suit the Debian policies and needs. This means that it has moved all configuration files to reside under /etc/mediawiki, but also that it offers a somewhat better way to include extensions. The downside of these modifications is that creating a wikifarm under Debian is somewhat more complicated than when you're using vanilla MediaWiki sources - mostly because there isn't much documentation on the Internet about it. In the following we'll be documenting one possible way to create a wikifarm. Note that our method is particularly well-suited to "small" wikifarms (a handful of wikis on a single server), and not exactly fit for "big" wikifarms (thousands of wikis on one or more servers).

General procedure to creat a wikifarm wiki instance

Setting up a wiki under Debian is quite easy - as you might have seen from this wiki's description under entry Basic MediaWiki Installation. This is because the Debian customization includes specific changes to some MediaWiki files, and to the default configuration files. Thus, the usual methods of creating a wiki instance next to the default one run into an additional obstacle: the forementioned Debian customizations. The way around it is this:

  • We suppose you've already installed your first wiki. It'll probably be just as described in the Basic MediaWiki Installation.
  • Make sure you adapt this first wiki to support virtual hosts, as described in the section on placing MediaWiki inside a Virtual Host.
  • If your first wiki is running OK, follow the instructions in the next set of sections to move the wiki from its default place to the wikifarm. As a result, your first wiki should still run OK, but the place where Debian expects your wiki to live is now "empty" again
  • Set up your second wiki as if it were the first one, and move it over to the wiki farm.
  • et cetera.

Because of the manual work involved, this system is not very well suited to "big" wikifarms; for those instances you'd be better off with one of the scripts that creates the wikifarm instance, and populates it with a default database and default configuration file. Scripts and instructions can be found on the Internet, like this one on jirp.nl.
However, the method is quite doable when you've only got a handful of wikis to create, and remains largely compatible with Debians mechanisms for updating packages.

Debian Mediawiki wikifarm structure

The standard Debian MediaWiki structure

After installation of the mediawiki package, your server will be expanded with the MediaWiki directory structure. Within this structure, there are four main branches (if you haven't installed the mediawiki-extensions package yet):

  • /etc/mediawiki contains the configuration files for MediaWiki.
  • /usr/share/doc/mediawiki contains the documentation files.
  • /usr/share/mediawiki contains the source code. It is not used directly, rather it is linked to from the /var/lib/mediawiki directory.
  • /var/lib/mediawiki contains the site, including links to the configuration files in /etc/mediawiki.

Within these branches, we should not delete or move any file or directory, since that will confuse our Debian package manager when an update to the mediawiki package becomes available, so we strive to minimize the changes to this structure. Furthermore, the default configuration files explicitly declare the wiki to "live" in /var/lib/mediawiki, so we'll designate the wiki in that location as the "default wiki" as opposed by "wikifarm instances" that live e.g. under /var/wikifarm/<instancename>.
Another important element of the MediaWiki structure is hidden in its use of the MySQL database: during setup, you either need to have a database user account, or a MySQL superuser account (e.g. 'root'@'localhost') so as to be able to create the wiki database and a database user account.

Extending the structure for a wikifarm

First we need a label <instancename>, which designates the name for each wikifarm instance, so that we can label them as we're creating them. Each name needs to be unique because we're going to create directories based on the labels. Note that this label need not be the Wiki name, as you fill it in when configuring the default wiki instance, although you could of course use that name. In our example, we use the domain name of the website on which the wiki instance is running. Thus, this wiki named "SaruWiki" is instance "saruman.biz" in our little wikifarm.

Furthermore, you need a database name for each wiki instance that gets its own database, and it should be different from the default database name "wikidb" (unless you want to put your wiki instance into this default database, together with the wiki or wikis that are already in there). You could use the label you've chosen, or different ones, or derived ones. Remember to observe the MySQL limits on database names (no more than 64 characters, no slashes or dots in the name, not ending in a space et cetera).
In addition, you need a database user name and password. This should not be the same user name and password that you've used to set up any other wiki database, or you'd have a security risk. In the unfortunate case in which you'd feel compelled to reuse a database user name (e.g. the standard "wikiuser"), be sure to reuse the original password of "wikiuser", or risk losing access to the earlier wikis that use that same user.

Thus, you could have labels, database names and database username/passwords like this:

  • saruman.biz and saruman-biz-wiki and saruwikiuser/Gp5Zmull01
  • iceditch.nl and iceditchNLwiki and icewikiuser/Br3thReeka1
  • ann.example.org and wiki-ann and ann/Tr4p7qqhl2
  • bob.example.org and wiki-bob and bobwiki/Gnff3Prkks
  • charles.example.org and wikiCharles and charles/Pzmel55Fgzxm

And remember, they are just labels. You could just as easily use user names, numbers or whatever. For maintainability though, we recommend a consistent scheme with recognizable labels. On the other hand, there are a number of labels you must avoid, because they will come to conflict with directories that already exist: most notably config, extensions and images. So before you begin, sit down and write out what your label and database name convention will be. This is trivial if you have just a handful of wikis, but can become very important should you ever grow to hundreds of instances.

Each wikifarm will need its own set of configuration files, so for each new wiki instance we'll create directories under /etc/mediawiki using the chosen wiki labels:

  • /etc/mediawiki/saruman.biz
  • /etc/mediawiki/iceditch.nl
  • et cetera.

Make each of these directories owned by the Apache2 user (chmod www-data:www-data saruman.biz et cetera).

Each wikifarm will need its own code base, so for each new wiki instance we'll create directories under /opt/mediawikifarm using the wiki labels:

  • /opt/mediawikifarm/saruman.biz
  • /opt/mediawikifarm/iceditch.nl
  • et cetera.

The choice for /opt/mediawikifarm is quite arbitrary; one could make a case to use /var/www or /var/lib/mediawikifarm/ or anything like this, but /opt is meant for optional software, which under Debian is software that's not in the standard repositories. And since the wikifarm isn't, it makes sense to create the links to the codebase there.

Each wikifarm may have its own upload directory for images. This directory should not be placed under /opt, but rather in a place where you expect data, e.g. /data if you have that, or /var/www, or something like this. We have chosen for /data/wikifarm/<wikilabel>/images.

Linking to the shared code base

As you can see from the MediaWiki directory structure, the code that is actually served in the webserver is in two separate trees: by default the webserver thinks the full tree is in /var/lib/mediawiki, but a large part of what you find in that spot is actually symlinked to files and directories in /usr/share/mediawiki. Actually, the tree /usr/share/mediawiki can be thought of as a source code library, and /var/lib/mediawiki as the instance that we're actually serving. Since an instance needs both source code and some files of its own, we find that /var/lib/mediawiki contains both links to that source code, as well as actual files and directories.

Now we can create links to the code base for each wikifarm instance for themselves. You need to create symlinks in your chosen wiki directory (e.g. /opt/mediawikifarm/saruman.biz) to all files and folders in /usr/share/mediawiki, except the directory images and the files AdminSettings.php (if it even exists) and LocalSettings.php. Such a thing can be done simplest with a little script like this: <source lang="bash"> WSRC='/usr/share/mediawiki' WDEST='/opt/mediawikifarm/saruman.biz' cd $WSRC for s in *; do

  ln -s $WSRC/$s $WDEST/$s

done cd $WDEST unlink images unlink AdminSetings.php unlink LocalSettings.php </source> Either run this at the command prompt, or bake it into a little Bash script (it could take the destination directory as a parameter to make it reusable).

Recreating (moving over) your first MediaWiki instance

We'll assume you've followed the other tips in this wiki. Most notably, you've taken all steps described in the article on "how to Place MediaWiki inside a Virtual Host". The steps are simple

Move the Wiki configuration within /etc

Your Wiki config file will be /etc/mediawiki/LocalSettings.php, and the settings for Apache2 will be in /etc/mediawiki/apache.conf. We'll create a subdirectory under /etc/mediawiki that hosts the configuration of your Wiki. Suppose we name the subdirectory after the Wiki we're handling, in this example the SaruWiki. Furthermore, suppose that you also have an AdminSettings.php file. If you don't: no worries! Just skip the steps involved with that file!

cd /etc/mediawiki
mkdir SaruWiki
mv LocalSettings.php SaruWiki
mv AdminSettings.php SaruWiki
mv apache.conf Saruwiki

Note: make sure that after moving the files, the owner and permissions are the same. Moving a file while you are root has the effect of changing ownership to root. LocalSettings.php must be owned by user:group www-data:www-data in order for Apache2 to be able to read the file. The AdminSettings.php file, if it exists, must be owned by root:root and NOT be readable by world (permissions 600) because it should contain the MySQL "root" username and password, with which maintenance scripts can perform their tasks if you instruct them to (while you are root, of course). To keep the Wiki running, we're going to need to tell Apache2 where the two files went to. In this case:

  • change the path to apache.conf in the Include statement in the site declaration (e.g. in /etc/apache2/sites-available/000-saruman.biz, change the Include line to
Include /etc/mediawiki/SaruWiki/apache.conf
  • change the symlink in /var/lib/mediawiki that points to LocalSettings.php; and if you have one, change the AdminSettings.php symlink as well:
cd /var/lib/mediawiki
unlink LocalSettings.php
ln -s /etc/mediawiki/SaruWiki/LocalSettings.php
unlink AdminSettings.php
ln -s /etc/mediawiki/SaruWiki/AdminSettings.php

If you now restart Apache2, your Wiki will still work - but the two primary configuration files have moved over to a dedicated spot. Also note that you're not yet using the new codebase - that comes later.

Move the Wiki website

So what we now need to do, is to tell the webserver that the full tree can be served from the codebase we prepared. To this end, we need to change the apache2 configuration file apache.conf of the Wiki, that we've previously copied over to the Mediawiki configuration directory /etc/mediawiki/SaruWiki and that's included in the Virtual Host file of the saruman.biz website. This file needs to point to the new shared code location; it will look something like this:

Alias /wiki /opt/mediawikifarm/saruman.biz

<Directory /opt/mediawikifarm/saruman.biz/>
        Options +FollowSymLinks
        AllowOverride All
        order allow,deny
        allow from all
</Directory>

# some directories must be protected
<Directory /opt/mediawikifarm/saruman.biz/config>
        Options -FollowSymLinks
        AllowOverride None
</Directory>
<Directory /opt/mediawikifarm/saruman.biz/upload>
        Options -FollowSymLinks
        AllowOverride None
</Directory>

Note that we haven't created the upload directory (but we can if we ever need it). Also note that we haven't created the config directory; that is because we don't need that in this wikifarm instance, because this particular instance has already been configured.

Furthermore, we now need to make sure Apache2 can find the (working) LocalSettings.php from the new wiki location /opt/mediawikifarm/saruman.biz, AND we need to change a setting within LocalSettings.php to reflect the new wiki location. For the first bit, we make the correct link(s):

cd /opt/mediawikifarm/saruman.biz
ln -s /etc/mediawiki/SaruWiki/AdminSettings.php
ln -s /etc/mediawiki/SaruWiki/LocalSettings.php

(even if you don't have the AdminSettings.php file yet, you can still make that link). For the second bit, we edit

# We define this to allow the configuration file to be explicitly
# located in /etc/mediawiki.
# Change this if you are setting up multisite wikis on your server.
define('MW_INSTALL_PATH','/opt/mediawikifarm/saruman.biz');

This should be enough! Restart Apache2 and test the result

invoke-rc.d apache2 restart

Adding more MediaWiki instances to your farm

So how do you now prepare the next wiki in your farm? Quite simple. We're now going to set up the next wiki in the default place, being /var/lib/mediawiki, after which we can return to the previous step of creating another copy of the code base and again move the new wiki over from the default location to its location in our fresh wiki farm.

So we need to set up the next website with access to the (now empty) default wiki location. We say "now empty" because the default location is lacking a LocalSettings.php file. In essence then, it is again an empty wiki, for which we'll have to run through the steps of basic configuration, as described in basic MediaWiki installation.

First off, we'll link the next virtual website to the default wiki location. Suppose the next virtual website is called "iceditch.nl", and our wiki database will be called IceditchWiki. We'll start off by creating a location for the IceditchWiki configuration files under the MediaWiki configuration directory:

cd /etc/mediawiki
mkdir iceditch.nl
cp apache.conf iceditch.nl/apache.conf

Next, we'll instruct our virtual host to load this new MediaWiki apache configuration file:

cd /etc/apache2/sites-available
vi 010-iceditch.nl

Inside of this virtual host file, we include the MediaWiki configuration; the virtual host file then looks something like this:

<VirtualHost *:80>
       ServerName www.iceditch.nl
       ServerAdmin webmaster@iceditch.nl

       DocumentRoot /var/www/iceditch.nl
       <Directory />
               Options FollowSymLinks
               AllowOverride None
       </Directory>
       <Directory /var/www/iceditch.nl>
               Options Indexes FollowSymLinks MultiViews
               AllowOverride None
               Order allow,deny
               allow from all
       </Directory>

       Include /etc/mediawiki/iceditch.nl/apache.conf

       ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
       <Directory "/usr/lib/cgi-bin">
               AllowOverride None
               Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
               Order allow,deny
               Allow from all
       </Directory>

       ErrorLog /var/appslog/apache2/iceditch.nl-error.log

       # Possible values include: debug, info, notice, warn, error, crit,
       # alert, emerg.
       LogLevel notice

       CustomLog /var/appslog/apache2/iceditch.nl-access.log combined

</VirtualHost>

Now restart Apache2, and visit the link designated to the new wiki, in this case http://www.iceditch.nl/mediawiki. You'll get the familiar "please set up the wiki first" message. Click it, and manually configure your new wiki. When you're done, move the created LocalSettings.php to /etc/mediawiki and you're set! Your new wiki is running in the default position, and as soon as you've moved it to the wikifarm, you can create yet another wiki!

Debian extensions and your wiki farm

So how about those interesting MediaWiki extensions? Well, if you've installed package mediawiki-extensions then you've got a default structure for handling extensions, that's slightly more intelligent than the standard MediaWiki way, and helps us a long way towards different extension configurations for our wikis in the farm.

In advance we offer you the following information: after installation of mediawiki-extensions, the Debian version of MediaWiki allows you to enable and disable extensions by simply running the command mwenext <extension> or mwdisext <extension>, e.g.

mwenext ConfirmEdit.php

It does this by having a directory with symlinks to all availabe extensions, and another directory where you can symlink those extensions that you want enabled. Now this mechanism can be extended to the wikifarm in two different ways:

# Per wiki a different selection of all installed extensions is made available. Per wiki in the farm, a selection of those extensions that are available to THAT wiki instance can be enabled.
# All extensions that are installed are available to all wikis in the wikifarm, but for each wiki a different selection can be enabled.

We will not describe the first case, but rather the second one.