Mediawiki collection extension installation
Definition
The Mediawiki collection extension allows a user to organize personal selections of pages in a collection. Collections can be:
- edited and structured using chapters
- persisted, loaded and shared
- rendered as PDF (see Extension:PDF_Writer)
- exported as ODF Text Document (see Extension:OpenDocument_Export)
- exported as DocBook XML (see Extension:XML_Bridge)
- ordered as a printed book at http://pediapress.com/
This page includes some centralized help links and installation tips made for our own use - Daniel K. Schneider 16:31, 4 May 2009 (UTC).
See also:
- Wiki book
- Help:Books (for collection authors and users)
- Mediawiki for a short list of useful extensions.
Help pages
Bugs and feature requests:
- http://meta.wikimedia.org/wiki/Book_tool/Feedback
- http://code.pediapress.com/wiki/report/1 (Issue tracker for technical people)
Help (can also be used for informal bug reports)
- http://groups.google.com/group/mwlib/topics (Google forums)
Information about the collection extension and related server-side software
- http://www.mediawiki.org/wiki/Extension:Collection
- http://www.mediawiki.org/wiki/Extension:PDF_Writer
- http://code.pediapress.com/wiki/wiki (PediaPress Open Source Repository, Wiki and Bug Tracking System)
- http://code.pediapress.com/git/mwlib/raw-file/tip/docs/commands.txt (command line options, important!)
- http://code.pediapress.com/wiki/wiki/Examples
Installing the whole suite requires some installation skills, but should go fairly smoothly on any Unix system and should be easy on a Debian-based Linux.
Collection extension installation and tuning
Base installation of the extension
The collection extension installs like any other Mediawiki extensions. Really easy with Mediawiki=> 1.14 (Spring 2009).
- (1) Installing
Get it from Extension:Collection on Mediawiki.org.
You may try the latest version, however sometimes it doesn't work with your MW installation. E.g. it breaks for MW 1.16.4 on April 20 2011. Get it from GIT
cd extensions git clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions/Collection.git
Formerly, the code was available in SVN
svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/Collection/
Read:
README.txt
You then can just leave all the defaults and the PDF will be generated by PediaPress. However, if you have a slow server or a high traffic server, you also should install a local render server (read the whole rest of this page). When we first installed this extensions in 2008, the Pediapress server did loose pages due to server overload. By now, this problem should be fixed (not tested ....)
- (2) Tuning
If you have and old and slow server, I suggest changing the file Collection.i18n.php and change the string (adjust to the power of your server)
'coll-rendering_text' => "<p><strong>Please wait while the document is being generated.
Depending on the size of book you may have to wait 5, 10, 15 minutes or longer.
</strong></p>
......"
'coll-save_collection_text' => 'Choose a storage location for your book and enter a name:',
- (3) Permissions (important !)
If you want users to be able to save and to share collections, add these permission to file Localsettings.php:
$wgGroupPermissions['user']['collectionsaveasuserpage'] = true; $wgGroupPermissions['user']['collectionsaveascommunitypage'] = true;
or maybe:
$wgGroupPermissions['user']['collectionsaveasuserpage'] = true; $wgGroupPermissions['sysop']['collectionsaveascommunitypage'] = true;
- (4) Define templates
You should define the following (language-dependant) templates and categories:
- Template:Saved_book (Grab a copy from a wikipedia and modify).
- Category:Books
- Category:Book tool (not really needed)
For conditional inclusion of text within articles you must define these templates (inspired from the french wikipedia, the English uses some parser extension which we probably should also install and test at some point - Daniel K. Schneider)
Template:Hide in print:
<includeonly><span class="noprint">{{{1}}}</span></includeonly>
Template:Only in print:
<includeonly><span class="hidden">{{{1}}}</span></includeonly>
Then, edit MediaWiki:Common.css and add the following CSS rule:
.hidden {display:none}
Tweaking the collection extension
Also read again README.txt !
Add in Localsettings.php (if not already done) the rendering engines you will support. In case you also installed a local rendering server, typing mw-render --list-writers will list the ones you installed.
Example setting for Localsettings.php:
$wgCollectionFormats = array( 'rl' => 'PDF', 'odf' => 'ODT', );
In Localsettings.php, if not already done so, add the name and port of the server (alternatively you also could install a cgi script).
$wgCollectionMWServeURL = "http://xxx.yyy:8899";
Extra stuff:
For the license, make sure to give the correct RAW wiki URL. Or if this doesn't work, remove the line and the user will see a URL.
$wgLicenseURL = "http://edutechwiki.unige.ch/mediawiki/index.php?title=EduTech_Wiki:Copyrights&action=raw";
Limit number of articles that a book can contain, e.g.
$wgCollectionMaxArticles = 150;
mwlib installation on Ubuntu
The easy way
Prerequisites
You must have python installed plus the pip packaging manager
apt-get install python apt-get install python-pip
pip is a tool for installing and managing Python packages, such as those found in the Python Package Index. It's a replacement for easy_install that was used in the past.
There must be some other dependencies, but since I installed prior versions of mwlib I can't remember.
Pediapress suggest the following:
apt-get install -y gcc g++ make python python-dev python-virtualenv \ libjpeg-dev libz-dev libfreetype6-dev liblcms-dev \ libxml2-dev libxslt-dev \ ocaml-nox git-core \ python-imaging python-lxml \ texlive-latex-recommended ploticus dvipng imagemagick \ pdftk
Install
pip install -i http://pypi.pediapress.com/simple/ pil pip install -i http://pypi.pediapress.com/simple/ mwlib pip install -i http://pypi.pediapress.com/simple/ mwlib.rl
Upgrades / trouble
Since I tried the old way first, see below, I then ran into conflicts. The following did help:
pip install -i http://pypi.pediapress.com/simple/ --upgrade mwlib pip install -i http://pypi.pediapress.com/simple/ --upgrade mwlib.rl
mwlib installation on Ubuntu more manually
Written in 2010, should still work somewhat
Not needed if you have a fast server and want to use the pediapress server and don't need the lasted tip of the code and don't want to customize.
On sept. 2009 we moved to Ubuntu, because Linux as compared to Solaris is so much easier. Solid enough for a little academic web server too.
Follow the instructions from Pediapress. In fact there a several variations of how you can do this.
Below are some installation notes (quick and dirty for now). Install is done by a user with admin rights not root (Ubuntu fashion):
Looking at the versions:
- The repository is here: http://code.pediapress.com/git/
Prerequisites
- Python
- Cython
apt-get install cython
Installing mwlib
Before installing, you might edit mwlib/tagext.py around line 100 a tag to exclude extensions tags used and that can't be handled by mwlib, e.g. at some point I had to exclue manually: 'pageby', 'uml', 'graphviz', 'categorytree', 'summary' (but since sometimes end of 2009 this is fixed).
sudo aptitude install make g++ perl python python-dev python-setuptools python-imaging re2c sudo aptitude install git-core git clone git://github.com/pediapress/mwlib cd mwlib
sudo python setup.py build sudo python setup.py install
Install mwlib.rl
(with a variant in procedure, i.e. we create a new directory where git will put the files)
git clone git://code.pediapress.com/mwlib.rl mwlib.rl.new cd mwlib.rl.new sudo python setup.py build install
Install the source code formatting package from http://pygments.org/ (debian packet: python-pygments)
sudo aptitude install python-pygments
Install extra fonts
sudo aptitude install ttf-indic-fonts ttf-unfonts ttf-farsiweb ttf-arphic-uming ttf-gfs-artemisia ttf-sil-ezra ttf-thai-arundina linux-libertine
In file fontconfig.py
font_paths = [os.path.dirname(mwlib.fonts.__file__), os.path.expanduser('~/mwlibfonts/') ]
This under the assumption that the fonts are in the directory mwlib.rl.hg/mwlib/fonts or ~/mwlibfonts/ of the user that installs or root if you install as root.
Check what it can render
mw-render --list-writers
You then can configure the mediawiki extension accordingly
Install tables of contents for mwlib.rl
apt-get install pdftk
This will produce a visible TOC in the beginning of the book if installed (nothing else to do, mwlib.rl will detect this library). So far this TOC does not have hyperlinks (but users can open the bookmarks to the left in order to navigate)
Configure styling
Customizing the resulting PDFs is possible by adding a custom configuration file. The file needs to named customconfig.py and should reside next to the pdfstyles.py file. Basically you can override anything in the pdfstyles.py file with your custom configuration.
Since I don't talk Python, I just modified /mwlib.rl/mwlib/rl/pdfstyles.py in the source, i.e. added the third line below to make font for codes smaller :)
if mode == 'source' or mode == 'preformatted': style.fontName = mono_font style.fontSize = small_font_size
Then rebuild and reinstall.
Updating mwlib and mwlib.rl
This is one way of doing it:
ssh to your server as root (else add "sudo" to each line ...) cd path-to/src
- consider creating a backup of the old source
mv mwlib mwlib.old mv mwlib.rl mwlib.rl.old
- Get and install mwlib
git clone git://code.pediapress.com/mwlib cd mwlib python setup.py build install cd ..
- Then get and install mwlib.rl (the PDF renderer)
git clone git://code.pediapress.com/mwlib.rl cd mwlib.rl python setup.py build install
mwlib installation on Solaris
Note: In 2010 Pediapress changed from hg to git. Therefore, some links below may be wrong - Daniel K. Schneider 17:50, 20 January 2010 (UTC).
Not needed if you have a fast server and want to use the pediapress server. Installation notes made for Solaris. This is more difficult.
Prerequisites
Install these if don't have them (usually you do)
- Python => 2.5
- Perl => 5
- g++
- Latex
Install Blahtexml
- (not done so far)
Install setuptools-0.6c9-py2.5.egg
sh setuptools-0.6c9-py2.5.egg
Install python imaging library (PIL)
- http://www.pythonware.com/products/pil/
- Get Python Imaging Library 1.1.6 Source Kit
- Unzip and cd Imaging-1.1.6
python setup.py install
Install odfpy 0.7.0 (not 0.8.0)
- http://opendocumentfellowship.com/projects/odfpy
- Get it from http://odfpy.forge.osor.eu/
- I.E. as tar ball froom http://forge.osor.eu/frs/?group_id=33
python setup.py build python setup.py install
Install rec2c
- http://re2c.org/
- Get it from http://www.sunfreeware.com/
pkgadd -d re2c-0.13.5-sol10-sparc-local
Install ocaml
- http://caml.inria.fr/
- Get if from http://www.sunfreeware.com/
pkgadd -d ocaml-3.10.2-sol10-sparc-local
Mwlib
Mwlib can be installed from a tar ball, alternatively through mercurial with easy_install.
I had to install manually, since I wanted to make some light patches to the code.
- Get if from http://code.pediapress.com/git/mwlib/ (click on gz)
- Dezip gtar zxf mwlib-db30ecca003a.tar.gz (or whatever file name)
- Slowing down the page pulling
Problem: One ought to able to slow down the server. mw-render has an option for reducing threads. But no parameter can be set in the mw extension or the mwserver itself. Therefore, one has to build mwlib from source. Changes made:
OLD In older versions, it was possible to change file mwlib/options.py in about line 60, change the default for --num-threads, e.g. from 10 to 3 if your server couldn't cope. This is no longer possible, but the server now behaves in more nice way :) - Daniel K. Schneider 18:29, 16 June 2009 (UTC)
- Not suported (unimportant) tags
- Added in mwlib/tagext.py around line 100 a tag to exclude:
'pageby', 'uml', 'graphviz', 'categorytree', 'summary'
- Note: 'uml' and 'graphviz' could be fixed by using the cached picture instead
- Setup.py problem
- none currently :)
Then go:
python setup.py install
Alternative if you don't plan any changes
easy_install mwlib && rehash
Other libraries needed by mwlib
- pygments
easy_install Pygments
- Fribidi - both a library and the Python bindings
./configure --prefix=/usr/local make make install
(this is difficult to install)
setenv fribidi_CFLAGS "-L/usr/local/lib -I/usr/local/include" setenv fribidi_LIBS -lfribidi ./configure make make install
- Ploticus
(not installed)
- mwlib.rl
easy_install mwlib.rl
Alternatively from a tarball:
- texvc
- Is in your mediawiki installation
- Compile with gmake it if not already done (needs ocaml, see above)
cd /XXX/mediawiki/math gmake ./texvc_test
- Add the directory to the system path
Testing
- mw-render --config=http://edutechwiki.unige.ch/mediawiki/ --writer=odf --output=./edutech.odt Educational_technology
- OK - Daniel K. Schneider 16:23, 4 May 2009 (UTC) (using version mwlib-41c207e76b28/)
mw-render --config=http://edutechwiki.unige.ch/mediawiki/ --writer=rl --output=./flash-cs3.pdf Flash_CS3_desktop_tutorial
- OK - Daniel K. Schneider 16:23, 4 May 2009 (UTC) (using version mwlib-41c207e76b28/)
- mw-zip --config=http://edutechwiki.unige.ch/mediawiki/ --output=./edutech.zip Educational_technology
- OK
mw-serv
mw-serve provides a server interface for the mw-render engine and mw-zip. It
- Run the server
- http://code.pediapress.com/git/mwlib/raw-file/tip/docs/commands.txt
- By default this service runs on port 8899
Type in a console:
mw-serve --cache-dir=/data/mwcache/mwlibcache/ --logfile=/data/mwcache/logs/mwserve.log --mwrender-logfile=/data/mwcache/logs/mwrender.log --mwzip-logfile=/data/mwcache/logs/mwzip.log --mwpost-logfile=/data/mwcache/logs/mwpost.log --mwzip-logfile=/data/mwcache/logs/mwzip.log
An init script
If you are happy with your server, you should make it start up automatically, e.g. do the following:
- Create a user, e.g. something like
useradd -u 70002 -g 16100 -s /bin/sh -d /data/mwcache mwserv
- Chown the cache and log directories to this user
chown -R mwserv mwcache/
- /etc/init.d script
see the example at http://svn.wikimedia.org/viewvc/mediawiki/trunk/tools/mw-serve/mw-serve.sh Add it to appropriate run-levels with chkconfig (see man chkconfig)
Here is a simple script that can do:
#! /bin/bash
PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin
DAEMON=/usr/local/bin/mw-serve
test -x $DAEMON || exit 0
case "$1" in
start)
/usr/local/bin/mw-serve --cache-dir=/data/mwcache/mwlibcache/ --logfile=/data/mwcache/logs/mwserve.log --mwrender-logfile=/data/mwcache/logs/mwrender.log --mwzip-logfile=/data/mwcache/logs/mwzip.log --mwpost-logfile=/data/mwcache/logs/mwpost.log --mwzip-logfile=/data/mwcache/logs/mwzip.log&
;;
stop)
killall mw-serve #ou rien
;;
force-reload|restart)
$0 stop
$0 start
;;
*)
echo "Usage: /etc/init.d/mw-serve {start|stop}"
exit 1
;;
esac
exit 0
Tips for creating wiki books
Creating a serious book
It is best just to "hand edit" a stored book. E.g. you may start by adding a category or isolated articles to a book with the collection interface, but once you got most of the articles:
- Save the book
- Then select it from the category:books and then just edit the page in the "normal way". The structure and syntax to respect is demonstrated by the following example:
{{saved_book}} == My Book == === Example === ;Foo :[[First article]] :[[Second article]] :[[Third article|This article renamed in the book]] ;Bar :[[Fourth article]] :[[Fifth article]] :[{{fullurl:Sixth article|oldid=20}}Sixth article version:20] [[Category:Books]]
Install the Mediawiki source extension
If you have articles that include computer source code (XML, ActionScipt or whatever), you also should install the source extension (CodeSyntaxHighlight MediaWiki for formatting computer code.
If you use XML within "pre" tags, the parser may become confused.
Also, the printed Pediapress book will look much prettier. Editing all your wiki pages may represent work, but user reading the on-line pages also will be grateful, colored and indented code is really much more readable !
Pictures and drawings
(1) Readjust some image sizes
(I'll have to be more precise about this, but I'll first need to analyze both a wiki book I got and the generated PDF...)
- Don't use large pictures when smaller ones are readable
- Create screendumps from smallest possible areas if you plan to show hairy details (see also the screen capture tutorial).
(2) Color Printed PDF is mostly grey (Pediapress books certainly are). E.g. when you create drawings or annotate images with text, you should make sure that shades of grey still allow to identify critical elements. A related very difficult issue is writing the text. Avoid writing about "blue" and "green" arrows and "green" or "yellow" dots ...
- To select colors that show on gray printers, you may use a color scheme designer such as colorschemedesigner.com and simulate color vision deficiency with "full colorblindness". Else print your drawing, before you import to wiki.
- A radical solution is to use gray images already upfront in the wiki. This way you are sure to get it somewhat right. Note: "grey" is spelled "gray" in CSS and X11.
Stuff that the parser doesn't handle well
As of mid May 2009 (so this may change)
- Some extensions (like graphwiz) are not supported, see above. There is no solution for this, except not using these.
- Some extensions that are not supported wont' matter, e.g. pageby. But you will have to modify the source code for filtering if you own your own server (see above) or file a request.
- prints "as is", therefore avoid! Use ":" and "::" etc. to indent lines for example
- Source code (either within "pre" or "source" tags that follow a picture will be printed over the picture. Probable reason. The renderer will reduce a picture and try to wrap text around it. Source code cannot be wrapped or not as well. Workaround: Move the image either 20 lines above or after the code.
Conditional inclusion/exclusion
Warning: This wasn't tested with Pediapress. If you plan to buy a book, use these on the very first pages, then have a look at the preview.
1) Exclude templates
You may exclude any template from the PDF generation you wish by adding them to the Category:Exclude in print. Use with care, since this will filter for all users !
2) Exclude certain specific content
By using Template:Hide in print, certain specific content, such as a few words or an image, can be excluded from printing.
This content will be printed.{{Hide in print|This content will not be printed.}}This content will be printed.
Alternative solution: use the class="noprint" within a div or span tag.
3) Include certain specific content only in print versions The Template:Only in print can be used to insert content that shall only be visible in offline versions.
Example: print this and display it in the browser {{Only in print|this is only in PDFs or printed books visible, not with the browser}} this is visible in the browser and in print as well.
Alternative solution: use the class="hidden" within a div or span tag.
4) Substitute templates
You can create a print version of a template under the name "TEMPLATENAME/Print" with TEMPLATENAME being the name of the original template.
(more to come ....) - Daniel K. Schneider 12:54, 20 May 2009 (UTC)