Tuesday, January 31, 2006


It just struck me that wikis are ideally suited to editing code, as each function name can be a link to the definition of that function, making code browsing a snap. Would be a trivial mediawiki extension. If you know of any code wiki, please let me know, too! I also suspect that some IDEs have this functionality?

Friday, January 27, 2006

GUI to queue and edit jobs to be done over ssh

Had the idea for an app that allows you to:
  • list jobs you want done over ssh (no point running more than one job at a time on single-cpu machines)
  • enable editing the cmdline arguments of each job before it is submitted
  • allow pasting or even drag-and-drop of jobs (e.g. between different machines) into the queue
  • allow reordering jobs
tags:, , ,

P2P RSS network

Have you ever been annoyed that when you subscribe to a new RSS feed, there may be only 10 articles available? Granted, this problem is addressed by some of the online RSS readers out there (I've posted before), but it might just be interesting to have the html content distributed by a P2P network. I believe this would significantly speed up loading of html pages for those feeds that use html. If you know of any such effort, please let me know!

tags:, ,

Thursday, January 26, 2006

Reinstalling Linux: a checklist

I thought it would be nice for people to have a checklist of what they need to back up before installing a different Linux flavour.
  • /etc - system configuration
  • /home - your files and configurations
  • /boot/grub/menu.lst or grub.conf
  • /var/httpd - only if you're running web services and using the global directory; on some distributions, this is not placed in /var (e.g. in Arch it's in /home/httpd)
  • do a mysqldump if you're using mysql; similarly for any other relational database
  • dump your package list - sometimes this is done by reading file names from /var/cache/pkg or similar after cleaning out old package files; some package managers will output a list (something like dpkg --get-selections on Debian and derivatives)

Monday, January 23, 2006

Remote desktop client with zoom

Wouldn't it be great to have a remote desktop (e.g. VNC) client that compensated for different desktop sizes by means of zoom? Libraries for such zooming exist in Mac OS X, GNOME and KDE, even Windows! Even better if the server supported it. If there is a way to do this, let me know!

Update 07/05/2006:
krdc has this feature.

Friday, January 13, 2006

Linux repository classification schemes

Originally, I was going to write this as an extended paper with a detailed review of how the ten main Linux distributions organise their software into repositories. However, I don't seem to have the time to do so, and will provide a brief overview here instead.

This is relevant to developing tools that allow comparisons of repositories, e.g. comparison of software availability (how many software packages are available; how quickly are new versions released, how current are the current versions, how many versions are released in a given time - three sides of a triangle; other comparisons might take into account stability and other criteria), such as whohas.

In any case, there are three main ways to classify repositories:
  • Maturity
  • Providence
  • Function
The classic example of repositories organised by maturity would be Debian, which at any given time has three branches which may be more or less distinct (there is a graph of the relationship over time somewhere on the web...) A peculiarity - indeed, a feature - of Debian is that one can almost freely mix packages from different repositories; so while one may be running a stable kernel, one could have an "unstable" version (the quality of unstable aka. still in development software is actually fairly high in Debian) of Mozilla-based (and -dependent) products. Many distributions (except source-based and advanced binary-based ones (Arch Linux)) occur as distinct releases in the wild, but Debian is the only one in which mixing repositories is common enough practice to actually work (in terms of documentation and being taken into account in development, if marginally).

A classic example of a providence-based repository classification is given by Fedora, which is now distributed as Core and Extras. Another common classification, especially used by RPM-based distros (for no technical reason as far as I know), is "Contrib", sometimes called Community.

Arch Linux has a hybrid of these two, in that Current correspond to Core, Extra and Community are self-explanatory providence-based contrasts, but there are aso Testing and Unstable repositories, which are code-maturity classifications and mostly contain packages that would otherwise be found in Core. To make things entirely confusing, there is a repository Unsupported, to which users can contribute buildscripts, so it is actually a fourth kind of classification, which I might phrase as binary-source-buildscript. Note that distributions will provide either source or buildscripts, but not both separately.

But to return to the original big three, the most prominent example of a functional classification would be Slackware, which classifies packages into base, latex, gnome etc.; however, these are not strictly repositories in that they would be separately specified in a package manager config file. Again, many hybrids exist - in Arch Linux, we also find an underlying functional classification into "categories", which resemble those in Slackware: x11, system, network, gnome etc.

Being aware of the different classification schemes used, one can get the full benefit of tools such as whohas.

Sunday, January 08, 2006

Does Wikipedia change the rules of language evolution?

I happened upon this example today:

It is argued that the term "Williams evolution" was coined either on wikipedia or in newsgroups. I find the wikipedia hypothesis quite plausible.

Picture this:
  1. Some editor of, say, http://en.wikipedia.org/wiki/History_of_evolutionary_thought writes something like, "George C. Williams' book led to a small revolution [...]"
  2. Over time, this becomes "George C. Williams' revolution", then "Williams' revolution"; someone thinks the apostrophe superfluous and the next person feels there should be something written about this "Williams revolution", so puts the infamous [[]] around it.
  3. Finally, someone budges and writes a stub about it.
  4. Meanwhile, the term "Williams revolution" has started being used in other articles because it's a more handy moniker than "advent of the gene-centric view of evolution". By now, putting it in [[]] is completely uncontroversial, because an article has already been written about it. So it appears on every imaginable page, ranging from "Scientific skepticism" over "Evolutionary theory and the political left" through to "The Vicar of Bray" (I am NOT kidding you!)
  5. Eventually, someone feels that Williams may be being given undue credit and does some research. All Google hits point to wikipedia, including those from scholar.google.com. Web of Science doesn't return a single hit. One contributor claims having heard the term on a newsgroup, but this is hardly evidence of common usage. Various people including myself check their textbooks and the books of Dawkins who are now suddenly being credited with having invented the term. Nothing. Nada. Puzzlingly, the German wikipedia mentions the term in spite of not having an entry about it.
  6. Due to lack of opposition, it is decided that it is not Wikipedia's business to have the power to coin useful phrases crediting someone who should not solely be credited.
  7. Someone works their arse off for an afternoon to eliminate all trace of the Williams revolution.
If you know of similar events, please let me know - this topic has only just begun to get interesting!