Saturday, 6 June 2009

Legal Māori Archive


Now that the
Legal Māori Archive is live, I thought I'd highlight a couple of my favourite texts from the corpus.

The first is a great example of reinforcing cultural confusion.
"The Laws of England, Compiled and translated into the Māori language" by judge Francis Dart Fenton is a bi-lingual compendium of the laws of England, but extraordinarily uses bible quotes as examples.

The second example is actaully a collection of texts, the works of Rev. Henry Hanson Turton, who compiled thousands of pages of land deeds and associated documents into six volumes. I can see these seeing a lot of use by Treaty researchers.

Tuesday, 5 May 2009

Why card-based records aren't good enough

Card catalogs have a long tradition in librarianship, dating back, I'm told, to the book stock-take in the French revolution. Librarians understand card catalogs in a deep way that comes from generations of librarians having used them as a core professional tool all their professional lives. Librarians understand card catalogs in ways that I, as a computer scientist, never will. I still recall on one of my first visits to a university library, I asked a librarian where I might find books by a particular author, they found the work for me arguably as fast as I can now find works with the new wizzy electronic catalog.

It is natural, when faced with something new, to understand it in terms of what we already know and already understand. Unfortunately, understanding the new by analogy to the old can lead to form of the old being assumed in the new. It was true that when libraries digitized their card catalogs in the 1970s and 1980s, they were more or less exactly digital versions of the card catalog predecessors, because their content was limited to old data from the cards and new data from cataloging processes (which were unchanged from the card catalog era) and because librarians and users had come to equate a library catalog with a card catalog---it was what they expected.

MARC is a perfect example of this kind of thing. As a data format to directly replace a card catalog of printed books, it can hardly be faulted.

Unfortunately, digital metadata has capabilities undreamt of at the time of the French revolution, and card catalogs and MARC do a poor job of handling these capabilities.

A whole range of people have come up with criticisms of MARC that involve materials and methodologies not routinely held in libraries at the time of the French revolution (digital journal subscriptions and music, for example), but I view these as postdating card catalogs and thus the criticism as unfair.

So what was held in libraries in 1789 that MARC struggle with? Here's a list:
  • Systematically linking discussion of particular works with instances of those works
  • Systematically linking discussion of particular instances with those instances ("Was person X the transcriber of manuscript Y?")
  • Handling ambiguity ("This play may have been written by Shakespeare. It might also have been a later forgery by Francis Bacon, Christopher Marlowe or Edward de Vere")

All of these relate to core questions which have been studed in libraries for centuries. They're well understood issues, which changed little in the hundred years until the invention of the computer (which is when all the usually-cited issues with MARC began).

The real question is why we're still expecting an approach that didn't solve the problems two hundred years ago to solve our problems now? Computers are not magic in this area they just seem to be helping us do the wrong things faster, more reliably and for larger collections.

We need a new approach to bibliographic metadata, one which is not ontologically bound to little slips of paper. There are a whole range of different alternatives out there (including a bevy of RDF vocabularies), but I've yet to run into one which both allowed clear representation of existing data (because lets face it, I'm not going to re-enter worldcat, and neither are you, not in our lifetimes) and admitting non-card-based metadata as first class elements.

</rant>

Friday, 1 May 2009

LoC gets semantic

This morning, the Library of Congress launched http://id.loc.gov/authorities/, their first serious entry into the semantic web.

The site makes the Library of Congress Subject Headings available as defererenable URLs. For example http://id.loc.gov/authorities/sh90005545.

Wednesday, 4 February 2009

NDHA demo and the National Library

This morning I went to the NDHA demonstration where a National Library techie talked us through the NDHA ingest tools. The tools are the most visible piece of the NDHA infrastructure, and are designed to unify the ingest of digital documents, whether they are born-digital documents physically submitted (i.e. producers mail in CDs/DVDs etc); born-digital documents electronically submitted (i.e. producers upload content via wizzy web tools); or digital scans of current holdings produced as part of the on-going digitisation efforts. The tools have a unified system with different workflows for unpublished material (=archive) and published material (=libarary). The unification of library and archival functionality seemed like futile ground for miscommunication.

The infrastructure is (correctly) embedded across the library, and uses all the current tools for collection maintenance, searching and access.

As a whole it looks like the system is going to save time and money for large content producers and capture better metadata for small donors of content, which is great. By moving the capture of metadata closer to the source (while still allowing professional curatorial processes for selection and cataloguing), it looks like more context is going to be captured, which is fabulous.

A couple of things struck me as odd:

  1. The first feedback to the producer/uploader is via a human. Despite having an elaborate validation suite, the user wasn't told immediately "that .doc file looks like a PDF, would you like to try again?" or "Yep, that *.xml file is valid and conforming XML" Time and again studies have shown that immediate feedback to allow people to correct their mistakes immediately is important and effective.
  2. The range of metadata fields available for tagging content was very limited. For example there was no Iwi/Hapu list, no Maori Subject Headings, no Gazetteer of Official Geographic Names. When I asked about these I was told "that's the CMS's role" (=Collection Management Software, i.e. those would be added by professional cataloguers later), but if you're going to move the metadata collection to as close to content generation, it makes sense to at least have the option of proper authority control over it.
Or that's my take, anyway. Maybe I'm missing something.

Monday, 2 February 2009

Report from the NDHA's International Perspectives on Digital Preservation

NOTE: I'm a computer scientist by training and this was largely librarian/archivist gig, so it's entirely possibly I've got the wrong end of the stick on one or more point in the summary below. It's also my own summary, and not the position of my employer, even though I was on work time during the event.

The NDHA is about to announce that the NDHA project has been completed on time and under budget. This is particularly pleasing in light of the poor history of government IT failures over the course of the last 30 years and a tribute to all concerned. Indeed, when I was taking undergraduate courses in software engineering a contemporary national library project was used as a text-book example of how not to run a software development undertaking. It's good to see how far they've come.

The event itself was a one-day event in the national library auditorium, with a handful of overseas speakers. I'm not entirely certain that a handful of foreigners counts as "international," but maybe that's just me being a snob. Certainly there was a fine turn-out of locals, including many from the National Library, the Ministry of Culture and Heritage and from VUW, including a number of students, who couldn't possibly have been there for the free food.

There seemed to be an underlying tension between librarianship and archivistship running through the event. I see this as being a really crazy turfwar, personally, since I see the chances of libraries and archives existing as separate entities and disciplines in fifty years seems pretty slim. The separation between the two, the "uniqueness" of objects in an archive seems to be to be obliterated by the free-duplication of digital objects. I've heard people say that archives also work access controls and embargoes for their depositors, but then so can libraries, particularly those in the military and those working with classified documents.

It seemed to me that the word "reliability" was used in a confusing number different of ways by different people. Without naming the guilty parties:
  1. reliability as the truthfulness of the documents in the library/archive. This is the old problem of ingestors having to determine the absolute veracity of documents
  2. reliability as getting the same metadata every time. This seems odd to me, since systems with audit control give _different_ results every time, because information on the previous accesses is included in the metadata of subsequent accesses
  3. reliability as the degree to which the system conformed to a standard/specification
On reflection this may have been a symptom of the different vocabulary used by librarians and archivists. Whatever the cause, if we're wanting to spend public money, we have to be able to explain to the public what we're doing, and this isn't helping.

The organisers told us the presentations would be up by tonight (the evening of the presentation), but you won't find them on google if you go looking, because they tell google to please f**k off. I guess this is what someone was referring to when they said we had to work to make content accessible to google. The link is http://ndha-wiki.natlib.govt.nz/ndha/pages/IPoDP%2009%20Presentations and most were up at the time of writing.

I was hugely encouraged by the number of pieces of software that seemed to be being open sourced, as I see this as being a much better economic model than paying vendors for custom software, particularly since it's potentially scalable out from the national and top-tier libraries/archives/museums out to the second and third tier libraries/archives/museums, which by dint of their much larger numbers actually serve the most users and have the most content. It was unfortunate that the national library hasn't looked beyond propriety software for non-specialist software but continues to use AbodePhotoshop / Microsoft Windows, which are available only for limited periods of time on certain platforms (which will inevitably become obsolete), rather than openoffice, GIMP, etc, which are cross platform and licensed under perpetual licences which include the right to port the software from one platform to another. I guessPhotoshop / Windows is what their clients and funders know and use.

With a number of participants I had conversations about preservation. Andrew Wilson in his presentation used the quote:

“traditionally, preserving things meant keeping them unchanged; however our digital environment has fundamentally changed our concept of preservation requirements. If we hold on to digital information without modifications, accessing the information will become increasingly difficult, if not impossible” Su-Sing Chen, “The Paradox of Digital Preservation”, Computer, March 2001, 2-6

If you think about what intellectual objects we have from the Greeks (which is were us Westerners traditionally trace our intellectual history from), the majority fall into two main classes: (a) art works, which have survived primarily through roman copies and (b) texts, which have survived by copying, including a body of mathematics which were kept alive in the Arabic translation during a period when we Westerners were burning the works in Latin and Greek and claiming that the bible was the only book we needed. I'll grant you that a high-quality book will last maybe 500 years in a controlled environment, maybe even 1000, but for real permanence, you just can't get past physical ubiquity. If we have things truly worthy of long-term preservation, we should be striking deals with the Warehouse to get them into every home in the country, and setting them as translation exercises in our language learning courses.

I had some excellent conversations with other participants at the event, including Phillipa Tocker from Museums Aotearoa / Te Tari o Nga Whare Taonga o te Motu who told me about the http://www.nzmuseums.co.nz/ site they put together for their members.

Looking at the site I'm struck by how similar the search functionality is to http://www.nram.org.nz/. I'm not sure whether their relative similarity is a good thing (because it enables non-experts to search the holdings) or a bad thing (because by lowering themselves to the lowest common denominator they've devalued their uniqueness). While I'm certain that these websites have vital roles in the museums and archives community respectively, I can't help but feel that from an end-users perspective have two sites rather than one seems redundant, and the fact that they don't seem to reference/suggest any other information sources doesn't help. I can't imagine a librarian/archivist not being forth-coming with a suggestion of where to look next if they've run out of local relevant content---why should our websites be any different?

I recently changed the NZETC to point to likely-relevant memory institutions when a search returns no results (or when a user pages through to the end of any list of results).

I also talked to some chaps from Te Papa about the metadata they're using to to represent places names (Getty Thesaurus of Geographic Names) and species names (ad-hoc). At the NZETC we have many place names marked up (in NZ, Europe and the Pacific), but are not currently syncing with an external authority. Doing so would hugely enable interoperability. Ideally we'd be using the shiny new New Zealand Gazetteer of Official Geographic Names, but it doesn't yet have enough of the places we need (it basically only covers places mentioned in legislation or treaty settlements). It does have macrons in all the right places though, which is an excellent start. We currently don't mark up species names, but would like to, and again an external authority would be great.

It might have been useful if the day had included an overview of what the NDHA actually was and what had been achieved (maybe I missed this?).

Sunday, 1 February 2009

flickr promoting the commons / creative commons

flickr is promoting photos in what it calls "The Commons", but only to logged in users. Normal users (who can't comment / tag the photos in the commons anyway) don't get an obvious link to them (except via about a billion third party sites such as blogs and google search). The page also shows how the national library's choice not to include text in their logo has come up trumps.

The confusingly similarly named "The Commons" and "Creative Commons" parts of the website apparently don't reference each other. Odd.

Friday, 9 January 2009

Excellent stuff from New Zealand Geographic Board Ngā Pou Taunaha o Aotearoa

A while ago, motivated by the need for an authoritative list of New Zealand place names for our with at the NZETC, I criticised the NZGB fairly roundly.
While they haven't produced what I/we want/need, in the last couple of months they've made huge progress in an unambiguously right direction.
Their primary work is the New Zealand Gazetteer of Official Geographic Names, a list of all official place names in New Zealand. It uses have a peculiar definition of "official" (= mentioned in legislation or a Treaty of Waitangi settlement), they have very few names of inhabited places (and no linking with the much larger ones maintained by official bodies such as the police and fire service), They have no elevation data for mountains and pass (which are defined by their height) and they define some things as points when they appear to be areas (such as Arthur Pass National Park), but it's much better than the New Zealand Place Names Database since:
  1. It has a statutory reference for every place, given the source of the officialness of the name
  2. It fully support Macrons
  3. It has a machine readable-list of DoC administered lands --- I can imagine this being used for all sorts of interesting things, getting people out in other scenic and marine reserves.
NZGB sent around an email in which they explicitly addressed some of the points I'd earlier raised (I'm sure I wasn't the only one):
It should be noted that some of the naming practices of the past will have to be lived with, despite inconsistencies. Moving forward, the rules of nomenclature followed by the NZGB are designed to promote standardisation, consistency, and non-ambiguity. The modern format for dual names is '<Maori name> / <non-Maori name', which the NZGB has applied for the past 10 years, though Treaty settlement dual names sometimes deviate from this convention, because the decision is ultimately made by the Minister for Treaty of Waitangi Negotiations. Older forms of dual names, with brackets, will remain depicted as such until changed through the statutory processes of the NZGB Act 2008. These are not generally regarded as alternative names.
Macrons in Maori names have posed problems for electronic databases. Nevertheless they are part of the orthography, recommended by the Maori Language Commission, and the Board endorses their use. The Gazetteer will include macrons where they are formalised as part of the official name. When Section 32 of the new Act comes into force, official documents will be required to show official names, and these will need to include macrons where they have been included as part of the official name (unless the proviso is used). A list of those official names which have macrons is at http://www.linz.govt.nz/placenames/researching-place-names/macrons/index.aspx . LINZ's Customer Services has some solutions for showing macrons in LINZ's own databases and on published maps and charts, and is currently investigating how bulk data extracts might include information about macrons, for the customer's benefit.
Despite the name, it isn't clear in my mind exactly what's official and what isn't. Is the content of the "coordinates" column official? For railway lines this is a reference to the description, which in the cases of railways is usually of the form "From X to Y", where X and Y are place names, frequently place names that aren't on the list, so are thus presumably not official. Unless I'm going blind there is also no indication of accuracy on the physical measurements.

Thursday, 9 October 2008

fuzzziness

I've been using topic maps in my day job, so I decided to try out http://www.fuzzzy.com/, a social bookmark engine that uses an underlying topic map engine.
I tried to approach fuzzzy with an open mind, but the increasingly stumbling on really annoying (mis-)features.
  1. This is the first bookmark engine I've ever used hat doesn't let users migrate their bookmarks with them. This is perhaps the biggest single feature fuzzzy could add to attract new users, since it seems that most people who're likely to use a bookmark engine have already played with another one long enough to have dozens or hundreds of bookmarks they'd like to bring with them. I know this is non-ideal from the point of view of the social bookmark engine they're migrating too, since it makes it hard to do things completely differently, but users have baggage.
  2. While it'd possible to vote up or vote down just about everything (bookmarks, tags, bookmark-tags, users, etc), very little is actually done with these votes. If I've viewed a bookmark once and voted it down, why is it added to my "most used Bookmarks"? Surely if I've indicated I don't like it the bookmark should be hidden from me, not advertised to me.
  3. For all the topic map goodness on the site, there is no obvious way to link from the fuzzzy topic map to other topic maps.
  4. There doesn't seem to be much in the way of interfacing with other semantic web standards (i.e. RDF).
  5. The help isn't. Admittedly this may be partly because many of the key participants have English as a second language.
  6. There's a spam problem. But then everywhere has a spam problem.
  7. It's not obvious that I can export my bookmarks out of fuzzzy in a form that any other bookmark engine understands.
These (mis-)features are a pity, because at NZETC we use topic maps for authority (in the librarianship sense), and it would be great to have a compatible third party that could be used for non-authoritative stuff and which would just work seamlessly.

Sunday, 5 October 2008

Place name inconsistencies

I've been looking at the "Dataset of New Zealand Geographic Place Names" from LINZ. This appears to be as close as New Zealand comes to an Official list of place names. I've been looking because it would be great to use as an authority in the NZETC.

Coming to the data I was aware of a number of issues:
  1. Unlike most geographical data users, I'm primarily interested in the names rather than the relative positions
  2. New Zealand is currently going through an extended period of renaming of geographic features to their original Māori names
  3. The names in the dataset are primarily map labels and are subject to cartographic licence
What I didn't expect was the insanity in the names. I know that there are some good historical reasons for this insanity, but that doesn't make it any less insane.
  1. Names can differ only by punctuation. There is a "No. 1 Creek" and a "No 1 Creek".
  2. Names can differ only by presentation. There is a "Crook Burn or 8 Mile Creek", an "Eight Mile Creek or Boundary Creek" and an "Eight Mile Creek" (each in a different province).
  3. There is no consistent presentation of alternative names. There is "Saddle (Mangaawai) Bivouac", "Te Towaka Bay (Burnside Bay)", "Queen Charlotte Sound (Totaranui)", "Manawatawhi/Three Kings Islands", "Mount Hauruia/Bald Rock", "Crook Burn or 8 Mile Creek" and "Omere, Janus or Toby Rock"
  4. There is no machine-readable source of the Māori place names with macrons, and the human readable version has contains subtle difference to the machine-readable database (which contains no non-ASCII characters). For example "Franz Josef Glacier/Kā Roimata o Hine Hukatere (Glacier)" and "Franz Josef Glacier/Ka Roimata o Hine Hukatere" differ by more than the macrons. There appears to be no information on which are authoritative.
Right now I'm find finding this rather frustrating.

(grammar edit)

Tuesday, 2 September 2008

Does anyone publish the Dataset of New Zealand Geographic Place Names already in XML form?

I've been playing with the Dataset of New Zealand Geographic Place Names which is a set of CSV files published by Toitū te whenua / Land Information New Zealand (LINZ). The data takes quite a bit of massaging, and I was wondering whether anyone else had already done the work of making acceptable XML out of the data rather than doing all the work myself.


I've attached the script I have so far, but it's not perfect. In particular:


  1. It doesn't include place names with Macrons
  2. It makes lots of ASCII-type assumptions
  3. Many of the element names are poorly named and map non-obviously to fields in the CSV files.
  4. The script isn't very generic and does little or no checking

Anyway, here's he script, hopefully it's successfully escaped. The basics are that it creates an sqlite database and streams the CSV files into it direct from the zip (which it expects to have been downloaded into the current directory). It then streams each point out using awk to transform it to XML.




#!/bin/bash
# script to import data from
# http://www.linz.govt.nz/placenames/search/place-names-dataset-download/index.aspx
# into an XML file.
# this script licensed under the GPL/BSD/Apache 2 licences

echo \(re\)creating the database, expect DROP errors the first time you run this
sqlite nzgeonames.db << EOF
DROP TABLE name;
CREATE TABLE name (id, name, east, north, pdescription, district, sheet, lat, long);

DROP TABLE district;
CREATE TABLE district (district, description);


DROP TABLE pdescription;
CREATE TABLE pdescription (pdescription, short, description);


DROP TABLE sheet;
CREATE TABLE sheet (edition, map, sheet);

VACUUM;
EOF

echo importing the names
unzip -p nznames_6Aug08.zip namedata.txt | sed 's/\r//' | sed 's/`/","/g' | awk -F^ '{print "INSERT INTO name VALUES (\"" $0 "\");"}' | sqlite nzgeonames.db

echo importing the districts
unzip -p nznames_6Aug08.zip landdist.txt | sed 's/\r//' | sed 's/`/","/g' | awk -F^ '{print "INSERT INTO district VALUES (\"" $0 "\");"}' | sqlite nzgeonames.db

echo importing the point descriptions \(expect two lines of errors\)
unzip -p nznames_6Aug08.zip pointdes.txt | sed 's/\r//' | sed 's/`/","/g' | awk -F^ '{print "INSERT INTO pdescription VALUES (\"" $0 "\");"}' | sed 's/:/","/' | sqlite nzgeonames.db

echo importing the sheet names
unzip -p nznames_6Aug08.zip sheetnam.txt | sed 's/\r//' | sed 's/`/","/g' | awk -F^ '{print "INSERT INTO sheet VALUES (\"" $0 "\");"}' | sqlite nzgeonames.db


# pick up the ugly duckling
sqlite nzgeonames << EOF
INSERT INTO pdescription VALUES ("MRFM","MARINE ROCK FORMATION","Marine Rock Formation");
EOF

echo exporting points as xml
echo "<document source=\"Sourced from Land Information New Zealand, [date]. Crown copyright reserved.\">" > nzgeonames.xml
sqlite nzgeonames.db "SELECT name.id, name.name, name.east, name.north, name.pdescription, name.district, name.sheet, name.lat, name.long, district.description, pdescription.short, pdescription.description AS descriptionA, sheet.edition, sheet.map FROM name, district, pdescription, sheet WHERE name.district = district.district AND name.pdescription = pdescription.pdescription AND name.sheet = sheet.sheet;" | awk -F\| '{print "<point><id>" $1 "</id><name>" $2 "</name><east>" $3 "</east><north>" $4 "</north><pdescription>" $5 "</pdescription><district>" $6 "</district><sheet>" $7 "</sheet><lat>" $8 "</lat><long>" $9 "</long><description>" $10 "</description><short>" $11 "</short><descriptionA>" $12 "</descriptionA> <edition>" $13 "</edition> <map>" $14 "</map> </point>"}' | sed 's/&/&amp;/' >> nzgeonames.xml
echo "</document>" >> nzgeonames.xml

echo formatting the points nicely
xmllint --format nzgeonames.xml > nzgeonames-formatted.xml


Library of Congress flickr experiment

While processing the photos from my parent's ruby wedding anniversary, I ran into the Library of Congress's flickr experiment.



I probably shouldn't have been, but I was astounded. It looks the the bastion of old-school cataloguing is coming to bathe in the fountain of social tagging.



This is part of a larger effort described at http://www.flickr.com/commons/


Tuesday, 26 August 2008

Saxon joy!

I've just moved to saxon from libxml for some XSLT stuff I'm doing, and I'm really loving it.

Not only does saxon run take much less memory, it also speaks XSLT 2.0.

Sunday, 10 August 2008

kowhai flowers at waikanae


Tuesday, 5 August 2008

moving back to google reader from bloglines

A couple of months ago I migrated to bloglines from google reader, not because I was necessarily unhappy with google reader, but because I was interested in seeing what else was available and how it might differ. I've just moved back to google reader.

OMPL just worked. I was able to move my RSS "reading list" from google reader to bloglines and back again with no fuss, no hassle and no duplication.

The advantages of google reader over bloglines are:
  1. AJAX - whereas bloglines marks all items on a page as read when you browse to it, google reader marks them as read when you scroll past them.
  2. Ordering - google entwines items from all feeds in time order, bloglines presents items feed by feed
  3. Better integration with other services
The advantages of bloglines over google reader are:
  1. Fast scanning of voluminous feeds
  2. Fast browsing (it seems _much_ faster when there are thousands of items)
  3. Less integration with other services
You'll notice that better integration is both a positive and a negative.

The fact that I have several google accounts and and only one of them is tied to my RSS reading means that there are tasks I can't multi-task between, even at the coarsest of levels and also means that contacts from the google account almost never get forwarded articles I discover via RSS.

The fact that my blogger.com account and my google reader accounts magically know about each other is great, as is being able to sign in once to a whole suite of tools.

In the end the reason for changing back was ordering. I read too many RSS feeds that cover the same topic for reading them out of order to make sense.

I've also just culled some of my RSS feeds, with the a prime criterion being the quality of their RSS. A number of web comics require one to click a link to read the strip and I no longer read them, but I still read Unshelved, which has the strip (and an ad) in the RSS.

Monday, 4 August 2008

Decent editor for blogger.com?

Can someone recommend a decent replacement for the default editor for blogger.com?

Before it drives me insane...

KDE/Gnome Māori localisation on the rocks?

It looks like Maori localisation has been removed from the KDE 4.0 repository:

stuartyeates@stuartyeates:~/tmp/mi$ svn co svn://anonsvn.kde.org/home/kde/trunk/l10n-kde4/mi/messages
svn: URL 'svn://anonsvn.kde.org/home/kde/trunk/l10n-kde4/mi/messages' doesn't exist
stuartyeates@stuartyeates:~/tmp/mi$ svn co svn://anonsvn.kde.org/home/kde/trunk/l10n-kde4/mi/docmessages
svn: URL 'svn://anonsvn.kde.org/home/kde/trunk/l10n-kde4/mi/docmessages' doesn't exist
stuartyeates@stuartyeates:~/tmp/mi$ svn co svn://anonsvn.kde.org/home/kde/branches/stable/l10n-kde4/mi/messages
svn: URL 'svn://anonsvn.kde.org/home/kde/branches/stable/l10n-kde4/mi/messages' doesn't exist

Things don't look good for the upcoming 4.* releases, with the stats for translation at 0%: http://l10n.kde.org/stats/gui/trunk-kde4/team/

Gnome Māori localisation is not much better: stable at 1%: http://l10n.gnome.org/teams/mi

In the medium/long term there is hope that much of this localisation can be bootstrapped by application-centric localisation that appears to be thriving, particularly with respect to firefox, thunderbird and OOo.

Sunday, 3 August 2008

Leaving catalyst :( joining NZETC :)

Last week I gave notice at my current employer (Catalyst.net.nz) and accepted a job at Victoria University's New Zealand Electronic Text Centre. The NZETC is primarily a TEI/XSLT/Cocoon-house which publishes digital versions of culturally significant works. It also runs a number of other digital services for the university library (into which it is currently being integrated). As such it's significantly closer to what I've been doing previously in terms of environment, content and technology.
Exciting things about the NZETC from my point of view:
The commute to work will be slightly longer, with me either getting off the bus one stop earlier and catching the cablecar up the hill, or getting off at my current stop and walking up. I'm hoping to do mainly the later.

Wednesday, 9 July 2008

Who should I nominate for the NZ Open Source Awards?

So nominations are open for the New Zealand Open Source Awards and I have to decide who I should nominate. There doesn't seem to be anything stopping me nominating several, but picking one contender and throwing my weight behind them seems like the right thing to do. The ideas I've come up with so far are:

Kiharoa Dear for excellent work in getting firefox, thunderbird and open office working in Māori contexts:

http://kiharoa.dear.maori.nz/

Standards New Zealand for sanity control in the OOXML fiasco:

http://www.standards.co.nz/news/Media+releases/NZ+maintains+negative+vote+on+OOXML+Standard.htm

Hagley Community College for rolling out Ubuntu in a secondary school:

http://computing.hagley.school.nz/about/opensource

Who should I nominate? Is there someone I've missed?

Monday, 30 June 2008

I'm confused about hardy heron and default applications

Back in the day you told your linux system which applications you wanted to use with environmental variables things like:
export EDITOR=/usr/bin/emacs
Then along came the wonderful debianness of the apt-family and the alternatives system.
update-alternatives --config vi
Now this system too is being undermined by various systems, leaving me uncertain where to set things. What I'm trying to do is:
  • have Sound Juicer and not Music Player (RhythmBox) launched when a CD is inserted. There is an entry for "Multimedia" under the "Preferred Applications" menu option, but this seems to be about opening files, not responding to newly-mounted media and Sound Juicer is not listed as an option. There doesn't seem to be anything about CDs under the "Removable Drives and Media Preferences" (although this is where the setting are that automatically load F-Spot when I attach my camera, which seems like the same kind of thing).
  • configure which applications I can launch on the .cr2/TIFF/Canon RAW files produced by my digital camera I want the same applications to appear in both the file browser and F-Spot (which look like they're presenting the same interface but apparently aren't). ufraw seems to be the tool of choice here (either standalone or as a gimp plugin), but I'd like to pass it some command line args. I can find no entry or this under the "Preferred Applications" menu option.
There are lots of menus with a "Help" as an option, but very few of them seem to be.

Mike O'Connor at Friday drinks


Mike O'Connor
Originally uploaded by Stuart Yeates
I took some photos at Friday drinks, trying to do the whole wide-aperture-to-isolate-visual-elements thing. I wasn't really aware of just how much it is dependent on the relative position of the photographer, subject and background.

Some of them turned out better than others.