Author Archives: vhk10

Digital R&D Fund for the Arts Workshop

Innovation Centre, Exeter University, 24 April 2013

I attended this on behalf of a colleague who was thinking of applying. It was one of a number of similar events around the country for this fund, which aims to bring together researchers, arts professionals and technical support. There were about 50 people there from around the South West, with the technical side rather under-represented.

The day mixed presentations with exercises in small groups. We broke the ice by thinking of the stereotypical ideas we entertained about those in the other two groups from our own. Later we examined one another’s proposed projects. Firstly we analysed them using ‘stories’. The ‘story’ story, the ‘people’ story, the ‘platform’ story, the ‘impact’ story and the ‘money’ story. I wasn’t really sure that all these headings really lent themselves to being made into ‘stories’ – platforms, for example, are fundamentally non-narrative things – so it didn’t differ much from just considering the bare headings, but it was a useful exercise.

Later our own proposal got pulled apart in a session on ‘user-centred design’ – looking at it through the senses of a prospective user of a certain demographic type. How do they first hear about it? What do they see when they get there? How does it make them feel? What do you want them to say about it? What do they think about afterwards? This exercise got us outside a mindset developers are particularly prone to – because we see the project from the inside, it takes an effort to put ourselves in the position of someone who knows nothing about it.

I talked to quite a few interesting people – someone trying to revive Weymouth’s museum, someone from the American Museum in Bath wanting to attract younger visitors, a representative of a theatre company on Dartmoor. Most people, particularly those who aren’t used to bidding for collaborative projects, seemed to find the day worthwhile.

Freecycle vs Freegle

I’ve used Freecycle/Freegle to pass on quite a few unwanted items. Freecycle is the original network, but in 2009 a number of Freecycle groups migrated to Yahoo and elsewhere and renamed themselves ‘Freegle’, in protest at restrictions imposed on them. New moderators were found for the groups on the Freecycle site, so now there are two systems running in parallel. I haven’t done a detailed comparison of Freecycle and Freegle but they seem to work in a very similar way, even down to the syntax of the messages and the wording of the disclaimers. One advantage of the Freecycle site is that you can withdraw an item which has been claimed; on Freegle you have to send a second message after the first and hope the membership (and the moderators!) connect the two.

It seems that users are showing a preference for the original Freecycle site. As I posted another offer today, I noticed that usage in the Bath Freegle Yahoo group has declined. When it split off in 2009, it was running at over 2500 messages a month. But through 2010 activity declined, and it is now about 1200 messages a month – still a considerable amount. Meanwhile Bath Freecycle has 4,000 more members than Bath Freegle and while I can’t find message stats, if today is typical it has significantly more messages. My experience of using both today is that moderators on Bath Freecycle are quicker to act – my message was moderated and put up within the hour (and the item then claimed), while no message has appeared on Bath Freegle for over 12 hours.

A quick check on the South Gloucestershire equivalent groups shows a similar pattern. The Bristol Freegle group exists but has never really taken off.

These sites are a real window on the sort of stuff people feel a need to offload and I often wonder at the story behind some items. Sadly but rather predictably, the literacy levels of the messages claiming items are often appalling.

I have one beef about the way Freecycle works. I have successfully posted offers in the Bath, West Wilts and South Gloucestershire groups (Bath is near the meeting point of these three counties). But when I tried offering something via the Bristol group I was told I couldn’t because I didn’t live in Bristol. Never mind that I work there and declared that a handover in the city centre or Clifton Triangle area could easily be arranged. This seems perverse and smacks of officious moderators liking to throw their weight around. I think it must have been Freecycle and not Freegle where this happened, for the reason given above.

But the situation remains that having two almost identical networks running in parallel doesn’t really benefit anyone. Whatever the restrictions which caused Freegle to spring up, they are less of a nuisance than the need for users to offer or place a wanted request in two separate places. I’d like to see a merger of the two, and if the current trends continue that will eventually happen.

http://groups.freecycle.org/bathfreecycle/posts/all
http://groups.yahoo.com/group/BathFreegle/

discouraging large email attachments

With the move to Google Mail, I am more conscious of how much space my email takes up, because the total is displayed at the foot of the screen. (On the current browser, correct to the nearest 100 Mb, but in the version for older browsers, you get told to the nearest megabyte or so!) Along with this, I’ve noticed a regrettable increase in emails with large attachments which could have been avoided. Examples I have received:

  • Senders who think an email is a blank message with a large Word document attached
  • New parents sharing photos of their newborns with the whole of their address book
  • Choir people circulating flyers for their concert or other people’s to the choir mailing list
  • ‘I took this photo and thought you would like to see it’ Such as a photo illustrating the large nose that runs in the family, and a photo of a cloud shaped like a pink elephant.

This is anti-social because it fills mailboxes, at the expense possibly of preventing the recipient from receiving other mail, could cost mobile phone users money to download, and slows down the Internet for everyone, especially if the attachment is sent to a large number of people. (I estimate one such message distributed about 700MB of data in total!) And it is counter-productive, because some mailers don’t deliver large messages or even flag them as spam, which could cause the sender to be blacklisted.

There are usually simple alternatives:

  • uploading photographs on the internet and sending a link. (For example, Facebook albums have a URL that can be shared with anyone, not just a Facebook ‘friend’)
  • putting a link to a website with the flyer on
  • putting text into a message body, not an attachment
  • uploading a file to a sharing site such as Dropbox or Wuala

I have tried politely requesting that people use these alternatives, but the message doesn’t usually get across. It tends to inspire replies such as ‘I don’t know how to put a photo on Facebook’ (So how did your profile picture get there?), or ‘It’s too complicated’ (Cutting and pasting the URI of an image on a choir website? Really?)

No, a sterner approach is needed. I have written a message which I will send in reply to some of the unnecessary large emails that I get. It does not have a personal signature and says that it comes via the email client, talking about the recipient of the original message as ‘he/she’. This gives the impression of an automatically generated message. (True, my name will still appear in the header, but the sort of person who can’t upload an image or cite a URI won’t pick up on that.) It warns that the recipient may not read it straight away, requests a URL instead and links to a page about netiquette which makes the same point as this article does. I will send this as a response to some of the more outrageous examples of huge attachments (bearing in mind the circumstances under which the email was sent), and to persistent offenders, and see what happens. Of course, in an ideal world this effort will have been wasted and I won’t get a chance to send the message!

(This approach was inspired by my experience with hackers on a website I ran. Attempts to inject nasty content into search scripts were met with a message saying ‘Sorry, we can’t display this page. This has been recorded in the logs and will be investigated’. Not too nasty, just in case a bona fide user managed to do something to produce the message, but firm enough to show that we’d taken notice.)

Leaves, snow and birds: real-time updates and user content

I’ve been using three sites which crowdsource information about the natural world and put it out again:

  • Leaf Watch (Disclaimer: this was developed by some colleagues) Collects data submitted from a phone to monitor the extent of leaf miner moth infestation in horse chestnut trees
  • UK Snow Map Collects tweets with the #uksnow hashtag to build up a picture of where snow is falling
  • RSPB’s Big Garden Birdwatch
    Collects data on birds seen in parks and gardens during an hour in one specific weekend

By their nature, Leaf Watch and the Big Garden Birdwatch take time to build up a picture of the data they record. Leaf Watch periodically publishes a map of aggregate data from its survey. The Big Garden Birdwatch publishes its results a couple of months after its survey, which is then repeated the following year.

Only the UK snow map attempts real-time display of information. Naturally enough you want to know where snow is falling NOW – how near to you is it?

The site also vary in how much they allow users to submit their own content. Leaf Watch doesn’t do this at all. The Big Garden Birdwatch has a ‘Community Group’ forum where people who have signed up to the site can crow (pun intended!) about the birds they’ve seen. Posts appear to be reactively moderated.

The UK Snow Map has a live stream of tweets with the hashtag scrolling beside the map. And this is a problem. I’m not that prudish, but many people seem to be unable to refrain from using obscene language even when tweeting about the weather! Actually I wonder if some people make a point of using it, precisely because they know their tweet will appear on screens across the country. Because of this I can’t recommend the site to my children. (There is a facility for only displaying tweets with a positive rating, but you have set it every time you start up the page and it doesn’t get rid of all the rude ones).

It would be much better if there were some sort of filter which ensured that tweets with potentially offensive words in didn’t get displayed. The ‘Scunthorpe problem’, that of censoring innocuous messages because of a rude string of letters, doesn’t really arise because it doesn’t matter whether any particular tweet is displayed or not. It’s still possible to be obscene without using rude words, of course, but I suspect that the Twitter stream would be cleaned up a lot.

Moving to Google Mail

A couple of months ago IT Services moved to Google Mail, ahead of the rest of Bristol University doing so later in 2013. This has proved controversial but I’ve been on Google Mail long enough now to have collected some thoughts about it.

Rather to my surprise, I’ve enjoyed the change and prefer Google Mail to pine or the Mulberry or Thunderbird software clients which we used to use before. Mulberry hadn’t been updated in years, Thunderbird had fewer features and pine lacks a properly graphical interface, a real nuisance as messages come increasingly with embedded images.

What I love about Google Mail is that being Google it is brilliant at searching. Do I want all the messages over a certain size from one particular source? Do I want all the messages that mention Manchester which I’ve received in the last week? I can get these easily and quickly.

I have a few gripes. One is that you often have to scroll up or down to do commonplace actions such as sending a message or viewing the subject field of a message. The facility for doing these should be on the page at all times.

Being technically minded, I like noticing the ‘space used’ figure at the bottom of each page, and take pride in keeping it low by clearing out old email as the new stuff comes in. A peculiarity is that if you use the ‘old browser’ version of Google Mail, you get to see this figure in much more detail – on the newer one it is rounded down and displays to only .1 GB, and I can’t find any way of getting the more detailed figure.

Word to HTML – the Google way

I’ve been given a group of Word documents to turn into a Web site. As they are long and contain a lot of internal formatting I wanted to convert them from Word into  HTML before working on them, rather than saving the content as raw text and putting all the formatting back in.

This used not to be possible. Or rather, it is possible using ‘Save As’ within Word, but the results are so stuffed with unwanted HTML that in practice it was not worth doing because the resulting file was almost impossible to do further editing on. But there’s now an alternative way of doing it, using Google Docs, or Google Drive as you are now supposed call it.  Well, you know what I mean – the part of Google’s bid for world domination that consists of looking after your files for you. 

Put your Word document in there, select it by checking the box by the name, then pull down ‘More’ from the header bar and select ‘Open With… Google Drive Viewer’. (Some Word docs appear on Google already automatically viewable with Google Drive Viewer, without your needing to select it.) You’ll see something that looks pretty much like your Word document with the formatting in. Select ‘File… Download As… Web Page’. This generates a zipped file with the suffix .html.

Using this method I get something with paragraphs, hyperlinks, headings etc. but there’s still quite a bit of work ahead before they are in a form I find acceptable.

Firstly, Google’s converter imposes its own stylesheet with all sorts of styles known only to itself. There’s a whole bunch of style information near the beginning which can be discarded and replaced with your own stylesheet. The rest of the HTML is littered with ‘class=”c12″‘ and similar tags. Sometimes Google’s styles join forces, and you get something like ‘class=”c1 c9 c11″‘. Fortunately some Perl one-liners get rid of these quite easily.

Then there’s our friend the line break. This time I actually want some in the HTML source, to make the text more readable, but if I’m to get them I need to put them in myself after headings and paragraphs. Again, Perl one-liners help. The process shows up a few ghost headings of the form <h4> </h4> which can easily be deleted, and ghost paragraphs <p></p> which also serve no purpose.  Every so often a whole paragraph, or part of a paragraph, will leap out in <h3> or some other heading tag for no particular reason.  Perhaps some deleted formatting in the original document is being picked up?

The HTML converter has particular problems with hyperlinks. It would be a real pain to put them all in by hand, and I do prefer Google’s conversion as the lesser of two evils. They tend to come out duplicated, one of the hyperlinks not enclosing any text; the closing </a> is also often not correctly positioned.  For good measure, sometimes an unscheduled paragraph break gets thrown into the mix.  A reliable pointer to duplication is the non-breaking space &nbsp; which is usually a sure indication of a place where Google’s converter hasn’t really understood what’s going on.

Lists really cause problems for the converter.  A panoply of ordered lists is generated, although many of them are on closer inspection lists containing only one item, because the next item in the list starts a new list.  In fact the original document contained only unordered lists, apart from a few which were ordered by letter (which the converter understands and handles correctly).

As well as adding unnecessary HTML, the conversion process also removes some formatting which I have had to replace, such as emphasis (used in quotations and some headings in the original document).

But I should end on a positive note.  There were a number of tables containing text, something which Word-to-HTML conversion used to pangle spectacularly.  These now come out correctly, and while they don’t automatically have borders (which become a box round the text inside), a simple global change using Perl puts borders back in.

So do I recommend using Google’s HTML conversion tool?  It boils down to one word: hyperlinks.  If you have more than a handful of these, it is worth converting the document to HTML, because the risk of creating an error if you cut and paste a link incorrectly outweighs the inconvenience of having formatting incorrectly rendered.  Incorrect formatting can easily be spotted when you look at the resulting page, but a mistake in a hyperlink can only be found by trying to follow it. And it knocks spots off Word’s own HTML conversion.

screen editors, good and bad

I’ve been editing some pages on a local CMS, and using the inbuilt screen editor (Zope’s). But there are some things about it that are really annoying and outdated.  To get my irritation off my chest, here are some of them:

  • No mass search-and- replace (I have been cutting and pasting the content into a Notepad window and editing it there!)
  • No way of searching all files in a directory together.
  • The editing window doesn’t wrap text.
  • When you save, you are taken back up to the top of the window.  This is a deterrent to regularly saving your work.
  • The HTML is validated when you save it, and you are told the line and column number of errors, but there is no quick way of navigating to (for example) line 200.
  • When you close a document after making unsaved changes, you are not prompted to ask whether you want to save them.

On the subject of editors, I usually use emacs when I can.  In a previous job I used joe, a really tough little editor that didn’t baulk at big files.

Forums: when to have one?

I’m currently on the steering group to review and redevelop the website for my children’s school.  One of the issues raised has been whether we want to have a parents’ forum on the site.  This led me thinking to the kinds of situations in which forums are useful, and when they serve no purpose  or are even harmful.

I’m not a great one for online forums, but I do contribute from time to time to one.  It’s a community of people who have a common interest, for some a professional one.  We have between us a spectrum of strongly held views on certain topics, which sometimes get argued over on the site, though many contributions are not contentious in any way.

It’s ‘reactively moderated’, in other words posts go online at once, but a post which breaks guidelines will be withdrawn or edited by a moderator of the list. (When I joined, my first few posts were viewed by a moderator before being published, before I was deemed to be trustworthy. In those days the forum was on a different platform though.)  Editing by a moderator doesn’t actually happen very often; the usual reason is that something unacceptable has been said about a particular person (their precise date of birth, or the reason why they didn’t get a certain job, for example).

Perhaps it’s that this particular topic attracts reasonable, restrained people, but arguments rarely get beyond polite and reasoned disagreement and the system has worked well to date.

But a forum for the school would be different.  We see one another regularly face to face and any unpleasantness which broke out (e.g. because of an allegation of bullying, suspicion someone was having an affair or a complaint about a particular teacher) could be seen by many before a moderator had a chance to remove it, which could be very damaging.  So reactive moderation wouldn’t be enough; each post would have be be seen by an editor and approved, which would slow discussion down, especially if the editors didn’t log on very often (which parents at the school tend not to).

In general, forums don’t take off when the members meet frequently in person or have other means of discussing matters of interest such as a shared email list.  We tried one at work, and it never took off for this reason.  They work best when there is a scattered group of people with a shared but unusual interest.

the NHS and mobile devices

I’ve been looking at the website of my local hospital.  It has become extensive and provides very detailed information about what the hospital does.  In fact in some cases, such as the hospital’s family history programmes for certain cancers, the Web page is the only public source of information that exists.  (Whether a Web page should be the only source of information for such vital services is a can of worms that I won’t open here.)

But the website appears to have been designed with the browsing habits of a few years ago in mind.  One issue stands out: an assumption that the person viewing it will be doing so on something the size of a desktop PC.

Let’s put ourselves in the position of someone wanting to get some information about what the local NHS provides.  It’s really quite likely that they might not want others in their household to find out which pages they’ve been looking at.  In this situation, they’re likely to consult the site on their own mobile phone or other handheld device rather than on a shared PC or laptop.

However, the hospital web pages do not have this sort of accessibility.  They are long and image-heavy and key information is often deeply buried (not a good idea, whatever hardware the visitor is using).  For example, a list of risk factors for one common and deadly disease is only reachable on a 17-page PDF which you first have to download and then scroll several pages into.  No thought has been given to the anxious person who might be accessing the page on a mobile phone with a small screen and who pays by the megabyte for everything they download.

I have the impression that the website was developed a few years ago when handheld devices were not routinely used to access the Web.  By the time the site was ready, the world had moved on and it no longer fully met the needs of the public (something possibly true of other aspects of the NHS too?)

a Potemkin village on the Web

I have been trawling round lots of UK university websites, principally those of the Russell Group, this week.  There’s a rather discouraging uniformity to many of them and I felt that the sight of another group of cheerful students clustered round a laptop would make me feel positively ill.  A surprising number of the sites concealed the name of the institution so it wasn’t easy to find.  Some deliberately broke the mould by displaying a main picture of far-flung research, such as a graduate student standing on a glacier in New Zealand.  A common trait was trumpeting a high position in a league table of universities compiled by a national newspaper or survey or (where possible) in the ‘World’s Top 100’ universities  There are so many tables and categories that there’s likely to be one to suit any reasonably good institution!

As regards look and feel, I liked Birmingham’s the best, though the one I browsed around most for pleasure was Liverpool’s, which had links to news releases and blog posts by its staff. (One of which admittedly said ‘I haven’t worked out how to write a blog without getting into trouble!’)

However what I was really after was information on research data management. I used this as a search term in the web sites of Russell Group universities and a few others, with interesting results.  I’ll share here a couple of things I found which won’t make it into my official report.  One was the Russell Group university which seems to equate research data management with using EndNote(!), at least to judge by the references to it on their site.

I was at first more impressed with the ‘Research Data Management homepage’ of another university (not in the Russell Group), which bristled with links, noting ‘This university means business!’    I returned to the page later and started to follow the links.  Almost all of them led you away to other sites such the DCC’s or other parts of the University.  The ‘news’ was all culled from elsewhere; the most recent news item was two months old and related to a court case involving a Hollywood actor!   The blog had a single posting, over three months old.  A ‘Learn More’ link just took you back to the same page.  Some dummy text could still be found on some of the pages.  There was some generalised information about research data on the home page, but nothing specific about the University’s own provision, such as whether it had a research data management policy or a dedicated repository.  I imagine that this site is intended mainly as a placemarker until more detailed information becomes available, but in its own way it does a very impressive job of disguising its absence – the Web equivalent of a Potemkin village, or of the lengthy football report I once read which concealed the fact that the two matches the local team were due to play that week had both been postponed.