Spelling Bee (Input from the Hive Mind)

For a researcher, curiosity is usually an advantage: in fact, it’s probably a survival characteristic. However, it’s sometimes a curse.

Somehow, after starting off the day by reading a fascinating novel by Alena Graedon called The Word Exchange, I’ve found myself reading post after post about linguistics, psycholinguistics and lexicography, which are always useful background for an editor (one of the less public hats I wear at ESET), but not something that I’d normally write about in a security blog.

I rarely review novels and probably won’t make an exception in the case of Graedon’s novel for reasons of time management. However, her book belongs to a tradition of fictional writing that could be said to encompass Orwell’s 1984, several stories by Borges, Daniel Keyes’ Flowers for Algernon, Neal Stephenson’s Snow Crash, Carroll’s Alice in Wonderland, as well as more diffuse influences such as the writings of Hegel and Dawkins: which not only explains my personal interest in it, but also indicates that it touches on quite a few themes that I’ve touched on in my own security-oriented writing.

While I tend to respond negatively to stories where a physical virus is transmitted electronically, on this occasion the metaphor worked well enough. Though if I quote the phrase, “Memes kill! Stop the spread of word flu!” I’ll need to point out that the ‘meme’ in this instance is one of a range of devices more advanced (especially in terms of interfacing with the user) but not so different in concept to the portable devices to which so many of us have become quasi-umbilically connected in recent years.

The aphasia that afflicts so many victims in the book does seem to tap into contemporary fears of loss of faculty (dementia and related conditions), and we are increasingly aware of conditions involving impaired focus, memory and empathy that may be associated with overreliance on technology as an aide-memoire, for entertainment, and for interpersonal communication that might formerly have been face-to-face. And certainly if you’re worried about the amount that operating system and service providers know about you and attempt to predict and manipulate your actions and preferences, this book is probably not going to comfort you.

It’s a bit of a stretch from aphasia in fiction to the analysis of text for forensic purposes, but there is a linguistic (and editing) link to a couple of blog comments I noticed, appended to a recent article by Graham Cluley. The article refers to attempts to defraud customers of the UK’s Metro Bank via a fake Twitter account. One comment makes the point that one of the bogus tweets

“…spells apologise ‘apologize’ - which while it is accepted in English proper - it isn’t the norm (and you probably wouldn’t expect a UK bank to use -ize instead of -ise).”

While the commenter makes some good points, I have to differ on this one. It’s widely assumed in the US that the English usually prefer –ise where Americans use –ize. Since I responded directly to that point, I’ll just reproduce my response here:

I can’t agree that ‘apologize’ is a significant indicator in this instance: it’s actually the more traditional spelling and ‘correct’ enough to be preferred by the Oxford and Cambridge University Presses, Fowler, Hart’s Rules and so on, being derived from the Greek. Not to mention professional pedants like me. There’s enough confusion about what is ‘correct’, though, that I wouldn’t necessarily even see - for example - ‘advertize’ as an indicator, even though ‘advertize’ is incorrect in UK and US English, since it’s so widely (mis-)used. I’d definitely be suspicious if I saw ‘analyze’ from a UK source, though. :)

You may think this is a somewhat academic point, of interest only to those of us who have spent much of their authorial and editorial careers mediating between ‘two countries separated by a common language’. However, there’s a serious point here. Attributing malice to a particular region is difficult enough in terms of analysing (or analyzing!) malware. Spelling and grammar represent an even more dubious foundation. Some versions of Word’s spellchecker have irritated me by trying to impose the ‘-ise’ suffix inappropriately, while some copy-editors have infuriated me by assuming that automated spell- and grammar-checkers are authoritative in this respect (and others).

In this example ‘apologize’ is not necessarily an error at all, and a less ambiguously US-centric spelling like ‘behavior’ could be introduced by software, or editorial policy, especially in a global organization. While Metro’s origins are in the UK, there are obviously other institutions with a far wider spread, not all of them headquartered in the UK. Nor would I dare to second-guess any company’s policy on quirks of language based on region. Americanized English is very common in the UK nowadays.

I agree wholeheartedly with the commenter’s recommendation that detail should be considered, but sometimes – in fraud as in the analysis of poetry – interpretation is based on something that just isn’t there. I’m certainly not saying that language can’t be an indicator of something awry. Poor spelling and grammar or ‘foreign’ phrasing are often a characteristic of phishing and related scams, though perhaps less so than in the past.

Indeed, another commenter points out the absence of a full stop or colon in the phrasing “we apologize for the convenience please go to…” Fair comment; though again there’s scope for alternative punctuation, regionally speaking. But (wearing my editor hat again), even where there’s a rigorous editorial process, errors will creep in. Nowadays, sheer data glut – we see more words and read fewer – and the erosion of national borders, at least linguistically, probably means that editorial rigour is less common than it used to be, certainly in the context of Twitter, with its hard-to-avoid limitation on character count.

It’s probably worth reiterating, though, that “Pidgin English or poor spelling is suspicious, but impeccable presentation doesn’t prove legitimacy” and that there are other cues and clues to take into account, some of which may be more reliable.

What’s Hot on Infosecurity Magazine?