Blogs make for appealing projects for students: the subject matter can be interesting, the styles are lively and personal, and the data couldn't be easier to collect (no taping and transcription). But from my experience with BA and MA students, there are some practical problems to consider, and these practical problems raise some interesting issues for discourse analysis in general.
Here are some topics studies by students at my university.
news blogs to news reports – e.g. Iraq, Katrina, a demonstration
Uses of narrative (tense, evaluation, reported speech) in personal diaries – e.g., Bitch PhD, dooce
Attempts of politicians to find an ethos for blogging, Facebook, and other on-line forms of presence – e.g. US Presidential candidates
Uses of informality, for instance colloquialisms and typos in comments
Community building (solidarity, banter, shared assumptions) – e.g., anorexics, soldiers, fans
Evaluative language in specialist blogs – e.g., cookery blogs
Language choice and code-switching (no longer English domination)
There are also language issues to study in other recent innovations in Web 2.0, such is in the comments on YouTube and on photo sharing sites.
Blogs provide a vast source of data already in electronic form, so it is easy to download material, save it as text, and use concordancing tools. But there are some theoretically interesting practical problems:
As we have seen in my studies, blogs are hard to sample. There is no ‘representative sample’, so one usually has to explain a theoretically-motivated sample, as I did. One can choose the most popular, or blogs linked to each other, or blogs in some unusual form or style, or blogs on a topic.
Students always ask how much text they need, and there is no right answer. For my chapters, I tended to go for about 10,000 words from each of ten blogs. If I wanted to make a statistical argument contrasting blogs and posts, or one kind of blog and another, I would need a much larger sample. These corpora are easy enough to collect, just cut and paste, but students are likely to blanch at the thought of analysing qualitatively 100,000 words, the length of an academic monograph.
Students have raised the issue of just what they should cut and paste. It can be hard to collect the comments as well as the posts, because one usually has to follow the permanent links for each post, but I have shown in Chapters 4 and 7 that they can be very different kinds of texts (for instance, one is likely to find a lot more evaluative comments in posts). If one followed up trackbacks and links, one would be in for a couple days of copying and pasting, and a lot of hard choices, rather than just an hour or two.
I copy all my texts as rtf files into qualitative analysis software, in my case Atlas-ti. One nice effect of this translation is that the links show up with the URLs, so I can tell what they are linking to. It looks messy, but it makes some kinds of analysis easier. Others analyse the texts with corpus software such as Wordsmith. For that, one needs text-only format.
There are also ethical issues in collecting these data. I have chosen only popular blogs, where the authors obviously expected to have their words read by the widest possible audience, so I have not worried about asking their permission. But with more private blogs (for instance those in a support group for a medical condition, or those used by political dissidents), there could be serious issues of permission and confidentiality.
As I have worked on this book, I have found that, not surprisingly, many of the most useful resources are on line. Some of the main researchers on blogs have pages on blogs that give lots of papers; these are particularly useful in starting students off on their reading. I have also listed some web resources, such as Technorati, the search engine and ranking tool, and Data-Mining, which experiments with visualisations. My own blog is not updated frequently enough, but has some resources and comments as I work on the book.
my blog: http://thelanguageofblogs.typepad.com/
Technorati – the most-used search engine for blogs - http://www.technorati.com/
Global Voices – a carefully-edited directory translating and summarizing blogs from around the world – a good source for finding well-written blogs that give a non-US perspective - http://www.globalvoicesonline.org/
Data Mining: Text Mining, Visualisation, and Social Media – Useful visualizations of the blogosphere by a Microsoft researcher http://datamining.typepad.com/data_mining/graphs/index.html
Blog Herald – news on blogs, mostly business-related - http://www.blogherald.com/
The Sum of My Parts – the home page of Stephanie Hendricks, who is completing a PhD on blogs - http://www.sumofmyparts.org/blog/
Rebecca’s Pocket – Rebecca Blood’s blog links to lots of commentary on blogs - http://www.rebeccablood.net/
Jill Walker – another academic blog by a pioneer of blogging - http://jilltxt.net/
On the Media – the weekly WNYC (Public Broadcasting System) radio programme regularly covers blogs and new media, and has podcasts and transcripts - http://www.onthemedia.org/
Journal of Computer-Mediated Communication – currently the best source of academic articles on blogs - http://jcmc.indiana.edu/
Bruns, A. and J. Jacobs, Eds. (2006). Uses of Blogs. New York, Peter Lang.