Topic-based variation and expat dialect change

My previous post on dialect change in my expat colleague showed a lot of variation, but from the analysis there were no clear patterns based on how long he’d been away from Scotland for. This post starts on the second part of the analysis: is the variation constrained by the topic of conversation? That analysis is easier said than done…

The initial hypothesis when I started the project was that my colleague’s accent would become less Scottish the longer he had lived in the Netherlands. But as we did the recordings, my casual observation was that there was no real change. My observations were based on salient and easily spotted variables as (ing), where indeed there was no real change, and (t), where my monitoring turns out to have been a bit less precise. Anyway, I decided to also offer the students a lot of reading on other reasons for intra-individual variation – addressee, audience, topic and referee design – and to ask them to look for topic-based variation in their research papers.

Topic-based variation. The main paper for inspiration here, and definitely a contender for Best. Sociolinguistics. Paper. Ever., is John Rickford and Faye McNair-Knox’s “Addressee- and topic-influenced style shift” (1994). For this study, they recorded several interviews with a single speaker, a young African American woman with the pseudonym Foxy Boston, but with different interviewers. Perhaps unsurprisingly, the woman used more African American English features when she was interviewed by an African American interviewer, and much less of these features when the interviewer was a European American. (This is the addressee-influenced style shift.) But there were also topic effects: there were more American American English features when they talked about friends, relationships and youth culture, and less when the topic was something like school or career plans. Interestingly, these effects remained constant regardless of the ethnicity of the interviewer!

(Side note: The Rickford & McNair-Knox paper is in a book an therefore unlinkable, but there’s a follow-up 20 years later by Rickford and Mackenzie Price that is also really quite excellent.)

What underlies topic-influenced style shift is the idea that each topic is associated with a social setting: a particular place, particular interlocutors, particular expectations about linguistic behaviour that then influence your speech even if you’re not in that setting. For Foxy Boston, talking about friends activates associations with those friends, who also happened to be predominantly African American and with whom African American English was the prime language of communication. The topic of education means the image of the school is activated, where you’re expected to speak Standard English and where African American English may be stigmatised. Those associations carried over to her speech in the interviews, leading to her different frequencies of African American English features for different topics.

Coding for topic. The difficult thing with topic-based variation is that you need to devise a coding scheme for topic. Six pairs of students actually looked at topic-influenced variation for their variable, and their lists of topics are in the table below. (Note that the team numbers do not match the team numbers in the previous post.)

Team 1
domestic abroad Dutch education
work other
Team 2
Scotland politics Netherlands education
Team 3
informal neutral formal
Team 4
travels meta environment academia
politics childhood Netherlands Scotland
personal life hobbies
Team 5
band Scotland Japan Netherlands
teaching UK research
Team 6
book Christmas Dutch earthquakes
EU food holiday identity
miscellaneous literature metric system personal
politics research science Sinterklaas
teaching travel writing

Oh dear. There’s a lot of variation there. Team 6 have twenty-three topics on their list, Team 2 have only four. Team 3 just went for three levels of formality, more on which later. There’s a few topics that are recurring across teams, like the Netherlands, Scotland, and education (with a subdivision into teaching and research) but there’s also topics that occur only once. Some of these topics really do occur only once across the conversations, like a very brief discussion about the metric system. That means it’s probably not the best choice as a category for topic-based variation across conversations… Team 5’s “band” does occur in more than one conversation, but it’s a lot more specific than their other categories. Tricky business, these topics.

And it gets more complicated. Going off our association game, we’d expect “Scotland” to produce more informal Scottish features but “teaching” to give more formal (and less stereotypically Scottish) features. But what would the expectation be if we’re talking about teaching Scottish literature? This is what led to Team 3’s coding scheme: if they had clear expectations about whether the speaker would use more formal or more informal language, the token would be coded as such, but if there was a conflict, it would be coded as “neutral”. Intuitively, I’d say there’s a bit too much researcher interpretation here. (Team 3’s results, by the way, patterned “formal” and “neutral” as pretty much identical, but “informal” was different. For what it’s worth.)

Results. It’s a bit pointless to talk about results at this stage. For each individual feature, the results were pretty much as expected, so the more formal topics had more velar (ing) and less (t) glottaling, for example. But it’s impossible to compare across features because the coding scheme is so vastly different for each individual paper.

Remedy? One change to the course for next year will be a requirement that all groups get together and agree on a coding scheme for topic to be used (more or less) consistently across research projects. But Team 3’s problem is still there: standard variationist analysis in Rbrul allows us to attach one value only to a token for each category. So for “topic” we can have “Scottish” or “teaching” but not both. (Alternatively you can make ten yes/no values for ten topic categories, but that’s more unwieldy and a lot less elegant.) Ideally, what we would need is some sort of Instagram tag cloud: #Scottish #teaching #literature, etc. But how do you analyse that?

Please do let me know if you have any ideas…