Later this week, I am flying out to Minneapolis to present my research at a conference (Psychonomics, for those of you who care ... also, if anybody here lives in Minneapolis and wants to grab a beer or something let me know). It is always good to get a wider audience of people to look at the preliminary stuff to help iron it out, and I generally respect the intelligence of most of the users here. Basically, I am just going to lay out a reasonably brief description of my research, and I'd like any sort of advice, feedback, criticism, or whatever on anything.
Literally any facet of this I want you to feel free to weigh in on. Either how I present it/describe it (am I being clear enough? detailed enough?), the methodologies I used, the analytic techniques I used (there are users here who are far more comfortable with higher order mathematics than I, and I bet you can give me some useful information on whether I approached the analysis the right way, or better ways to test for the same things...), or even just general questions, theories, suggestions for follow-up experiments or alternative explanations to the ones I propose ... ANYTHING.
And thanks in advance.
Anyway, I will try to keep this relatively concise, but I am afraid it still will be relatively long.
So the research is focusing on semantic comprehension errors in a subject with pure word deafness (a.k.a. verbal auditory agnosia). More detail on the condition can be found through a quick Google search, but essentially it is a selective deficit in speech perception whereby the subject has "normal" hearing and preserved semantic knowledge yet is largely unable to properly process speech (caused by a brain lesion, usually as a result of a stroke). Reading, speech production, and nonverbal auditory perception are all within normal limits. However, although they are able to pick up on various nonverbal cues (and limited lipreading) in conversation, they have a very difficult time at simple speech perception tasks.
For example, in a word repetition task (where you say a word and ask them to repeat it), they are mostly unable to. To throw some numbers out, in one of our experiments the subject was presented with 120 words with instructions to repeat them. He got 12 of them right, and made 21 "errors" (that is, repeated a word that wasn't the correct target word). The rest of the trials he was completely unable to provide a response (he just shook his head and said he couldn't understand it). So the deficit can be quite severe, but is still very selective. In a variety of tasks such as picture naming, reading and writing, word-picture matching, etc., he performs exactly as you expect someone without a deficit.
It is a very strange condition, and there aren't that many people in the world that suffer from it. The typical model of PWD is that it reflects problems in early stages of speech perception. However, in the word repetition task I mentioned above, we noticed certain patterns emerging in the types of error the subject makes. Although sometimes he will make errors that are clearly phonological in nature (if the target word is "window," he says "windle" or something of the sort), most of them seem to be semantic (if the target word is "hawk," he says "eagle" ... or "lobster->fish", "diamond->ring", "principle->teacher" to give a few examples). Similarly, on a spoken word-picture naming task (you say a word and show a picture, and ask him whether the picture is the same thing as the word), if you say "cat" and show him a picture of a dog he will say it is a cat.
First, we wanted to demonstrate that he is making these semantic errors above chance. That is, if PWD is a problem in early speech perception as has been previously suggested, then you might suppose that the errors are essentially random word production. The subject doesn't understand the word that was said at all. If the subject were just producing words at random, you would expect that some of them would seem to be semantically related (especially given the human propensity to read into relationships and patterns that might not otherwise exist) by chance. To test this, we used a type of permutation/Monte Carlo simulation. The semantic relatedness of two words can be calculated by a number of measures (usually pulled out from the NLP crowd) - the one we chose is called latent semantic analysis (LSA). Previous studies seem to show that LSA reflects standards of human judgment better than many other measures, and calculating it is relatively straightforward.
Anyway, you can look that up on its own (though I can talk more about it in a separate post if you ask nicely). So we calculated the LSA relatedness value for each of his errors (the relatedness between the target word and his incorrect response). We had a couple different data sets, but for all of them the average relatedness value of his errors was around 0.2-0.3 (the absolute values are irrelevant for right now, just remember these for comparative purposes). Then, I ran a little program I coded that would randomly repair each of his response words with a different target word from the same data set, and calculate those relatedness values. So, modelling the (null) hypothesis that he is just randomly producing words and some of them happen to be related. I ran this simulation a few thousand times to create a Monte Carlo distribution - the average response error relatedness values converge around 0.07-08 for each data set (with a spread from around .03 to .15). The average relatedness values of his actual errors is several standard deviations higher than that predicted by this model, indicating that is is extremely, extremely unlikely for these semantic pairings to be the result of pure chance. Really, not that shocking a revelation, but still nice that there is statistical backing.
For the sake of contextualizing those numbers, an LSA value of around 0.25 reflects a degree of relatedness akin to "diamond" and "ring" ... so two words that we conceptualize as being related to each other fairly closely. An LSA value of around 0.08 is the same degree of relatedness as "diamond" and "lumber." This
is basically what that data looks like in histogram form. The black bar at right is the "real" data, the blue is the simulation. "NL" is just the patient code.
Next, we need to test whether the manipulation of semantic factors/conditions can affect his speech comprehension. The MC simulations simply demonstrate that the perceived effect is non-random, not that there is a definite semantic effect. So we ran a couple more experiments. First, we wanted to manipulate semantic imageability and lexical frequency. The former is essentially the difference between "abstract" and "concrete" - it is easier to "image" a word like dog or cat than a word like prestige. The latter is simply the relative frequencies with which the words tend to appear in a large text corpora. There is a lot of previous research on the effects of these two factors. The basic idea is that words that are highly imageable or used frequently have "richer" semantic representations in the brain. In our data, there is a significant effect for imageability. That is, when presented with target words classified as being highly imageable, the probability of the subject providing a correct response was significantly higher than with low imageability words (and the average LSA relatedness value of errors in the highly imageable condition was higher than for the low). Frequency had a lesser effect.Here
is a quick little bar chart of those results.
The next task was semantic priming. That is, two words are presented, a prime and then the target. Priming is widely used in psychological studies. A semantically related prime tends to build stronger semantic support for subsequent comprehension (basically, the semantic representation of a word stays in active memory long enough to shape your perception of what comes next). The subject showed a small but statistically significant improvement in speech comprehension following related primes (and the average LSA relatedness value of errors was higher for targets with related primes than unrelated). The asterisk next to this particular part of the data though is sample size ... as I mentioned earlier, in a task of 120 trials there will only be around 30 where he is able to respond. For the other parts of this research, we ran the subject on multiple data sets so we would have enough data to find meaningful patterns ... for a variety of logistical reasons I won't get into we couldn't replicate the priming task in time for this conference. So unfortunately rather low power on this finding.
Finally, we propose a model for future testing (and if you have any clever suggestions for a way to test this model they are more than welcome). Essentially, we believe that this deficit isn't an error in speech perception per se, but rather a rapid decay of phonological representations in the semantic network. This would account for the patterns we see in our data. The theory is that, when the brain hears a word, the phonological input activates a lexical network, which connects those sounds to a particular word. However, this activation is probabilistic. If you hear the word "cat", those phonemes activate the lexical representation of "cat" ... however, those same phonemes will also partially activate other lexical representations that sound similar ("bat", "cap", etc.). In a normal system without a deficit, "cat" has a stronger activation than other "less perfect" lexical matches. These lexical representations then activate the semantic network, and all the associations and meanings of a particular word (and this activation is also probabilistic, activating multiple possible meanings of varying strengths). The interaction and feedback of lexical and semantic networks is how your brain figures out what word was said, what it means, and how to respond to it.
We believe that PWD entails a rapid decay in the phonological representations feeding the lexical activation network. That is, when "cat" is heard, all the lexical representations are activated as normal. However, active memory "loses" the phonological information, which reduces the activation strength of the lexical network. The feedback from the semantic network, then, becomes a more powerful activation factor. Our data not only fits this model, but on a completely non-scientific subjective level it fits our subject's behavior. For example, sometimes in the word repetition tasks, he will begin to say the right word, then get frustrated and say, "I done lost it" or something of the sort. Sometimes he is even able to describe what the word is, but can't seem to figure it out (like in response to "cat" he is able to say it is a pet, furry, and an animal). Here
is a quick little graphic of that model. Blue is the relative strength of activation, and arrows are the direction of activation.
Anyway, I apologize that this is basically a wall of text, and hope some of you take the initiative to read through it and tell me what you think.