A few months ago, I was working on a project that required me to look through a lot of search results at the Corpus of LDS General Conference Talks. I was surprised to find that some speakers not only told the same stories and made the same points in multiple talks, they frequently used exactly the same phrasing in doing so. In other words, they were clearly copying and pasting parts from one talk to another. Not that I blame them. I know GAs are busy people, so in retrospect I probably shouldn’t have been surprised.
This got me to wondering, though, whether some Conference speakers use this copy-and-paste strategy more than others. I hit on an easy way to measure how often they do this while reading Brian Christian’s fascinating book The Most Human Human. The book is about the author’s preparation for participating in a Turing test, where his role is to serve as a chat partner for judges who are trying to distinguish between computer programs and people, and his goal is to win the award that is the book’s title, by convincing the most judges that he is a human and not a computer. One issue Christian discusses is redundancy in language. For example, when we’re reading, we can predict with accuracy far better than chance what word will come next in a sentence, and our accuracy goes up as the sentence goes on. More importantly for my purposes, compression software also works by spotting redundancies in language.