**Editor for this issue:** <>

- Jacques Guy, Greenberg, simulation, significance

I have received this personal e-mail: >Date: Sat, 07 May 94 00:37:50 +0200 >From: Stephen P Spackman <spackmandfki.uni-sb.de> >Maybe there's much more to greenberg than I thought. >When I first saw his stuff those of us in the classroom with a >mathematical background had a severe giggling fit. But if you're coming >out with odds like 0.2 (and not 0.98) of chance resemblances, it looks >like our intuitions were nearly as off as his (albeit in a different >direction...). Stephen Spackman is absolutely right and I owe everyone my apologies. Indeed, when writing the simulation of semantic shifts, I defined semantic domains of size N, N being the number of word-meanings over which semantic shifts were allowed. Thus for instance, with a 200-item wordlist and fudge factor of 7 (i.e. domain size of 8), you had 25 discrete domains within which semantic shifts were allowed. If you think about it, the "within" is pretty silly, because if you allow, as Greenberg does, a semantic shift breast-milk-suck-swallow-drink-chew-throat-neck and thereby define a *closed* semantic domain, you are at the same time disallowing such semantic shifts as breast-nipple, throat-throttle-gag-stench, etc. Thus the figures obtained are a *gross* underestimate! A better solution, and still an underestimate, is to allow for semantic shifts between any one item of the wordlist and the next N items. I have done that, and obtained results which agree with the extraordinary (to some) figure of 0.98 mentioned by Stephen Spackman, namely: Ten languages, 200 words, 1/250 chance of accidental resemblance, fudge factor 7 (i.e. semantic domains of 8 items): 0.76 cases *per simulation* of exactly SIX languages showing the same word. (and 0.05 of seven, and 0.003 of eight, none of nine or more). With twenty languages, and the same parameters, the number of cases of SIX languages showing the same word is... hold onto your hats... an astonishing 12.7 per simulation! (Good Lord, is that right? Let me check. Just a moment...). Yes that *is* right. And 2.0 of seven languages, 0.32 of eight (yes! one chance in three!), 0.036 of nine, and 0.006 of ten or more. When I have some time, I will write a bit of explanation as a documentation file, and upload the lot with the source code in the pc/linguistics subdirectory at garbo.uwasa.fi, so that the simulation method is open to scrutiny, and the experiments reproducible. But another point. > From: "Paul Purdom" <pwpcs.indiana.edu> > Subject: Re: 5.521 Greenberg - Simulation with semantic shift > > I would like to raise a word of caution for the people that are attacking > the word of Greenberg and followers using statistical arguments. It is very > difficult to disprove things using statistics. Basically, the attackers > set up a model that they believe is similar to the process that Greenberg goes > through and show that by chance you get results somewhat similar to > Greenberg's (see recent post by Jacques Guy for a good example of this type > of work). In general people doing such studies seem to make better use of > statistics that Greenberg and followers. Such results should cause one to > wonder whether there is any significant reason to believe the results of > Greenberg and coworkers. On the other hand, the classification of Greenberg > did match up rather well with the genetic relatedness of the speakers of > the various languages. This should cause one to wonder whether the statistical > models are missing something important. I have already shown here that, even granting the accuracy of the data proffered, the correlation between genetics and language is at best nil, at worst *negative* (see my analysis of Cavalli-Sforza somewhere in the archives of LINGUIST). I am not surprised to see a correlation between Greenberg's linguistic classification and speakers' genes because the linguistic evidence proffered by Greenberg being demonstrably an artifact of allowing for semantic shifts, his classification must have been naturally influenced by what is known of the genetic relatedness of speakers. Indeed, given three informants, one Spanish-speaking Basque, one Basque-speaking Basque, and one Rotokas-speaking Papuan, I would sooner look for, and find, resemblances between Basque and Spanish than between Basque or Spanish and Rotokas. > > Try to prove or disprove something with statistical models can be quite > tricky. Let me refer to an analysis I did of some data of Dana Nau to > give a case that I understand completely. These results appear in > International Journal of Parallel Programming 15 (1987) pp 163-183 > (Nau, Purdom, and Tzeng) and in Analysis of Algorithms (1985) pp 447-449 > (Purdom and Brown). Nau measured how two algorithms did at playing a simple > game. He had the algorithms play each other 3200 times using random starting > positions. Actually, he had 7 series of 3200 games each, because one of the > algorithms because one of the algorithm had a parameter, and he wanted > results as a function of the parameter. One of the results was that algorithm A > won 1640 of 3200 games, significant at the level 0.16 (i.e., not very > significant). The other 6 cases also showed method A winning, but with even > less significance. > > One could take two veiws on the data as I have presented it so far. Either > method A is not noticably different from method B, or it is strange that > method A won in each of the seven series (particularly since the satistical > test said the two methods had about the same ability). Paul Purdom's interpretation is fallacious, but it is such an extremely common fallacious interpretation of statistics that I feel bound to explain it in detail and, doing so, heap even more curses and imprecations on Jane Edwards who started all this. Many poxes and a googolplex curses on thee, Ma'am. First, 1640 wins out of 3200 games (1600 wins expected) has no degree of significance *whatsoever*. I will not go into the statistics of it because to those who know statistics the proof is trivial, and to those who do not it would not be convincing. Oh, stuff it, here is the proof. This is a fair game, so we are expecting 50% wins (like tossing a fair coin). So, using the normal approximation to the binomial distribution we have: standard deviation = ____________ |(0.5)(1-0.5) \|------------ = 0.008839 | 3200 Now, what we have observed is 1640 wins out of 3200, which is 1640/3200=0.5125, i.e. 1.41 standard deviations from expectancy, which means, ... oh pox again, my statistical books are at home and so is my HP41, anyway, it's approximately what Purdom quotes: one chance in six. So what's the big surprise? Remember: the simulation was run SEVEN times. (I'll come to that later). Now for a proof of sort understandable without the slightest knowledge of statistics. Toss a coin 3200 times. Result: 1640 heads. Are you surprised? Now toss a coin 3200 times. Result: 1600 heads. Toss it again 3200 times. Same result. Toss it again 3200 times. 1600 heads again. And again, and again. Aren't you surprised? Yes indeed, there is something very fishy about that coin! Now you will say: yes, but, there were SEVEN simulations and out of seven the same side always won! Well, *none* of those wins were significantly different from chance. The chances of observing seven wins (or losses) in a row in a fair game are one in 2 to the power 7, i.e. 1 in 128. Go to a casino and watch the roulette wheel all evening. You will see many cases of red coming up seven times in a row. In this discussion, I have implicitly gramted that the game is fair, that is, that neither of the two strategies, A or B, is superior to the other. However: >In this case, it > turned out that the second explanation was correct. As Nau explained, his > 3200 games consisted of 1600 pairs of games. For each position there were two > games, one where method A made the first move and one where method B made the > first move. If a particular position stronger favored the first player you > would expect that the first player might win even if it was not a very good > player. An alternate way to analyze the data is to consider how many pairs > where won by algorithm A and how many were won by algorithm B (disregarding > the cases where each algorithm won one game of the pair). When the previous > case is analyzed this way, we find that algorithm A won 140 pairs of 240 > pairs. There is only one chance in 0.00015 that this would happen by chance. > Clearly algorithm A is better than algorithm B. (The other six series gave > similar results.) So what is this? A is better than B? So it should win more than half the time, shouldn't it? Two questions then: 1. Was there a formal proof that strategy A and strategy B were equally good? By formal proof I mean a mathematical proof by combinatorials, exact, not statistical, which is approximate. 2. If there was, then take a look, hard look at your random-number generator: it's showing cycles. That said, Purdom's analogy is beside the point. The A vs B simulation exhibits small significant variations from expectations. 140 out 240 is 58% when 50% would be expected if the two strategies were of equal strengths (but are they?). What we have here, on the one hand, is someone claiming "there is one chance in ten billion of this happening even only *once*", and, on the other hand, a thousand simulations showing it to appear on the average *twelve* times every time. Greenberg says: you will see A win once in 10 bilion games, the simulations show A winning in every single game. What follows is therefore not receivable: > I would urge those > that are doing statistical studies of Greenberg's techniques to consider > various ways to model the approach that you believe he uses. Small variation > in how you model the process may have important effects on your conclusions. > As I have shown in this and the previous postings, large variations in the modelall yield the same result: chance resemblances have (vulgarly speaking) infinitely greater probabilities of happening than Greenberg claims. "Large" variations: from allowing no semantic shifts at all, to allowing roughly as many as Greenberg allowed himself. Bon, maintenant, y'en a marre, ca suffit, j'ai vraiment autre chose a foutre que de discuter des sornettes.Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue