A report into several translations of the Holy Quran

Report author: John Olsson

 

This report has been commissioned by Ahmadiyya Anjuman Ishaat Islam, (Lahore) USA, the publishers of Maulana Muhammad Ali's various literary works, including his translations of the Holy Quran.  The publishers claim that the alleged translation by Mr. 'MH Shakir' is a direct and extensive plagiarism of the 1917 Maulvi Muhammad Ali translation into English from the original of the Holy Quran in the Arabic language.

Report findings

In this report I will show that the publishers’ claim is valid. The MH Shakir version of the text cannot realistically be anything more than an almost literal copy of the 1917 text, with some minor borrowings from other translations, especially the 1951 revision by Mr Maulvi Mohammad Ali of his earlier translation.

MM Ali’s first translation was published in 1917. He had been working on it since 1909. He then issued a revised translation in 1951 which he said was the result of extensive further study. This revision is generally known as the ‘Maulana’ translation (here referred to as ‘M’ for the sake of brevity). As far as I can judge, MH Shakir’s translation first appeared in 1983. The Shakir translation is in the main a verbatim copy of the MM Ali 1917 translation, although there is also some material taken verbatim from the 1951 translation. It is intriguing to wonder why  Mr Shakir depended so heavily on two versions by just one  translator.

As a potential complication to this picture it should be noted that the Shakir version (here referred to as ‘Q’, i.e. ‘questioned document’) occasionally reverts to a more traditional interpretation of the Quran[1], but does not do so consistently[2]. Inconsistencies appear in regard to some items of doctrine, for example the belief that Jesus was taken to heaven alive (a doctrine of ascension). Whereas MM Ali has “but when thou didst cause me to die”, reflecting a strictly literal translation, others have “when you took me up”, “when thou tookest me”, etc., Shakir fails to revert to the traditional interpretation, but copies MM Ali. However, this contradicts what he did earlier in 3:54/3:55 where he has already made precisely this change, because whereas at this point MM Ali has ‘I will cause you to die’ Shakir has ‘and cause you to ascend unto me’. Thus, whereas MM Ali has confined himself to a strict literal translation from the Arabic, Shakir – at this point – reverts to a traditional interpretation.

In other words, the Shakir translation seems to adopt two contradictory doctrinal positions[3]. If I have interpreted what has happened correctly between the two texts, then it is worth reflecting that this kind of inconsistency is not uncommon in the plagiarism process, where the usual practice is to copy blindly – and hence carelessly – thus producing incompatible or contradictory text. A plagiarised text is almost always logically and ideationally inferior to the source text, especially in the case of a scholarly document.

Method of sampling

The Quran consists of over 6,000 verses, divided into 114 chapters. This makes it a work of substantial length, and therefore, rather than testing each verse in each version, a sample of verses was taken. The sample was produced by building a random generator program in Visual Basic 6. The generator first produces a chapter number (between 1 and 114), then the number of sections in the chapter are input into the program – for example some chapters have as many as a dozen sections, while others have only one section. Once the section number has been chosen, the number of verses for that section is recorded and this is added to the list of verses to be tested. In this way a list of the following randomly selected chapters and sections, given with the number of verses in the relevant section, was created[4]:

Table 1: List of randomly selected chapter sections to be tested for plagiarism

Chapter

Section

No of Verses

14

3

9

17

5

13

19

6

16

22

5

5

27

3

13

28

4

14

30

2

9

38

5

24

53

1

25

65

2

5

66

1

7

77

2

10

79

1

26

81

1

29

86

1

17

90

1

20

90

1

20

96

1

19

100

1

11

101

1

11

107

1

7

108

1

3

 

In all, 313 verses were randomly selected in this way, representing approximately five per cent of the total number of verses. As can be seen from the above table, chapters throughout the Quran have been chosen, and it is believed that this sample is likely to be representative of the work, in terms of the respective styles and vocabularies of the two texts. It should therefore provide ample possibility for testing whether the Shakir text (Q) was plagiarised from the Ali text/s (E and M).

It can be argued that the above method means that not every Quranic verse has an equal chance of being selected. However, the alternative would have been to number each verse individually, regardless of its chapter or chapter section. This would have been an onerous task and, on balance, it was felt that the method used did at least provide some chance for each verse to be selected.

What is plagiarism and how can it be established?

Several ways of defining plagiarism exist. A moral definition could be: ‘The theft of another’s work or ideas presented as one’s own’; on the other hand a legal definition could encompass ideas such as: ‘The intellectual infringement of the work of another constituting a copyright violation’. For linguists plagiarism is the presence in one text of substantial amounts of another text or the ideas contained in it, where the plagiariser’s text has been claimed to have been produced independently. All texts rely on other texts for their genesis and production. Novels in the same genre, for example, often have many similar features, such as scenes, characters, plots, etc. Research papers in a particular discipline also share many common features. The linguistic term for this phenomenon is intertextuality. We expect works of the same genre and of the same text type to share lexis (vocabulary) and elements of structure, such as, for example, headings in the case of an academic paper or plot in the case of a novel. In itself the process of intertextuality does not constitute plagiarism. It is an entirely normal process. However, plagiarism goes beyond intertextuality because it copies either the ideas of the source work or the language (or, sometimes, both) and, crucially, does not acknowledge its source, thereby falsely representing itself as an independently authored work.

In the case of translation we cannot really consider the notion of theft of ideas, except where a plagiarist copies an error from his/her source. So, for example, we may suspect plagiarism if the first translator misinterprets an idea expressed in the source language and the second translator copies this idea, but uses different language from the first translator: we would especially suspect plagiarism in such an instance if the first translator had been the first writer/translator to produce this specific error, which had then itself been copied in error. Previously we gave an example of apparent doctrinal inconsistency[5] in the case of Mr Shakir’s text. Here we appear to have something bizarrely like the theft of ideas: in this case the plagiarist sees what he considers to be a doctrinal error and reverts to what he believes to be a non‑heretical view. Later, he comes across another instance of the apparent doctrinal error, but fails, in the copying process, to ‘correct’ this error, and in this way inadvertently copies, not just the text, but a fundamental idea within the text, thus exposing the plagiarism.

Aside from the theft of ideas, and the inconsistencies which almost inevitably follow when a copyist attempts to avoid borrowing a specific error in one instance, but fails to do so in another, we also have word‑for‑word, or literal, plagiarism.

In any analysis the aim is to demonstrate, on the basis of probability. Even though a probability in a given case may be 99.99999999999999% (or, depending on the analysis, its counterpart of 0. 00000000000001) it is still classed as a probability. Generally, a five‑point probability scale is used, given as follows:

 

Figure 1: Showing the five‑point probability scale

Scale

1

2

3

4

5

Phrased as:

Very low probability

Low probability

Medium probability

High probability

Very high probability

 

In the Shakir translation of the Quran there are literally thousands upon thousands of word‑for‑word passages which are identical with their counterparts in MM Ali’s translation. Below I will detail how these can be measured, and that as a result, (through the use of statistical analysis) a very high probability of plagiarism is proposed. Moreover, it will be seen that the plagiarism is at saturation levels, that is to say it is comprehensive, occurring across the entire work.

Preliminary steps: MM Ali’s text in the context of Quranic translations

As far as I have been able to judge MM Ali’s translation of the Quran into English is the earliest of those under consideration here. Sarwar’s translation did not appear until three years later in 1920. The next major translation was that of Pickthal (or Pickthall), which appeared in 1930. Yusuf Ali’s translation appeared in 1934, and was re-issued in 1937. Sherali’s work first saw the light of day in 1955, and Rashad’s work was not published until about 1970. The translation referred to as by Khan, is in fact a joint work by Al Hilali and Khan and is of relatively recent date, 1995, although there was a translation by a Khan in 1905 (to which I can find no further references). Because MM Ali’s translation is  the earliest of those under detailed comparison, it is clear he could not have depended on any of the above texts. However, I wondered whether there were any earlier translations that he might have depended on.

In the notes to MM Ali’s 1917 translation, I found mention of three earlier translations for comparative purposes: those by JM Rodwell (1861), George Sale 1734 and Palmer (1876). Research appears to confirm that these were the best‑known translations of the Quran into English which were available at the time that MM Ali began his own translation.

Even a cursory glance shows Palmer’s translation to be derivative of Sale’s and closer examination leads me to believe that the scholarship of these three editions was not high. Furthermore, none of these translators was a Muslim, and therefore, given MM Ali’s preoccupation with rendering the message of the Quran faithfully for the benefit of western believers who did not speak Arabic, my first impression was that he was unlikely to have depended on any of these translations to any extent, although he was familiar with them – given his references to them.

I have looked at verses from each of these three works, Sale, Rodwell and Palmer, and below I quote Chapter 14 Verse 13 from each of them, followed by MM Ali’s own version. I will comment on these translation excerpts below.

 

 

 

 

 

 

 

 

Text Excerpts 1

Sale

And those who believed not said unto their apostles, we will surely expel you out of our land; or ye shall return unto our religion. And their LORD spake unto them by revelation, saying We will surely destroy the wicked doers;”

Rodwell 14

And they who believed not said to their Apostles, “Forth from our land will we surely drive you, or, to our religion shall ye return.” Then their Lord revealed to them, “We will certainly destroy the wicked doers,”

Palmer

And those who misbelieved said to their apostles, “We will drive you forth from our land; or else ye shall return to our faith!” And their Lord inspired them, ‘We will surely destroy the unjust;’

 

MM Ali

And those who disbelieved said to their apostles: we will most certainly drive you forth from our land, or else you shall come back into our religion.  So their Lord revealed to them: most certainly we will destroy the unjust:

 

I believe MM Ali’s translation differs quite clearly from these earlier versions[6]. Ali’s translation is less archaic, for instance there are no instances of ‘ye’, although he does use the slightly archaic place adverbial ‘forth from’ (as Palmer does). All of these translations, including that by MM Ali, use ‘apostles’, while most of the translations after him refer to ‘messengers’. Ali’s use of ‘disbelieve’ is interesting: he appears to use the word as meaning actively not believing, rather than failing to believe. Having read through many different translations of these verses, it does indeed seem that the Quran at this point is commenting on those who refuse to believe, who effectively actively (sic) dis‑believe rather than those who simply fail to believe. Therefore, despite its unusual appearance as a verb (the noun disbelief is more common), I can understand why MM Ali would have used ‘disbelieve’. Moreover, this word does not occur in any translation earlier than that of MM Ali. I cite his use of disbelieve as one example of MM Ali’s apparent efforts to search out the meaning of the text, rather than simply render it into English without considering its implications within the context of the type of work he was translating and its particular contextual significance[7].

While looking at MM Ali’s notes accompanying his translation, it seemed to me that, though he did not have any formal linguistic training, he nevertheless appears to have used sound translation principles. For example, he cross-references verses to other verses where the same or similar words, or words derived from the same etymological root are given; he cross‑references verses where the same or similar ideas are expressed; he gives alternative interpretations of phrases, synonyms for words, and – most crucially for a scholarly work – he cites the work of other translators and scholars, and in some cases gives reasons for accepting or rejecting their interpretations.

For the above reasons, it seems to me likely that MM Ali’s scholarship is genuine, and that he carried out his work as an authentic translation, rather than as a process of borrowing from other translations. This has been verified by many Muslim scholars and although some may disagree with a few of his interpretations, the quality of his scholarship has never, as far as I can tell, been in question.

My intention in this section has been to demonstrate MM Ali’s work as a genuine translation. I summarise my reasons for this view here:

1.     The English translations which occurred before MM Ali’s translation were written in a more archaic style, and with less sensitivity to nuances of meaning, e.g. the use of ‘disbelieve’ by MM Ali shows considerable attention to meaning.

2.     The other major English translations, e.g. Pickthal(l), occurred after MM Ali’s 1917 translation was published.

3.     MM Ali shows not only sensitivity to meaning, but scholarship with regard to choice of word, synonyms used, consideration of previous translations, and attention to the original text.

By definition, a work which is not in itself original or genuine cannot be plagiarised from. It would simply itself be a copy, and any simulation of it would be little more than a distorted reflection of the true, but obscured, original. Since, in my view, MM Ali’s work is genuine, then it follows that it can be plagiarised from.

Methods of detecting plagiarism

One of the best‑known software packages for detecting plagiarism is ‘Copycatch’, developed at the University of Birmingham. It attaches particular importance to unique lexical words (see Clough, 2000: 11). Texts are measured in terms of their lexical similarities. As Clough (citing a conference paper by Woolls, the developer who authored ‘Copycatch’) notes: a match of about 40% is normal in same genre, same text type, while anything over 70% requires further analysis. However, there are two important factors in the present instance which require us to be particularly careful regarding percentage of similarity between texts:

1.     The first consideration is the fact that we are comparing translated works, as opposed to works in their original language;

2.     The second consideration is that scriptural works employ a more restricted lexis than many other types of work. That is to say, the vocabulary used by scriptural texts is more narrow in range than the vocabulary found in many (if not most) other types of work. To describe this more fully I must introduce some terminology, type and token. If a text consists of 1000 words, then we say it has 1000 tokens. In this sense token is synonymous with word. In this 1000 word text, the author will have used hundreds of different words, or types. I measured a range of works and found the type-token density as follows:

 

 

 

Table 2: Range of type‑token densities for a sample of different genres of text

Type of work

Range of type-token densities (i.e. number of different words per text length, given as a decimal fraction)

English novels

0.37 – 0.41

Book of Genesis

0.08 - 0.109

St John’s Gospel

0.12 – 0.16

Book of Mormon

0.14 – 0.16

Quran

0.19 – 0.2

Note: All excerpts were < 1000 words; the samples used to carry out the above comparisons are included in the directory folder labelled ‘Samples used to measure type token density’.

As can be seen from the above table, scriptural works appear to have much lower levels of vocabulary richness than, for example, English novels. The fact that most of the religious works mentioned in the table are translations may have a bearing on this. However, the nature of the written material is likely to be more focused (given that, as we see, the Book of Mormon also has a low type‑token ratio, but – nevertheless – is not a translation).

More research would be needed to establish the reasons for these lower levels of lexical richness in scriptural works, but – nevertheless – the observation does seem to hold good across a range of scriptural works. Thus, while a common lexicon of 70% may raise questions in the minds of those attempting to determine the presence of plagiarism in non‑translated, non‑scriptural works, we cannot be sure that this measure would be useful when analysing translated scriptural works. Scriptural works necessarily have a limited range of topics, and express a finite set of attitudes, morals, beliefs and viewpoints. Therefore, we should not be surprised if, in translating a scriptural work, several writers working independently of each other turn out to have higher levels of common lexis than we would expect to find in, say, fiction works which appear in their original language. For this reason, comparisons were taken across the entire sample corpus of 22 chapters: each translator’s version of a particular chapter was measured against every other translator’s version of that chapter. This is described in more detail in the next section.

Methods of plagiarism detection used in the present instance

1. Lexical identity comparisons

Explanation: Lexical identity comparisons measure the number of lexical (or content words) in common between two texts. The present test goes one step further and measures unique lexical words in each text. Unique words are also called hapax legomena – and because they only occur once in a text, the chances of finding a high number of hapax legomena in two texts which were produced independently are very low: how low will depend on the genre and the text type, whether the text is a translation, and also the length of the text.

What happens is that the words unique to one text are matched with the unique words found in the test text. The higher the match, the greater the probability that the two texts were not independently produced. This approach, namely the comparison of unique lexical words across source and target text is well attested (see above references).

For a valid comparison to be made the two texts being measured should be of a similar length. It should be borne in mind that texts of the same type and genre will have a higher common lexis (vocabulary) than texts of different genres or types.

It was decided to treat the individual chapters of the nine different translations of the Quran as a corpus, and the chapters taken as samples as the sample of the corpus. The aim was to establish what norms of similarity exist across this sample corpus, on the basis that this could be extrapolated to the entire corpus. As previously stated, given that these are scriptural translations, we would expect relatively high baselines, especially since it seems to be the case that scriptural works tend to have a somewhat narrow lexical focus.

The nine different translations used are as follows:

Khan (Hilali-Khan)

Maulana (the 1951 revision of MM Ali’s 1917 translation)

Mm Ali (the 1917 translation)

Pickthal

Rashad

Sarwar

Shakir

Sherali

Yusufali

A comparison of every sample chapter or section across each author‑pair was undertaken. Thus, for example, Sherali was compared with Khan, Maulana, MM Ali, Pickthal, Rashad, Sarwar, Shakir, and Yusufali. The same applied to all of the other translators. In all 22 chapters or chapter sections were thus compared, obtaining over 400 possible pairwise comparisons.

Two measurements were taken. For the first measurement translations from MM Ali and Shakir were excluded. This would establish, for each chapter or section, what the ‘norm’ across the group would be. For the second measurement, only translations from MM Ali and Shakir were included. This would establish the degree of similarity between MM Ali and Shakir and it would be immediately apparent if this were very different from the proportion of similarity for the group.

The null hypothesis is that the 2 proportions are identical. The alternative hypothesis is that the MM Ali-Shakir proportion is higher and therefore it is a one-tailed test. A two proportions Z test was used as both samples are large and the combined p is fairly close to 0.5.

To describe the findings technically, I paraphrase from correspondence and discussions I had with my statistician: the null hypothesis was rejected in all 22 chapters because the Z value was usually much higher than the critical value of 1.645 for a 1 tailed 5% significance. The actual p value in many instances was actually below 0.01 and so the null hypothesis would be rejected under much more stringent significance values than the 5% value adopted for this test.

In plain language what this means is that there is a significant difference between the MM Ali-Shakir comparison and all the other comparisons across the corpus of nine Quran translations of 22 chapters and chapter sections:

 

Table 3: Results of lexical identity tests of sample chapters and sections

Chapter

MMAli-Shakir

Rest

Prob.

14

.92

.37

.001

17

.99

.39

.0001

19

.91

.32

.0003

22

.96

.27

.00005

27

.93

.3

.00004

28

.85

.36

.001

30

.86

.28

.0009

38

.9

.36

.001

53

.95

.34

.0001

65

.89

.41

.004

66

.93

.38

.0009

77

.94

.36

.007

79

.83

.3

.0006

81

.83

.33

.001

86

.87

.28

.0009

90

.96

.29

.0001

94

.8

.27

.01

96

.93

.32

.0007

100

.9

.33

.004

101

.95

.44

.02

107

.86

.24

.001

108

.67

.23

.04

 

The first column above gives the chapter number. This is followed by the density of identical, unique, lexical words found in Shakir in a given chapter which are also found in MM Ali. The third column gives the mean density of similarity across all the other translations. The final column gives the probability that the degree of similarity could have arisen by chance, i.e. that Shakir could have arrived at this degree of similarity across so many chapters and sections independently. What do we notice from this table? The degree of similarity between Shakir and MM Ali is so high that it can safely be described as ‘overwhelmingly similar’. On average Shakir uses 89 per cent of the unique lexicon in each chapter and section that MM Ali does. The average across the other translators is 33 per cent. This is roughly in line with predictions: recall, that earlier Clough (2000) was quoted as saying that 40% was normal. We then find that the average of all the probabilities is below 1 per cent, i.e. that p (probability) < 0.01.

The above table is summarised from the Excel document, ‘Unique Lexical Words Test.xls’ which accompanies this report, and which gives the full results of the test. That document gives all the comparisons for each chapter on the left hand side of the worksheet, and then the ‘second proportion’ test (i.e. all of the comparisons excluding MM Ali-Shakir), and below that the comparison for MH Shakir (the ‘first proportion’). These two proportions are then compared with each other with key information displayed in the grey panel, and the significant data highlighted in yellow.

I suggest that the high degree of similarity shown here between Shakir and MM Ali is far beyond co-incidence or chance. Although we expect translations of a scriptural work to contain some common material, it is clear that the Shakir translation must have arisen as a result of plagiarism.

Opinion 1: For reasons given in this section it is my professional opinion that the author known as MH Shakir has extensively plagiarised the translation of the Quran by MM Ali.

2. Word for word plagiarism

Copycatch, as mentioned previously, measures the number of identical lexical words found between two texts. This is the method used above, though I am not familiar with Copycatch’s statistical or other calculation methods. Instead I have used a series of tried and tested methods, following the advice and assistance of our statistical department).

A more powerful method than the common unique lexical identity mentioned in the previous section is to search for identical strings of language across two texts. Identical strings of six words are considered to be unlikely to occur independently across two texts, unless consisting of fixed phrases, which are common in all languages. Tests I have previously carried out (see Olsson 2004) show that identical strings greater than 31 letters and spaces (excluding punctuation) are highly unlikely to occur independently.

However, as with the number of lexical words in common, as per the previous test, with scriptural text we must at least anticipate a higher than average occurrence of identical strings. Therefore, as before, we need to establish what the corpus of Quran translation excerpts reveals in terms of what is found across all the translations except MM Ali and Shakir.

As with the lexical identity tests reported in the previous section, the string tests revealed very high degrees of similarity between MM Ali and Shakir and, conversely, much lower degrees of similarity between the rest of the translations.

This is how the string test works: the first six words of a text are taken and searched for in the target text. If a match is found the count is incremented by 1, and the target string is deleted. The software then takes the next six words, searches for them, and increments and deletes, as before, if there is a match. If no match is found the software moves onto the next six words in the text. It is discrete strings that are searched for: the software does not take, for example words 1-6, 2-7, 3-9, etc., but 1-6, 7-12, 13-18, etc. This means that there may be many matches which are missed: the point is we are taking a sample of the available population of strings, not measuring the entire population.

How similarity is calculated in string tests

In a text of, say, 100 words, if there are 16 identical discrete strings across two texts, then the similarity is calculated as 16 X 6 = 96/100 = 96% or 0.96, in other words 96% of possible discrete strings measuring from the first word, not all possible strings, or even all possible discrete strings. Below, I will describe the statistical tests used to calculate the significance of the findings.

Items likely to reduce the number of identical strings found

It was noticed that for all their similarities the Shakir and MM Ali texts do have some important differences. Shakir always writes names in their Arabic original. Thus, for example, Moses is Musa, Jesus is Isa, Mary is Miriam, and so on. MM Ali, on the other hand, uses the English versions, most of which have arrived in the language through Hebrew and Greek, rather than Arabic. Shakir will also use Arabic religious terms, like – for example – ‘kausur’, rather than their English equivalents. Also, Shakir uses US spellings, whereas MM Ali uses UK spellings. Other differences arise when, for instance, Shakir will differ in his interpretation of an issue, event or doctrine, from that of MM Ali. We also expect to find a lower level of similarity when the chapter being tested is very short. In such instances, we find Shakir will use Arabic terms not found elsewhere in the text. It seems possible he was highly aware that identicality of text is more easily observed when chapters are short. By using Arabic words and terms he is able to reduce, at least superficially, the risk of detection.

The above reasons all contribute to some chapters exhibiting a lower level of similarity than one would expect where plagiarism is literal: however, we must not lose sight of the fact that the plagiarism is by and large literal – but that this is on occasion obscured by the activity of resorting, I believe somewhat cynically, to the above devices.

Statistical tests on the source and plagiarising texts

The aim here is to estimate the genuine proportions, which is to say the proportions found across the rest of the corpus. For this purpose, all of the MM Ali and Shakir excerpts were excluded. For Chapter 14 this gives a total of 42 6‑word strings, comprising 252 words out of a total of 5487 words, yielding what we may term a ‘sample identical string density’ of 252/5487 or 0.045. For this chapter, the MM Ali sample is 281 words in length and Shakir has 37 identical 6 word strings, comprising 222 words in total (almost as much as the entire rest of the sample population for this chapter). This yields the ‘sample identical string density’ of 222/281 = 0.79. The probability of these two works being arrived at independently is then calculated.

The corpus appears to tell us that there is a 0.045 probability of a common string occurring. The probability of obtaining 42 strings over a text of the same length is thus much more remote. SPSS[8] gives it at 0.0000000000000432. The statistics department suggests that this is right on the limits of SPSS precision, but that it is likely that the probability is of the order of 1 x 10-14 – on the assumption that the probability of a common string is 0.045. The full results for this test are given in the document ‘Six String Calcs with macro.xls’, the layout of which is similar to that described for the previous Excel document. A summary of these data are given below:

 

 

 

 

 

See Appendix Note on precision of this table.

 
Table 4: Summarising the results of the sample identical string test

Chapter           MMAli-Shakir            The Rest         Probability

14

.79

.06

0.0000000000000000E+00

17

.97

.05

0.0000000000000000E+00

19

.77

.04

0.0000000000000000E+00

22

.94

.02

0.0000000000000000E+00

27

.88

.05

0.0000000000000000E+00

28

.72

.07

0.0000000000000000E+00

30

.9

.05

0.0000000000000000E+00

38

,86

.04

0.0000000000000000E+00

53

.88

.03

0.0000000000000000E+00

65

.83

.05

0.0000000000000000E+00

66

.9

.03

0.0000000000000000E+00

77

.91

.01

0.0000000000000000E+00

79

.77

.03

0.0000000000000000E+00

81

.75

.02

0.0000000000000000E+00

86

.74

.04

0.0000000000000000E+00

90

.94

.04

0.0000000000000000E+00

94

.71

.01

0.0000000000000000E+00

96

.9

.05

0.0000000000000000E+00

100

.77

.02

0.0000000000000000E+00

101

.79

.02

0.0000000000000000E+00

107

.89

.01

0.0000000000000000E+00

108

.44

.02

0.0000000000000000E+00

 

As can be seen the sample identical string density (the number of identical strings per length of text for MM Ali‑Shakir) is on average almost twenty times the sample identical string density found across the rest of the corpus. This yields an extremely minute probability of the Shakir texts having been produced independently.

Opinion 2: I believe the above demonstrates absolutely overwhelming evidence in favour of extensive, almost total, plagiarism by MH Shakir. It is simply not possible to doubt that MM Ali’s translation was plagiarised by Shakir.

Did Shakir copy from Maulana?

There is some evidence that Shakir copied not only from the 1917 translation, but also from its 1951 revision. Below I give some examples of this copying. It should be noted that I have not looked through all of the sample chapters for this exercise, but only a few:

In Chapter 22 Verse 38 (hereafter, for example, 22:38) MM Ali has ‘Surely Allah will repel from those who believe...’ whereas Shakir has 'Surely Allah defends those who believe'. This is very close to Maulana's ‘Surely Allah defends (present tense) those who believe’. This has some similarities with some of the other translations, but it is closer to Maulana than MM Ali.

In 27: 38 Shakir has '...which of you can bring to me her throne...' whereas MM Ali has 'Which of you can bring to me a throne for her...'. Again the Shakir version is closer to Maulana's version: 'Which of you can bring me her throne...'

In 28: 39 the copying from Maulana is identical for the entire verse, even punctuation and case. I reproduce the three versions here:

SHAKIR

028:039 And he was unjustly proud in the land, he and his hosts, and they deemed that they would not be brought back to Us.

MAULANA

028:039 And he was unjustly proud in the land, he and his hosts, and they deemed that they would not be brought back to Us.

MM Ali

And he was unjustly proud in the land, he and his hosts, then we cast them into the sea, and see how was the end of the unjust.

 

As can be seen from the last example given above, it is the MM Ali version in 28:39 which stands out as different in this group of three. Moreover, none of the other versions (Khan, Sarawar, Pickthal, etc.) is identical with this version.

A close investigation of the entire text for each author would doubtless yield further results, but I believe this section has shown that there is little doubt that some direct plagiarism has occurred from the Maulana text by Shakir. The last example given above, for example, represents a 24‑word string: elsewhere in this report I have spoken about the statistical significance of 6‑word strings. It is well observed (Olsson 2004) that with every additional word the string becomes less and less likely to be reproducible under independent conditions. By the time we reach the length of a 24‑word string we are stretching credibility far beyond possibility. For a more comprehensive picture of the Maulana‑Shakir progression of borrowings it would be necessary to do a separate study from the present, since the primary task of the present study was to assess the level of plagiarism from the 1917 version. However, I believe such an analysis would paint a very similar picture to that of the present study.

Did the other translators copy from MM Ali?

I took the unique lexical word matches from the first five sample chapters (Ch/s 14, 17, 19, 22, and 27). As previously noted we regularly have borrowings by Shakir from MM Ali at around 90 per cent. However, many other borrowings are above 50 per cent, but it is not always easy to follow the provenance of these borrowings. For example, with regard to 14: 13-21, Khan matches 56 per cent with Pickthal and 54 per cent with Maulana. The Maulana‑Pickthal match is 45 per cent. Do we conclude that Maulana borrowed from Pickthal? It is possible, but we note that the Maulana‑MM Ali match is 69 per cent, while the MM Ali‑Pickthal match is 37 per cent. It therefore seems that MM Ali may have consulted Pickthal’s version when revising his translation in 1951, but Pickthal will already have consulted MM Ali’s earlier translation for his own 1930 publication. In fact Pickthal and Yusufali, the two translators who were closest to MM Ali in time, and were – as far as I understand – actually acquainted with him – appear to have borrowed least from him, their matches averaging not much more than 40 per cent for unique lexical words – which is about the figure suggested by earlier researchers as being ‘normal’ when same‑genre, same‑topic texts are under consideration. If there is a name which seems to recur at above the 50% level, it is that of Khan who appears to have a close lexical relationship with MM Ali, Pickthal and Yusufali. However, I do not suggest – without further analysis – that this is statistically significant. Certainly, more research would be required to establish the exact nature of the translation history of the Quran with respect to plagiarism. Moreover, other translations than those mentioned here have also appeared in the last 80 years, and these would all need to be taken into account. From what I have seen, however, the greatest debt among all of them seems to be to MM Ali, Pickthal and Yusufali. However, it is possible that in this context the notion of plagiarism would not be entirely appropriate. Many of the translators were/are not native speakers of English and would have felt bound to consult other editions. Few were/are native speakers of Arabic – Rashad, for example, was one of the few Arabic native speakers, being an Egyptian who then spent many years in America, where he appears to have acquired a virtual native speaker competence in that language.

The extent of the borrowings from MM Ali and between other translators is, as I suggest, not likely – without further research – to prove significant, except, as noted, with regard to MH Shakir[9]. The extent to which MH Shakir has plagiarised from MM Ali and, to a lesser extent from the Maulana version, is both breathtaking and blatant. No other conclusion is possible. It was a deliberate plagiarism, which in parts he has attempted to disguise by the use of Arabic names and terminology. The use of such names gives the text a superficial air of authenticity, but I suggest their use is no more than a heartless and cynical ploy to disguise what was actually going on. The MH Shakir version cannot be called a translation at all: it is no more than a copy of MM Ali’s work.

Conclusion

I simply repeat here my earlier observations, based on the textual and statistical analyses of the similarities between MM Ali and MH Shakir presented in the accompanying documents. It is concluded that MH Shakir plagiarised almost the entire translation from MM Ali (1917) and from the 1951 revision of that translation. I estimate that on average he plagiarised 90 per cent of the text from each chapter, whereas the average amount of common material between the other translators was below 40 per cent, which I believe to be normal for same‑genre, same‑topic works, whether translated or in the language of the original.

If you have Microsoft Excel, you can view the data for this report by clicking on the links below, which will open the original excel files.

http://www.muslim.org/intro/fli-6string.xls

http://www.muslim.org/intro/fli-unique.xls

 

 

John Olsson BSc, MA, MPhil, Member, International Association of Forensic Linguists

Director, Forensic Linguistics Institute, United Kingdom

Adjunct Professor, Nebraska Wesleyan University

Academic Advisor, Language and Law Program, University of Zagreb

Independent Consultant to Legal Professionals and Law enforcement agencies worldwide


References:

Clough P. 2000. Plagiarism in natural and programming languages: an overview of current tools and technologies, Dept of Computer Science, University of Sheffield, see: http://ir.shef.ac.UK/cloughie/papers/plagiarism2000.pdf

Olsson J. 2004. Forensic Linguistics: An introduction to language, crime and the law. Continuum. London, New York.

Woolls D, PAPERS: authors, affiliations and abstracts, IAFL Conference, 1999, University of Birmingham, 1999, http://www-clg.bham.ac.UK/forensic/IAFL99/conf_abst.html

 


Appendix Note

The probabilities given in Table 4 are of a very minute order, and therefore an approximation has been used. The method is correct, being binomial (i.e. the event occurs or not), but most computer programs cannot handle the level of accuracy required. See also the sheet ‘binoprobs’ in the document ‘Six string calcs with macro.xls’. The point is that the probability values are of a very minute order.

 



[1] Shakir gives the Arabic names for prophets e.g. Suleiman instead of Solomon, Isa for Jesus, Musa for Moses, whereas MM Ali/Maulana give the English versions of these and other Quranic names. This does sometimes make the Q text (i.e. the questioned text) appear to be less verbatim (of E/M) than it is.

[2] In this report I will refer to the translations as follows: the 1917 translation by MM Ali will be termed the Earlier text (abbreviated ‘E’); the 1951 revision will be referred to, as it is commonly known, the Maulana translation (abbreviated ‘M’), and the Shakir translation will be referred to as Q, (i.e. the Questioned) text.

[3] I am grateful to various web site for pointing this information out.

[4] Given here in ascending numerical sequence, not in the sequence in which they were generated.

[5] I should stress that I am not proposing that Mr MM Ali’s idea regarding Jesus’ death was ‘mistaken’ or ‘heretical’ in any way. I am not passing any opinion regarding doctrinal views. From my limited research on this subject, it appears that many leading Islamic authorities throughout history have also held this view. See www.muslim.org/bookspdf/deathj.pdf

[6] We note that these earlier versions, at least with respect to Chapter 14, Verse 13, do not always differ from each other. Rodwell’s translation, for example, seems to have several similarities to Sale’s work.

[7] Interestingly of the eight other translations which I will be comparing with that of MM Ali, seven also use either ‘disbelieve’ in Chapter 14, verse 13, or ‘disbelievers’. Only Yusufali differs by using ‘Unbelievers’.

[8] A well known statistical package

[9] Although borrowings in terms of lexical words may not be significant, it appears that later translators may have benefited from MM Ali's understanding and interpretation of the Quranic verses.  Pickthal's translation, particularly, has been viewed by some as a mere "revision" of MM Ali's work because of his apparent following MM Ali's understanding of Islamic principles. See for example the Rev Samuel Zwemer’s references to this on the Internet. Even so, this issue requires further linguitic research to establish as full a picture as possible and should not be pre‑judged.