Index of Coincidence
First test on encoded text
Last updated
First test on encoded text
Last updated
The index of coincidence (IC or IoC) is an indicator used in cryptanalysis which makes it possible to evaluate the global distribution of letters in an encrypted message for a given alphabet.
it can easily be calculated using following formula
with ni the number of occurrences of the letter i in the text and N the total number of letters.
It can be used for any encoded text to get a rough idea about the Algorithm used to encode it
In cryptography we can classify all algorithms in mainly following 2 categories:
In this type of cipher text each alphabet is either shifted by some position or replaced by a specific alphabet they are generally easy to decode.
IoC value of a monoalphabetic substitution is similar to plain text that is about 0.07
In this type of cipher text a character or a group of character is replaced by another group of character they are relatively difficult to decode
IoC value of polyalphabetic cipher is smaller around 0.03-0.04
We can also calcualte IoC usign the followign python code
Let us take some example to have a better understanding "Gwzd zd bo bqqzon hzuwna" By using the above python code we get index of coincidence as 0.063 which is closer to 0.07 so this is a monoalphabetic substitution cipher and on further analysis which will be covered in other section we get the decoded message as "This is an affine cipher"
To study further about this topic you can refer to following website
Index of coincidence is the first tool which we can use to shorten the list of possible encryption algorithms but this is not the only method so keep on exploring ...