Friday, 28 September 2018

Vigenere Cipher - code - C++

This is a follow up to my article on the Vigenere Cipher. Here, I have given the c++ code for encrypting / decrypting a text file using the Vigenere Cipher. The user has to select whether to ecrypt or decrypt, and they must enter the key. The whole text is then encrypted or decrypted using the key repeated until the length of the text is reached.

I hope the comments explain things sufficiently in the code, but I will provide an additional explanation after the code. The code is given below:

/* 
 * File:   main.cpp
 * Author: 88615XC
 *
 * Created on September 28, 2018, 3:34 PM
 */

#include <cstdlib>
#include <iostream>
#include <fstream>
#include <cstring>

#define CODE_ARR_SIZE 63                //Size of the code array

std::string line;                       //For reading data from file
std::string outLine;                    //For coded data
std::string inCmd;                      //For reading data input from console
std::string key;                        //To store the key
int key_marker = 0;                     //marks current position along the key
char cipher[CODE_ARR_SIZE]={'0','1','2','3','4','5','6','7','8','9','A','a',
'B','b','C','c','D','d','E','e','F','f','G','g','H','h','I','i','J','j','K','k',
'L','l','M','m','N','n','O','o','P','p','Q','q','R','r','S','s','T','t','U','u',
'V','v','W','w','X','x','Y','y','Z','z',' '};


using namespace std;

/*
 * Returns the array index corresponding to the character
 */
int getNumber(char char1){
    for (int i = 0; i<CODE_ARR_SIZE; i++){
        if (char1==cipher[i]){
            return i;
        }
    }
    return -1;
}
/*
 * Returns the array index corresponding to the character
 */
char getChar(int num){
    if ((num>=0)&&(num<CODE_ARR_SIZE)){
        return cipher[num];
    }
    return '|';
}
/*
 * codes a line from the input line with given key.
 */
void encryptLine(void){
    int tmp1, tmp2;                     //Temporary variables
    char tmp3;
    outLine.clear();                    //Clear output string
    //Loop through the line one character at a time
    for (int i = 0; i<line.length(); i++){
        tmp1 = getNumber(line[i]);        //get numerical equivalent
        if (tmp1 != -1){                //If character is in code array
            //Get numerical equivalent for key. No validity checking needed,
            //As non-alphanumeric characters are already removed.
            tmp2 = getNumber(key[key_marker]);
            key_marker++;                //Update key marker
            //If key marker overflows, set it back to 0
            if (key_marker >= key.length()) key_marker=0;
            tmp1 = tmp1 + tmp2;         //Add the two values
            tmp1 = tmp1%CODE_ARR_SIZE;  //Get modulus
            tmp3 = getChar(tmp1);       //Get corresponding character
            outLine.push_back(tmp3);    //Add to output string
        }
        else{
            //If character is not in code array (periods, commas, etc.) they 
            //will be added to the output string as it is. If neccassary to 
            //remove them, can be done like with the key in main.
            outLine.push_back(line[i]);
        }
    }
}
/*
 * decodes the input line using the given key
 */
void decryptLine(void){
    int tmp1, tmp2;                     //Temporary variables
    char tmp3;
    outLine.clear();                    //Clear output line
    for (int i = 0; i<line.length(); i++){
        tmp1 = getNumber(line[i]);        //get numerical equivalent
        if (tmp1 != -1){                //If character is in code array
            //Get numerical equivalent for key. No validity checking needed,
            //As non-alphanumeric characters are already removed.
            tmp2 = getNumber(key[key_marker]);
            key_marker++;               //Update key marker
            //If key marker overflows, set it back to 0
            if (key_marker >= key.length()) key_marker=0;
            tmp1 = tmp1 - tmp2;         //Subtract key value from character
            tmp1 = tmp1%CODE_ARR_SIZE;  //Get modulus
            //If the value is negative, add CODE_ARR_SIZE to it
            if (tmp1<0) tmp1=tmp1+CODE_ARR_SIZE;
            tmp3 = getChar(tmp1);       //Get corresponding character 
            outLine.push_back(tmp3);    //Add to output string
        }
        else{
            //If character is not in code array (periods, commas, etc.) they 
            //will be added to the output string as it is. If neccassary to 
            //remove them, can be done like with the key in main.
            outLine.push_back(line[i]);
        }
    }
}
/*
 * 
 */
int main(int argc, char** argv) {
    ifstream input1;
    ofstream output1;
    
    //Open the input and output files
    input1.open("input1.txt",ios::in);
    output1.open("output1.txt",ios::out);
    
    //The user chooses whether the input file is coded or decoded.
    cout<<"Enter E to encrypt, D to decrypt, Q to quit:"<<endl;
    cin>>inCmd;
    //If a quit command is received, the program exits.
    if ((inCmd[0]=='Q')||(inCmd[0]=='q')){
        cout<<"Quit command received"<<endl;
        return EXIT_SUCCESS;
    }
    else if ((inCmd[0]=='E')||(inCmd[0]=='e')){
        cout<<"encrypt mode"<<endl;
    }
    else if ((inCmd[0]=='D')||(inCmd[0]=='d')){
        cout<<"decrypt mode"<<endl;
    }
    //If the command is not valid, the program exits.
    else{
        cout<<"Command not understood"<<endl;
        return EXIT_FAILURE;
    }
    
    //The user enters the key. No spaces allowed. characters that are not in the
    //specified range will be ignored.
    cout<<"Enter Key. Characters 0-9, A-Z, a-z:"<<endl;
    cin>>key;
    if (key.length()==0){               //Impossible but just in case
        cout<<"key length 0. Exiting."<<endl;
        return EXIT_FAILURE;
    }
    else{
        //Check every character to see whether it is alphanumeric. If any 
        //non-alphanumeric characters are detected, remove them.
        int j = 0;
        for (int i = 0; i<key.length(); i++){
            if (isalnum(key[i])){ 
                j++;
            }
            else{
                for (int k=i+1;k<key.length();k++){
                    key[k-1] = key[k];
                }
                key.erase(key.length()-1);
                i--;
            }
        }
        if (j==0){                  
            //If no alphanumeric characters are detected, key is invalid.
            cout<<"key not alphanumeric. Exiting."<<endl;
            return EXIT_FAILURE;
        }
        else{
            cout<<"Key :"<<key<<", Key length :"<<key.length()<<endl;
        }
    }
    
    while (!(input1.eof())){            //Loop until end of input file
        getline(input1,line);           //Read a line from input file
        if ((inCmd[0]=='E')||(inCmd[0]=='e')){
            encryptLine();                 //If code mode, encode input file
        }
        else if ((inCmd[0]=='D')||(inCmd[0]=='d')){
            decryptLine();               //If decode mode, decode input file
        }
        output1<<outLine<<endl;         //Write result to output file
    }
    
    input1.close();
    output1.close();

    return 0;
}

Results:
For the input file, the following text was pasted into input1.txt:

This is a test.
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
01234567891011121314151617181920

The purpose was to test the entire range of possible characters. The console output on running this file through it on encrypt mode:

Enter E to encrypt, D to decrypt, Q to quit:
encrypt mode
Enter Key. Characters 0-9, A-Z, a-z:
Key :Testing, Key length :7


Please note that my IDE does not show the characters that I enter.

The key used in this case is 'Testing'. If I enter something like '..T.e.s.t..ing.', the periods would be removed, and the key would still be 'Testing'. The encrypted result is in output1.txt is:

lRalI17sKSmSf9.
Ykz2rxr9rBDy8ygyIKAGAnAPRH
TN0N37U UCUdf1b1J1kmcicQcr
tJoIVHvXmSHTFTUjOItgTWJqHwFwUNog

To decode this, rename the file input1.txt, and put it through the program in decode mode using the same key. Console output:

Enter E to encrypt, D to decrypt, Q to quit:
decrypt mode
Enter Key. Characters 0-9, A-Z, a-z:
Key :Testing, Key length :7


The decrypted result is:

Enter E to encrypt, D to decrypt, Q to quit:
decrypt mode
Enter Key. Characters 0-9, A-Z, a-z:
Key :Testing, Key length :7

So, for further explanation -

  • includes: iostream is needed for console input and output. fstream is needed for reading from and writing to files. cstring is needed for all the string operations used in the program.
  • The code array size is defined as a constant because it's needed to initialize the cipher array. You can change it if you want to expand or reduce the cipher array (and include punctuation marks or other symbols, if you want to).
  • variables: line is for reading the input data, and outLine for storing the ecrypted or decrypted data. key is the key. inCmd could be a character if necessary, but I went with a string here. key_marker is used to track which character of the key is used at the moment. It does not reset when a line is done, so if a line ends in the middle of the key character, it wraps around and continues where it left off. The cipher array includes all characters that will be included in the cipher - here it lists the numbers 0-9, all letters, both uppercase and lowercase, and the space character,
  • getNumber and getChar are both used to scan the cipher array. getNumber checks whether the char is in the array, and returns the array index if it is and -1 if it isn't. getChar reverses the process and returns '|' if the number doesn't correspond to an array index.
  • encryptLine and decryptLine encrypt/decrypt a line at a time. They go through one character at a time. encryptLine adds the key, and decryptLine subtracts it. Both use the modulus function to put the value back in the 0-CODE-ARR_SIZE range. The decrypt function also needs to correct for possible negative numbers. The result is then appended to outLine. This is repeated for every character in the line.
  • The main program defines the input and output files, and then opens them. Then the user is prompted to select whether to run the encryption program or the decryption program. If the user selected encryption or decryption, then they are prompted to enter the key. If the key contains at least one alphanumeric character, it will be used. However, using a single character is not recommended, because then it will be the same as a Caesar cipher, and will be relatively easy to crack.
  • Once the key is obtained, the program will then read the input file one line at a time, and put it through the encryption or decryption program (as the user selected before), and write the result to the output file.
I hope this helps. See you next time!






Monday, 17 September 2018

Dates and times

aka Calculating time for a novel


A story needs a timeline, especially if your characters are students/are working, and if the story covers a significant length of time (more than two weeks, at least given my writing style). You need to track the passage of time, you need to know what day of the week it is, and you have to track when the holidays are. It's a rather difficult problem if you actually count the days one by one.
This is one of the problems that I really enjoyed solving while writing. First, I love maths. Secondly, the maths is not complicated - you can always use a spreadsheet or something to do the calculations for you unless you are a masochist (if you are, go ahead and do it manually, but consider yourself warned).
So, first things first - you need a reference date. You need to know the date and the day of the week on that day, and if you're tracking the sun/moon/etc to track festivals (as I am doing), you will need to set these down as well. Basically, you have to start at an arbitrary point.
First up, the length of the year. My current biggest project does not take place on earth, so I had a little flexibility with this, so I went with approximately 372.058 days. If you're based on earth, the length of the year will be 365.25 days. Next up, you have to decide the nominal length of a year. I would recommend rounding it down to the nearest integer - for example, I went with 372 days. Again, if you're based on earth, that's 365 days.
From this, you can calculate the frequency of leap years. For earth, that will be every four years. Yes, that means that the calendar gradually gets out of sync with our actual position relative to the sun until it's effectively reset every leap year on February 29th. To calculate the frequency,it's 1 divided by the difference between the length of the actual year and the nominal year.
For example, for earth: 1 ÷ (365.25-365) = 1 ÷ 0.25 = 4.
For a planet with a 234.345 day long year, it will be 1 ÷ (234.345-234) = 1 ÷ 0.345 = 2.89855.
This translates to a leap year every 3 years (round it up to the nearest integer), but that will not get rid of the offset in this case. To deal with this, you simply have to repeat the process until the remainder is small enough / time periods involved are too long for it to matter. To continue with the previous example, that would give:
÷ (234.345 - (234 + (1/3)) = 85.71429 -> another leap year every 86 years. Note that these two cycles will be independent of each other, so in the years that are a multiple of both 3 and 86 from the first leap year (every 258 years), there will be a double leap year - i.e., you have to add two days to the calendar on those days.
Continuing with this process, we get 1 ÷ (234.345 - (234 + (1/3) + (1/86)) = 25800. At this point, we can safely ignore the rest of the remainder, unless your story spans millions of years.
Isn't the earth pretty neat?
Continuing on, after deciding how long the years will be, you have to decide how the year is divided. How many months are there? How long are the months? Do you have a wacky calendar like the Gregorian calendar? (Seriously, I don't care about the ego of Roman emperors, but have a 28 day month smack in the middle of the year while many other months have 31 days is simply insane. Also, there is simply no rhyme or reason as to how the number of days in a month are decided. It used to drive me insane as a child, and it still does).
Other things you have to decide before proceeding include the resolution of your calendar (seconds, or even shorter? It may be important if you're going to include festivals based on celestial events. Otherwise, days will work quite well), and the number of days in a week if you're going with the year-month-week-day system we use. If not, you can make anything up. If you're based on the earth, you can simply use the existing calendar.
That being done, you have to come up with a program, essentially, to output the day of the week once you input the date. This can be quite complicated, but the calculation can be used if you're going to include the movement of celestial objects, so bear with me.
The most important step, and the first thing you need to do, is to calculate the time elapsed since the reference date. I will use the earth for calculations, because it is both insanely complicated (thanks to humans, the universe made the starting point a lot easier) and familiar to everyone, so there is less information to absorb.
Say the reference point is the first of January, 2010. My computer says it's a Friday. So, armed with that information, we can proceed. Say we want to figure out what day it is on the 17th of September, 2018. We will use days for calculation. So, here are the steps:
Elapsed years = (2018-2010) - 1 = 7 (remember, 2018 is not over yet).
No of leap years = ((2018-2010) - 1)/4 + 1 = 2.75 = 2 (round it down. 1 is added because the starting year is a leap year. I would recommend starting with a leap year, otherwise this calculation will get complicated.)
No of days elapsed in the past years = 7*365 + 1 = 2557.
Elapsed months: January to August (The current month, September, isn't included). You have to specify this because the lengths of the months vary, so you have to add each of them individually. You have to account for February's changing length as well. Since 2018 is a leap year, this gives 31+29+31+30+31+30+31+31=244 days.
Then you add the remaining no of days to the 17th - which is 16 days.
The total is 2556+244+16 = 2817 days.
At least the weeks are consistent here on earth, which makes the next part a lot easier. You simply divide the number of days you obtained by 7 and get the remainder, and map it to the days of the week. The reference day will be 0, so you get
0 = Friday
1 = Saturday
2 = Sunday
3 = Monday
4 = Tuesday
5 = Wednesday
6 = Thursday.
So, you get 2817%7 = 3 ->Monday.
You can try this with any reference date. If you're working backwards from the reference date, there are some changes - the leap year calculation changes, and you have to start counting months and days from the other end, but once you account for those changes, the algorithm, in its basic form, still works.
As I said, you can use a spreadsheet for this calculation - once you're set it up, basically, you just have to enter the date and you'll get all the information you need. For my work, I calculate everything to the second, because track the movements of some celestial objects as well. The basis of this calculation is also the the time stamp mentioned here. I will get to that calculation in another article.
Hope this helps.
Until next time!

Friday, 14 September 2018

Caesar cipher - a simple program

I did some previous articles on cipers, the first one being simple subsitution ciphers. There, I talked about the Caesar cipher, where a simple shift is used for encoding. The code given below is for a simple program for the purpose.

The code is in c++. First, the program reads the input file (input.txt), code it, and output the coded version to output1.txt. Then, it reads the output1.txt file, decodes it, and outputs the result to output2.txt. Decoding is done by using -1*original shift as the shift.
The explanation follows the code.


/* 
 * File:   main.cpp
 * Author: 88415XC
 *
 * Created on September 14, 2018, 3:52 PM
 */

#include <cstdlib>
#include <iostream>
#include <fstream>
#include <cstring>

using namespace std;

char line[250]={'\0'};                  //For input data
char line_c[250]={'\0'};                //For coded/decoded data
//array storing all simple and capital letters. this program is not case sensitive.
char code[2][26]={{'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'},
{'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'}};
//the 'value' assigned to the corresponding character in the code array.
int vals[26]={0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25};

/*
 * get the value corresponding to the char input
 */
int get_val(char character){
    int val = -1;
    for (int i=0; i<26;i++){
        if ((character==code[0][i])||(character==code[1][i])){
            val = vals[i];
            break;
        }
    }
    return val;
}

/*
 * gets the character corresponding to the char input
 */
char get_char(int val){
    char char_1 = '\0';
    for (int i=0; i<26;i++){
        if (val==vals[i]){
            char_1 = code[0][i];
            break;
        }
    }
    return char_1;
}

/*
 * Encodes/decodes the input data. 
 * All non-letter characters are ignored. All letters are shifted by the 
 * specified shift.
 */
void encode_caesar(int shift){
    int temp=0;
    char temp_1 = '\0';
    for (int i = 0; i<250; i++){
        if (line[i]=='\0') break;
        temp=get_val(line[i]);
        if (temp!=-1){
            temp = (temp+shift)%26;
            if (temp<0) temp=temp+26;
            temp_1=get_char(temp);
            line_c[i]=temp_1;
        }
        else{
            line_c[i]=line[i];
        }
    }
}

/*
 * 
 */
int main(int argc, char** argv) {
    ifstream input1;
    fstream output1;
    ofstream output2;
    
    input1.open("input1.txt",ios::in);
    output1.open("output1.txt",ios::out);
    
    //code input1 and put to output 1
    while(!(input1.eof())){
        memset(line,'\0',250);
        memset(line_c,'\0',250);
        input1>>line;
        encode_caesar(8);
        output1<<line_c<<endl;;
    }
    input1.close();
    output1.close();
    
    output1.open("output1.txt",ios::in);
    output2.open("output2.txt",ios::out);
    
    //decode output1 and put to output 2
    while(!(output1.eof())){
        memset(line,'\0',250);
        memset(line_c,'\0',250);
        output1>>line;
        encode_caesar(-8);
        output2<<line_c<<endl;
    }
    output1.close();
    output2.close();
    

    return 0;
}

First, the includes:
iostream: mainly for debugging.
fstream: for reading and writing to files.
cstring: because memset doesn't work without it (explanation will follow later.

Global variables: I hope the comments explain things sufficiently. The reason for using a 'val' array instead of using the array index of code is to allow more flexibility (as in, you could technically mix up the order as long as it's from 0-25 and make it a little more difficult to decode than the standard Caesar cipher)

get_val and get_char: These two functions are used to look up the values necessary in the arrays. Get_val looks up the number corresponding to the character it has to look up, and get_char does the opposite. Note that the program is not case sensitive, so get_char will return an upper case letter every time. In case of an error, get val returns -1, get char returns the null character (\0).

encode_caesar: This function handles one line at a time. It goes through the line one character at a time. Each character is converted to the corresponding value. If the value is -1, the same character is written to the coded string (line_c). If not, the shift (positive or negative) is added, and the modulus is taken. Here, you have to add the divisor (25) if the value is negative because '%' operator is actually the remainder from division. The resulting value is then converted back to a character and written to the coded string. As you probably noticed, numbers, punctuation marks, etc are not changed and written back as it is.

Moving on to the main program, it's basic functionality is as described in the introduction. Going step by step, first, the file handles are defined. I really need only 2 here, but I used three for the three files I was using. ifstream is for input, fstream is for both input and output, and ofstream is for output.
Then input.txt and output1.txt are opened, the former in input mode and the latter in output mode. The following while loop reads input.txt line by line until the end of the file. I should probably mention that c++ considers a space the end of the line, and the next word is considered another line. memset is used to clear line and line_c before reading the input to clear any of the previous line left in them. Once the line is read, encode_caesar is called with the required shift (8 in this case, but you can, technically, use any value you like). Then the coded line is written to output.txt.
The next part functions almost identically, with output1.txt as the input and output2.txt as the output. The only difference is, as this decodes the text, the previous shift multiplied by -1 is used with encode_caesar (-8 in this case).

I hope this is helpful. Until next time!



How to write a character who is smarter than you

We all have that one character (or few) who is significantly smarter than the writer. So, as a writer, how do you write such a character con...