Search This Blog

Simplicity is the ultimate sophistication.” — Leonardo da Vinci
Contact me: sreramk360@gmail.com

Saturday, 14 February 2015

Learn files in C: tutorial 2

Learn files programming in C: practical tutorial 2 (Example programs in C: using fscanf and fprintf. Try practicing these problems)

"to follow this tutorial you must be familiar with the topic discussed in tutorial 1 in this series"
  
 Before we go on to the programming part, lets discuses about elegant ways to write a C program involving files. Files are used only for long term storage and the hard-disk cannot be used as a RAM memory. Using the hard-disk like your computers RAM memory slows down the run-time of your program to a great extent. This is because, communication between the processor and the RAM memory is several times faster than the communication between the processor and the hard-disk. The RAM memory sores its data magnetically using capacitors, which have a very short life; this demands it to be refreshed several times in just one second. But hard-disk has a disk writer head which physically imparts data on the hard-disk. It changes the magnetic field direction of the ferromagnetic substance to either be read as the digit one or zero by the reader. Storage and recollection of data in RAM happens electronically, so hence high speed memory ICs can be used to increase its speed; but disk read-write process happens physically, ans this causes the heavy time delay. 

So if you are attempting to do huge memory involving processing like image editing or signal processing, you usually take the entire file's memory into the RAM memory before processing the data. Reading parts of data from the hard-disk and modifying it and writing it back to the hard-disk immediately causes the hard disk to be used extensively and this will result in your program running very slowly compared to the algorithm which gets the entire data into the RAM memory before doing the modification.

To prevent issues such as these, C programming language has a built in feature that lets it read data in blocks and write data in blocks. The size of each block depends upon the hardware in use. As you saw in the previous tutorial, I had used the fscanf function and the fprintf function within a while loop which causes it to be called more than once. The smart developers of C language had built it in such a way that the second time fscanf is called, it doesn't read the hard-disk for a second time immediately, but instead refers to the buffer memory. Lets say that the compiler reads the statement 'fscanf(f, "%s", &text);' for the second time. Also let the size of the C string variable text to be lesser than the buffer size. Then, the program does not read the same file again instead reads the buffer. Even if the the size of the character string 'text' is lesser than the size of the buffer, the entire buffer gets filled into the buffer  memory. Now, when the function fscanf is called, it does not look into the hard-disk, instead looks into the buffer; if the data that's to be stored in the string 'text' exceeds past the data present in the buffer, another block of data is read from the hard-disk. This way, the fscanf function minimizes the number of times it reads the hard-disk. So on a longer run, this acts as a more efficient solution and users need not worry too much about their algorithm extensively reading the hard-disk unnecessary. 

But even after all these precautions, the programmer's design is very important while writing large programs. Today's market favors the fastest among the fast algorithms to solve a problem! The ways by which we may violate the rule "do not read from the hard-disk often or write to the hard-disk often" might not be clear at this point of this tutorial, but I promise it would soon make sense. Lets get back to programming.

Q1. Write a C program to read basic mathematical expressions form a text file and evaluate the final result. For example let the contents of the file read.txt be:
( 1 + ( 3 + 5 ) * 4 + 11 )
And  on executing the program, the result 44 should be displayed; Its enough if the program does the operations addition, subtraction, multiplication and division operations alone in an integer variable.   

please look at the question's solution only after you have tried it yourself for a considerable amount of time. 

The answer to the first question is the following program (I have provided the explanation below)
/*
* Program for solving basic arithmetic expressions using BODMAS rule, but excluding the order.
*/

#include <stdio.h> // for fscanf, fprintf, fseek, fopen and fclose
#include <conio.h> //for getch
#include <string.h> //for strlen, strcmp
#include <stdlib.h> // for exit function
#include <ctype.h> // for isdigit()
#define BOOL int // considered to be a boolean data type
#define TRUE 1
#define FALSE 0
#define STACK_MAX_SIZE 100 // there cannot be more than 100 symbols or 100 values

/**************************************************************************************
*               This enum enumerates these symbols progressively
***************************************************************************************
*/
enum { empty_block, number, B_open, B_close,/*O,*/D, M, A, S }; // order is not used in this example


/**************************************************************************************
***************************************************************************************
*      this following function gets the character symbol and determines if its a number
*      or a bracket (open or close) or addition operation or subtraction, etc.
***************************************************************************************
***************************************************************************************
*/
int symbolToIndex(char symbol) ///here the symbol is fed as a character variable
{
    switch (symbol)
    {
    case '(':
        return B_open;
    case ')':
        return B_close;
    case '/':
        return D;
    case '*':
        return M;
    case '+':
        return A;
    case '-':
        return S;
    default:
        if (isdigit(symbol))
            return number;
        else
            return empty_block; // for unidentified symbol
    }
}



/****************************************************************************************
*****************************************************************************************
*      The following push function is used to push the symbol read from the file or to
*      push the value read from the file into its respective stack. By default the symbol
*      or values are pushed front.
*      A better data structure design could be to include the parameters int *valueStack,
*      int* index, int stackMaxSize in a C data structure.
*****************************************************************************************
*****************************************************************************************
*/
void push(int *valueStack, int* index, int stackMaxSize, int data)
{
    if (((*index) + 1) >= stackMaxSize)
    {
        fprintf(stderr, "error: stack overflow in valuesStack");
        getch();
        exit(-1); // returns -1 to the system and terminates the program
    }

    valueStack[*index] = data;
    (*index)++;
}


/****************************************************************************************
*****************************************************************************************
*      The following pop function is used to pop the symbol from the stack's front end.
*      A better data structure design could be to include the parameters int *valueStack,
*      int* index, int stackMaxSize in a C data structure and the parameter that's passed
*      could be the structure variable.
*      this kind of design will be more readable compared to the one used below.
*****************************************************************************************
*****************************************************************************************
*/
void pop(int *index)
{
    if ((*index) != 0)
        --(*index);
}



/****************************************************************************************
*****************************************************************************************
*      The following function actually does the arithmetic operation. It evaluates the
*      top two values in the value stack with the operator at the top of the symbol stack
*      and once this operation is over, the result is written to the memory location just
*      below the top of the values stack. I.e., to say more formally, the pop() operation
*      is performed after the arithmetic operation and the result is pushed to the front
*      of the value stack, and later the top symbol is popped.
*****************************************************************************************
*****************************************************************************************
*/
BOOL evaluate(int *symbolStack, int* values, int* pointer_symbol, int* pointer_values)
{
    if (symbolStack[*pointer_symbol - 1] == D)
        values[*pointer_values - 2] = values[*pointer_values - 2] / values[*pointer_values - 1];
    else if (symbolStack[*pointer_symbol - 1] == M)
        values[*pointer_values - 2] = values[*pointer_values - 2] * values[*pointer_values - 1];
    else if (symbolStack[*pointer_symbol - 1] == A)
        values[*pointer_values - 2] = values[*pointer_values - 2] + values[*pointer_values - 1];
    else if (symbolStack[*pointer_symbol - 1] == S)
        values[*pointer_values - 2] = values[*pointer_values - 2] - values[*pointer_values - 1];
    else
        return FALSE;
    pop(pointer_symbol);
    pop(pointer_values);
    return TRUE;  // though this function detects error, the error that gets detected is not
                    // used in the main function which calls this function
}




/****************************************************************************************
*****************************************************************************************
********************      The main function    ******************************************
*****************************************************************************************
*****************************************************************************************
*/
int main()
{
    FILE* f = fopen( "read.txt", "r"); // opens the file for reading
    if (f == NULL) // checks if the file is opened correctly
    {
        fprintf(stderr, "error: unable to open file read.txt for reading");
        getch();
        return -1; // returns -1 to the system in case of error
    }
    char data_char[10]; // for reading the symbol or digit without any errors. it reads up-to 10 numeric digits
                        // but only the first one will be used. So a better design could be to make the array size 2
    int symbolStack[STACK_MAX_SIZE], values[STACK_MAX_SIZE], data_int, symbol; // the symbol and value stack cannot exceed 100
    int pointer_symbol = 0; // pointer to the location top+1 for symbol stack
    int pointer_values = 0; //pointer to the location top + 1 for values stack
    while (!feof(f)) // waits till the end of file
    {
        fscanf(f, "%s", data_char); // a symbol or value is being read
        symbol = symbolToIndex(data_char[0]); // determines the kind of symbol or determines if its a value
        if (symbol == empty_block)
        {
            fprintf(stderr, "error: unrecognised symbol %c", data_char[0]); // sends an error message
            return -1; // returns -1 in case of failure
        }
        else if (symbol == B_open) // on reading the open bracket symbol it just pushes it into the stack
            push(symbolStack, &pointer_symbol, STACK_MAX_SIZE, B_open);
        else if (symbol == B_close) // on reading the close bracket symbol, it evaluates
                                    // till the open bracket symbol is reached
        {
            while ((symbolStack[pointer_symbol - 1] != B_open) && !feof(f))
                evaluate(symbolStack, values, &pointer_symbol, &pointer_values);
            pop(&pointer_symbol); // pop the B_Close
        }
        else if ((pointer_symbol > 0) && (symbol == number)) // if a number is read,
                                                    //it simply pushed into the stack
        {
            fseek(f, -1 * strlen(data_char), SEEK_CUR);
            // remember, we have the digit formatted as a C string,
            // so we need to format it as a value. Hence, we re-read
            // the content from the file, but this time formatted as a
            // value (the string to value conversation takes place in
            // the fscanf function
            fscanf(f, "%d", &data_int);
            push(values, &pointer_values, STACK_MAX_SIZE, data_int);
        }
        else if ((symbol == D || symbol == M || symbol == A || symbol == S))
        {
            if (symbolStack[pointer_symbol - 1] <= symbol) // checks the symbol's
                //priority. So if the symbol read has a higher priority / same,
                // the operation is performed
            {
                evaluate(symbolStack, values, &pointer_symbol, &pointer_values);
                push(symbolStack, &pointer_symbol, STACK_MAX_SIZE, symbol);
            }
            else // else the symbol is just pushed
                push(symbolStack, &pointer_symbol, STACK_MAX_SIZE, symbol);
        }
        else
        {
            fprintf(stderr, "syntax error: %d", symbol);
            getch();
            return -1;
        }
    }
    if (pointer_symbol != 0 || pointer_values != 1)
    { // report syntax error
        fprintf(stderr, "syntax error");
        getch();
        return -1;
    }
    printf("result = %d", values[0]); //displays the results
    getch();
    return 0;
}


If you have written your own program for performing the basic arithmetic operations congratulations. If haven't, its not a problem; its a bit too complicated for beginners in C programming. Anyway, the second question is asked from this program.

Q2. in the program's comment, I had pinpointed 2 design mistakes; how will you overcome them, and what is the benefit of checking a program's design, such as the one pinpointed above?
  (I hadn't realized the mistakes until I was half way writing the above code. I didn't correct it because a mistake like this could caution you and prevent you from making the same mistake).   


                                                                                                           next tutorial >>>>                                              

 copyright (c) 2015 K Sreram. You may not distribute this article without the concerned permission from K Sreram.  

About my blog
            

No comments:

Post a Comment

Featured post

Why complicating matters is not good

“ Simplicity is the ultimate sophistication.” — Leonardo da Vinci Why is complicating things wrong ? - K Sr...