Search This Blog

Simplicity is the ultimate sophistication.” — Leonardo da Vinci
Contact me: sreramk360@gmail.com

Saturday 14 February 2015

Learn files in C: tutorial 2

Learn files programming in C: practical tutorial 2 (Example programs in C: using fscanf and fprintf. Try practicing these problems)

"to follow this tutorial you must be familiar with the topic discussed in tutorial 1 in this series"
  
 Before we go on to the programming part, lets discuses about elegant ways to write a C program involving files. Files are used only for long term storage and the hard-disk cannot be used as a RAM memory. Using the hard-disk like your computers RAM memory slows down the run-time of your program to a great extent. This is because, communication between the processor and the RAM memory is several times faster than the communication between the processor and the hard-disk. The RAM memory sores its data magnetically using capacitors, which have a very short life; this demands it to be refreshed several times in just one second. But hard-disk has a disk writer head which physically imparts data on the hard-disk. It changes the magnetic field direction of the ferromagnetic substance to either be read as the digit one or zero by the reader. Storage and recollection of data in RAM happens electronically, so hence high speed memory ICs can be used to increase its speed; but disk read-write process happens physically, ans this causes the heavy time delay. 

So if you are attempting to do huge memory involving processing like image editing or signal processing, you usually take the entire file's memory into the RAM memory before processing the data. Reading parts of data from the hard-disk and modifying it and writing it back to the hard-disk immediately causes the hard disk to be used extensively and this will result in your program running very slowly compared to the algorithm which gets the entire data into the RAM memory before doing the modification.

To prevent issues such as these, C programming language has a built in feature that lets it read data in blocks and write data in blocks. The size of each block depends upon the hardware in use. As you saw in the previous tutorial, I had used the fscanf function and the fprintf function within a while loop which causes it to be called more than once. The smart developers of C language had built it in such a way that the second time fscanf is called, it doesn't read the hard-disk for a second time immediately, but instead refers to the buffer memory. Lets say that the compiler reads the statement 'fscanf(f, "%s", &text);' for the second time. Also let the size of the C string variable text to be lesser than the buffer size. Then, the program does not read the same file again instead reads the buffer. Even if the the size of the character string 'text' is lesser than the size of the buffer, the entire buffer gets filled into the buffer  memory. Now, when the function fscanf is called, it does not look into the hard-disk, instead looks into the buffer; if the data that's to be stored in the string 'text' exceeds past the data present in the buffer, another block of data is read from the hard-disk. This way, the fscanf function minimizes the number of times it reads the hard-disk. So on a longer run, this acts as a more efficient solution and users need not worry too much about their algorithm extensively reading the hard-disk unnecessary. 

But even after all these precautions, the programmer's design is very important while writing large programs. Today's market favors the fastest among the fast algorithms to solve a problem! The ways by which we may violate the rule "do not read from the hard-disk often or write to the hard-disk often" might not be clear at this point of this tutorial, but I promise it would soon make sense. Lets get back to programming.

Q1. Write a C program to read basic mathematical expressions form a text file and evaluate the final result. For example let the contents of the file read.txt be:
( 1 + ( 3 + 5 ) * 4 + 11 )
And  on executing the program, the result 44 should be displayed; Its enough if the program does the operations addition, subtraction, multiplication and division operations alone in an integer variable.   

please look at the question's solution only after you have tried it yourself for a considerable amount of time. 

The answer to the first question is the following program (I have provided the explanation below)
/*
* Program for solving basic arithmetic expressions using BODMAS rule, but excluding the order.
*/

#include <stdio.h> // for fscanf, fprintf, fseek, fopen and fclose
#include <conio.h> //for getch
#include <string.h> //for strlen, strcmp
#include <stdlib.h> // for exit function
#include <ctype.h> // for isdigit()
#define BOOL int // considered to be a boolean data type
#define TRUE 1
#define FALSE 0
#define STACK_MAX_SIZE 100 // there cannot be more than 100 symbols or 100 values

/**************************************************************************************
*               This enum enumerates these symbols progressively
***************************************************************************************
*/
enum { empty_block, number, B_open, B_close,/*O,*/D, M, A, S }; // order is not used in this example


/**************************************************************************************
***************************************************************************************
*      this following function gets the character symbol and determines if its a number
*      or a bracket (open or close) or addition operation or subtraction, etc.
***************************************************************************************
***************************************************************************************
*/
int symbolToIndex(char symbol) ///here the symbol is fed as a character variable
{
    switch (symbol)
    {
    case '(':
        return B_open;
    case ')':
        return B_close;
    case '/':
        return D;
    case '*':
        return M;
    case '+':
        return A;
    case '-':
        return S;
    default:
        if (isdigit(symbol))
            return number;
        else
            return empty_block; // for unidentified symbol
    }
}



/****************************************************************************************
*****************************************************************************************
*      The following push function is used to push the symbol read from the file or to
*      push the value read from the file into its respective stack. By default the symbol
*      or values are pushed front.
*      A better data structure design could be to include the parameters int *valueStack,
*      int* index, int stackMaxSize in a C data structure.
*****************************************************************************************
*****************************************************************************************
*/
void push(int *valueStack, int* index, int stackMaxSize, int data)
{
    if (((*index) + 1) >= stackMaxSize)
    {
        fprintf(stderr, "error: stack overflow in valuesStack");
        getch();
        exit(-1); // returns -1 to the system and terminates the program
    }

    valueStack[*index] = data;
    (*index)++;
}


/****************************************************************************************
*****************************************************************************************
*      The following pop function is used to pop the symbol from the stack's front end.
*      A better data structure design could be to include the parameters int *valueStack,
*      int* index, int stackMaxSize in a C data structure and the parameter that's passed
*      could be the structure variable.
*      this kind of design will be more readable compared to the one used below.
*****************************************************************************************
*****************************************************************************************
*/
void pop(int *index)
{
    if ((*index) != 0)
        --(*index);
}



/****************************************************************************************
*****************************************************************************************
*      The following function actually does the arithmetic operation. It evaluates the
*      top two values in the value stack with the operator at the top of the symbol stack
*      and once this operation is over, the result is written to the memory location just
*      below the top of the values stack. I.e., to say more formally, the pop() operation
*      is performed after the arithmetic operation and the result is pushed to the front
*      of the value stack, and later the top symbol is popped.
*****************************************************************************************
*****************************************************************************************
*/
BOOL evaluate(int *symbolStack, int* values, int* pointer_symbol, int* pointer_values)
{
    if (symbolStack[*pointer_symbol - 1] == D)
        values[*pointer_values - 2] = values[*pointer_values - 2] / values[*pointer_values - 1];
    else if (symbolStack[*pointer_symbol - 1] == M)
        values[*pointer_values - 2] = values[*pointer_values - 2] * values[*pointer_values - 1];
    else if (symbolStack[*pointer_symbol - 1] == A)
        values[*pointer_values - 2] = values[*pointer_values - 2] + values[*pointer_values - 1];
    else if (symbolStack[*pointer_symbol - 1] == S)
        values[*pointer_values - 2] = values[*pointer_values - 2] - values[*pointer_values - 1];
    else
        return FALSE;
    pop(pointer_symbol);
    pop(pointer_values);
    return TRUE;  // though this function detects error, the error that gets detected is not
                    // used in the main function which calls this function
}




/****************************************************************************************
*****************************************************************************************
********************      The main function    ******************************************
*****************************************************************************************
*****************************************************************************************
*/
int main()
{
    FILE* f = fopen( "read.txt", "r"); // opens the file for reading
    if (f == NULL) // checks if the file is opened correctly
    {
        fprintf(stderr, "error: unable to open file read.txt for reading");
        getch();
        return -1; // returns -1 to the system in case of error
    }
    char data_char[10]; // for reading the symbol or digit without any errors. it reads up-to 10 numeric digits
                        // but only the first one will be used. So a better design could be to make the array size 2
    int symbolStack[STACK_MAX_SIZE], values[STACK_MAX_SIZE], data_int, symbol; // the symbol and value stack cannot exceed 100
    int pointer_symbol = 0; // pointer to the location top+1 for symbol stack
    int pointer_values = 0; //pointer to the location top + 1 for values stack
    while (!feof(f)) // waits till the end of file
    {
        fscanf(f, "%s", data_char); // a symbol or value is being read
        symbol = symbolToIndex(data_char[0]); // determines the kind of symbol or determines if its a value
        if (symbol == empty_block)
        {
            fprintf(stderr, "error: unrecognised symbol %c", data_char[0]); // sends an error message
            return -1; // returns -1 in case of failure
        }
        else if (symbol == B_open) // on reading the open bracket symbol it just pushes it into the stack
            push(symbolStack, &pointer_symbol, STACK_MAX_SIZE, B_open);
        else if (symbol == B_close) // on reading the close bracket symbol, it evaluates
                                    // till the open bracket symbol is reached
        {
            while ((symbolStack[pointer_symbol - 1] != B_open) && !feof(f))
                evaluate(symbolStack, values, &pointer_symbol, &pointer_values);
            pop(&pointer_symbol); // pop the B_Close
        }
        else if ((pointer_symbol > 0) && (symbol == number)) // if a number is read,
                                                    //it simply pushed into the stack
        {
            fseek(f, -1 * strlen(data_char), SEEK_CUR);
            // remember, we have the digit formatted as a C string,
            // so we need to format it as a value. Hence, we re-read
            // the content from the file, but this time formatted as a
            // value (the string to value conversation takes place in
            // the fscanf function
            fscanf(f, "%d", &data_int);
            push(values, &pointer_values, STACK_MAX_SIZE, data_int);
        }
        else if ((symbol == D || symbol == M || symbol == A || symbol == S))
        {
            if (symbolStack[pointer_symbol - 1] <= symbol) // checks the symbol's
                //priority. So if the symbol read has a higher priority / same,
                // the operation is performed
            {
                evaluate(symbolStack, values, &pointer_symbol, &pointer_values);
                push(symbolStack, &pointer_symbol, STACK_MAX_SIZE, symbol);
            }
            else // else the symbol is just pushed
                push(symbolStack, &pointer_symbol, STACK_MAX_SIZE, symbol);
        }
        else
        {
            fprintf(stderr, "syntax error: %d", symbol);
            getch();
            return -1;
        }
    }
    if (pointer_symbol != 0 || pointer_values != 1)
    { // report syntax error
        fprintf(stderr, "syntax error");
        getch();
        return -1;
    }
    printf("result = %d", values[0]); //displays the results
    getch();
    return 0;
}


If you have written your own program for performing the basic arithmetic operations congratulations. If haven't, its not a problem; its a bit too complicated for beginners in C programming. Anyway, the second question is asked from this program.

Q2. in the program's comment, I had pinpointed 2 design mistakes; how will you overcome them, and what is the benefit of checking a program's design, such as the one pinpointed above?
  (I hadn't realized the mistakes until I was half way writing the above code. I didn't correct it because a mistake like this could caution you and prevent you from making the same mistake).   


                                                                                                           next tutorial >>>>                                              

 copyright (c) 2015 K Sreram. You may not distribute this article without the concerned permission from K Sreram.  

About my blog
            

Friday 13 February 2015

Learn files in C: tutorial 1

Learn files programming in C: practical tutorial 1(Brief introduction to files in C)

Keywords- Learn files programming in C, C files programming, file random access in C, C Files, fopen, fclose, fopen C formats, C files programming, fread, fwrite, Programming in C tutorials.
"To follow this tutorial, you should be familiar with the following topics in C language:
  1.  variable declaration, static memory allocation, value assignment
  2. basic arithmetic and logical expressions in C
  3.  basic C pointers, its meaning, assigning the address location of memory locations to pointer variables  (i.e., assigning the reference of one variable to another variable).
  4. functions in C (all four combinations; i.e. functions with return with arguments, .. etc)
  5. passing memory address as arguments in C functions"
Now, assuming you  are familiar with the above topics, let me proceed with the tutorial. To gain the maximum benefit learning files, it is advisable to try out all the example programs in a computer. Until now, you have been storing values in variables and manipulate them in your program. But once your program terminates, you will find it impossible to retrieve the values that were stored in your program's memory. This is always not desirable; let's assume that you are playing your favorite video game, and suddenly you get interrupted by some work. Then you will be left with two possibilities; one: you will have to play the entire game from the very beginning; two: you could save the game to a particular location and time such that the next time you switch on your game, you would resume it without any problems. It's obvious that the second option is more desirable; so how is this done? Your game stores the data that describe your level, your place in that game, or any other associated parameters in the computer's hard-disk as a file, giving it a suitable name. So this file is read when the application is reopened and resumes your place in the game.

So how is all this done? When we run our program on top of a modern operating system, like windows 8.1, the computer won't give you direct access to the basic hardware like the hard-disk, the random access memory, the video memory (the memory associated with the screen display) etc.; so your program requests the operating system to communicate with these hardware. This ensures the overall stability of the system and minimizes the risk of malware and virus (imagine if a malware or virus has low level access to the system, enabling it to directly communicate with the hardware; then, it wouldn't take any time for it to wipe-out the entire data on the hard-disk!). The request to handle hardware are sent to the operating system using its run-time library functions; these functions help in performing the requested task. So when you run the same program in a different version of windows, the run-time library functions used may vary, but that's not going to affect your program's performance. 

 Streams:

Streams are logical interfaces to hardware device in a computer. This hardware device could be the output screen, hard drive, keyboard or the computer mouse or any other device. In C language, when referring to a file, we refer to a hardware device. Streams like stdin and stdout are input and output streams respectively; they handle the keyboard inputs and the program output screen's display. They get initialized by default when the header stdio.h is included. In C we represent a pointer to a stream as
                                     FILE *varName; // the stream

Files programming:

So certain standard file handling functions are available in C. They are (note, at this point its not necessary to know about the following functions; Ill be explaining them soon),

Opening and closing files:  
fopen()  - for opening the file stream
fclose()  - for closing the file stream
 fclose() - for closing a stream
fcloseall() - for closing all the open streams other than the default ones such as stdin, stdout, sterr etc.

For text based operations:
fgets()    - for getting a string from file
fputs()   -  for writing a string to file
fscanf() - for reading a formatted data from a stream
fprintf() - for writing a formatted data to a stream
fputc()  - for writing a character to the destined file
fgetc()  - for reading a character from the destined file

For binary based operations:
fread() - reads raw data from the stream
fwrite() - writes raw data to file

For file navigation:
fseek()
ftell()

For stream buffer flushing:
fflush()

 Just for a brief introduction, let me write down an example program:

#include <stdio.h> // has the external symbols fscanf, fprintf, fopen declared in it
#include <conio.h> // not required: tell me why should we avoid including library files that aren't                                       //    used
#include <strings.h>
int main()
{
    FILE *f = fopen("a.txt", "w"); // opens the file a.txt in write mode
    char text[100];

    if(f == NULL) // checks if the opening operation was a success
    {
        printf("error opening the file in read mode");
        return -1; // return an error signal to the operating system
    }
    while(!(strcmp(text, "exit")==0)) // checks if the text typed in is "exit" or not
    {
        gets(text); // reads text from the user
        fprintf(f, "%s\n", text); // writes the text to the file
    }
    fclose(f); // closes the file and clears all the memory in the location pointed by f.
    return 0;
}

Now, this above program gets data from the user and writes to the file named a.txt. The function fopen takes two arguments, one the file directory, and the other the opening mode. Let me list down the kind of file open modes avilable:
  1.   r - read mode
  2.  w- write mode
  3.  a - append mode
  4.  r+ - read/write mode. Return's error when the file does not exist. 
  5. w+ - write/read mode. Creates a new file if the file does not exist
  6. a+ - opens the file for read/append mode. But when data is written to the file right after the file opens, the written data gets appended. instead, if data is read from the file, the pointer jumps to the end of the file
  7. rb - read in binary mode
  8. wb - write in binary mode
  9. ab -append in binary mode
  10. r+b - read/write in binary mode (does not create the file if it does not exist)
  11. w+b -write/read in binary mode (here, it creates a file)
  12. a+b - opens a file in append mode for read/write (here it creates a file if it does not originally exist. But if the file does exist, and if the "fread" operation is used, the pointer heads to the file's beginning )
Now we got to check if the file was actually opened by the fopen operation. If  it was opened successfully, then 'f' would hold a non zero memory address value. If for some reason the file didn't open, then the error message is displayed and the program gets terminated. Next, the function fprintf writes the data in 'text' C string variable to the file a.txt. Finally, the file gets closed using the fclose(). Calling fclose(f) clears all the memory locations associated with the stream 'f'.

If you run the above program in a machine, you will find that a new text file gets created and once you type something and press enter, the text gets saved to the file a.txt. After terminating the process by typing 'exit', check out the file a.txt. The text you typed along with the word exit would be saved to the file a.txt.   

 Now, lets see a program that reads from a file. We used fprintf for writing data to file, now we may use fscanf to read formatted data from the file. Like scanf, fscanf also formats the data obtained from the file. fscanf ignores all space characters and terminates reading either when it encounters, '\n', ' ' or '\r'. So when we read the data from the file using fscanf, then as the characters '\n', '\r' and ' ' are ignored, the information about the location of the space character in the text file (along with the next-line character) is lost and it become impossible to extract them back unless we read the buffer data present in the location f. So you use fscanf only if your program needed these texts to be ignored. Lets say you write a huge program like the C interpreter in C. In this case, it is necessary that these characters are ignored.

Now lets see the example for reading the text contents from the file.       
#include <stdio.h>
#include <conio.h>
#include <strings.h>
int main()
{
    FILE *f = fopen("a.txt", "r");
    char text[100];

    if(f == NULL)
    {
        printf("error opening the file in read mode");
        return -1;
    }
    while(!(strcmp(text, "exit")==0) && !(feof(f)))
    {
       fscanf(f, "%s", text);// reads text from the file pointed to by the stream f
        printf("%s\n", text); // displays the text that's been read
    }
    getch();
    fclose(f);
    return 0;
}

Unlike the former program, this program reads contents from the file a.txt displaying the words one after other and terminates once it reads the word 'exit' or reaches the file's end.The function feof(), as its name suggests, checks for the end of the file being read. feof(f) becomes true if the file's end is reached. Note that the file has been opened in read mode (i.e., "r"). So if the file does not exist, then the pointer variable will be assigned the value NULL or 0x0.
                                                                                                           tutorial 2 >>>>
 copyright (c) 2015 K Sreram. You may not distribute this article without the concerned permission from K Sreram.  

about my blog

Featured post

Why increasing complexity is not good?

“ Simplicity is the ultimate sophistication.” — Leonardo da Vinci Why is complicating things wrong ? - K Sr...