Sunteți pe pagina 1din 15

Bad coding practices

Unlike in the real world where the winner is good, in programming world good programming wins. It is not uncommon to see developers shaking their head in frustration when they are looking at the code written by others or themselves and using all those unparliamentary four letter words. If not all, many of the times this arises because of bad programming practices. While it is universally accepted by programmers that standards are very important, the disagreement comes in the standards that are to be followed. (Un)Fortunately, programming is not a science and still remains an art. I believe that it shall remain to be so for quite a while, if not forever. Until that time comes, the best that can be done is in following some good programming practices. Being of the opinion that there is no point in listing the good programming practices (which have been discussed ad-nauseam by computer scholars, project leaders and Quality Assurance Groups) I only wish to list the bad coding practices which I have experienced or heard so far. Most of the discussion here is limited to C,C++ programming. Though some of the concepts can be applied to other languages, the discussion in this document is concentrated primarily on C and C++. It is not possible for all the programmers to program in all kinds of scenarios. So, I wish to maintain a list of scenarios and bad programming practices in that scenario. As the saying goes, Fools learn from their own experience and wise men learn from others. I was foolish enough to implement some of the bad programming practices and learnt from my mistakes.

Oooops!!! This is not OOP


I have fallen prey to this many times. The temptation of writing a program in the procedural way is sometimes too strong to resist and I have exposed the internals of a class by either making the member variables as public or worse, grouping them as structures instead of classes. Before I proceed any further, let me say it for the record: I am not against procedural programming. It is the way we all started learning programming and is one of the easiest ways to imagine business logic in terms of code. And what the heck, it is a lot simpler!!! There is no shame in stating yourself to be a procedural programmer. But then for some reason like a stubborn project leader or an adamant project manager or a god damn crazed visionary on the top, we are forced to write the program in an Object Oriented Programming language. Here starts the war of the worlds. OOP is more than a group of classes. It is much vaster than the simple textbook definition of class as a collection of related data and functions. If we go by this definition alone, then all the functions in any software package can be viewed as related and can all be made as the member functions of a single class. This might work. And with some additional effort, it actually does work. But with all due respects, it aint Object Oriented Programming. For example, consider the following program. This is one of the classic cases of procedural object oriented jumbled up garbage. Example 1
#include <stdio.h> class CSingleClass { int i,j; public: void Execute() { int k=0; scanf("%d%d",&i,&j); k=i+j; printf("%d",k); } }; void main() { CSingleClass o.Execute(); }

o;

Syntactically, the above program is object oriented. But that is only because, it cant be compiled in a C compiler. Semantically, this program is not even close to what is called object orientated programming. The above is a fairly simple example and I hope you are still with me. Now let us see a slightly more tricky case where I have seen programmers more experienced than me succumbing to Procedural programming temptation. Example 2: Lets say I have a class which deals with rectangles as shown below. After a few months into the development, there was a need to support parallelograms too. It is sometimes tempting to just add an Enum or a flag to the existing rectangle class and use it for parallelograms too.
typedef struct { double double }sPoint: class CRectangle { sPoint sPoint public: void Area(); }; m_BottomLeft; m_TopRight; sPoint x; y;

This class can be used for Parallelograms with little modifications as


class CRectangle { sPoint sPoint double bool m_BottomLeft; m_TopRight; m_dInclusiveAngle; m_bIsParallelogram;

public: void Area() { if(m_bIsParallelogram) {...} else {...} } };

This sure as hell works. It saves a lot of time for you because you need not write the code for set/get of the points etc. The more such functions, the more time you save when writing code. It may even win you a pat on the back from your project leader for finishing the work quickly. But, with all due respects, it is neither Object oriented nor is it Procedural. It is a badly cooked spaghetti. For every minute you saved in avoiding code duplication, the time spent in maintaining this code is manytimes more than that. With a diligent analysis of the design, I am sure that such spaghetti code can be completely avoided with minimum code duplication. Note: I am not a student of probability. So, I plead guilty at the arrogance of using the word completely.

Enums or #defines?
We all had the need to use some or other arrays in our programming. Many a time, we also know the index of the object we are interested in. It is a very compelling to argue that This index is not something which the user can modify. Why should I not hard code the number? One of the best examples I have seen for this is in the case of UI programming using Grids or tables. Let say, we are collecting user information in a table as shown below: Name Id Date of Birth City of Birth Marital Status

Then, we can safely use Example 3A


Table[RowNumber][0] Table[RowNumber][1] Table[RowNumber][2] Table[RowNumber][3] Table[RowNumber][4] = = = = = strUserName nUserId DOB strCity bIsSingle?Single:Married

This definitely works and there seems to be no reason why this should not be done. But the reason is that, there might always be a tom fool of a user who wants Marital status to come before City of birth. He might even be willing to pay for this change. Now imagine doing this change for this one users installation. Imagine the development effort of replacing 3 with 2 and vice versa. The nightmare continues into testing. A 1 mistakenly replaced with 2 or a unreplaced 3 shall wreak havoc and keep you from your nights sleep. This gets even more tricky when one field is a dropdown list (Example Marital status can be Single or married.) Then the sequence of selections might not really matter to the user. But then your project manger gets the novel idea of providing options like Single and

looking, Married but interested etc. Lo!!! Afore mentioned nightmare has materialized. A safe way to minimize troubles with this kind of situations is using a #define or an enum for the column numbers or the dropdown box entries etc. Consider the same case but now with #defines. Example 3B
#define #define #define #define #define COL_USER_NAME COL_USER_ID COL_DOB COL_CITY COL_MARITAL_STATUS 0 1 2 3 4

Table[RowNumber][COL_USER_NAME] = strUserName Table[RowNumber][COL_USER_ID] = nUserId Table[RowNumber][COL_DOB] = DOB Table[RowNumber][COL_CITY] = strCity Table[RowNumber][COL_MARITAL_STATUS] = bIsSingle?Single:Married

Now changing the sequence of columns is limited to correcting the column numbers in the #define s. Rest is automatically taken care of. Corollary: Over a period of time, I have found myself leaning towards using Enums instead of #defines. The Enumerations have the unique advantage that, they maintain their numbering all by themselves. When using #defines, reordering would mean that we not only need to remember the order in which they should appear, but also make sure that the numbering is maintained. i.e. if the user has asked for DOB as the first field and ID as the last field, the #defines shown in the Example 3B shall change as Example 3C
#define #define #define #define #define COL_USER_NAME COL_USER_ID COL_DOB COL_CITY COL_MARITAL_STATUS 1 4 0 3 2

If the number of columns is large around 20 say, then maintaining all the numbers becomes confusing while modifying the code. You may need 3 or 4 debugging sessions to get the sequence correct. Enums offer a better way of doing it. Consider Example 3B rewritten using Enums

Example 3D
enum{ COL_USER_NAME COL_USER_ID, COL_DOB, COL_CITY, COL_MARITAL_STATUS }; = 0,

Table[RowNumber][COL_USER_NAME] = strUserName Table[RowNumber][COL_USER_ID] = nUserId Table[RowNumber][COL_DOB] = DOB Table[RowNumber][COL_CITY] = strCity Table[RowNumber][COL_MARITAL_STATUS] = bIsSingle?Single:Married

By virtue of being declared as enums, COL_USER_ID and others are automatically assigned a value by the compiler. This can be used as an index in the array. Now, changing the columns sequence is a piece of cake. Change the relative positions in the Enum and you are done. Example 3E
enum{ COL_DOB, COL_USER_NAME, COL_MARITAL_STATUS, COL_CITY, COL_USER_ID }; Table[RowNumber][COL_USER_NAME] = strUserName Table[RowNumber][COL_USER_ID] = nUserId Table[RowNumber][COL_DOB] = DOB Table[RowNumber][COL_CITY] = strCity Table[RowNumber][COL_MARITAL_STATUS] = bIsSingle?Single:Married

Note: The Single, Married strings being hardcoded into the code is a glaring bad practice. I have put it here only for clarity sake.

Macros or inline functions


Why are C/C++ programmers so happy at writing macros? Sure, macros offer an easy way of writing reusable pieces of code. But they also have the side effect of making the changes in-context. Indiscriminate use of macros can cause rather unintended effects. Consider the following example. The macro checks the return value and throws exceptions based on various circumstances. Example 4A
#define CheckValueForError { if(i+j-k!=0) \ \ \

value = value/(i+j-k);

} if(value>1) throw exception;

\ \ \ \ \ \

Can you reuse this macro? Of course, you can!!! But how easy is it to reuse? That is a debatable question. This macro has the preconditions that the variables I,k,k and value are declared in context of the macro call and that they have the appropriate values. Moreover, the macro changes the value of these variables. If for any reason, the variable name conflicts, then this macro cannot be used. By the time the macro execution finishes, you have no idea of whether or not the macro has done what it should. The solution is to use inline functions. The same macro above can be rewritten as a function as Example 4B bool inline CheckValueForError(i,j,k,value) { bool bFlag = true; if(i+j-k!=0) { value = value/(i+j-k); bFlag = true; } if(value>1) throw exception; return bFlag; } While it is true that you need to explicitly provide the variables that need to be checked as input parameters, this actually is helpful. All the above mentioned the evils of macros shall vanish with inline functions. Saying that function has the overhead of stack maintenance is not valid here since this is an inline function. So, are Macros devils in disguise? Noph. They are not. They have their uses. If you want a change to be done in the context of the call, and this change to be done is many places in the code, then use a macro. For example, you want a code which swaps the value of two variables. The same algorithm needs to be written over and over for many different data types and there is no way for you to use function overloading or perhaps function overloading is an overkill. Then macros are the only way out. Macros also come handy for a clean and streamlined implementation for Single entry single exit programming. This, I shall discuss further on in my discourse. If you observe carefully, I have slightly cheated you above to put forth my point. The example I gave for bad macros does not have any input arguments. And when I said of

macros being a good way in the previous paragraph, I was referring to macros which have input arguments. When using macros with input arguments, then all the advantages which I attributed to inline functions shall be available for macros too. Still, the inline functions have one significant advantage. They can be stepped through in the debugger!!! When a macro is called, then the macro is processed by the preprocessor and the macro code is expanded in the source program. The expanded source program is written in a file known as Intermediate file and it is this intermediate file which is processed by the compiler. In the debug information stored in the PDB file, all these lines of code in the intermediate file are linked to the single line where macro is written in the source C/C++ file. As a result, you will see the debugger passing your macro call as if it is only a single statement. This behavior is both helpful and troublesome. When the macro call is working fine and you dont need to check the code flow inside the macro, then this is good. But if you are debugging some problem and are suspecting that the macro itself is having the bug, then it is very difficult to see inside the macro. As with everything, there are counter arguments to this statement. It is certainly possible to instruct the compile to save the Intermediate file and use it in the debugger so that, you can see the expanded code. But the catch is that, the intermediate file contains the code which was written in the way which is easy for the compiler to understand. It contains the expanded code for various macros provided by the SDK you are using, written by you or your colleagues and the various libraries you are using. In the best cases, this code can be understood easily. At times, this intermediate file can be so cryptic that, you might find it difficult to find where your code dealing with business logic is.

Single Entry Multiple exit


Single Entry Single Exit is a significant tenet in structured programming (Pls. do not confuse Structured programming with Procedural programming). Single Entry Single Exit programming says that a function/procedure should have a single entry point and single exit point. In all cases of the code flow i.e. success, failure, error or exception the control must enter the procedure at one point and exit and another. There should not be any other entry or exit. I find it hard to imagine how one can write a procedure which has multiple entry points. But writing code with multiple exit points is a very common practice. For example, consider the following program which finds the square root Example 5A
bool SquareRoot(double dValue,double& dSquareRoot) { if(dValue<0) {

return false;

dSquareRoot = pow(dValue,0.5); return true; }

A similar coding is very common. This is a single entry multiple exit program. Now let us consider a single entry single exit counterpart for this function. Example 5B
bool SquareRoot(double dValue,double& dSquareRoot) { bool bRetVal = false; if(dValue<0) { dSquareRoot = NAN; bRetVal = false; } else { dSquareRoot = pow(dValue,0.5); bRetVal = true; } return bRetVal; }

Though this is a simple example chosen specifically for being compact, in real life, programming is seldom this simple. The condition where procedure has to be exited is encountered in the middle for a nested loop which is 3 for() loops deeps. Then having multiple return statements all over the place is good or bad? Consider the below where The tan(arcsine()) is calculated for an array of numbers. Example 6A
bool { TanArcSine(double *pValues,int nLength,double **ppResult) if(nLength<0) return false; *ppResult = new double[nLength]; for(int inx=0;inx<nLength;inx++) { if((-1<=pValues[inx])&&(pValues[inx]<=1)) { *ppResult[inx] = asin(pValues[inx]); if(fabs(*ppResult[inx])!=M_PI/2) { *ppResult[inx] = tan(*ppResult[inx]); } else

{ } else { } } return true; } }

return false;

return false;

Consider the same program written using a Single entry single exit mechanism. Example 6B
bool { TanArcSine(double *pValues,int nLength,double **ppResult) bool bRetVal = false;

if(nLength<0) bRetVal = false; else { *ppResult = new double[nLength]; for(int inx=0;inx<nLength;inx++) { if((-1<=pValues[inx])&&(pValues[inx]<=1)) { *ppResult[inx] = asin(pValues[inx]); if(fabs(*ppResult[inx])!=M_PI/2) { *ppResult[inx] = tan(*ppResult[inx]); } else { bRetVal = false; break; } } else { bRetVal = false; break; } } bRetVal = true;

} }

return bRetVal;

An interestingly large number of programmers believe writing Single entry multiple exit programs is easier, safer, compact, reliable etc The argument they often provide is why carry the excess baggage when it is no longer needed? Looking at the programs in example 6A and 6B, one tends to agree with them. And they cant be blamed for doing so. It certainly is of no value addition to execute a procedure when an error condition is encountered. But as a responsible programmer, it is our responsibility to make sure that whenever we fail, we do so very gracefully. No one should be given a chance to grumble when we fail. For this purpose, I find single exit point concept extremely helpful. As you can imagine, procedures are written to achieve some really complicated tasks like writing data to a table in the database, entering a critical section and performing some operation in a thread safe manner in multi threaded/multi process system. Now when we return from the procedure we must do things like committing or rolling back the database, releasing the mutex and exiting the critical tasks. There might also be things to do like closing a file etc. Now if there are more than one exit points, then this exit tasks need to be done in all those exit points. In most cases, this would mean blunt code duplication. As you know very well, code duplication is a bad practice. Instead, if we maintain a single exit point, then by virtue of having a single exit, all the exit tasks will be concentrated in a single location in the code. This directly reflects on the code quality. So, with a little extra baggage, we can avoid possible code duplications and keep our code top class. I will not be so arrogant here as to call Single entry Multiple exit as a bad coding practice all by myself. The jury is still out on this. Single Entry Single Exit is a good practice in my opinion and all my remaining discussion is based on this style of programming. Fortunate or not C, C++, C# and Java do not have any built-in support for single exit programming. If you want to employ this kind of programming, then you have to do it with carefully prepared standards and adhering to them strictly. Needless to say, this is very subjective and varies with taste.

Resource lock-up
One of the easily committed mistakes is forgetting to free the memory allocated when an error condition is encountered. Both in example 6A and 6B, whenever the return false statement is executed, the memory allocated to ppResult has not been freed. As a responsible programmer, it is the procedures responsibility to free all the resources that are not used. We cannot expect the caller to check the return value and free all the memory. If the ppResult happens to be a local variable, then the caller cannot free that memory.

Java, C# and .Net extensions to C++ have introduced Garbage Collecter to take care of this situation. But then, there might be other resources like open files or network sockets or semaphores etc which must be released. There is no built-in support for this kind of resources. Most of the times, programmers do free up resources. But they do it only in the success cases. When employing Single Entry Multiple exit, very often some or other resource shall not get freed in the error case. This can clog the resources bringing the system to its knees. If you adopt Single Entry Single Exit style of coding, then solving this problem becomes a bit easy. At the exit point, you need to write the code for freeing the resources and in all circumstances, i.e. success, failure, exception or error you are sure to have your resources freed up. Consider the following example: Example 7
bool { TanArcSine(double *pValues,int nLength,double **ppResult) bool bRetVal = false;

if(nLength<0) bRetVal = false; else { *ppResult = new double[nLength]; for(int inx=0;inx<nLength;inx++) { if((-1<=pValues[inx])&&(pValues[inx]<=1)) { *ppResult[inx] = asin(pValues[inx]); if(fabs(*ppResult[inx])!=M_PI/2) { *ppResult[inx] = tan(*ppResult[inx]); } else { bRetVal = false; break; } } else { bRetVal = false; break; } } bRetVal = true;

if(false==bRetVal)

if(*pResult) delete *pResult; *pResult = NULL;

} } return bRetVal;

There is nothing like a free lunch in this world. The code quality in this case comes with the price of extra baggage. In my opinion, the extra baggage is worth it.

The infamous goto


How many of you like goto? How many of you think goto is bad? How many of you believe goto is bad? What will you say when you see a person writing code using gotos? With break, continue and return statements, it is almost always possible to write code without using gotos. But gotos come in handy for a very easy and simple implementation of Single Entry Single Exit style of programming. Consider the following example Example 8A
bool { TanArcSine(double *pValues,int nLength,double **ppResult) bool bRetVal = false;

if(nLength<0) goto error_exit; *ppResult = new double[nLength]; for(int inx=0;inx<nLength;inx++) { if((1<pValues[inx])||(pValues[inx]<-1)) goto error_exit; *ppResult[inx] = asin(pValues[inx]); if(fabs(*ppResult[inx])==M_PI/2) goto error_exit; } *ppResult[inx] = tan(*ppResult[inx]);

bRetVal = true; goto success_exit; error_exit: bRetVal = false; if(*ppResult) delete *ppResult; *ppResult = NULL;

success_exit: return bRetVal; }

This code is certainly more understandable than the one in Example 7. Isnt it? The only goto statements present are for success_exit or error_exit. No other goto statement is permitted. I know of many people who employ code quality check tools to trap all other usages of goto. Thereby Single Entry Single Exit is achieved with minimum over head of extra variables and maintaining the ease of programming. A very controlled usage of goto is perhaps a viable tradeoff in these cases. So the next time your quality assurance team screams at you for using a goto, ask them what they prefer. A clean looking code with all resources freed with the over head of exactly two goto statements or a code which gets mired in if then else statements , hard to write, understand and maintain but does not have goto. If seeing a goto statement brings a tear to your cheek, then try this. The same program as Example 8A but slightly rewritten to use macros. This is one place where I find macros not just handy, but indispensable. Example 8B
#define IF_TRUE_EXIT(expr) \ { if(expr) goto error_exit; \ } #define SUCCESS_EXIT bool { \ \ \

goto success_exit;

TanArcSine(double *pValues,int nLength,double **ppResult) bool bRetVal = false;

IF_TRUE_EXIT(nLength<0); *ppResult = new double[nLength]; for(int inx=0;inx<nLength;inx++) { IF_TRUE_EXIT((1<pValues[inx])||(pValues[inx]<-1)); *ppResult[inx] = asin(pValues[inx]); IF_TRUE_EXIT(fabs(*ppResult[inx])==M_PI/2); } *ppResult[inx] = tan(*ppResult[inx]);

bRetVal = true; SUCCESS_EXIT; error_exit: bRetVal = false; if(*ppResult)

delete *ppResult; *ppResult = NULL; success_exit: return bRetVal; }

Return values
I add this section as an after thought. The intention for this section is just to remind people to check the return value of the function call. Even if you are sure that the function cannot possibly fail, it is better to check for the return value. You cant be too careful when it comes to coding. Assume the case where the function you have called has failed. But the function has forgotten to free the memory allocated to an output pointer. (Probably because the called function is not Single Entry Single Exit). Now you dont check the return value but go ahead to access the output parameter. This can result in an memory access violation or overflow etc. This is a perfect hackers exploit. Sounding farfetched??? Think again.

Conclusion
Can Bad coding practices be concluded??? No. I dont think so. There isnt enough space on my brain to learn all the coding practices. There certainly isnt enough computing power available in my brain to judge them all and classify as good and bad. Feeling rather proud of my experiences with C,C++ coding, I have started listing the practices which I found bad. Most of these bad practices were once followed by me. I make a diligent effort to not commit them now-a-days but that is something of a perfection which has always remained beyond my grasp. I wish to keep this doc updated whenever I learn a lesson.

S-ar putea să vă placă și