Why people must test programs is often debated. If people were ethical, we would not have to test for security risks, but the world is not entirely full of ethical people who ensure that correct data is computed into a system -- that is why safe practices need to be developed. The only way for this to take place is through bug testing. There are two categories that effect the client and the programmer, as each have different needs and wants.
From the client's perspective, having a stable program that is guaranteed to perform its desired task is not only a reflection of the program, but also of the company itself. Poor products shine the light dimly on the company, so a solid and well-tested product needs to be ensured through bug testing, before manufacturing takes place.
Management doesn't always know about product flaws; company directors assume that every function works smoothly without any defects. However, experience shows that no product/system can be deemed completely secure without controversy. There will always be bugs in a program; whether they are found or not is another question. Open Source makes it much easier to spot bugs and code flaws, and active security checks by the public help create a much more stable and operable program. This is one of the reasons why Microsoft products fail consistently when it comes to testing; their products are not Open Source, and therefore it is much harder to create a secure and flexible program without the aid of the programming community to help optimize code.
The importance to the client, purchaser of the software, is without doubt a key aspect in performing their daily tasks successfully. If the program is vulnerable to overflows, lack of input checks, or even lack of encryption, it will quickly become known for its instability, and product sales will drop dramatically. Customers will purchase alternative products that perform the same task and that have been carefully checked by multiple tests, as will be seen in the testing section of this document.
There is a high level of ethics involved when a programmer is contracted to develop a program. The programmer is the top of the chain for importance in testing and coding a proficient software application. He/she is responsible for ensuring that all functions of the program work, and work efficiently; code optimization should be at its peak, with security functions in check. Better programs are known to have been thoroughly tested, with all sorts of data sets being properly dealt with. Operating systems like Linux are tested every day by programmers and crackers alike. Yes, security problems do exist in this environment, but most have are patched or fixed, pushing towards one of the most stable systems currently around.
Sloppy programmers will not care about ethics, and will simply code the program to minimally function with all its client side requirements implemented. Some programmers deem financial security more important than ethical security -- be careful whom you contract to fulfill your programming requirements.
Goals should be adopted by programmers to ensure software quality assurance, but the customer has a responsibility to communicate to the programmer once a bug has been found.
The primary goal of a programmer is to complete a working program that serves its purpose to client-side requirements. Once this stage has been reached, the more advanced and less known methods should be then put into practice, adding functionality such as:
Adding security features is a must, and assures that code quality is evident within a program. Use of secure functions and methodologies/implementations should at this stage make itself known. This is where a gap between sloppy and aware programmers becomes apparent. All programs should aim for a level of code quality by utilizing the secure function calls within their specific programming languages, which helps create a more reliable and flexible program. Of course, one of the only certain ways to determine a program's reliability is through testing. Testing focuses on the need for rapid feedback and the evolving nature of the program under test. This is where clients/customers come into the picture.
Although programmers bear the most responsibility in terms of code reliance, clients and customers also need to be prepared to communicate with software engineers if a bug or flaw is observed in a program. If the expected output is different from what is given, it's time to get in contact by means of a bug discussion list, email, phone -- whatever, but be sure to advise the correct people. It is important to inform product vendors before the public knows about it, especially if the bug could lead to increased privileges. This gives the vendors time to write patches/advisories for their clients before any damage can be done.
Testing software is always a step in the right direction. Effective bug testing by customers/clients will force the programmer to improve code quality and security in future products; that's why we must tolerate and thank the software task forces out there that make software vulnerabilities public, such as BUGTRAQ.
When reporting a bug, always be sure you can reproduce it, and always include a detailed description of exactly how the bug was found and the type of system that you tested the software on. The more information the better, but be sure not to obfuscate the description -- get as many of the basic facts down as possible. In particular, segmentation faults generally cause core dumps (a memory image of the terminated process when any of a variety of errors occur) which hold vast amounts of information to help the programmer locate where the bug took place. Remember, full disclosure is bliss.
Developing a program or system effectively needs to be thoroughly thought out before any raw code is actually written down. One of the most important methods of establishing functional requirements is a storyboard. Prototypes may consist of a storyboard, a sequence and series of screens, showing the end-user a typical scenario of using the program/system.
This is one of the most useful methods for making sure the programmer understands just what a program is intended to do. A functional prototype is a very limited version of the final program which gives some idea of the appearance of the final product, but with a lot of functions missing. Displaying a simple storyboard to a client or bug tester is necessary, as they will be able to comment on whether the "expected input" takes the "observed output" resulting from running the program. This will also force the programmer to think through many of the details of what the program is meant to do.
Creating workable and effective sets of tests is intellectually challenging. Testing can almost never be exhaustive, and it may even be possible that not all programming flaws are evaluated even after very stringent testing has been covered. In the real "commercial" world, a significant source of program defects is created by people running tests and not checking the results carefully; the programmers run the tests, but do not take enough care in reviewing the results to see that the tests showed unexpected flaws in the programs.
Tests must be convincing, and must demonstrate a successful performance of the program. In a commercial setting, there are many methodologies used to produce a set of tests. One of the necessary tests that should be first evaluated covers the main function of the program. The programmer must decide on a set of tests that enable him/her to see if the code achieves its desired outcome.
All conditions of the program need to be thoroughly checked, including:
Naturally, sets of tests will assess the same parts of the program repeatedly. Known as "equivalence partitioning" for tests, it may seem like duplication, but it is standard economical testing. Perhaps part of the code works in one scenario, but not another -- this needs to be carefully checked. The first thing a programmer needs to understand is that testing will demonstrate the presence of bugs, but it will not demonstrate the absence of bugs. Semantic errors fall into this category -- that is, errors in the logic of the program, that the compiler or interpreter is unable to help you with.
Testings falls into two broad categories:
This type of test tries to detect all the defects the program may have. All parts of the program should be tested, and if the programmer feels that one part of the code may not properly deal with unexpected input, more rigorous tests should be performed on that area of the code. One key point to remember in this is that nobody knows a program better than the programmer himself. The programmer will know the area of the program that is most likely defective, so a designed set of tests on that area should be practiced before a Beta release is produced.
Regression testing stems from defect testing, and is the process of testing changes within the programming environment to make sure the older program still works with the new implemented changes. Regression testing is a normal part of the program development process and, in the commercial world, is performed by code testing specialists. Test department coders develop code test scenarios and exercises that will test new units of code after they have been written. These test cases form what becomes the test bucket. Before a new version of a software product is released, the old test cases are run against the new version to make sure all the old capabilities still work. The reason they might not work is that changing or adding new code to a program can easily introduce errors into code that was not supposed to be changed, and thus will obscure test results. Recursive regression testing is a must!
Acceptance testing is done in conjunction with defect testing, and runs an agreed set of sets with an agreed output. These should demonstrate that the code does an agreed task well enough for the programmer and client to be satisfied. In the commercial world, acceptance tests are part of the contract for defining what the customer insists on before money ever changes hands.
Prototyping of this nature is relatively simple. Structural prototyping is a stripped-down version of a program that shows a structure, in skeleton form, of the complete version. All major aspects of the code are written, but routines and sub programs are written only as stubs, comments/statements within the program that show the programmer that the actual routine has been called or executed.
Maintaining effective code that is easily interpreted by the programmer and other developers (and allows further extensions to be added easily) requires three code characteristics:
Understandability means that programs that are easier to understand are considered to be better designed than ones that do the same task but are harder to understand. A key to developing stable code is a good functional prototype that allows the general idea of the program to be observed before code writing takes place. It may also be necessary to note that better code is clear and neatly presented, spaced out where necessary with comments to let the reader understand what is going on.
Adaptability refers to how easy it is to modify areas of the code to perform alternate tasks. This is directly linked to understandability. The more understandable the code, the easier it is to adapt.
Cohesion refers to a routine or sub program that does one clear task which is apparent to the reader and programmer. A well-defined task should give a clear indication of what the program is intended to do; this includes well-chosen names for variables, constants, headers, etc. As small as this concept may seem, it allows any coder to pick up the source and quickly scan through and understand what the program is about.
Whether you are checking the source for bugs or testing the executable for flaws, all of the above tests need to be considered and exercised. It's most common that bugs present themselves in boundary structure conditions. When designing a set of tests, it cannot be stressed enough that boundaries need to be checked on both sides of their "walls". Other flaws that should checked before releasing a beta include the malpractice of dealing with format control bugs such as %s. The programmer must employ capable input routines/parameters to correctly deal with user-supplied input, ensuring that all possible scenarios have been considered before adopting the most suitable code to perform the given command. This includes identifiers themselves, avoiding use of getenv(), strcpy(), and sprint() wherever possible in exchange for more secure methods like strncpy() or snprintf(); the "n" refers to the number of bytes allowed to be copied to a buffer. Avoid common mistakes often used by sloppy programmers to get user-supplied environment variables from the terminal or environment. Establish your own method of setting or checking the environment, and make it insusceptible to malformed data that could lead to unexpected outcomes such as spawning a shell -- a definite security risk, and one that is often observed in many UNIX environments. (Early ZGV [a console graphics viewer] releases were always victim to getenv('HOME') problems.)
Another way to use acceptance testing to expose flaws is to use the proper data set meant to be sent to the program, but to send extensive data to a particular input command, such as sending 1024 bytes to a 512-byte buffer, causing an overflow.
Sometimes, when a program appears to have decreased its efficiency in terms of speed or processing of the data, it may be directly linked to a heap or stack overflow caused by corrupt data being entered. At this stage, vital tests need to be conducted by the bug tester.
Let's take a real life example of a program that I exposed with a flaw not long ago, the WinSMTPD mailer/pop3d daemon, versions 1.06f and 2.X.
After acceptance testing this program, everything worked well. All the desired tasks of the program were fulfilled, and the smptd and pop3d servers performed their tasks efficiently. Now, here is where defect testing came in to play:
To start an SMTP transaction, the client needs to send a "HELO %s" call, where the format string "%s" is your hostname. WinSTMPD only allowed a fixed buffer of 170 bytes before the expected output became unexpected. When I sent 150 bytes after the HELO field, the program noticeably paused before proceeding to function as normal. That told me that one of two things had happened:
As it turned out, WinSTMPD was vulnerable to a stack overflow. By sending 170+ bytes to the HELO field, I got:
WINSMTP caused a general protection fault in module WINSMTP.EXE at 0003:00002359. Registers: EAX=461e0001 CS=42e7 EIP=00002359 EFLGS=00000246 EBX=00807fe0 SS=4207 ESP=00007e36 EBP=00004141 ECX=00010283 DS=4207 ESI=0000544c FS=05c7 EDX=58600000 ES=461e EDI=00001547 GS=0000 Bytes at CS:EIP: cb 49 73 49 63 6f 6e 69 63 00 00 58 4c 6f 63 00 Stack dump: 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141 41414141
Obviously, this isn't what the programmer had in mind when performing an SMTP transaction. The 41414141 that appears on the stack is "A" binary value, which I had filled the buffer with. From this general protection fault, we as bug testers and programmers are able to ascertain that this 16-bit program (judged by the leading 0s within the memory registers) has successfully overwritten the EBP register (+4 bytes for EIP), and as ethical programmers/bug testers, that's all we need to know to fix this bug. If there were, say, an unethical cracker out there, loading up the stack with malicious data could allow arbitrary code to be executed from the stack, and anything is possible from there. This is why it is important to test for bugs, and especially to check the boundaries and the data that is allowed to be sent by the client/user.
Although I approve of people writing "proof of concept" exploits to expose the existence of a bug in a program (as I am a firm believer in full disclosure and an advocate for Open Source), it is not ethical or urged to run these scripts without the direct consent of those you are exploiting. (POC exploits are necessary in whitehat cracker security firms to prove and demonstrate a code flaw.)
Data sets and tests computed to the program/system are effectively system calls executed by active processes. These include different kinds of programs (e.g., programs that run as daemons and those that do not), programs that vary widely in their size and complexity, and programs with different purposes. Spawns or fork()s by applications are tested when the maximum process limit is exhausted by various resource-depleting exploits; this too needs to be prepared for when making a heavily-used program. Normal computed data can be "synthetic" or "live". Synthetic traces are collected in production environments by running a prepared script, often called a driver program. The program options are chosen solely for the purpose of exercising the program (acceptance testing), not to meet any real user's requests. Live normal data traces of programs are computed during normal usage of a production computer system (manual specificities of code testing and boundary testing). Both methods are often put to test when processing en masse software applications.
So, you think you've found a bug? Then read on, here's what to do next:
If a user has somehow stumbled on a logical error or security vulnerability within a tested (beta or stable) product, it's necessary to report the bug immediately to the vendor. More of this was discussed in the "Development Goals" subtopic, but visually displaying a practical advisory was not. The bug report should include most, if not all, of the following information, generally in brief conceptual form:
|Bug synopsis:||a brief paragraph explaining the vulnerability|
|Description:||the sequential steps taken to produce the bug|
|Attachments:||any relevant materials, such as core dumps and message logs|
|Environment:||system specifications and conditions used to test the bug|
|Contact info:||how the vendor can contact you with further comments/queries|
If the bug has been accepted by the vendor as being a vulnerability that could lead to such problems as network/software penetration, increased privileges, or excessive system resource usage, the vendor should issue a public advisory through mailing lists, the vendor's Web site, and/or direct email to customers. It's then the responsibility of the programmer/manufacturer to offer instructions to the client to patch his/her software/system so the vulnerability is removed. The advisory should include the following information:
|Date:||date of the advisory's release|
|Affected systems:||a list of the environments/settings in which the bug may occur|
|Description:||similar to the client's description, but with more technical inside info|
|Patch:||the URL of the patch or description of how to correct the bug|
|Contact:||how clients can contact the vendor for more info -- phone, e-mail, URL|
This communication link creates a much friendlier atmosphere between users and vendors, which helps software development become a more stable and reliable community -- one that excels in safe security practices.
I made a generic resource kit earlier this year. It consists of seven skeletal template scripts coded in Perl for various purposes of testing network services in a Linux/Unix environment. It includes tests for malformed HTTP "GET" requests, multiple thread connections, random data streaming, ICMP error generation, etc. It's mainly used as a research and development kit to help spot bugs more easily, particularly in server/router software; feel free to expand it. It can be downloaded from http://dethy.synnergy.net/reskit.tar.
Luke Andrews (firstname.lastname@example.org) works as a UNIX systems administrator for Errata Internet Solutions (a new Web hosting/shell/security company), and does Research and Development for Synnergy Research Labs as a hobby. His own code (and advisories) can be found at http://dethy.synnergy.net/.