Introduction
Creating good, successful software is hard - very hard.
For every programmer, it is therefore important to know, understand and apply fundamental software development pragmatics - practical advice and rules that have proved their usefulness and help us to create the best possible software in the shortest possible time.
In this article I have tried to assemble a set of basic pragmatics which I consider to be the most relevant ones. If you are an experienced programmer then you might be familiar with most/all of them. If you miss some advice then please share it with us by leaving a comment.
Please note:
-
The pragmatics in this article are about designing and writing code. There are other important aspects for software development projects to be successful (such as good user interfaces, dealing with people, etc.), but they are out of the scope of this document.
-
The following pragmatics only cover basic principles that are meant to be generally applicable. There is no advice for specific programming environments (programming languages, libraries, tools, and architectures).
The tips are divided into three categories:
-
General Guidelines
-
Data Design
-
Writing Code
Let’s get started.
General Guidelines
Everything should be as simple as possible, but not simpler! - Albert Einstein
Simple tools and concepts can be understood quickly, are easy to use, less error-prone and make us more productive.
We all like simplicity. Simplicity makes work and life more enjoyable.
However, we must be aware of oversimplification, as indicated at the end of Einstein’s beautiful quote.
Therefore: Keep it simple, but not simplistic!
Many famous people advocate simplicity. Here are a few examples:
Simplicity is prerequisite for reliability.
Simplicity and clarity … decide between success and failure.
Dutch computer scientist; author of 'Go To Statement Considered Harmful'
Controlling complexity is the essence of computer programming.
Computer scientist and writer; co-developer of Unix
Simplicity is the ultimate sophistication.
Simplicity transforms ordinary into amazing.
creator of Dilbert
If you can’t explain it simply, you don’t understand it well enough.
Physicist; genius
Truth is ever to be found in simplicity, and not in the multiplicity and confusion of things.
Mathematician; astronomer; theologian; author and physicist
If it’s doomed to fail, then 'Fail fast!'
Most software projects fail. This is a sad and undeniable fact.
If a project is deemed to fail, then it should fail as fast as possible, in order to limit the damages and free up time and resources for other projects that will (hopefully) not fail.
Failing fast in this context means that problems should be detected and dealt with as soon as possible. The longer we wait to address the problem, the more time, energy and resources will be wasted. The cumulative loss increases exponentially with time.
For example, correcting a design flaw in the design phase is easy and cheap. But once the software is in production and used by many people, it is often very expensive and frustrating to fix the error.
The sooner we 'fail' and the faster we learn, the greater the chances for success. Failing early saves you time and money.
product manager at Adobe
Test fast, fail fast, adjust fast.
author of the bestseller book 'In Search of Excellence'
Strive for 'good enough', not for 'perfect'. Then release!
Creating perfect software (no bugs, all features fully implemented, optimal user interface, excellent documentation, etc.) is extremely time-consuming and expensive, unless we are working on a very small project. In most cases it is impossible to achieve perfection because of practical limitations.
Even the major players in the software industry with the best developers and dream-budgets don’t write perfect software. That’s why they constantly provide patches and new versions.
Therefore, proceed like this:
-
Set goals and priorities
-
Create a prototype
-
Deliver 'good enough' software
-
Improve continuously (see next item)
Prototype |
Good enough |
Better |
You cannot write 100% perfect code. Even if you do, in 6 months time it will not be perfect.
Co-founder of CodeProject
No one in the brief history of computing has ever written a piece of perfect software.
Writer; co-author of 'The Pragmatic Programmer'
90% of the functionality delivered now is better than 100% of it delivered never.
Computer scientist and writer; co-developer of Unix
Searching for perfect solutions often will lead to stagnation and frustration. Perseverance, tolerance for less than perfection, the pursuit of improvement, and commitment to doing the very best you can, all are healthy, and most likely to yield the best results.
Psychotherapist
Listen to the users!
Practice shows:
-
Software developers can’t anticipate everything the users really want.
-
Users often don’t know exactly what they want, unless they have used the software for some time.
-
The users' satisfaction is a determinant factor for the software’s success.
We want happy users. Therefore the best approach is an iterative one that goes like this:
After spending a lot of time with trying out some marketing tricks he [Joel Spolsky, co-founder of Stack Exchange] concludes (after 5 years): Nothing works better than just improving your product. Make great software that people want and improve it constantly. Talk to your customers (users) and listen. Find out what they need.
... we build prototypes for every major product feature. We test with pre-release users and key customers very early on and definitely before we get to implementation. And yes, we do 'fail' a lot, which is great! It’s great because we learn a ton in the process and minimize the risk of failure in the long run. In the end, we tackle real customer needs, with empathy, in an innovative way.
product manager at Adobe
The most important property of a program is whether it accomplishes the intention of its user.
Computer scientist; ACM Turing Award 1980
Data Design
Before writing code, design your data carefully!
Whenever you create an application, start by carefully designing the data structures and their relationships. Do this before writing code.
Well designed data structures lead to simpler and more maintainable code, less bugs, better performance and less memory consumption.
The difference can be striking.
Study after study shows that the very best designers produce [data] structures that are faster, smaller, simpler, clearer, and produced with less effort. The differences between the great and the average approach an order of magnitude.
book 'No Silver Bullet'
Show me your flowcharts (code) and conceal your tables (data structures), and I shall continue to be mystified. Show me your tables (data structures), and I won’t usually need your flowcharts (code); they’ll be obvious.
book 'The Mythical Man-Month'
Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.
Data dominates. If you’ve chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming
I’m a huge proponent of designing your code around the data, rather than the other way around.
Bad programmers worry about the code. Good programmers worry about data structures and their relationships.
creator of Linux
If you get the data structures and their invariants right, most of the code will just write itself.
book 'Coders at work
Make all data structures immutable, unless there is a good reason to make them mutable!
Immutable data structures are simpler to understand, simpler to use and less error-prone because:
-
Once created, the state doesn’t change anymore. There are no state transitions, no temporary invalid states and there is no need for synchronization or locking or defensive copying.
-
Immutable data can be freely shared in concurrent/parallel computing environments - there is no risk of corrupted data, deadlocks or other nasty problems that can be very difficult to tackle down and fix.
-
Computed results based on immutable data can easily be cached for better performance.
However, immutable data structures are not always the best choice. For example, cloning a whole structure for every single change can be very costly (in time and space). In some cases (e.g. game or GUI applications) mutable data structures are better suited.
Moreover, if two objects need to directly refer to each other, then the objects (at least one of them) must be mutable. Examples would be a parent and child node in a tree referring directly to each other, mutual friends (A points to B, and B points to A), etc. If immutability must be preserved in such a case, then a possible solution would be to have an additional data structure that describes the relationships, such as a set of tuples that represent a pair of objects related to each other.
Limit the set of allowed values to the smallest possible one!
Limiting the set of allowed values for a data type:
-
documents and helps to understand the data type
-
eliminates the risk of misbehavior or severe failures due to wrong values
-
simplifies the code that has to deal with the data
For example, consider the case of a name
field for data type employee
. By allowing any string to be stored in name
, the following can happen:
-
A long string can lead to a buffer overflow (depending on the programming language), memory exhaustion, or other software failures.
-
Characters that are invalid in a name, but used as valid symbols in Javascript and SQL (such as
<
,>
and"
) can open the door for attacks such as SQL and/or script injections. -
An empty or
null
string can lead to bugs if the code doesn’t explicitly handle these values correctly.
To avoid these risks, the name
field should be constrained. For example, the following simple regular expression eliminates all above mentioned problems:
[a-zA-Z ]{1,70}
This regex limits the name to a maximum of 70 characters, requires at least one character and allows only letters and spaces. Note, however, that the above regex would be over-simplified and not suitable in real-life applications that must allow names containing hyphens, apostrophes, and maybe other symbols.
Protecting data against invalid values (especially in case of data read from external sources) is often cited as the most important rule for writing secure software. (for example, see OWASP Top 10 Critical Web App Vulnerabilities and Top 10 Secure Coding Practices)
Example of an attempt for SQL injection in a data entry form:
Example of a Javascript injection attempt:
Most of the time, a default value should be the strictest among the set of allowed values.
By choosing the strictest possible value as default value, we are always on the safe side.
More permissive values should need to be explicitly stated (in the code, in configuration files, etc).
For example:
-
A function that writes to a file should, by default, not overwrite an existing file.
-
All compiler warnings should be enabled by default.
-
If a new user is added in a multi-user application then the lowest privileges should be granted by default.
The last example demonstrates that strict default values are not always the best choice. They can be annoying. If the multi-user application is only used by a single person on his/her PC, then the user obviously wants to have full rights by default. Hence, the best default value sometimes depends on several factors.
Example of a strict default value in a GUI:
Avoid data redundancy!
Storing copies of mutable data at different locations is error-prone, laborious and expensive because:
-
In case of data changes there is a risk of not updating all locations - due to forgetfulness, technical problems, security concerns, etc. This can lead to data inconsistencies and corruption, and can ultimately result in severe software malfunctions.
-
Sometimes a locking/synchronization mechanism needs to be implemented and activated for every data modification (create, update, and delete operations), in order to avoid accessing invalid data during the data modification process. Implementing these mechanisms can be very tricky and error-prone.
-
Additional memory is needed to store the copies.
Banal example: A salesman misses an important appointment. Reason: He entered the appointment into his agenda on his PC, but forgot to synchronize the data with another agenda on his mobile phone. If the data for both agendas were stored in one place (e.g. in the cloud), he wouldn’t have missed his appointment.
Consider storing data in simple text files using universal, standardized formats!
Using text files as storage media has many advantages:
-
Text files are fully supported by all operating systems. There is no need to install and configure additional software such as a database server.
-
Text files can easily be read and manipulated by humans. This is very convenient for debugging purposes.
-
They can also easily be read and manipulated by many third-party applications (written in any programming language) and tools. For example, Unix provides many useful tools for text manipulation, such as grep, awk, sed, etc.
-
Using standard formats such as JSON, XML and CSV enables the data to be queried, sorted, filtered, searched, printed and converted by many existing applications. For example, a CSV file can easily be used in a spreadsheet application.
However, text files also have severe limitations, especially in cases of big data. Complex queries (with filters and joins), update and delete operations, transactional processing, data encryption and other functionalities might need to be implemented manually and can be very inefficient. When it comes to big data then storing them in a database is often the only viable choice.
Moreover, storing data as characters (instead of bits) can consume much more space and time. Therefore, binary data are sometime unavoidable.
A nice example of using text to represent graphics is SVG
(Scalar Vector Graphics).
Here is a simple example of an SVG file (SVG_example.svg):
<?xml version="1.0" encoding="UTF-8"?> <svg xmlns="http://www.w3.org/2000/svg" width="140" height="140" > <circle cx="70" cy="70" r="40" stroke="red" stroke-width="4" fill="yellow" /> </svg>
Opening the file (for example with your web browser) displays the following image:
Write programs to handle text streams, because that is a universal interface.
Inventor of Unix pipes
Unix tradition strongly encourages writing programs that read and write simple, textual, stream-oriented, device-independent formats.
The Art of Unix Programming
Writing Code
Write beautiful code!
Beautiful code is short, simple, quickly understandable, modular, extensible, reliable and maintainable.
Beautiful code rarely needs comments and reads like prose.
Developers love to look at and work with beautiful code. It makes them feel good.
Writing beautiful code is an art that requires passion, deep commitment, and many years of practice.
Great software requires a fanatical devotion to beauty. If you look inside good software, you find that parts no one is ever supposed to see are beautiful too.
Current development speed is a function of past development quality.
Programs should be written for people to read, and only incidentally for machines to execute.
Because maintenance is so important and so expensive, write programs as if the most important communication they do is not to the computer that executes them but to the human beings who will read and maintain the source code in the future (including yourself).
Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
You can always recognize truth by its beauty and simplicity.
Use existing software whenever possible!
Don’t reinvent the wheel!
Before writing new code, be sure the functionality you need doesn’t exist already.
Look at existing libraries and frameworks (maybe written in different languages), tools, other applications and cloud services to find what you need, or to get ideas and become enlightened.
For example, Unix provides many powerful, reliable and flexible out-of-the-box-tools (also available on Windows) for text manipulation. Use them in your application if they are appropriate.
... library code is likely to be better than code that you’d write yourself and is likely to improve over time. … library code receives far more attention than most developers could afford to devote to the same functionality.
book 'Effective Java'
Thou shalt study thy libraries and strive not to re-invent them without cause, that thy code may be short and readable and thy days pleasant and productive.
The Ten Commandments for C Programmers
The best code is no code at all.
If I have seen further than others, it is by standing upon the shoulders of giants.
Mathematician; astronomer; theologian; author and physicist
Fail fast and noisily!
Detect and fix problems as early as possible!
This helps to minimize development time and costs, and reduces the risk for failures and damages in production mode.
A maximum of bugs should be detected automatically. This can be achieved by smart IDEs, compilers, static code analyzers, fuzzy testers, etc. Automatically detected bugs are very cheap to be discovered, and they can’t go into production and cause malfunctions.
Write good unit tests.
Bugs that stay undetected before running the application, as well as all kind of runtime problems (e.g. file not found, invalid input data, etc.) should be detected as early as possible at runtime.
Therefore, check all input for valid values (function arguments, user input, resource input, etc.). Check all return values for error conditions.
If the runtime problem cannot be handled gracefully, then the application should consistently apply the Fail-fast principle by reporting the problem in an appropriate way (i.e. with a maximum of information that helps to solve the problem), and then aborting immediately. This is important because:
-
in development mode, it helps in debugging, as the application crashes noisily as soon as something goes wrong
-
in production mode, it leads to a crash of the application which is generally less awful than silently continuing and leading to outcomes that are worse, such as wrong data, wrong decision making, etc.
Here is an example of what can happen with wrong HTML code, because the Fail-fast principle is not applied:
Instead of displaying the following life-saving warning …
... a life-threatening message is displayed:
For an explanation of the above outcome (and for more advice about how to fail fast) please read Introduction to the 'Fail Fast!' Principle in Software Development (hint: it’s just a missing >
character in HTML code).
Ultimately, the Fail-fast principle helps to write more reliable and safe code in less time.
Therefore, prefer programming languages, libraries, frameworks, and tools that support the Fail-fast principle.
Repair what you can - but when you must fail, fail noisily and as soon as possible.
Rule of Repair
If a function be advertised to return an error code in the event of difficulties, thou shalt check for that code, yea, even though the checks triple the size of thy code and produce aches in thy typing fingers, for if thou thinkest "it cannot happen to me", the gods shall surely punish thee for thy arrogance.
The Ten Commandments for C Programmers
Do you fix bugs before writing new code?
The Joel Test: 12 Steps to Better Code
Every software component should be small and have a single responsibility.
Small, single-responsibility components have benefits for both the author and the user of the component.
-
Benefits for the author: They are easier to write, test and maintain. They are less error-prone and reduce dependability (coupling).
-
Benefits for the user: They are easier to understand and use. They can be combined with other components in more flexible ways. It is easier to replace or extend them with another implementation if needed.
The power of the whole is the result of many simple components working seamlessly together.
Write simple parts connected by clean interfaces.
Make each program do one thing well.
I developed a habit of writing very small functions.
FunctionLength
Don’t overdesign!
Resist the temptation to add features that might be used in the future.
Every feature increases complexity as well as the risk for malfunctions. Every feature needs time to be created, tested and maintained.
Moreover it is often difficult to foresee features that will actually be needed later, especially in projects with frequently changing requirements. Adding features that will never be used is pointless.
Remember: Keep it small and simple!
Before thinking "What else can we add?", ask yourself "Is there anything we can remove?"
When in doubt, leave it out. If there is a fundamental theorem of API design, this is it. It applies equally to functionality, classes, methods, and parameters. Every facet of an API should be as small as possible, but no smaller. You can always add things later, but you can’t take them away.
How to design a good API and why it matters
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.
To attain knowledge, add things everyday. To attain wisdom, remove things every day.
YAGNI: You aren’t gonna need it.
Wikipedia
Optimize only if necessary!
Instead of writing 'clever code that runs faster' it is better to write simple, correct and maintainable code that runs fast enough.
The approach should be:
-
Write simple code that does the job
-
If (and only if) there is a performance problem then:
-
Measure performance to reliably find the bottleneck (do not guess!)
-
Optimize the code that needs to run faster
-
Don’t forget: Bad performance is often just a consequence of bad data structures, bad code and/or bad architecture.
Premature optimization is the root of all evil.
author of 'The Art of Computer Programming'
Make it run, then make it right, then make it fast.
Creator of extreme programming
Rushing to optimize before the bottlenecks are known may be the only error to have ruined more designs than feature creep. From tortured code to incomprehensible data layouts, the results of obsessing about speed or memory or disk usage at the expense of transparency and simplicity are everywhere. They spawn innumerable bugs and cost millions of man-hours — often, just to get marginal gains in the use of some resource much less expensive than debugging time.
The Art of Unix Programming
Automate recurring tasks!
Doing the same again and again is error-prone, counter-productive, boring and increases the risk of not doing things that are important, such as unit/integration tests, backups, etc.
Use the best tools you can get for frequent tasks, such as writing/editing code, handling files, etc.
Strive for single-click-execution or, even better, scheduled tasks that run automatically.
It is often easy to automate tasks. Create small scripts (written in your preferred language or using OS scripting), use automation tools such as Autoit, Selenium, or use any other best suited tool to quickly automate your or your user’s recurring tasks.
Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you’ve finished using them.
Inventor of Unix pipes
Rule of Economy: Programmer time is expensive; conserve it in preference to machine time.
Can you make a build in one step?
The Joel Test: 12 Steps to Better Code
Enjoy being in the flow!
Flow (psychology) is described as follows in Wikipedia:
Flow … is the mental state of operation in which a person performing an activity is fully immersed in a feeling of energized focus, full involvement, and enjoyment in the process of the activity. In essence, flow is characterized by complete absorption in what one does, and a resulting loss in one’s sense of space and time.
Being in the flow (also called being in the zone or in hack mode) can lead to a state of blissful feeling that needs to be experienced to be understood. Flow might be the reason a programmer once wrote in a forum: "I want to code, code, code - until I die."
For programmers, flow can best be achieved under the following conditions:
-
There are no distractions, such as incoming phone calls, emails, any other kind of notifications, noise, etc.
-
The programming task at hand is worthwhile and challenging, but not too difficult.
-
You must see clearly, before starting to code.
What does it mean to 'see clearly'? Close your eyes. Can you see the overall picture, the data structures with their relationships, as well as the functions and their interactions? Doubts must be eliminated first - by using paper and pencil, searching the net, asking for help, writing small snippets of test code, or doing whatever else is appropriate to get rid of the doubts.
Can you see clearly now? Then start coding and fully enjoy it!
Summary
Here is a summary of all pragmatics:
General Guidelines
-
Everything should be as simple as possible, but not simpler! - Albert Einstein
-
If it’s doomed to fail, then 'Fail fast!'
-
Strive for 'good enough', not for 'perfect'. Then release!
-
Listen to the users!
Data Design
-
Before writing code, design your data carefully!
-
Make all data structures immutable, unless there is a good reason to make them mutable!
-
Limit the set of allowed values to the smallest possible one!
-
Most of the time, a default value should be the strictest among the set of allowed values.
-
Avoid data redundancy!
-
Consider storing data in simple text files using universal, standardized formats!
Writing Code
-
Write beautiful code!
-
Use existing software whenever possible!
-
Fail fast and noisily!
-
Every software component should be small and have a single responsibility.
-
Don’t overdesign!
-
Optimize only if necessary!
-
Automate recurring tasks!
-
Enjoy being in the flow!