One of my friends is thinking about starting a programming project, and he asked me “how do I pick my technologies?” with an eye towards performance as well as getting the project done. This person has been writing code on and off for a while, but he hasn’t worked on any substantial projects. He isn’t wedded to any particular technology, and he has the luxury of a clean slate for this project. While he does have a preference for deploying to Linux due to his experience with it, this is not a final decision.
In this column, I share the advice I gave him. I am not going to recommend any particular technologies, but rather show you my approach for making technology and architectural decisions.
Step 1: Loosely design your application
Agile methodologies have become very popular, and some folks believe that means you don’t try to think beyond a two week timeline. But there is a lot of value in preparing a loose design of your project. All you need to do is use a tool that has flowchart capabilities (such as Visio) to provide a high level overview of the logic. You do not need to get bogged down in details like “validate that this field contains at least five characters” or data layouts. But you do need a general idea of what parts of the application will be handling what responsibilities.
Some examples of things that should appear in this diagram include:
- Any major batch processing tasks.
- Where data is stored (files, databases, “the cloud,” etc.) and which components retrieve it and expose it to the rest of the application.
- Where significant processing occurs (in the database, in a business logic layer, a Web service, the client, and so on) and what it does.
Step 2: Identify resource usage and characteristics
Once your diagram is complete, we can use it to identify resource usage. The resources that you want to be aware of are CPU, RAM, drive space, and bandwidth. Are you transferring a large amount of data to or from an external Web service? That is going to be a bandwidth resource on the connection between your application and that service. Perhaps you are doing an intense calculation within your database — that would be CPU use within the database. And so on.
Step 3: Determine performance critical areas
Once you know where the resources will be used, you have found your potential performance bottlenecks. These are the places where your technology choices will have the biggest impact. If it turns out that you are storing very little in a database, you have more options for the database. Perhaps you will be performing CPU heavy algorithms in the business logic layer, which points out that you will need a language and platform that supports high-speed calculations. This is a chart I’ve made which will help you see how this affects your decision making.
CPU | RAM | Disk space | Bandwidth | |
Database | DB must be a high performance system like PostgreSQL or Oracle. | All major databases should be able to work with lots of RAM. | Skip the low-end DBs with size limits, like SQL Server Express. | Locate your application in the same server room as the database. |
Business layer | Language must be fast, and may need excellent multithreading support. | N/A | Locate the application in the same server room as the disks. | Locate your application in the same server room as the database. |
API that you expose to clients | Language must be fast, and may need excellent multithreading support. | N/A | Reconsider your architecture. | Reconsider your strategy. |
Client-side software | Language must be fast, and may need excellent multithreading support. | Carefully consider your target market and their client capabilities. | Carefully consider your target market and their client capabilities. | Ensure that target market has the bandwidth (don’t sell to consumers in rural areas, for example). |
Third-party service | Pick vendor carefully. | N/A | Choose a vendor with low cost storage. | Reconsider your strategy. |
Step 4: Scale your needs
Another thing that you can learn from your diagram is where your application needs to scale. If the bulk of your processing needs occur in the client piece of the application, your server architecture can be much more modest, for example.
You will also be able to see what kind of scaling you need. Most databases have clustering capabilities, so if you have a choice, it is often easier and better to push things that need to scale (especially if they require a shared state between requests) into the database where scaling is already handled, or to consider technologies higher up in the stack that also have clustering or scaling built in.
Conclusion
By starting your development process with a lightweight sketch of the application’s logic, you will be on the right path to select the best technologies for your needs. There are lots of non-technical considerations (such as your budget, experience in particular technologies, and so on), but you need to start somewhere, and this decision making process will help you narrow down your choices and highlight any problem areas before they come up.
I’d love to hear from you in the comments section below to get your experiences with these kinds of issues.
COMMENTS